Tralics, a LaTeX to XML translator; Part I

3. Mathematics

3.1. Introduction

Mathematics play a great role in TeX and Tralics. For instance, TeX has three modes: vertical mode, in which no typesetting is done, horizontal mode (where everything happens) and math mode, a mode in which special objects are handled; a two phase process converts these special objects in normal ones. Fonts to be used in math mode have special properties (see appendices F and G of the TeXbook). Not all subtleties of TeX math can be implemented in Tralics; on the other hand, the XML translation is conforming to MathML. This defines some entities, for instance in isoamsc.ent, there is a definition of &rceil; to ⌉. As a consequence, Tralics will translate \rceil to <mo>&rceil;</mo> or <mo>⌉</mo>, depending on an option. Translation of a footnote is in general a <footnote> element, and the user can change the name of this element; this is not done for maths: the name <mo> is a constant.

The syntax of mathematics is often strange. Instead of

\math{E=\fraction{1}{2} m\superscript{v}{2}}

you say

$E={1\over 2} mv^2$

Three categories codes are defined for use in math mode, they correspond to the dollar sign (math shift), underscore character (subscript) and hat character (superscript). If you want a dollar or underscore character, you can say \$, or \_, but \^ produces an accent over what follows, not a hat character (In LaTeX, you can say \textasciicircum, provided that you can guess the name).

In the example above, we have two pseudo commands \fraction and \superscript (followed by two arguments) whereas the plain TeX version uses infix operators (placed between the arguments). The first opertr is greedy. This means that, without the braces in the example above, everything before \over would be the numerator, and everthing after it would be the denominator. On the other hand, you see sometimes 2¹6 instead 2¹⁶, when people forget braces around the superscript. The essential difference however is that arguments are typeset in different style: the nucleus (what precedes the hat operator) is typeset in text style, while numerator, denominator, superscripts and subscripts are in script style; moreover, it two objects are placed one above the other, cramped style is used used for the object that is below the other one (i.e., the denominator or a subscript). The style influences spacing; because of commands like \over, the current style is known only after the whole expression is parsed. This explains why you may see: Package amsmath Warning: Foreign command \over; \frac or \genfrac should be used instead.

TeX has also a notion of “inner” mode. Inside an inner object, you cannot put an outer one. Such a distinction exists also in HTML, where <div> is outer and  is inner. We explained in the previous chapter that \ifinner can be used to check whether current mode is inner or outer, and we mentioned that, outside math mode, this is not well defined in Tralics. This may produces surprising results. Consider for instance \hbox{$$}. Inner mode is the rule inside a box, and a double dollar sign signals the start of an outer (display math) formula. You would expect this expression to provoke an error. In fact, TeX assumes that you know what you do, enters inner math mode when it sees the first dollar sign, and quits when it sees the second one; this gives an empty math formula (in fact, it will contain all tokens from the \everymath hook), surrounded by some space: the value of \mathsurround (this can be set to zero using \m@th). Note that a math formula defines group: assignments made inside the formula are forgotten after full evaluation (in particular after this space is added).

The essential difference between inner (normal, inline) math and outer (display) math is that a display formula uses a line of its own (very often the formula is centered on the line). One could say that a display formula terminates the current paragraph. In fact, it is just interrupted, the paragraph continues after the formula (this is only interesting in constructions like \parshape, whose scope is the current paragraph; here a formula counts for three lines; not implemented in Tralics). The construction \hbox{$$ x$$} produces a display math formula in Tralics, instead of two empty math formulas. Before version 2.11.7, an error was signaled (because Tralics started a new paragraph at the end of the equation, and this is illegal in a box).

A display math formula can have an equation number (via commands \eqno, \leqno, \tag, \notag; these commands were not implemented in early versions, and are described in the last chapter of the second part of this report). The MathML documentation says “One of the important uses of <mlabeledtr> is for numbered equations. In a <mlabeledtr>, the label represents the equation number and the elements in the row are the equation being numbered. The side and minlabelspacing attributes of <mtable> determine the placement of the equation number.” Thus, the recommended way, for MathML, is to use a table, like this (replace ellipsis by an expression)

<mtable>
  <mlabeledtr id='e-is-m-c-square'>
    <mtd>
      <mtext> (2.1) </mtext>
    </mtd>
    <mtd>
     ...
    </mtd>
  </mlabeledtr>
</mtable>

This mechanism is not yet implemented. We do not know how to insert numbers automatically, so that the proposed solution is: you can use \label, \ref for any display math formula. This will add an id attribute to the <formula> object, which is a wrapper for the <math>.

When you say {\alpha^2}, TeX will enter math mode with an error of the form Missing $ inserted. On the other hand, Tralics will signal two errors, the first is Math only command \alpha. Missing dollar not inserted, the second is Missing dollar not inserted, token ignored: {Character ^ of catcode 7}. If you want a command that works in math mode and outside math mode, you can say:

\def\foo{\ifmmode \alpha^2 \else $\alpha^2$\fi}

This can be generalised, using the following command

\DeclareRobustCommand{\ensuremath}{%
  \ifmmode
    \expandafter\@firstofone
  \else
    \expandafter\@ensuredmath
  \fi}
\long\def\@ensuredmath#1{$\relax#1$}

The purpose of the \relax on the last line is for the case of an empty argument: we do not want \ensuremath{} to expand to $$. Note that the argument is handled only once (i.e., \ensuremath does not read it, but calls a helper), because of subtle bugs, see latex bugs data base amslatex/2104. We shall say later `Mode independent commands are interpreted as usual´, this implies that the \relax token will do nothing. We shall see later that, in non-mathml mode, \relax appear in the result unless it is the first in the list. Other commands, not listed in this chapter, may signal an error. For instance, \par is forbidden. Note that \mathchar provokes an Unimplemented command error. If you want a random Unicode character, you should use commands like \mathmi, \mathmo, \mathmn. You can also define a command via \chardef or \mathchardef (the result is the same), and use it, the result is always a <mi> element. The following example shows that \amp produces an ampersand sign in some case, it must be used with care.

\chardef\AAA"1000
\chardef\CCC`x
\mathchardef\BBB"2000
$\mathbf{x\AAA\BBB\CCC} \mathmi{foo}\mathmo{\&\#666;}\mathmo{\amp\#777;}$

Translation

<formula type='inline'>
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
    <mrow>
      <mi mathvariant='bold'>x</mi>
      <mi>&#x1000;</mi>
      <mi>&#x2000;</mi>
      <mi>x</mi>
      <mi>foo</mi>
      <mo>&amp;#666;</mo>
      <mo>&#777;</mo>
    </mrow>
  </math>
</formula>

Because a math expression translates as <math> inside a <formula>, and that the math has a long namespace attribute, examples will never fit on a single line. In order to make the result easier to read, we have inserted some newline characters, and reindented all these examples. Two consecutive newline characters are scanned by TeX as space plus \par. This space is ignored by TeX (see TeXbook, the text between exercises 14.12 and 14.13). Hence the general rule in Tralics: when a  element is ended, a trailing space or newline is removed from the content of the element, a newline character is added to the parent of the . As a result, you will very often see  at the start of a line and  at the end of a line in a XML file generated by Tralics.

Consider the following simple example:

$\alpha$ and $$\beta \label{foo}$$

The translation is the following

<p>
 <formula type='inline'>
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
   <mi>&alpha;</mi>
  </math>
 </formula> and</p>
<formula id='uid1' type='display'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>&beta;</mi>
 </math>
</formula>

You can also say

\(\alpha\) and \[\beta \label{foo}\]

The result is exactly the same. In LaTeX, the commands $, $, \[ and \] test the current mode. No such test is done by Tralics. The LaTeX implementation of \[ is a bit strange. If the formula is in vertical mode, it will be preceded by a box of width .6\linewidth containing nothing (except two \hss commands to fill it) preceded by the current paragraph indentation. The command \] executes \ignorespaces. As you can see, there is some difference between a single dollar and a double dollar. In the first case, we are in normal math mode, otherwise in display math mode. One difference is the initial style: it is \textstyle (for normal mode) and \displaystyle otherwise (this will be explained later). A second difference is that the \everymath or \everydisplay token list is inserted when scanning the formula depends on the mode. The third difference is specific to Tralics. A display math formula is never `trivial´ (see section 3.5), it can have a label (not more than one): in this case, the <formula> element has an id attribute. In any case, the <formula> element has a type attribute that explains that the formula is inline or display. A non-display formula starts a paragraph; a display math formula cannot appear in a paragraph (the equivalent of \par is executed), if the first non-space token (after expansion) that follows the math formula is not \par, a \noindent token will be inserted (see line 34 of the transcript at page 3.3). Note that, in TeX, a math formula does not end a paragraph, in the sense that a \parshape is valid across math formulas; however what precedes the formula is split into lines, according to parameters in force at the start of the formula. Tralics does not split paragraphs into lines, and does not implement use \parshape.

3.2. The basic objects

The following environments are recognized outside math mode, and produce a math formula: eqnarray*, align*, aligned, split, multline, equation*, math and displaymath. When Tralics sees a dollar character, it looks at the next character (without expansion). If this is a dollar sign, it will be read, and display math mode is entered, otherwise, normal math mode is entered. All environments shown above start display math mode (except math, which enters normal math mode). The environments math and displaymath are equivalent to $...$ and \[...\] respectively. The environments eqnarray, and split are implemented as arrays. There is no difference between

\begin{eqnarray} a&b\\ c&d \end{eqnarray}
\begin{split} a&b\\ c&d \end{split}

and

\[\begin{array}{rcl} a&b\\ c&d \end{array}\]
\[\begin{array}{rl} a&b\\ c&d \end{array}\]

Environments equation and align are translated as normal math. A star after the environment name is ignored. In the case of normal math mode, the content of the token list \everymath is inserted before the formula, for displaymath it is \everydisplay. For instance, if you say

\everymath={(N)\ }
\everydisplay={(D)\ }
$\alpha$ and $$\beta$$

the translation will be

<p>
 <formula type='inline'>
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
   <mrow>
    <mo>(</mo><mi>N</mi><mo>)</mo><mspace width='6pt'/>
    <mi>&alpha;</mi></mrow></math></formula> and</p>
<formula type='display'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
   <mo>(</mo><mi>D</mi><mo>)</mo><mspace width='6pt'/>
   <mi>&beta;</mi>
  </mrow>
 </math>
</formula>

In TeX, you can put anything inside a math formula, provided it is hidden in a box; this is not possible in Tralics, because we want the XML result to be conforming to MathML. We shall list here all commands valid in math mode, and explain later on how they are translated.

Commands \limits, \nolimits and \displaylimits can be used just after an operator and before subscripts or supscripts, as in \int \limits _x. They are curently ignored by Tralics.

The following environments are recognized: array, matrix, pmatrix, bmatrix, Bmatrix, vmatrix, Vmatrix. All these environments produce arrays. For the first, an argument is required, explaining how cells are aligned. For all other environments, cells are centered. Environments of the form Xmatrix have fences, an implicit \left and \right. In order: parentheses, braces, brackets, simple bars, double bars. There is also an environment cases, with two columns, left aligned, that has an open brace as left delimiter, an empty right delimiter. Example

$\begin{array}{lcr}a&b&c\end{array}
\begin{bmatrix}d&e\\f&g\end{bmatrix}$

The translation is the following.

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
    <mtable>
     <mtr>
      <mtd columnalign='left'><mi>a</mi></mtd>
      <mtd><mi>b</mi></mtd>
      <mtd columnalign='right'><mi>c</mi></mtd>
     </mtr>
    </mtable>
   <mfenced open='{' close='}'>
    <mtable>
     <mtr>
      <mtd><mi>d</mi></mtd>
      <mtd><mi>e</mi></mtd>
     </mtr>
     <mtr>
      <mtd><mi>f</mi></mtd>
      <mtd><mi>g</mi></mtd>
     </mtr>
    </mtable>
   </mfenced>
  </mrow>
 </math>
</formula>

The following delimiters are recognized: <, >, ., (, ), [, ] |, \{, \}, \langle, \rangle, \lbrace, \rbrace, \lceil, \rceil, \lgroup, \rgroup, \lfloor, \rfloor, \lmoustache, \rmoustache, \vert, \Vert, \uparrow, \downarrow, \updownarrow, \Uparrow, \Downarrow, \Updownarrow. A delimiter is anything that can follow \left or \right. For MathML, this has to be a character. As the following example shows, we use in most cases a character entity.

$\left\lceil \left\uparrow x\right\}\right.$
$\lceil \uparrow x\}$

The translation is

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
   <mfenced open='&lceil;' close='.'>
     <mfenced open='&uparrow;' close='&rbrace;'>
       <mi>x</mi></mfenced></mfenced></math></formula>
<formula type='inline'>
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
    <mrow><mo>&lceil;</mo><mo>&uparrow;</mo><mi>x</mi><mo>}</mo>
    </mrow></math></formula>

This is the list of commands allowed in math mode, as well as in text mode: \dots, \ldots, \quad, \qquad, \␣, \$, \%, \&, \!, \, \{, \}, \i, \sharp, \natural, \flat, \_. The following commands produce space: \;, \:, \>. Note that \! produces a negative space in math mode, nothing outside math mode. Example of use:

\def\alist{\i\j\$\,\_\&\{\}\%\ \^^J\^^I\^^M\!}
\def\blist{\quad,\qquad,\dots,\sharp,\natural,\flat}
\alist\blist
$\alist\blist$

This is the translation, with nobreak space replaed by tilde:

&#x131;j$ _&amp;{}%    ~~~,~~~~~~,...,&#x266F;,&#x266E;,&#x266D;
<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow><mo>&inodot;</mo><mi>j</mi><mi>$</mi><mspace width='0.166667em'/>
  <mo>~</mo><mo>&amp;</mo><mo>{</mo><mo>}</mo><mo>%</mo>
  <mspace width='6pt'/><mspace width='6pt'/>
  <mspace width='6pt'/><mspace width='6pt'/>
  <mspace width='-0.166667em'/><mspace width='1.em'/><mo>,</mo>
  <mspace width='2.em'/>
  <mo>,</mo><mo>&ctdot;</mo><mo>,</mo><mo>&sharp;</mo><mo>,</mo>
  <mo>&natur;</mo><mo>,</mo><mo>&flat;</mo></mrow>
 </math>
</formula>

We give here the list of all symbols that have a translation of the form <mi>α</mi>. They are of type Ord (ordinary symbol). We start with the lower case Greek letters: \alpha, \beta, \gamma, \delta, \epsilon, \varepsilon, \zeta, \eta, \theta, \iota, \kappa, \lambda, \mu, \nu, \xi, \pi, \rho, \sigma, \tau, \upsilon, \phi, \chi, \psi, \omega, \varpi, \varrho, \varsigma, \varphi, \vartheta, \varkappa, then upper case Greek letters: \Gamma, \Delta, \Theta, \Lambda, \Xi, \Sigma, \Upsilon, \Phi, \Pi, \Psi, \Omega, then other symbols: \hbar, \ell, \wp, \Re, \Im, \partial, \infty, \emptyset, \nabla, \surd, \top, \bottom, \bot, \angle, \triangle. Example

$\alpha\Gamma \surd$

This translates as

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow><mi>&alpha;</mi><mi>&Gamma;</mi><mi>&radic;</mi>
  </mrow></math></formula>

Next comes the list of all symbols whose translation is like log. There are of type Ord (ordinary symbol), though they should be Op (large operator). The list is divided in two parts: these have movable limits: \det, \gcd, \inf, \injlim, \liminf, \limsup, \max, \min, \sup, \projlim, and these have not: \dim, \exp, \hom, \ker, \lg, \lim, \ln, \log, \Pr, \arccos, \arcsin, \arctan, \arg, \cos, \cosh, \cot, \coth, \csc, \deg, \sec, \sin, \@mod, \sinh, \tan, \tanh. Example

$\displaystyle\lim_a \liminf_a \sin_a \hom_a$

The LaTeX translation is $lim_{a} \underset{a}{lim inf} {sin}_{a} {hom}_{a}$ , and the Tralics version is

<formula type='inline'>
<math xmlns='http://www.w3.org/1998/Math/MathML'>
<mstyle scriptlevel='0' displaystyle='true'>
<mrow>
 <msub><mo movablelimits='true' form='prefix'>lim</mo> <mi>a</mi> </msub>
 <msub><mo movablelimits='true' form='prefix'>lim inf</mo><mi>a</mi></msub>
 <msub><mo form='prefix'>sin</mo> <mi>a</mi> </msub>
 <msub><mo form='prefix'>hom</mo> <mi>a</mi> </msub>
</mrow></mstyle></math></formula>

From now on, all symbols translate into the form <mo>...</mo>. We start with symbols of type Ord. In reality, most of them they should be of type Op (large operator). \mho, \clubsuit, \diamondsuit, \heartsuit, \spadesuit, \aleph, \backslash, \Box, \imath, \jmath, \square, \cong, \lnot, \neg, \forall, \exists, \coprod, \bigvee, \bigwedge, \biguplus, \bigcap, \bigcup, \int, \sum, \prod, \bigotimes, \bigoplus, \bigodot, \oint, \bigsqcup, \smallint. Examples

$\bigcap \int\oint$

The translation is

<mrow><mo>&bigcap;</mo><mo>&int;</mo><mo>&oint;</mo></mrow>

These are of type Bin (binary operator). \triangleleft, \triangleright, \bigtriangleup, \bigtriangledown, \wedge, \land, \vee, \lor, \cap, \cup, \multimap, \dagger, \ddagger, \sqcap, \sqcup, \amalg, \diamond, \Diamond, \bullet, \wr, \div, \odot, \oslash, \otimes, \ominus, \oplus, \uplus, \mp, \pm, \circ, \bigcirc, \setminus, \cdot, \ast, \times, \star, \in. Example

$\cap \cup \wr$

The translation is

<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'>
<mrow><mo>&cap;</mo><mo>&cup;</mo><mo>&wr;</mo></mrow></math></formula>

These are of type Rel (relation). \propto, \sqsubseteq, \sqsupseteq, \sqsubset, \sqsupset, \parallel, \mid, \dashv, \vdash, \Vdash, \models, \nearrow, \searrow, \nwarrow, \swarrow, \Leftrightarrow, \Leftarrow, \Rightarrow, \ne, \neq, \le, \leq, \ge, \geq, \succ, \approx, \succeq, \preceq, \prec, \doteq, \supset, \subset, \supseteq, \subseteq, \bindnasrepma, \ni, \gg, \ll, \gtrless, \geqslant, \leqslant, \not, \notin, \leftrightarrow, \leftarrow, \owns, \gets, \rightarrow, \to, \mapsto, \sim, \simeq, \perp, \equiv, \asymp, \smile, \iff, \leftharpoonup, \leftharpoondown, \rightharpoonup, \rightharpoondown, \hookrightarrow, \hookleftarrow, \Longrightarrow, \longrightarrow, \longleftarrow, \Join, \longmapsto, \frown, \bowtie, \Longleftarrow,

\longleftrightarrow

, \Longleftrightarrow. Example.

$\approx\leftrightarrow\Longleftrightarrow$

Translation:

<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'>
<mrow><mo>&approx;</mo><mo>&leftrightarrow;</mo>
<mo>&Longleftrightarrow;</mo></mrow></math></formula>

These are of type Inner: \cdots, \hdots, \vdots, \ddots. These are of type Between (they are of type Ord in TeX, but are used as opening or closing delimiters): \Vert, \|, \vert, \uparrow, \downarrow, \Uparrow, \Downarrow, \Updownarrow, \updownarrow. These are of type Open and Close: \rangle, \langle, \rmoustache, \lmoustache, \rgroup, \lgroup, \rbrace, \lbrace, \lceil, \rceil, \lfloor, \rfloor.

The following characters are classified as `small´: <>,.:;*?!x, these are classified as `small-l´ and `small-r´: ()[], the vertical bar is small-l, these are bin: +/ and the equals sign is of type Rel. Note: what you see here as x is in reality the character 215. It cannot be printed in verbatim mode by LaTeX.

$<>,.:;*?!x ()[]|+-/=$

Translation:

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
   <mrow><mo>&lt;</mo><mo>&gt;</mo><mo>,</mo><mo>.</mo><mo>:</mo>
      <mo>;</mo><mo>*</mo><mo>?</mo><mo>!</mo><mi>&times;</mi><mo>(</mo>
      <mo>)</mo><mo>[</mo><mo>]</mo><mo>|</mo><mo>+</mo><mo>-</mo>
      <mo>/</mo><mo>=</mo>
   </mrow></math></formula>

The following commands are used for accents: \acute, \grave, \mathring, \ddddot, \dddot, \ddot, \tilde, \widetilde, \bar, \breve, \check, \hat, \widehat, \vec, \overrightarrow, \overleftarrow, \underrightarrow, \underleftarrow, \dot.

The following commands are special. They will be explained later: \overline, \underline, \stackrel, \underset, \overset, \mathchoice, \frac, \overbrace, \underbrace, \genfrac, \dfrac, \tfrac, \sqrt, \root.

3.3. Parsing a math formula

This is a non-trivial operation, for this reason in verbose mode, the math expression will be printed on the transcript file. For instance, given

\tracingall
$\begin{cases} x &y\\a&b \end{cases} \mkern18mu x^{ {2 }}!$

whose translation in no-mathml mode is

<texmath type='inline'>
 {\left\rbrace \begin{array}{ll} x &amp;y\\a&amp;b \end{array}\right.}
 \hspace{10.0pt}x^{ {2 }}!
</texmath>

the transcript file will contain

1 {math shift character $}
2 +stack: level + 2 for math entered on line 2
3 +stack: level + 3 for math entered on line 2
4 \cases ->\left \{\begin {array}{ll}
5 +stack: level + 4 for math entered on line 2
6 +stack: level + 5 for cell entered on line 2
7 +stack: level + 6 for math entered on line 2
8 +stack: level - 6 for math from line 2
9 +stack: level - 5 for cell from line 2
10 +stack: level + 5 for cell entered on line 2
11 +stack: level - 5 for cell from line 2
12 +stack: level + 5 for cell entered on line 2
13 +stack: level - 5 for cell from line 2
14 +stack: level + 5 for cell entered on line 2
15 \endcases ->\end {array}\right .
16 +stack: level - 5 for cell from line 2
17 +stack: level - 4 for math from line 2
18 +stack: level - 3 for math from line 2
19 +scanint for \mkern->18
20 +scandimen for \mkern->18.0mu
21 +stack: level + 3 for math entered on line 2
22 +stack: level - 3 for math from line 2
23 +stack: level + 3 for math entered on line 2
24 +stack: level + 4 for math entered on line 2
25 +stack: level - 4 for math from line 2
26 +stack: level - 3 for math from line 2
27 +stack: level - 2 for math from line 2
28 Math: $\begin {cases}{\left\{\begin {array}{ll} x &y\\a&b\end{cases}
29 \end {array}\right.} \mkern\hspace{10.0pt}x^{ {2 }}!$
30 +scanint for \hspace->10
31 +scandimen for \hspace->10.0pt
32 {scanglue 10.0pt\relax }
33 Realloc xml math table to 20
34 {Push p 1}

We shall explain for each line in the transcript file where it comes from. Math mode scanning is entered when the translator sees a math shift character (line 1). The scanner reads some tokens and puts them in a list. The list is printed at the end (lines 28-29). The start of the formula is a bit special, in that the token that follows the first dollar sign is considered unexpanded when we check for a double dollar sign. A new group is entered, before scanning the whole formula (line 2).

The loop is as follows:

A token is read and expanded. Lines 4 and 15 show expansion of user commands. An error is signaled in the case of end of data.
In the case of \nobreakspace, we insert a ~.
If we get a font command, we proceed as follows. First \cal is transformed into \mathcal. The font can be \mathtt, \mathcal, \mathbf, \mathrm, \mathit, \mathbb, \mathsf. These are basic math fonts; they have an inner variant, of the form \@mathtt. There is also \mathnormal. The command \mathfrak selects a Fraktur variant. We allow old fonts (like \rm, \sf, \tt, \bf, \it, \sl), fonts switches of the form \rmfamily, or font commands that take an argument like \textrm. These fonts have an inner variant, say T. If the font takes no argument, then the token T is inserted (as explained for \cal above). Otherwise, let S be the current math font. In this case, an argument is read, then S, T and this argument is pushed back, to be read again. For instance, if the current font is `sf´, then \mathrm{foo} produces \@mathrm foo\@mathsf: these are five tokens to be read again. Note: \mathbbm is an alias for \mathbb. The name of the internal font has changed since the first edition, it has the form mml@font@xxx, where the suffix is one of normal, upright, bold, italic, bolditalic, script, boldscript, fraktur, doublestruck, boldfraktur, sansserif, boldsansserif, sansserifitalic, sansserifbolditalic, and monospace. Details can be found in the second part of this report.
In all other cases the current token is added to the list. In particular, this explains while the trace starts with a dollar and \begin.
If the token is an open brace, in fact any character of category code 1, a new math group is read. You can see on lines 23 and 24 that the stack level increases (a new semantic level is entered, all assignments are local).
If the token is a close brace, in fact any character of category code 2, this terminates the current math group (see lines 25 and 26). An error is signaled in case the current group should be closed differently (for instance with \end, or \right, etc.)
If the token is a dollar sign, in fact any character of category code 3, then four alternatives can be chosen. This dollar sign can be the end of the math formula. If we are in display math, a token is read with expansion. An error is signaled Display math should end with $$ if this is not a dollar sign (in fact, a character of category code 3). If the current group is defined by \hbox, this can be the start of a math formula (never display math). The token in \everymath are inserted, then a math formula is read. Otherwise, an error is signaled Extra $ ignored..., parsing continues.
In the case of \label, an argument is read. If we are not in display math, or if the formula already has a label, you get an error: Some labels may be lost. Wherever the location of the label, an attribute will be added to the <formula> element that contains the <math> element.
In the case of \ensuremath, a token list is read, and pushed back, so that this command acts as \@firstofone.
The case \begin or \end is considered next. We make the assumption that this is a user defined environment, or a math environment. In the case of the example, we have a user environment that expands to a math environment. For a user defined environment, the following is executed:
```
  {\cases .... \endcases}
```
In the trace, lines 28-29, you will see both \begin{cases} and the result of the expansion of \cases, but not \cases or the brace. However, you can see on lines 4 and 15 the expansion of the user defined commands, and on lines 3 and 18 the braces; these braces can also be seen in the translation in no-mathml mode. You can see on lines 6 and 16 that a group (named `cell´) is opened and closed, because the builtin math environment starts a cell. This allows & or \\ tokens. The group defined on lines 7-8 does not exists in TeX. Let´s hope for the best: the argument of the array should contain only letters. Whether these characters should be expanded is unclear.
In the case \left and \right, a delimiter is read. The rules are: \relax and space tokens are ignored. After full expansion, the result should be one of the tokens listed above as valid delimiters, otherwise an error of the form Invalid character in \left or \right can be signaled. These commands come in pairs; you might get errors like Missing \right. inserted, or Unexpected \right. These commands define a group, see lines 5 and 17.
Case & and \\. These are valid only inside a cell group. They terminate a cell group and start a new one. See lines 9 to 14.
The \of token is ignored.
The \mathchoice command reads 4 arguments, which are remembered.
Case of \frac, etc. These commands read their arguments. The main token list will contain a special slot, with the name of the command and the arguments. In the case of \sqrt, the first argument is optional. You say \root A \of B.
The syntax of \genfrac is special. It takes six arguments. The equivalent of the following commands is executed when Tralics bootstraps:
```
  \def\binom{\genfrac()\z@{}}
  \def\dbinom{\genfrac(){0pt}0}
  \def\tbinom{\genfrac(){0pt}1}
```
This defines three commands that take two arguments with regular syntax. The first two arguments of \genfrac are delimiters or empty. The next one is a dimension or empty. If empty, a default dimension will be used. In the example, the first argument is an opening parenthesis, the second is a closing parenthesis, the third is zero. The next argument is empty or a number between 0 and 3. These numbers correspond to a style: \displaystyle, \textstyle, \scriptstyle, and \scriptscriptstyle respectively. Currently an explicit number is required; everything else is treated as an empty list. On the other hand, the dimension is scanned via the scandimen routine (the procedure that prints lines of the form 19-20).
\hbox and friends. Currently, only \hbox is implemented. The current value of \everyhbox token list is inserted, and the argument is read. There are restrictions, see later.

Case of \mbox, \text, \makebox. Like \hbox, but no \everyXXX token list is inserted. Example

\everyhbox{A}
\everymath{B}
\everydisplay{C}
\[a=\hbox{bc d $ef g$}h i\text{OK}\]

Translation, in no-mathml mode.

<texmath type='display'>Ca=\text{Abc} \text{d} Bef gh i\text{OK}</texmath>

Case of a math font, for instance \@mathcal. The font command is inserted in the token list, but the variable holding the current font is (locally) updated.
Case of a math command (like \alpha listed as above). The command is read. There is a special hack: \not\in is converted to \notin, and \not= to \ne.
Case of \hspace, \vspace. An argument, preceded by an optional space, is read. In the case of \hspace, a space is added, otherwise the command is ignored.
\kern, \mskip, \mkern, \hskip, \vskip. See the example: on line 19 and 20, there is the trace of the routines that read the argument. The result is converted into a \hspace, with argument delimited by braces. Look at the trace, line 29. This will be read again later, see lines 30 to 32. In the case of \vskip, we should convert to \vspace, and re-insert, but the argument is ignored. In the case of \mkern, the result is converted into pt units, using the rule 18mu=10pt; in the case of \mskip, the stretch and shrink parts of the glue are discarded.
An apostrophe is handled in a special way, as explained in the TeX book. Essentially x´ is x^{\prime} and x´^2 is x^{\prime2}.
Mode independent commands are interpreted as usual (this includes undefined commands). This should not typeset anything.
A character is remembered, together with the current font.
Mode-independent tokens are evaluated. These are commands like \def that change the environment, and have empty transation. For instance, \relax commands are handled here.
Everything else is inserted without evaluation.

We give here an example with some fonts.

$\mathtt{Ab}\mathcal{Cd}\mathbf{Ef}\mathrm{Gh}\mathit{Ij}
\mathbb{Kl}\mathsf{Mn}$

The translation is as follows. You can notice that some variants affect only uppercase letters.

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
   <mi mathvariant='monospace'>A</mi>
   <mi mathvariant='monospace'>b</mi>
   <mi>&Cscr;</mi>
   <mi>d</mi>
   <mi mathvariant='bold'>E</mi>
   <mi mathvariant='bold'>f</mi>
   <mi> G </mi>
   <mi> h </mi>
   <mi>I</mi>
   <mi>j</mi>
   <mi>&Kopf;</mi>
   <mi>l</mi>
   <mi mathvariant='sans-serif'>M</mi>
   <mi mathvariant='sans-serif'>n</mi>
  </mrow>
 </math>
</formula>

3.4. Translation of arrays

Whenever we see an array (this can be a global environment like eqnarray or a local one, like array), we translate all cells one after the other. The character & is the cell separator. The command \\ is the row separator. In the case where an array ends with a \\, this gives an empty row: it will be removed. Each cell has an alignment, left, right, or center. An attribute is added only if this is not center. The array environment has an argument that explains the type of the columns (columns not indicated are centered). The default alignment is `rl´ for split and align, `rcl´ for eqnarray, centered for matrix. You can use \multicolumn. This command takes three arguments: the span which should be some integer, then the alignment (one of r, l or c) and the content of the cell. The program may signal errors in case of wrong syntax. Here is an example:

$\begin{array}{rcl}
a&b&c&d\\
A&\multicolumn{1}{r}{B}&C&D\\
\end{array}$

This is the translation of the array.

<mtable>
 <mtr>
  <mtd columnalign='right'><mi>a</mi></mtd>
  <mtd><mi>b</mi></mtd>
  <mtd columnalign='left'><mi>c</mi></mtd>
  <mtd><mi>d</mi></mtd>
 </mtr>
 <mtr>
  <mtd columnalign='right'><mi>A</mi></mtd>
  <mtd columnalign='right' columnspan='1'><mi>B</mi></mtd>
  <mtd columnalign='left'><mi>C</mi></mtd>
  <mtd><mi>D</mi></mtd>
 </mtr>
</mtable>

3.5. Trivial math

If you say ` $x$ and $123$ ´, the translation will be

<p><formula type='inline'><simplemath>x</simplemath></formula> and 123</p>

Initially, we found this a good idea; because this can easily be converted in HTML into x. Moreover ` $2^{i\grave eme}$ ´ gives

<temporary>2<hi rend='sup'>e</hi></temporary>

Here the <temporary> element will not show in the XML tree, but is printed on the terminal if Tralics is called with the `interactivemath´ switch. If you invoke Tralics with the `-notrivialmath´ switch, these hacks are not tried, and the formula translates into:

<formula type='inline'>
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
   <msup>
    <mn>2</mn>
    <mrow>
     <mi>i</mi>
     <mover accent='true'><mi>e</mi> <mo>&grave;</mo></mover>
     <mi>m</mi>
     <mi>e</mi>
    </mrow>
   </msup>
  </math>
</formula>

There are three hacks: the first is when the formula contains only a letter, the second is when the formula contains only digits, and the last one is when people use a math formula instead of \textsuperscript. This hack is applied only if the math formula starts with digits (no digit at all is OK; braces are ignored) followed by a exponent marker, followed by a special exponent; this has to be a single token or a token list. In the case of a single token, the hack is applied only if this is e or o. Typically, it applies in cases like 2^e and N^o. In the case of more than one token, it applies when the exponent is `th´, `st´, `rd´ and `nd´, for cases like 1^st, 2^nd, 3^rd, and 4^th. There are four rules for French: `e´, `eme´, `ieme´, `ème´ and `ième´ convert to `e´, `ier´ and `er´ convert to `er´, `iemes´, `ièmes´ and `es´ convert to `es´, `ère´ and `re´ convert to `re´. The accented letter can be given as è, or \`e or \`{e} or \grave{e} or \grave e. The hack is applied in a case like:

$2 ^{\text{\small\rm \grave ere}} $

Instead of \text, \hbox can be used. Instead of \small or \rm any font change or font size command can be used. Up to two commands can be given. The original Perl version had 30 exceptions, including $\Sigma{}^{{\rm it}}$ and \ddot{\rm o}. Compare $Σ^{it}$ with Σ^it and $\ddot{o}$ with ö.

Since version 2.8, there is an integer register named \notrivialmath, that controls these hacks; it contains initially 1, it is set to zero if Tralics is called with the -notrivialmath switch, to seven if Tralics is called with the -trivialmath math switch (and to 349 if Tralics is called with -trivialmath=349). If the value is $A + 2 B + 4 C$ modulo 8, where A, B, and C are zero (false) or one (true), then the behavior is the following (by default A is true, other flags are false).

If A is true, in the case where the math formula contains optional digits followed by a special exponent, the rule explained above is applied; the special exponent can be one of th, rd, nd, st (for English), or those shown above for French.
If B is true, some math formulas containing a single token produce a non-math result. We have shown above the translation of a digit or a letter. The translation of a minus sign is a en-dash (so that $-$ is the same as --). Other characters are not considered trivial math. Most commands, whose translation is a single MathML element, including Greek letters, are converted into their content. Thus, the translation of $\alpha$ is α rather than some long formula.
If C is true, in the case where the formula starts with a hat or underscore, followed by a character, or a simple token list, and nothing more, then the result is a superscript, or subscript, out of math mode. It is as if you had used \textsuperscript or \textsubscript. A simple list is a list of characters, with an optional font change at the beginning. This rule has precedence over the first. Said otherwise, $X^{eme}$ is translated as X^eme. Example

$1^e$, $3^{eme}$ X$^{eme}$ $4^{i\grave{e}me}$
$1^{st}$ $2^{nd}$ $3^{rd}$  $4^{th}$
$x$ $1$ $\alpha$ $\pm$ $\longleftrightarrow$ $-$
$_{foo}$ $^{2+3}$  $_{\bf Foo}$
$+$ $x^{eme}$ $\log$ $_{F\bf oo}$

Translation (with MathML namespace removed), all hacks enabled:

<p>1<hi rend='sup'>e</hi>, 3<hi rend='sup'>e</hi>
    X<hi rend='sup'>eme</hi> 4<hi rend='sup'>e</hi>
1<hi rend='sup'>st</hi> 2<hi rend='sup'>nd</hi> 3<hi rend='sup'>rd</hi>
    4<hi rend='sup'>th</hi>
<formula type='inline'><simplemath>x</simplemath></formula>
   1 &alpha; &pm; &longleftrightarrow; &#x2013;
<hi rend='sub'>foo</hi> <hi rend='sup'>2+3</hi>
   <hi rend='sub'><hi rend='bold'>Foo</hi></hi>
<formula type='inline'><math><mo>+</mo></math></formula>
<formula type='inline'><math><msup><mi>x</mi>
    <mrow><mi>e</mi><mi>m</mi><mi>e</mi></mrow> </msup></math></formula>
<formula type='inline'><math><mo form='prefix'>log</mo></math></formula>
<formula type='inline'><math><msub><mrow></mrow>
     <mrow><mi>F</mi><mi mathvariant='bold'>o</mi>
  <mi mathvariant='bold'>o</mi></mrow> </msub></math>
</formula></p>

3.6. Conversion to XML

In the case where the value of the counter \@nomathml is negative, then the translation is a <texmath> element containing all tokens of the math list. For instance,

\csname@nomathml\endcsname=-1
$\begin{pmatrix}
\binom 12&\int_0^\infty f(x)dx\\[2cm]
\mathfrak{W}_2&\text{xyz}=\sqrt{xxyyzz}
\end{pmatrix}$

translates as

<p><texmath type='inline'>\begin{pmatrix}
\genfrac(){0.0pt}{}{1}{2}&amp;\int _0^\infty f(x)dx\\[2cm]
\@mathfrak W\@mathit _2&amp;\text{xyz}=\sqrt{xxyyzz}
\end{pmatrix}</texmath></p>

In all other cases we use a highly recursive procedure that converts a math list into a formula. The procedure takes as argument the current style. This is one of D, T, S, or SS (display, text, script, or script script style). It is D for a display math formula, T for a normal formula.

Consider first the case where the formula has an \over, or a variant, not hidden inside braces. This example has 6 subexpressions, each of them have such an operator.

${a\over b}{a\above2mm b}{a\atop b}
{a\overwithdelims[] b}{a\abovewithdelims[]2mm b}{a\atopwithdelims[] b}$

The translation is

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
   <mfrac><mi>a</mi> <mi>b</mi></mfrac>
   <mfrac linethickness='2mm'><mi>a</mi> <mi>b</mi></mfrac>
   <mfrac linethickness='0.0pt'><mi>a</mi> <mi>b</mi></mfrac>
   <mfenced open='[' close=']'>
       <mfrac><mi>a</mi> <mi>b</mi></mfrac></mfenced>
   <mfenced open='[' close=']'>
       <mfrac linethickness='2mm'><mi>a</mi><mi>b</mi></mfrac></mfenced>
   <mfenced open='[' close=']'>
       <mfrac linethickness='0.0pt'><mi>a</mi> <mi>b</mi></mfrac></mfenced>
  </mrow>
 </math>
</formula>

It is an error if the formula has more than one such operators. Otherwise, we have two parts: what precedes the operator and what follows the operator. As the example shows, some operators need delimiters. Other operators read a dimension. This dimension must be given explicitly as a sequence of digits and a unit of measure (we could do better; if you want \parindent instead of 2mm, you should use \genfrac instead). After splitting the formula into two parts, the same idea than \genfrac is used. If the current style is C, the next style in the list is used for both parts of the formula (if the style is D or T, the next style is S, otherwise it is SS). Note that \choose is like \over, you should use \binom instead.

We assume from now on that the formula contains no more operators like \over. This means that the current style can be used for the current object. Items are handled as follows:

A space is ignored.

If the current token is \text, \hbox, \mbox, this is a command with an argument, that is interpreted using special rules. A sequence of characters produces \mtext, a space produces a \mspace, and math formulas are allowed. Errors may be signaled if the content of the argument is too complicated. The translation of

$ x=0 \text{provided that $y=0$ or } a=1$

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
   <mi>x</mi> <mo>=</mo>  <mn>0</mn>
   <mrow>
     <mtext>provided</mtext>
     <mspace width='0.5em'/>
     <mtext>that</mtext>
     <mspace width='0.5em'/>
   <mrow> <mi>y</mi><mo>=</mo><mn>0</mn> </mrow>
   <mspace width='0.5em'/>
   <mtext>or</mtext><mspace width='0.5em'/></mrow>
   <mi>a</mi><mo>=</mo><mn>1</mn>
  </mrow>
 </math>
</formula>

In the case of $\mathop{\rm sin}$ , the translation is <mo form=´prefix´>sin</mo>. Any sequence of characters is allowed instead of `sin´. Instead of \rm, any font change command that switches to `rm´ can be used.
In the case of \hspace, an argument is read, converted to a dimension (in fact, a glue is read via the scanglue routine, the shrink and stretch parts of the glue are discarded), and the result is a <mspace> element. For instance \hspace{2cm plus 3pt} produces <mspace width=´56.9055pt´/>.
In the case of \displaystyle, \textstyle, \scriptstyle, \scriptscriptstyle, the current style is changed, to D, T, S and SS respectively.

In the case of \nonscript, the token is discarded if the style is D or T, kept otherwise. We shall see later that space disappears after such a token, if it is not discarded. Example.

$\def\foo{\nonscript~} \foo x^{y\foo}_{\textstyle z\foo}$

The translation is

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
   <mspace width='3.33333pt'/>
   <msubsup>
    <mi>x</mi>
    <mstyle scriptlevel='0' displaystyle='false'>
       <mrow><mi>z</mi><mspace width='3.33333pt'/></mrow>
    </mstyle>
    <mi>y</mi>
   </msubsup>
  </mrow>
 </math>
</formula>

If the token is a character of category code 7 or 8, it is left unchanged (typically, case of ^ and _).
If the token is \limits, \nolimits, \mathord, \mathop, \mathbin, \mathrel, \mathopen, \mathclose, \mathpunct, \mathinner, \ensuremath, \nonumber, \nolinebreak, it is ignored. However, we remember that the next object should be Rel, Bin (if \mathrel or \mathbin has been seen.)
If the token is \big, \bigl, \bigm, \bigr, \bigg, \biggl, \biggm, \biggr, \Big, \Bigl, \Bigm, \Bigr, \Bigg, \Biggl, \Biggm, \Biggr, we remember that a big object is wanted, with subtype left, right, middle, or other.
The current token or group is translated, according to the rules given below. After that, flags may be added (if the object is declared Bin or Rel or big). The current style is changed to the next style in the case we are in a group, and the group is preceded by ^ or _.
If the current token is a character, its translation will be a <mi>, <mn> or <mo> element. In the case of a letter, this may depend on the font attribute associated to the character. For instance {\bf x=1} gives <mi mathvariant=´bold´>x</mi> and <mo>=</mo> and <mn>1</mn>.
If the current token is \left, \right, or already translated, it is left unchanged.
If the token is a constant, like \alpha, see the big list at the start of the chapter, its XML value is inserted.
If the token is a list, like {...}, it will be translated, using a copy of the current style.
If the token is a list of the form \left ... \right, it will be translated. After that, fences will be added (using what follows the \left and \right).
If the token is a list of the form \begin{xxx}...\end{xxx}, we assume that this is an array, or a matrix, we already explained how it can be translated.

If the token is \mathchoice, with its four arguments, one of them is selected according to the mode. For instance

\def\foo{\mathchoice{1}{2}{3}{4}}
$\foo{\displaystyle \foo}^{\foo^{\foo}}$

translates to

  <mrow>
   <mn>2</mn>
   <msup>
    <mstyle scriptlevel='0' displaystyle='true'><mn>1</mn></mstyle>
    <msup>
     <mn>3</mn>
     <mn>4</mn>
    </msup>
   </msup>
  </mrow>

It is an error if the current token is not a command of the form \acute, etc, or \overline, those listed at the end of the section `basic objects´ on page ✻. As a general rule, for instance for \frac, arguments are translated using the next style (i.e., smaller), unless the style is indicated (for \dfrac and \tfrac, the style is T and S, the style may be indicated for \genfrac). If \foo is as above, the translation of

$\tfrac{\foo}{\foo} = \dfrac{\foo}{\foo} = \frac{\foo}{\foo}
{\displaystyle \frac{\foo}{\foo}}$

  <mrow>
   <mstyle scriptlevel='0' displaystyle='false'>
    <mfrac><mn>3</mn> <mn>3</mn></mfrac>
   </mstyle>
   <mo>=</mo>
   <mstyle scriptlevel='0' displaystyle='true'>
     <mfrac><mn>2</mn> <mn>2</mn></mfrac>
   </mstyle>
   <mo>=</mo>
   <mfrac><mn>3</mn> <mn>3</mn></mfrac>
   <mstyle scriptlevel='0' displaystyle='true'>
     <mfrac><mn>2</mn> <mn>2</mn></mfrac>
   </mstyle>
  </mrow>

The translation of

\def\xbar#1{\genfrac{}{}{}{#1}{\foo}{\foo}}
$\xbar{0}\xbar{1}\xbar{2}\xbar{3}\xbar{}$

is the following. You can notice that, if the argument of \xbar is 2 or 3, this does not change the translation of the fraction. In TeX we get two formulas that have the same size but are not vertically aligned (why?).

   <mstyle scriptlevel='0' displaystyle='true'>
    <mfrac><mn>2</mn> <mn>2</mn></mfrac>
   </mstyle>
   <mstyle scriptlevel='0' displaystyle='false'>
    <mfrac><mn>3</mn> <mn>3</mn></mfrac>
   </mstyle>
   <mstyle scriptlevel='1' displaystyle='false'>
    <mfrac><mn>4</mn> <mn>4</mn></mfrac>
   </mstyle>
   <mstyle scriptlevel='2' displaystyle='false'>
    <mfrac><mn>4</mn> <mn>4</mn></mfrac>
   </mstyle>
   <mfrac><mn>3</mn> <mn>3</mn></mfrac>

As a final example, the translation of

$\overline{x}\grave{y} \underbrace{z}\stackrel{a}{b}\overset{a}{b} $

<mover accent='true'><mi>x</mi> <mo>&OverBar;</mo></mover>
<mover accent='true'><mi>y</mi> <mo>&grave;</mo></mover>
<munder accentunder='true'><mi>z</mi> <mo>&UnderBrace;</mo>
</munder><mover><mi>b</mi> <mi>a</mi></mover>
<mover><mi>b</mi> <mi>a</mi></mover>

3.7. Final math mode hacks

Before we forget it: when the formula is completely translated, we have a list of XML elements. If the list is empty, the result is <mrow/>. For instance, in the case of x^{}, then exponent is empty. If the list has a single XML token, this will be the result. Otherwise, everything is put in a <mrow>. If the current formula, or subformula contains a style change, it is put in a <mstyle> element. This is not always the good solution, because the same style is used for everything, what precedes and what follows the style command. If you look at the \genfrac example above, you can see that styles are added by the \genfrac interpreter (the single TeX switch is associated with two MathML attributes).

If we have a formula, of the form $_x^{2}_{abc}$ , the translation rules explained so far tell us that we have: an underscore character, an XML element for x, a hat character, an XML element for {2}, an underscore, and an XML element for {abc}. We may have \nonscript tokens; they will be removed, as well as a space that follows. We have to evaluate the commands that control subscripts and superscripts. A hat character gives <msup>, an underscore character gives <msub>, and both give <msubsup>. It is possible for a formula to start with an underscore or a hat: in this case, the kernel is empty. It is not possible for a formula to end with hat or underscore. A kernel can have at most one subscript and at most one superscript; hence the formula above is wrong: the letter x is the first subscript to the empty kernel. A valid formula is for instance $_yx^2$ . It translates as

<mrow>
  <msub><mrow></mrow> <mi>y</mi> </msub>
  <msup><mi>x</mi> <mn>2</mn> </msup>
</mrow>

We have mentioned above that some operators can be flagged as left, right, and that adding \bigr may convert a left operator into a right operator. There is a magic that converts, in some cases, the \big operator into fences. For instance

$\bigl [ A\big ( x^2 \big) B \bigr[  $

translates as

<mfenced open='[' close='['>
  <mi>A</mi>
  <mfenced open='(' close=')'><msup><mi>x</mi> <mn>2</mn> </msup></mfenced>
  <mi>B</mi>
</mfenced>

There is another trick, that works in some cases. Consider:

$\int_0^\infty f(x) dx = \big[  U \big ]$

the translation is

<mrow>
 <msubsup><mo>&int;</mo> <mn>0</mn> <mi>&infin;</mi> </msubsup>
 <mrow>
   <mi>f</mi><mo>(</mo><mi>x</mi><mo>)</mo><mi>d</mi><mi>x</mi>
 </mrow>
 <mo>=</mo>
 <mfenced open='[' close=']'><mi>U</mi></mfenced>
</mrow>

The interesting point here is the placement of the inner \mrow. The idea is that the parentheses should remain small (not larger than the \mrow). In particular, it should not be influenced by the integral that precedes and the fence that follows. In some cases, it works.

3.8. Extensions

In Tralics, you can use the following three commands \mathmo, \mathmi, and \mathmn. They take an argument and produce a <mo>, <mi>, or <mn>. There is a file tralics-iso.sty that contains

\def\makecmd#1{\expandafter\newcommand\csname math#1\endcsname}
\def\makemo#1#2{\makecmd{#2}{\mathmo{\amp\##1;}}}
\def\makemi#1#2{\makecmd{#2}{\mathmi{\amp\##1;}}}
\def\makemn#1#2{\makecmd{#2}{\mathmn{\amp\##1;}}}

Then you can say \makemo{x02190}{slarr}, and this will define a command \mathslarr, whose translation (in math mode only) is <mo>←</mo>. The file provides nearly 2000 such definitions, taken from the MathML entity files, with the MathML names. These commands can be used instead of TeX commands like \mathchar: remember that a math-char is a 15bit integer, where 8 bits are used for the position in a font table, 3 bits for the type, and 4 bits for the family. Only three types are defined for Tralics, but the content of the element is arbitrary (most math symbols are between U+2100 and U+27FF, there are also letters between U+1D400 and U+1D7FF). There is a command \mathattribute that adds an attribute pair to the last created math element. You can say for instance

\providecommand\operatorname[1]{%
  \mathmo{#1}%
  \mathattribute{form}{prefix}%
  \mathattribute{movablelimits}{true}%
}

After that,

$\min _xf(x) >\operatorname{min} _xf(x)$

translates as

<formula type='inline'>
 <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
   <msub><mo movablelimits='true' form='prefix'>min</mo> <mi>x</mi> </msub>
   <mrow>
    <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>&gt;</mo>
   </mrow>
   <msub><mo movablelimits='true' form='prefix'>min</mo> <mi>x</mi> </msub>
   <mrow>
    <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo>
   </mrow>
  </mrow>
 </math>
</formula>

The command \DeclareMathOperator takes two arguments (say `foo´ and `bar´), with an optional star before the first argument. It defines \foo to be the command \operatorname applied to `bar´ (with a star when required). The command \operatorname is as shown above (the movablelimits attribute is only added if the command is followed by a star).

You can use the command \mathchardef. This is like \chardef, it reads a command and a number. The number should fit on 15 bits. Otherwise, you will see an error of the form: Bad mathchar replaced by 0: 1234567. The \mathchardef command reads a command, say \foo, and an integer N; there is no difference between \foo and \mathcharN, except that \the\foo returns the integer N, and is faster to parse. Some constants, like \@cclvi=256, are defined in this way by the TeX kernel and should not be used as math characters. Some commands, like \eta=111₁₆, are meant to be used as a math character. In Tralics, until version 2.8 an error will be signaled. In version 2.9, the translation, in math mode, is a <mi> element containing this character; you might say \mathchardef\eta"3B7. Outside math mode, this gives an error: that takes the form Undefined command \eta; command code = 264, instead of Math only command \theta. Missing dollar not inserted; inside math mode, the behavior is the same as the standard one.

TeX has a special register called \fam. If you say something like

\fam3 ${\fam9 \the\fam}\ \the\fam$

then the second \the expands to minus one. The first gives 9, but LaTeX complains with: \textfont 9 is undefined (character 9). In Tralics, you would see

<mrow><mn>9</mn><mspace width='6pt'/><mn>3</mn></mrow>

As the example shows, the family is unused, and not correctly restored. Each character has a \mathcode. The following

\mathcode`\a="0941 $a\the \mathcode`\a$

is interpreted by Tralics as $a2369$ . However TeX complains, with \textfont 9 is undefined (character A), because you ask the lower case letter a to be printed like the upper case letter A with textfont 9. A mathcode is a 15bit integer, with an exception: a character whose mathcode is 32768 behaves like an active character, the action associated to it must be defined somehow, for instance like this:

{\catcode`\'=\active \global\let'\active@math@prime}

There is a command \delimiter, it reads a number, but you cannot use it. There is a command \radical, it reads a number, then signals an error. The \mathaccent command is similar.

There are commands \raise and \lower, as well as \vcenter. The last one is not implemented in Tralics. The translation of

a\raise2cm\xbox{foo}{bar}\lower 2pt\xbox{xfoo}{xbar}

<p>a<foo>bar</foo><xfoo>xbar</xfoo></p>

As you can see, the specification disappear. Maybe in a future version, we will add an attribute to the box. You cannot use these commands in math mode in Tralics. In TeX, you can get an error of the form: You can´t use `\raise´ in vertical mode, while \vcenter is a math only command. Currently \indent and \noindent are ignored in math mode (in TeX $\indent_b$ produces a kernel and an index; the kernel is an empty box of width \parindent, of type Ord).

Back to main page