Tralics, a LaTeX to XML translator; Part I

Mathematics play a great role in TeX and *Tralics*. For instance, TeX has
three modes: vertical mode, in which no typesetting is done, horizontal mode
(where everything happens) and math mode, a mode in which special objects are
handled; a two phase process converts these special objects in normal ones.
Fonts to be used in math mode have special properties (see appendices F and G
of the TeXbook). Not all subtleties of TeX math can be implemented in
*Tralics*; on the other hand, the XML translation is conforming to MathML.
This defines some entities, for instance in isoamsc.ent, there is a
definition of `⌉` to `⌉`. As a consequence,
*Tralics* will translate `\rceil` to `<mo>⌉</mo>` or
`<mo> ⌉</mo>`, depending on an option. Translation of a
footnote is in general a

The syntax of mathematics is often strange. Instead of

\math{E=\fraction{1}{2} m\superscript{v}{2}}

you say

$E={1\over 2} mv^2$

Three categories codes are defined for use in math mode, they correspond to
the dollar sign (math shift), underscore character (subscript) and hat
character (superscript).
If you want a dollar or underscore character, you can say `\$`,
or `\_`, but `\^` produces an accent over what follows, not a
hat character (In LaTeX, you can say `\textasciicircum`, provided that you
can guess the name).

In the example above, we have two pseudo commands `\fraction` and
`\superscript` (followed by two arguments) whereas the plain
TeX version uses infix operators (placed between the arguments). The first
opertr is greedy. This means that, without the braces in the example above,
everything before `\over` would be the numerator, and everthing after it
would be the denominator. On the other hand, you see sometimes
2^{1}6 instead 2^{16},
when people forget braces around the superscript. The essential
difference however is that arguments are typeset in different style: the
nucleus (what precedes the hat operator) is typeset in text style, while
numerator, denominator, superscripts and subscripts are in script style;
moreover, it two objects are placed one above the other, cramped style is used
used for the object that is below the other one (i.e., the denominator or a
subscript). The style influences spacing; because of commands like `\over`,
the current style is known only after the whole expression is parsed. This
explains why you may see:
*Package amsmath Warning: Foreign command \over;
\frac or \genfrac should be used instead*.

TeX has also a notion of “inner” mode. Inside an inner object, you cannot
put an outer one. Such a distinction exists also in HTML, where `<div>` is
outer and `<span>` is inner. We explained in the previous chapter that
`\ifinner` can be used to check whether current mode is inner or outer, and
we mentioned that, outside math mode, this is not well defined in
*Tralics*. This may produces surprising results. Consider for instance
`\hbox{$$}`. Inner mode is the rule inside a box, and a double
dollar sign signals the start of an outer (display math) formula. You would
expect this expression to provoke an error. In fact, TeX assumes that you
know what you do, enters inner math mode when it sees the first dollar
sign, and quits when it sees the second one; this gives an empty math formula
(in fact, it will contain all tokens from the `\everymath` hook),
surrounded by some space: the value of `\mathsurround` (this can be set to
zero using `\m@th`). Note that a math formula defines group: assignments
made inside the formula are forgotten after full evaluation (in particular
after this space is added).

The essential difference between inner (normal, inline) math and outer
(display) math is that a display formula uses a line of its own (very often
the formula is centered on the line). One could say that a display formula
terminates the current paragraph. In fact, it is just interrupted, the
paragraph continues after the formula (this is only interesting in
constructions like `\parshape`, whose scope is the current paragraph; here
a formula counts for three lines; not implemented in *Tralics*).
The construction `\hbox{$$ x$$}` produces a display math
formula in *Tralics*, instead of two empty math formulas. Before version
2.11.7, an error was signaled (because *Tralics* started a new paragraph at
the end of the equation, and this is illegal in a box).

A display math formula can have an equation number
(via commands `\eqno`, `\leqno`, `\tag`, `\notag`;
these commands were not implemented in early versions, and are described in
the last chapter of the second part of this report).
The MathML documentation says “One of the important uses of
`<mlabeledtr>` is for numbered equations. In a `<mlabeledtr>`,
the label represents the equation number and the elements in the row are the
equation being numbered.
The side and minlabelspacing attributes of `<mtable>`
determine the placement of the equation number.” Thus, the recommended way,
for MathML, is to use a table, like this (replace ellipsis by an expression)

<mtable> <mlabeledtr id='e-is-m-c-square'> <mtd> <mtext> (2.1) </mtext> </mtd> <mtd> ... </mtd> </mlabeledtr> </mtable>

This mechanism is not yet implemented. We do not know how to insert numbers
automatically, so that the proposed solution is: you can use `\label`,
`\ref` for any display math formula. This will add an id attribute
to the `<formula>` object, which is a wrapper for the `<math>`.

When you say `{\alpha^2}`, TeX will enter math mode with an error of the form *Missing $ inserted*.
On the other hand, *Tralics* will signal two errors, the first is *Math
only command \alpha.
Missing dollar not inserted*, the second is
*Missing dollar not inserted, token ignored: {Character ^ of catcode 7}*. If you want a command that works
in math mode and outside math mode, you can say:

\def\foo{\ifmmode \alpha^2 \else $\alpha^2$\fi}

This can be generalised, using the following command

\DeclareRobustCommand{\ensuremath}{% \ifmmode \expandafter\@firstofone \else \expandafter\@ensuredmath \fi} \long\def\@ensuredmath#1{$\relax#1$}

The purpose of the `\relax` on the last line is for the case of an empty
argument: we do not want `\ensuremath``{}` to expand to
`$$`. Note that the argument is handled only once (i.e.,
`\ensuremath` does not read it, but calls a helper), because of subtle
bugs, see latex bugs data base amslatex/2104.
We shall say later `Mode independent commands are interpreted as usual´, this
implies that the `\relax` token will do nothing. We shall see later that,
in non-mathml mode, `\relax` appear in the result unless it is the first in
the list. Other commands, not listed in
this chapter, may signal an error. For instance, `\par` is forbidden.
Note that `\mathchar` provokes an
*Unimplemented command* error. If you want a random Unicode character,
you should use commands like `\mathmi`, `\mathmo`, `\mathmn`.
You can also define a command via
`\chardef` or `\mathchardef` (the result is the same), and use it, the
result is always a `<mi>` element. The following example shows that
`\amp` produces an ampersand sign in some case, it must be used with care.

\chardef\AAA"1000 \chardef\CCC`x \mathchardef\BBB"2000 $\mathbf{x\AAA\BBB\CCC} \mathmi{foo}\mathmo{\&\#666;}\mathmo{\amp\#777;}$

Translation

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mi mathvariant='bold'>x</mi> <mi>က</mi> <mi> </mi> <mi>x</mi> <mi>foo</mi> <mo>&#666;</mo> <mo>̉</mo> </mrow> </math> </formula>

Because a math expression translates as `<math>` inside a `<formula>`, and
that the math has a long namespace attribute, examples will never fit on a
single line.
In order to make the result easier to read, we have inserted some newline
characters, and reindented all these examples.
Two consecutive newline characters are scanned by
TeX as space plus `\par`. This space is ignored by TeX (see TeXbook, the text between exercises 14.12 and 14.13). Hence the general
rule in *Tralics*: when a `<p>` element is ended, a trailing space or
newline is removed from the content of the element,
a newline character is added to the parent of the `<p>`. As a result,
you will very often see `<p>` at the start of a line and `</p>` at the end
of a line in a XML file generated by *Tralics*.

Consider the following simple example:

$\alpha$ and $$\beta \label{foo}$$

The translation is the following

<p> <formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mi>α</mi> </math> </formula> and</p> <formula id='uid1' type='display'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mi>β</mi> </math> </formula>

\(\alpha\) and \[\beta \label{foo}\]

The result is exactly the same. In LaTeX, the commands `\(`, `\)`,
`\[` and `\]` test the current mode. No such test is done by *Tralics*.
The LaTeX implementation of `\[` is a bit strange. If the formula is in
vertical mode, it will be preceded by a box of width `.6\linewidth`
containing nothing (except two `\hss` commands to fill it) preceded by the
current paragraph indentation. The command `\]` executes
`\ignorespaces`. As you can see, there is some difference between a single
dollar and a double dollar. In the first case, we are in normal math mode,
otherwise in display math mode. One difference is the initial style: it is
`\textstyle` (for normal mode) and `\displaystyle` otherwise (this will be explained
later). A second difference is that the `\everymath` or
`\everydisplay` token list is inserted when
scanning the formula depends on the mode. The third difference is specific to
*Tralics*. A display math formula is never `trivial´ (see section
3.5), it can have a label (not more than one): in this case,
the `<formula>` element has an id attribute. In any case, the
`<formula>` element has a type attribute that explains that the formula is
inline or display. A non-display formula starts a paragraph; a display math
formula cannot appear in a paragraph (the equivalent of `\par` is executed), if the first
non-space token (after expansion) that follows the math formula is not
`\par`, a `\noindent` token will be inserted
(see line 34 of the transcript at page 3.3). Note
that, in TeX, a math formula does not end a paragraph, in the sense that a
`\parshape` is valid across math formulas; however what precedes the
formula is split into lines, according to parameters in force at the start of
the formula. *Tralics* does not split paragraphs into lines, and does not
implement use `\parshape`.

The following environments are recognized outside math mode, and
produce a math formula: `eqnarray*`,
`align*`,
`aligned`,
`split`,
`multline`,
`equation*`,
`math` and
`displaymath`. When *Tralics* sees a dollar character, it
looks at the next character (without expansion). If this is a dollar sign, it
will be read, and display math mode is entered, otherwise, normal math mode is
entered. All environments shown above start display math mode (except
`math`, which enters normal math mode).
The environments `math` and `displaymath` are equivalent to `\(...\)` and
`\[...\]` respectively. The environments `eqnarray`,
and
`split` are implemented
as arrays. There is no difference between

\begin{eqnarray} a&b\\ c&d \end{eqnarray} \begin{split} a&b\\ c&d \end{split}

and

\[\begin{array}{rcl} a&b\\ c&d \end{array}\] \[\begin{array}{rl} a&b\\ c&d \end{array}\]

Environments `equation` and `align` are translated as normal math.
A star
after the environment name is ignored. In the case of normal math mode, the
content of the token list `\everymath` is inserted before
the formula, for displaymath it is `\everydisplay`.
For instance, if you say

\everymath={(N)\ } \everydisplay={(D)\ } $\alpha$ and $$\beta$$

the translation will be

<p> <formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mo>(</mo><mi>N</mi><mo>)</mo><mspace width='6pt'/> <mi>α</mi></mrow></math></formula> and</p> <formula type='display'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mo>(</mo><mi>D</mi><mo>)</mo><mspace width='6pt'/> <mi>β</mi> </mrow> </math> </formula>

In TeX, you can put anything inside a math formula, provided it is hidden in
a box; this is not possible in *Tralics*, because we want the XML result to be
conforming to MathML. We shall list here all commands valid in math mode, and
explain later on how they are translated.

Commands `\limits`, `\nolimits` and `\displaylimits`
can be used just after an operator and before subscripts or supscripts, as in
`\int` `\limits` `_x`. They are curently ignored by *Tralics*.

The following environments are recognized: `array`,
`matrix`, `pmatrix`, `bmatrix`,
`Bmatrix`, `vmatrix`, `Vmatrix`.
All these environments produce arrays. For the first, an argument is required,
explaining how cells are aligned. For all other environments, cells are
centered. Environments of the form `Xmatrix` have fences, an implicit
`\left` and `\right`. In order: parentheses, braces, brackets, simple
bars, double bars. There is also an environment `cases`, with two
columns, left aligned, that has an open brace as left delimiter, an empty
right delimiter. Example

$\begin{array}{lcr}a&b&c\end{array} \begin{bmatrix}d&e\\f&g\end{bmatrix}$

The translation is the following.

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mtable> <mtr> <mtd columnalign='left'><mi>a</mi></mtd> <mtd><mi>b</mi></mtd> <mtd columnalign='right'><mi>c</mi></mtd> </mtr> </mtable> <mfenced open='{' close='}'> <mtable> <mtr> <mtd><mi>d</mi></mtd> <mtd><mi>e</mi></mtd> </mtr> <mtr> <mtd><mi>f</mi></mtd> <mtd><mi>g</mi></mtd> </mtr> </mtable> </mfenced> </mrow> </math> </formula>

The following delimiters are recognized: `<`,
`>`, `.`, `(`,
`)`, `[`, `]`
`|`,
`\{`, `\}`,
`\langle`, `\rangle`,
`\lbrace`, `\rbrace`,
`\lceil`, `\rceil`,
`\lgroup`, `\rgroup`,
`\lfloor`, `\rfloor`,
`\lmoustache`, `\rmoustache`,
`\vert`, `\Vert`,
`\uparrow`, `\downarrow`,
`\updownarrow`,
`\Uparrow`, `\Downarrow`,
`\Updownarrow`. A delimiter is anything that can follow
`\left` or `\right`. For MathML, this has to be a character. As the
following example shows, we use in most cases a character entity.

$\left\lceil \left\uparrow x\right\}\right.$ $\lceil \uparrow x\}$

The translation is

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mfenced open='⌈' close='.'> <mfenced open='↑' close='}'> <mi>x</mi></mfenced></mfenced></math></formula> <formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow><mo>⌈</mo><mo>↑</mo><mi>x</mi><mo>}</mo> </mrow></math></formula>

This is the list of commands allowed in math mode, as well as in text mode:
`\dots`, `\ldots`, `\quad`,
`\qquad`, `\␣`, `\$`,
`\%`, `\&`, `\!`, `\,`
`\{`, `\}`, `\i`,
`\sharp`, `\natural`,
`\flat`,
`\_`. The following commands produce space: `\;`,
`\:`, `\>`. Note that `\!` produces a negative space in
math mode, nothing outside math mode.
Example of use:

\def\alist{\i\j\$\,\_\&\{\}\%\ \^^J\^^I\^^M\!} \def\blist{\quad,\qquad,\dots,\sharp,\natural,\flat} \alist\blist $\alist\blist$

This is the translation, with nobreak space replaed by tilde:

ıj$ _&{}% ~~~,~~~~~~,...,♯,♮,♭ <formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow><mo>ı</mo><mi>j</mi><mi>$</mi><mspace width='0.166667em'/> <mo>~</mo><mo>&</mo><mo>{</mo><mo>}</mo><mo>%</mo> <mspace width='6pt'/><mspace width='6pt'/> <mspace width='6pt'/><mspace width='6pt'/> <mspace width='-0.166667em'/><mspace width='1.em'/><mo>,</mo> <mspace width='2.em'/> <mo>,</mo><mo>⋯</mo><mo>,</mo><mo>♯</mo><mo>,</mo> <mo>♮</mo><mo>,</mo><mo>♭</mo></mrow> </math> </formula>

We give here the list of all symbols that have a translation of the form
`<mi>α</mi>`. They are of type Ord (ordinary symbol).
We start with the lower case Greek letters:
`\alpha`, `\beta`, `\gamma`,
`\delta`, `\epsilon`, `\varepsilon`, `\zeta`,
`\eta`, `\theta`, `\iota`, `\kappa`,
`\lambda`, `\mu`, `\nu`, `\xi`, `\pi`,
`\rho`, `\sigma`, `\tau`, `\upsilon`,
`\phi`, `\chi`, `\psi`, `\omega`,
`\varpi`, `\varrho`, `\varsigma`, `\varphi`,
`\vartheta`, `\varkappa`, then upper case Greek letters:
`\Gamma`, `\Delta`, `\Theta`, `\Lambda`,
`\Xi`, `\Sigma`, `\Upsilon`, `\Phi`,
`\Pi`, `\Psi`, `\Omega`, then other symbols:
`\hbar`, `\ell`, `\wp`, `\Re`, `\Im`,
`\partial`, `\infty`, `\emptyset`, `\nabla`,
`\surd`, `\top`, `\bottom`, `\bot`,
`\angle`, `\triangle`. Example

$\alpha\Gamma \surd$

This translates as

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow><mi>α</mi><mi>Γ</mi><mi>√</mi> </mrow></math></formula>

Next comes the list of all symbols whose translation is like log. There are of
type Ord (ordinary symbol), though they should be Op (large operator).
The list is divided in two parts: these have movable limits:
`\det`, `\gcd`, `\inf`, `\injlim`,
`\liminf`, `\limsup`, `\max`, `\min`,
`\sup`, `\projlim`, and these have not:
`\dim`, `\exp`, `\hom`, `\ker`, `\lg`,
`\lim`, `\ln`, `\log`, `\Pr`,
`\arccos`, `\arcsin`, `\arctan`, `\arg`,
`\cos`, `\cosh`, `\cot`, `\coth`,
`\csc`, `\deg`, `\sec`, `\sin`,
`\@mod`, `\sinh`, `\tan`,
`\tanh`. Example

$\displaystyle\lim_a \liminf_a \sin_a \hom_a$

The LaTeX translation is $\underset{a}{lim}\underset{a}{lim\; inf}{sin}_{a}{hom}_{a}$, and
the *Tralics* version is

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mstyle scriptlevel='0' displaystyle='true'> <mrow> <msub><mo movablelimits='true' form='prefix'>lim</mo> <mi>a</mi> </msub> <msub><mo movablelimits='true' form='prefix'>lim inf</mo><mi>a</mi></msub> <msub><mo form='prefix'>sin</mo> <mi>a</mi> </msub> <msub><mo form='prefix'>hom</mo> <mi>a</mi> </msub> </mrow></mstyle></math></formula>

From now on, all symbols translate into the form `<mo>...</mo>`. We start with symbols
of type Ord. In reality, most of them they should be of type Op (large
operator).
`\mho`, `\clubsuit`, `\diamondsuit`,
`\heartsuit`, `\spadesuit`, `\aleph`,
`\backslash`, `\Box`, `\imath`, `\jmath`,
`\square`, `\cong`, `\lnot`, `\neg`,
`\forall`, `\exists`, `\coprod`, `\bigvee`,
`\bigwedge`, `\biguplus`, `\bigcap`,
`\bigcup`, `\int`, `\sum`, `\prod`,
`\bigotimes`, `\bigoplus`, `\bigodot`,
`\oint`, `\bigsqcup`, `\smallint`. Examples

$\bigcap \int\oint$

The translation is

<mrow><mo>⋂</mo><mo>∫</mo><mo>∮</mo></mrow>

These are of type Bin (binary operator).
`\triangleleft`, `\triangleright`, `\bigtriangleup`,
`\bigtriangledown`, `\wedge`, `\land`, `\vee`,
`\lor`, `\cap`, `\cup`, `\multimap`,
`\dagger`, `\ddagger`, `\sqcap`, `\sqcup`,
`\amalg`, `\diamond`, `\Diamond`, `\bullet`,
`\wr`, `\div`, `\odot`, `\oslash`,
`\otimes`, `\ominus`, `\oplus`, `\uplus`,
`\mp`, `\pm`, `\circ`, `\bigcirc`,
`\setminus`, `\cdot`, `\ast`, `\times`,
`\star`, `\in`. Example

$\cap \cup \wr$

The translation is

<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow><mo>∩</mo><mo>∪</mo><mo>≀</mo></mrow></math></formula>

These are of type Rel (relation).
`\propto`, `\sqsubseteq`, `\sqsupseteq`,
`\sqsubset`, `\sqsupset`, `\parallel`, `\mid`,
`\dashv`, `\vdash`, `\Vdash`, `\models`,
`\nearrow`, `\searrow`, `\nwarrow`,
`\swarrow`, `\Leftrightarrow`, `\Leftarrow`,
`\Rightarrow`, `\ne`, `\neq`, `\le`,
`\leq`, `\ge`, `\geq`, `\succ`,
`\approx`, `\succeq`, `\preceq`, `\prec`,
`\doteq`, `\supset`, `\subset`, `\supseteq`,
`\subseteq`, `\bindnasrepma`, `\ni`,
`\gg`, `\ll`, `\gtrless`, `\geqslant`,
`\leqslant`, `\not`, `\notin`,
`\leftrightarrow`, `\leftarrow`, `\owns`, `\gets`,
`\rightarrow`, `\to`, `\mapsto`, `\sim`,
`\simeq`, `\perp`, `\equiv`, `\asymp`,
`\smile`, `\iff`, `\leftharpoonup`,
`\leftharpoondown`, `\rightharpoonup`,
`\rightharpoondown`, `\hookrightarrow`,
`\hookleftarrow`,
`\Longrightarrow`, `\longrightarrow`,
`\longleftarrow`,
`\Join`,
`\longmapsto`,
`\frown`, `\bowtie`,
`\Longleftarrow`,

, `\Longleftrightarrow`.
Example.

$\approx\leftrightarrow\Longleftrightarrow$

Translation:

<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow><mo>≈</mo><mo>↔</mo> <mo>⟺</mo></mrow></math></formula>

These are of type Inner: `\cdots`, `\hdots`, `\vdots`,
`\ddots`. These are of type Between (they are of type Ord in TeX, but
are used as opening or closing delimiters): `\Vert`,
`\|`, `\vert`, `\uparrow`, `\downarrow`,
`\Uparrow`, `\Downarrow`, `\Updownarrow`,
`\updownarrow`. These are of type Open and Close:
`\rangle`, `\langle`, `\rmoustache`,
`\lmoustache`, `\rgroup`, `\lgroup`,
`\rbrace`, `\lbrace`, `\lceil`, `\rceil`,
`\lfloor`, `\rfloor`.

The following characters are classified as `small´:
`<>,.:;*?!x`, these are
classified as `small-l´ and `small-r´: `()[]`, the vertical bar is small-l,
these are bin: `+/` and the equals sign is of type Rel. Note: what you see
here as x is in reality the character 215. It cannot be printed in verbatim
mode by LaTeX.

$<>,.:;*?!x ()[]|+-/=$

Translation:

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow><mo><</mo><mo>></mo><mo>,</mo><mo>.</mo><mo>:</mo> <mo>;</mo><mo>*</mo><mo>?</mo><mo>!</mo><mi>×</mi><mo>(</mo> <mo>)</mo><mo>[</mo><mo>]</mo><mo>|</mo><mo>+</mo><mo>-</mo> <mo>/</mo><mo>=</mo> </mrow></math></formula>

The following commands are used for accents: `\acute`,
`\grave`, `\mathring`, `\ddddot`, `\dddot`,
`\ddot`, `\tilde`, `\widetilde`, `\bar`,
`\breve`, `\check`, `\hat`, `\widehat`,
`\vec`, `\overrightarrow`, `\overleftarrow`,
`\underrightarrow`, `\underleftarrow`, `\dot`.

The following commands are special. They will be explained later:
`\overline`, `\underline`, `\stackrel`,
`\underset`, `\overset`, `\mathchoice`,
`\frac`, `\overbrace`, `\underbrace`,
`\genfrac`,
`\dfrac`, `\tfrac`, `\sqrt`, `\root`.

This is a non-trivial operation, for this reason in verbose mode, the math expression will be printed on the transcript file. For instance, given

\tracingall $\begin{cases} x &y\\a&b \end{cases} \mkern18mu x^{ {2 }}!$

whose translation in no-mathml mode is

<texmath type='inline'> {\left\rbrace \begin{array}{ll} x &y\\a&b \end{array}\right.} \hspace{10.0pt}x^{ {2 }}! </texmath>

the transcript file will contain

1 {math shift character $} 2 +stack: level + 2 for math entered on line 2 3 +stack: level + 3 for math entered on line 2 4 \cases ->\left \{\begin {array}{ll} 5 +stack: level + 4 for math entered on line 2 6 +stack: level + 5 for cell entered on line 2 7 +stack: level + 6 for math entered on line 2 8 +stack: level - 6 for math from line 2 9 +stack: level - 5 for cell from line 2 10 +stack: level + 5 for cell entered on line 2 11 +stack: level - 5 for cell from line 2 12 +stack: level + 5 for cell entered on line 2 13 +stack: level - 5 for cell from line 2 14 +stack: level + 5 for cell entered on line 2 15 \endcases ->\end {array}\right . 16 +stack: level - 5 for cell from line 2 17 +stack: level - 4 for math from line 2 18 +stack: level - 3 for math from line 2 19 +scanint for \mkern->18 20 +scandimen for \mkern->18.0mu 21 +stack: level + 3 for math entered on line 2 22 +stack: level - 3 for math from line 2 23 +stack: level + 3 for math entered on line 2 24 +stack: level + 4 for math entered on line 2 25 +stack: level - 4 for math from line 2 26 +stack: level - 3 for math from line 2 27 +stack: level - 2 for math from line 2 28 Math: $\begin {cases}{\left\{\begin {array}{ll} x &y\\a&b\end{cases} 29 \end {array}\right.} \mkern\hspace{10.0pt}x^{ {2 }}!$ 30 +scanint for \hspace->10 31 +scandimen for \hspace->10.0pt 32 {scanglue 10.0pt\relax } 33 Realloc xml math table to 20 34 {Push p 1}

We shall explain for each line in the transcript file where it comes from. Math mode scanning is entered when the translator sees a math shift character (line 1). The scanner reads some tokens and puts them in a list. The list is printed at the end (lines 28-29). The start of the formula is a bit special, in that the token that follows the first dollar sign is considered unexpanded when we check for a double dollar sign. A new group is entered, before scanning the whole formula (line 2).

The loop is as follows:

A token is read and expanded. Lines 4 and 15 show expansion of user commands. An error is signaled in the case of end of data.

If we get a font command, we proceed as follows. First

`\cal`is transformed into`\mathcal`. The font can be`\mathtt`,`\mathcal`,`\mathbf`,`\mathrm`,`\mathit`,`\mathbb`,`\mathsf`. These are basic math fonts; they have an inner variant, of the form`\@mathtt`. There is also`\mathnormal`. The command`\mathfrak`selects a Fraktur variant. We allow old fonts (like`\rm`,`\sf`,`\tt`,`\bf`,`\it`,`\sl`), fonts switches of the form`\rmfamily`, or font commands that take an argument like`\textrm`. These fonts have an inner variant, say*T*. If the font takes no argument, then the token*T*is inserted (as explained for`\cal`above). Otherwise, let*S*be the current math font. In this case, an argument is read, then*S*,*T*and this argument is pushed back, to be read again. For instance, if the current font is `sf´, then`\mathrm``{foo}`produces`\@mathrm foo\@mathsf`: these are five tokens to be read again. Note:`\mathbbm`is an alias for`\mathbb`. The name of the internal font has changed since the first edition, it has the form mml@font@xxx, where the suffix is one of normal, upright, bold, italic, bolditalic, script, boldscript, fraktur, doublestruck, boldfraktur, sansserif, boldsansserif, sansserifitalic, sansserifbolditalic, and monospace. Details can be found in the second part of this report.In all other cases the current token is added to the list. In particular, this explains while the trace starts with a dollar and

`\begin`.If the token is an open brace, in fact any character of category code 1, a new math group is read. You can see on lines 23 and 24 that the stack level increases (a new semantic level is entered, all assignments are local).

If the token is a close brace, in fact any character of category code 2, this terminates the current math group (see lines 25 and 26). An error is signaled in case the current group should be closed differently (for instance with

`\end`, or`\right`, etc.)If the token is a dollar sign, in fact any character of category code 3, then four alternatives can be chosen. This dollar sign can be the end of the math formula. If we are in display math, a token is read with expansion. An error is signaled

*Display math should end with $$*if this is not a dollar sign (in fact, a character of category code 3). If the current group is defined by`\hbox`, this can be the start of a math formula (never display math). The token in`\everymath`are inserted, then a math formula is read. Otherwise, an error is signaled*Extra $ ignored...*, parsing continues.In the case of

`\label`, an argument is read. If we are not in display math, or if the formula already has a label, you get an error:*Some labels may be lost*. Wherever the location of the label, an attribute will be added to the`<formula>`element that contains the`<math>`element.In the case of

`\ensuremath`, a token list is read, and pushed back, so that this command acts as`\@firstofone`.The case

`\begin`or`\end`is considered next. We make the assumption that this is a user defined environment, or a math environment. In the case of the example, we have a user environment that expands to a math environment. For a user defined environment, the following is executed:{\cases .... \endcases}

In the trace, lines 28-29, you will see both

`\begin{cases}`and the result of the expansion of`\cases`, but not`\cases`or the brace. However, you can see on lines 4 and 15 the expansion of the user defined commands, and on lines 3 and 18 the braces; these braces can also be seen in the translation in no-mathml mode. You can see on lines 6 and 16 that a group (named `cell´) is opened and closed, because the builtin math environment starts a cell. This allows`&`or`\\`tokens. The group defined on lines 7-8 does not exists in TeX. Let´s hope for the best: the argument of the array should contain only letters. Whether these characters should be expanded is unclear.In the case

`\left`and`\right`, a delimiter is read. The rules are:`\relax`and space tokens are ignored. After full expansion, the result should be one of the tokens listed above as valid delimiters, otherwise an error of the form*Invalid character in \left or \right*can be signaled. These commands come in pairs; you might get errors like*Missing \right. inserted*, or*Unexpected \right*. These commands define a group, see lines 5 and 17.Case

`&`and`\\`. These are valid only inside a cell group. They terminate a cell group and start a new one. See lines 9 to 14.The

`\mathchoice`command reads 4 arguments, which are remembered.Case of

`\frac`, etc. These commands read their arguments. The main token list will contain a special slot, with the name of the command and the arguments. In the case of`\sqrt`, the first argument is optional. You say`\root A \of B`.The syntax of

`\genfrac`is special. It takes six arguments. The equivalent of the following commands is executed when*Tralics*bootstraps:\def\binom{\genfrac()\z@{}} \def\dbinom{\genfrac(){0pt}0} \def\tbinom{\genfrac(){0pt}1}

This defines three commands that take two arguments with regular syntax. The first two arguments of

`\genfrac`are delimiters or empty. The next one is a dimension or empty. If empty, a default dimension will be used. In the example, the first argument is an opening parenthesis, the second is a closing parenthesis, the third is zero. The next argument is empty or a number between 0 and 3. These numbers correspond to a style:`\displaystyle`,`\textstyle`,`\scriptstyle`, and`\scriptscriptstyle`respectively. Currently an explicit number is required; everything else is treated as an empty list. On the other hand, the dimension is scanned via the scandimen routine (the procedure that prints lines of the form 19-20).`\hbox`and friends. Currently, only`\hbox`is implemented. The current value of`\everyhbox`token list is inserted, and the argument is read. There are restrictions, see later.Case of

`\mbox`,`\text`,`\makebox`. Like`\hbox`, but no`\everyXXX`token list is inserted. Example\everyhbox{A} \everymath{B} \everydisplay{C} \[a=\hbox{bc d $ef g$}h i\text{OK}\]

Translation, in no-mathml mode.

<texmath type='display'>Ca=\text{Abc} \text{d} Bef gh i\text{OK}</texmath>

Case of a math font, for instance

`\@mathcal`. The font command is inserted in the token list, but the variable holding the current font is (locally) updated.Case of a math command (like

`\alpha`listed as above). The command is read. There is a special hack:`\not``\in`is converted to`\notin`, and`\not=`to`\ne`.Case of

`\hspace`,`\vspace`. An argument, preceded by an optional space, is read. In the case of`\hspace`, a space is added, otherwise the command is ignored.`\kern`,`\mskip`,`\mkern`,`\hskip`,`\vskip`. See the example: on line 19 and 20, there is the trace of the routines that read the argument. The result is converted into a`\hspace`, with argument delimited by braces. Look at the trace, line 29. This will be read again later, see lines 30 to 32. In the case of`\vskip`, we should convert to`\vspace`, and re-insert, but the argument is ignored. In the case of`\mkern`, the result is converted into pt units, using the rule 18mu=10pt; in the case of`\mskip`, the stretch and shrink parts of the glue are discarded.An apostrophe is handled in a special way, as explained in the TeX book. Essentially

`x´`is`x^{\prime}`and`x´^2`is`x^{\prime2}`.Mode independent commands are interpreted as usual (this includes undefined commands). This should not typeset anything.

A character is remembered, together with the current font.

Mode-independent tokens are evaluated. These are commands like

`\def`that change the environment, and have empty transation. For instance,`\relax`commands are handled here.Everything else is inserted without evaluation.

We give here an example with some fonts.

$\mathtt{Ab}\mathcal{Cd}\mathbf{Ef}\mathrm{Gh}\mathit{Ij} \mathbb{Kl}\mathsf{Mn}$

The translation is as follows. You can notice that some variants affect only uppercase letters.

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mi mathvariant='monospace'>A</mi> <mi mathvariant='monospace'>b</mi> <mi>𝒞</mi> <mi>d</mi> <mi mathvariant='bold'>E</mi> <mi mathvariant='bold'>f</mi> <mi> G </mi> <mi> h </mi> <mi>I</mi> <mi>j</mi> <mi>𝕂</mi> <mi>l</mi> <mi mathvariant='sans-serif'>M</mi> <mi mathvariant='sans-serif'>n</mi> </mrow> </math> </formula>

Whenever we see an array (this can be a global environment like
`eqnarray` or a local one, like `array`), we translate all cells one after the other. The
character `&` is the cell separator. The command `\\` is the row
separator. In the case where an array ends with a `\\`, this gives an empty
row: it will be removed. Each cell has an alignment, left, right, or
center. An attribute is added only if this is not center. The `array`
environment has an argument that explains the type of the columns (columns not
indicated are centered). The default alignment is `rl´ for `split` and
`align`, `rcl´ for `eqnarray`, centered for `matrix`. You can use
`\multicolumn`. This command takes three arguments: the span which
should be some integer, then the alignment (one of r, l or c) and the content
of the cell. The program may signal errors in case of wrong syntax. Here is an
example:

$\begin{array}{rcl} a&b&c&d\\ A&\multicolumn{1}{r}{B}&C&D\\ \end{array}$

This is the translation of the array.

<mtable> <mtr> <mtd columnalign='right'><mi>a</mi></mtd> <mtd><mi>b</mi></mtd> <mtd columnalign='left'><mi>c</mi></mtd> <mtd><mi>d</mi></mtd> </mtr> <mtr> <mtd columnalign='right'><mi>A</mi></mtd> <mtd columnalign='right' columnspan='1'><mi>B</mi></mtd> <mtd columnalign='left'><mi>C</mi></mtd> <mtd><mi>D</mi></mtd> </mtr> </mtable>

If you say ``$x$` `and` `$123$`´, the translation will be

<p><formula type='inline'><simplemath>x</simplemath></formula> and 123</p>

Initially, we found this a good idea; because this can easily be converted in
HTML into `<i>x</i>`. Moreover ``$2^{i\grave eme}$`´
gives

<temporary>2<hi rend='sup'>e</hi></temporary>

Here the `<temporary>` element will not show in the XML tree, but is
printed on the terminal if *Tralics* is called with the `interactivemath´
switch. If you invoke *Tralics* with the `-notrivialmath´ switch,
these hacks are not tried, and the formula translates into:

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <msup> <mn>2</mn> <mrow> <mi>i</mi> <mover accent='true'><mi>e</mi> <mo>`</mo></mover> <mi>m</mi> <mi>e</mi> </mrow> </msup> </math> </formula>

There are three hacks: the first is when the formula contains only a letter,
the second is when the formula contains only digits, and the last one is when
people use a math formula instead of `\textsuperscript`. This hack is
applied only if the math formula starts with digits (no digit at all is OK;
braces are ignored) followed by a exponent marker, followed by a special
exponent; this has to be a single token or a token list. In the case of a
single token, the hack is applied only if this is `e` or `o`.
Typically, it
applies in cases like 2^{e} and N^{o}. In the
case of more than one token, it applies when the exponent is ``th`´,
``st`´, ``rd`´ and ``nd`´, for cases like
1^{st}, 2^{nd},
3^{rd}, and 4^{th}. There are four rules for
French: ``e`´, ``eme`´, ``ieme`´, ``ème`´ and
``ième`´ convert to ``e`´,
``ier`´ and ``er`´ convert to ``er`´, ``iemes`´,
``ièmes`´ and ``es`´ convert to ``es`´,
``ère`´ and ``re`´ convert to ``re`´.
The accented letter can be given as `è`, or
`\`e` or `\`{e}` or `\grave{e}` or
`\grave e`. The hack is applied in a case
like:

$2 ^{\text{\small\rm \grave ere}} $

Instead of `\text`, `\hbox` can be used. Instead of `\small` or
`\rm` any font change or font size command can be used. Up to two commands
can be given. The original Perl version had 30 exceptions, including
`$\Sigma{}^{{\rm it}}$` and
`\ddot{\rm o}`. Compare
$\Sigma {}^{\mathrm{it}}$ with Σ^{it} and $\ddot{\mathrm{o}}$
with ö.

Since version 2.8, there is an integer register
named `\notrivialmath`, that controls these hacks; it contains initially 1,
it is set to
zero if *Tralics* is called with the -notrivialmath switch, to seven if
*Tralics* is called with the -trivialmath math switch (and to 349 if
*Tralics* is called with `-trivialmath=349`).
If the value is $A+2B+4C$ modulo 8,
where A, B, and C are zero (false) or one (true), then the behavior is the
following (by default A is true, other flags are false).

If A is true, in the case where the math formula contains optional digits followed by a special exponent, the rule explained above is applied; the special exponent can be one of th, rd, nd, st (for English), or those shown above for French.

If B is true, some math formulas containing a single token produce a non-math result. We have shown above the translation of a digit or a letter. The translation of a minus sign is a en-dash (so that

`$-$`is the same as`-``-`). Other characters are not considered trivial math. Most commands, whose translation is a single MathML element, including Greek letters, are converted into their content. Thus, the translation of`$\alpha$`is`α`rather than some long formula.If C is true, in the case where the formula starts with a hat or underscore, followed by a character, or a simple token list, and nothing more, then the result is a superscript, or subscript, out of math mode. It is as if you had used

`\textsuperscript`or`\textsubscript`. A simple list is a list of characters, with an optional font change at the beginning. This rule has precedence over the first. Said otherwise,`$X^{eme}$`is translated as X^{eme}. Example

$1^e$, $3^{eme}$ X$^{eme}$ $4^{i\grave{e}me}$ $1^{st}$ $2^{nd}$ $3^{rd}$ $4^{th}$ $x$ $1$ $\alpha$ $\pm$ $\longleftrightarrow$ $-$ $_{foo}$ $^{2+3}$ $_{\bf Foo}$ $+$ $x^{eme}$ $\log$ $_{F\bf oo}$

Translation (with MathML namespace removed), all hacks enabled:

<p>1<hi rend='sup'>e</hi>, 3<hi rend='sup'>e</hi> X<hi rend='sup'>eme</hi> 4<hi rend='sup'>e</hi> 1<hi rend='sup'>st</hi> 2<hi rend='sup'>nd</hi> 3<hi rend='sup'>rd</hi> 4<hi rend='sup'>th</hi> <formula type='inline'><simplemath>x</simplemath></formula> 1 α ± ⟷ – <hi rend='sub'>foo</hi> <hi rend='sup'>2+3</hi> <hi rend='sub'><hi rend='bold'>Foo</hi></hi> <formula type='inline'><math><mo>+</mo></math></formula> <formula type='inline'><math><msup><mi>x</mi> <mrow><mi>e</mi><mi>m</mi><mi>e</mi></mrow> </msup></math></formula> <formula type='inline'><math><mo form='prefix'>log</mo></math></formula> <formula type='inline'><math><msub><mrow></mrow> <mrow><mi>F</mi><mi mathvariant='bold'>o</mi> <mi mathvariant='bold'>o</mi></mrow> </msub></math> </formula></p>

In the case where the value of the counter
`\@nomathml` is negative, then the translation
is a `<texmath>` element containing all tokens of the math list. For instance,

\csname@nomathml\endcsname=-1 $\begin{pmatrix} \binom 12&\int_0^\infty f(x)dx\\[2cm] \mathfrak{W}_2&\text{xyz}=\sqrt{xxyyzz} \end{pmatrix}$

translates as

<p><texmath type='inline'>\begin{pmatrix} \genfrac(){0.0pt}{}{1}{2}&\int _0^\infty f(x)dx\\[2cm] \@mathfrak W\@mathit _2&\text{xyz}=\sqrt{xxyyzz} \end{pmatrix}</texmath></p>

In all other cases we use a highly recursive procedure that converts a math list into a formula. The procedure takes as argument the current style. This is one of D, T, S, or SS (display, text, script, or script script style). It is D for a display math formula, T for a normal formula.

Consider first the case where the formula has an `\over`, or a variant, not
hidden inside braces. This example has 6 subexpressions, each of them have
such an operator.

${a\over b}{a\above2mm b}{a\atop b} {a\overwithdelims[] b}{a\abovewithdelims[]2mm b}{a\atopwithdelims[] b}$

The translation is

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mfrac><mi>a</mi> <mi>b</mi></mfrac> <mfrac linethickness='2mm'><mi>a</mi> <mi>b</mi></mfrac> <mfrac linethickness='0.0pt'><mi>a</mi> <mi>b</mi></mfrac> <mfenced open='[' close=']'> <mfrac><mi>a</mi> <mi>b</mi></mfrac></mfenced> <mfenced open='[' close=']'> <mfrac linethickness='2mm'><mi>a</mi><mi>b</mi></mfrac></mfenced> <mfenced open='[' close=']'> <mfrac linethickness='0.0pt'><mi>a</mi> <mi>b</mi></mfrac></mfenced> </mrow> </math> </formula>

It is an error if the formula has more than one such operators. Otherwise, we
have two parts: what precedes the operator and what follows the operator. As
the example shows, some operators need delimiters. Other operators read a
dimension. This dimension must be given explicitly as a sequence of digits and
a unit of measure (we could do better; if you want `\parindent` instead of
`2mm`, you should use `\genfrac` instead). After splitting the formula
into two parts, the same idea than `\genfrac` is used. If the current style
is C, the next style in the list is used for both parts of the formula (if the
style is D or T, the next style is S, otherwise it is SS). Note
that `\choose` is like `\over`, you should use `\binom`
instead.

We assume from now on that the formula contains no more operators like
`\over`. This means that the current style can be used for the current
object. Items are handled as follows:

A space is ignored.

If the current token is

`\text`,`\hbox`,`\mbox`, this is a command with an argument, that is interpreted using special rules. A sequence of characters produces`\mtext`, a space produces a`\mspace`, and math formulas are allowed. Errors may be signaled if the content of the argument is too complicated. The translation of$ x=0 \text{provided that $y=0$ or } a=1$

is

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mi>x</mi> <mo>=</mo> <mn>0</mn> <mrow> <mtext>provided</mtext> <mspace width='0.5em'/> <mtext>that</mtext> <mspace width='0.5em'/> <mrow> <mi>y</mi><mo>=</mo><mn>0</mn> </mrow> <mspace width='0.5em'/> <mtext>or</mtext><mspace width='0.5em'/></mrow> <mi>a</mi><mo>=</mo><mn>1</mn> </mrow> </math> </formula>

In the case of

`$\mathop{\rm sin}$`, the translation is`<mo form=´prefix´>sin</mo>`. Any sequence of characters is allowed instead of `sin´. Instead of`\rm`, any font change command that switches to `rm´ can be used.In the case of

`\hspace`, an argument is read, converted to a dimension (in fact, a glue is read via the scanglue routine, the shrink and stretch parts of the glue are discarded), and the result is a`<mspace>`element. For instance`\hspace{2cm plus 3pt}`produces`<mspace width=´56.9055pt´/>`.In the case of

`\displaystyle`,`\textstyle`,`\scriptstyle`,`\scriptscriptstyle`, the current style is changed, to D, T, S and SS respectively.In the case of

`\nonscript`, the token is discarded if the style is D or T, kept otherwise. We shall see later that space disappears after such a token, if it is not discarded. Example.$\def\foo{\nonscript~} \foo x^{y\foo}_{\textstyle z\foo}$

The translation is

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <mspace width='3.33333pt'/> <msubsup> <mi>x</mi> <mstyle scriptlevel='0' displaystyle='false'> <mrow><mi>z</mi><mspace width='3.33333pt'/></mrow> </mstyle> <mi>y</mi> </msubsup> </mrow> </math> </formula>

If the token is a character of category code 7 or 8, it is left unchanged (typically, case of

`^`and`_`).If the token is

`\limits`,`\nolimits`,`\mathord`,`\mathop`,`\mathbin`,`\mathrel`,`\mathopen`,`\mathclose`,`\mathpunct`,`\mathinner`,`\ensuremath`,`\nonumber`,`\nolinebreak`, it is ignored. However, we remember that the next object should be Rel, Bin (if`\mathrel`or`\mathbin`has been seen.)If the token is

`\big`,`\bigl`,`\bigm`,`\bigr`,`\bigg`,`\biggl`,`\biggm`,`\biggr`,`\Big`,`\Bigl`,`\Bigm`,`\Bigr`,`\Bigg`,`\Biggl`,`\Biggm`,`\Biggr`, we remember that a big object is wanted, with subtype left, right, middle, or other.The current token or group is translated, according to the rules given below. After that, flags may be added (if the object is declared Bin or Rel or big). The current style is changed to the next style in the case we are in a group, and the group is preceded by

`^`or`_`.If the current token is a character, its translation will be a

`<mi>`,`<mn>`or`<mo>`element. In the case of a letter, this may depend on the font attribute associated to the character. For instance`{\bf x=1}`gives`<mi mathvariant=´bold´>x</mi>`and`<mo>=</mo>`and`<mn>1</mn>`.If the current token is

`\left`,`\right`, or already translated, it is left unchanged.If the token is a constant, like

`\alpha`, see the big list at the start of the chapter, its XML value is inserted.If the token is a list, like

`{...}`, it will be translated, using a copy of the current style.If the token is a list of the form

`\left`...`\right`, it will be translated. After that, fences will be added (using what follows the`\left`and`\right`).If the token is a list of the form

`\begin{xxx}`...`\end{xxx}`, we assume that this is an array, or a matrix, we already explained how it can be translated.If the token is

`\mathchoice`, with its four arguments, one of them is selected according to the mode. For instance\def\foo{\mathchoice{1}{2}{3}{4}} $\foo{\displaystyle \foo}^{\foo^{\foo}}$

translates to

<mrow> <mn>2</mn> <msup> <mstyle scriptlevel='0' displaystyle='true'><mn>1</mn></mstyle> <msup> <mn>3</mn> <mn>4</mn> </msup> </msup> </mrow>

It is an error if the current token is not a command of the form

`\acute`, etc, or`\overline`, those listed at the end of the section `basic objects´ on page ✻. As a general rule, for instance for`\frac`, arguments are translated using the next style (i.e., smaller), unless the style is indicated (for`\dfrac`and`\tfrac`, the style is T and S, the style may be indicated for`\genfrac`). If`\foo`is as above, the translation of$\tfrac{\foo}{\foo} = \dfrac{\foo}{\foo} = \frac{\foo}{\foo} {\displaystyle \frac{\foo}{\foo}}$

is

<mrow> <mstyle scriptlevel='0' displaystyle='false'> <mfrac><mn>3</mn> <mn>3</mn></mfrac> </mstyle> <mo>=</mo> <mstyle scriptlevel='0' displaystyle='true'> <mfrac><mn>2</mn> <mn>2</mn></mfrac> </mstyle> <mo>=</mo> <mfrac><mn>3</mn> <mn>3</mn></mfrac> <mstyle scriptlevel='0' displaystyle='true'> <mfrac><mn>2</mn> <mn>2</mn></mfrac> </mstyle> </mrow>

The translation of

\def\xbar#1{\genfrac{}{}{}{#1}{\foo}{\foo}} $\xbar{0}\xbar{1}\xbar{2}\xbar{3}\xbar{}$

is the following. You can notice that, if the argument of

`\xbar`is 2 or 3, this does not change the translation of the fraction. In TeX we get two formulas that have the same size but are not vertically aligned (why?).<mstyle scriptlevel='0' displaystyle='true'> <mfrac><mn>2</mn> <mn>2</mn></mfrac> </mstyle> <mstyle scriptlevel='0' displaystyle='false'> <mfrac><mn>3</mn> <mn>3</mn></mfrac> </mstyle> <mstyle scriptlevel='1' displaystyle='false'> <mfrac><mn>4</mn> <mn>4</mn></mfrac> </mstyle> <mstyle scriptlevel='2' displaystyle='false'> <mfrac><mn>4</mn> <mn>4</mn></mfrac> </mstyle> <mfrac><mn>3</mn> <mn>3</mn></mfrac>

As a final example, the translation of

$\overline{x}\grave{y} \underbrace{z}\stackrel{a}{b}\overset{a}{b} $

is

<mover accent='true'><mi>x</mi> <mo>‾</mo></mover> <mover accent='true'><mi>y</mi> <mo>`</mo></mover> <munder accentunder='true'><mi>z</mi> <mo>⏟</mo> </munder><mover><mi>b</mi> <mi>a</mi></mover> <mover><mi>b</mi> <mi>a</mi></mover>

Before we forget it: when the formula is completely translated, we have a
list of XML elements. If the list is empty, the result is
`<mrow/>`. For instance, in the case of `x^{}`, then exponent is empty. If
the list has a single XML token, this will be the result. Otherwise, everything is
put in a `<mrow>`. If the current formula, or subformula contains a style
change, it is put in a `<mstyle>` element. This is not always the good
solution, because the same style is used for everything, what precedes and
what follows the style command. If you look at the `\genfrac` example
above, you can see that styles are added by the `\genfrac` interpreter
(the single TeX switch is associated with two MathML attributes).

If we have a formula, of the form `$_x^{2}_{abc}$`, the translation rules
explained so far tell us that we have: an underscore character, an XML element for `x`, a hat character, an XML element for `{2}`, an underscore,
and an XML element for `{abc}`. We may have `\nonscript` tokens; they will
be removed, as well as a space that follows.
We have to evaluate the commands that control subscripts and superscripts. A hat
character gives `<msup>`, an underscore character gives `<msub>`,
and both give `<msubsup>`. It is possible for a formula to start with an
underscore or a hat: in this case, the kernel is empty. It is not possible for
a formula to end with hat or underscore. A kernel can have at most one
subscript and at most one superscript; hence the formula above is wrong:
the letter x is the first subscript to the empty kernel. A valid formula is
for instance `$_yx^2$`. It translates as

<mrow> <msub><mrow></mrow> <mi>y</mi> </msub> <msup><mi>x</mi> <mn>2</mn> </msup> </mrow>

We have mentioned above that some operators can be flagged as left, right,
and that adding `\bigr` may convert a left operator into a right
operator. There is a magic that converts, in some cases, the `\big`
operator into fences. For instance

$\bigl [ A\big ( x^2 \big) B \bigr[ $

translates as

<mfenced open='[' close='['> <mi>A</mi> <mfenced open='(' close=')'><msup><mi>x</mi> <mn>2</mn> </msup></mfenced> <mi>B</mi> </mfenced>

There is another trick, that works in some cases. Consider:

$\int_0^\infty f(x) dx = \big[ U \big ]$

the translation is

<mrow> <msubsup><mo>∫</mo> <mn>0</mn> <mi>∞</mi> </msubsup> <mrow> <mi>f</mi><mo>(</mo><mi>x</mi><mo>)</mo><mi>d</mi><mi>x</mi> </mrow> <mo>=</mo> <mfenced open='[' close=']'><mi>U</mi></mfenced> </mrow>

The interesting point here is the placement of the inner `\mrow`.
The idea is that the parentheses should remain small (not larger than the
`\mrow`). In particular, it should not be influenced by the integral that
precedes and the fence that follows. In some cases, it works.

In *Tralics*, you can use the following three commands
`\mathmo`, `\mathmi`, and `\mathmn`.
They take an argument and produce a `<mo>`, `<mi>`, or `<mn>`. There is
a file tralics-iso.sty that contains

\def\makecmd#1{\expandafter\newcommand\csname math#1\endcsname} \def\makemo#1#2{\makecmd{#2}{\mathmo{\amp\##1;}}} \def\makemi#1#2{\makecmd{#2}{\mathmi{\amp\##1;}}} \def\makemn#1#2{\makecmd{#2}{\mathmn{\amp\##1;}}}

Then you can say `\makemo``{x02190}``{slarr}`, and this will define a command
`\mathslarr`, whose translation (in math mode only) is
`<mo>``←``</mo>`. The file provides nearly 2000 such definitions, taken
from the MathML entity files, with the MathML names. These commands can be
used instead of TeX commands like `\mathchar`: remember that a math-char
is a 15bit integer, where 8 bits are used for the position in a font table,
3 bits for the type, and 4 bits for the family. Only three types are defined
for *Tralics*, but the content of the element is arbitrary (most math symbols
are between U+2100 and U+27FF, there are also letters between U+1D400 and
U+1D7FF).
There is a command `\mathattribute` that adds an attribute pair to the
last created math element. You can say for instance

\providecommand\operatorname[1]{% \mathmo{#1}% \mathattribute{form}{prefix}% \mathattribute{movablelimits}{true}% }

After that,

$\min _xf(x) >\operatorname{min} _xf(x)$

translates as

<formula type='inline'> <math xmlns='http://www.w3.org/1998/Math/MathML'> <mrow> <msub><mo movablelimits='true' form='prefix'>min</mo> <mi>x</mi> </msub> <mrow> <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>></mo> </mrow> <msub><mo movablelimits='true' form='prefix'>min</mo> <mi>x</mi> </msub> <mrow> <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> </mrow> </math> </formula>

The command `\DeclareMathOperator` takes two arguments
(say `foo´ and `bar´), with an optional star before the first argument.
It defines `\foo` to be the command `\operatorname` applied to `bar´
(with a star when required). The command `\operatorname` is as shown
above (the movablelimits attribute is only added if the command is
followed by a star).

You can use the command `\mathchardef`. This is like `\chardef`,
it reads a command and a number. The number should fit on 15 bits.
Otherwise, you will see an error of the form:
*Bad mathchar replaced by 0: 1234567*. The `\mathchardef` command
reads a command, say `\foo`, and an integer N; there is no difference
between `\foo` and `\mathchar`N, except that `\the``\foo` returns
the integer N, and is faster to parse. Some constants, like `\@cclvi`=256,
are defined in this way by the TeX kernel and should not be used as math
characters. Some commands, like `\eta`=111_{16}, are meant to be used as
a math character. In *Tralics*, until version 2.8 an error will be signaled.
In version 2.9, the translation, in math mode, is a `<mi>` element
containing this character; you might say `\mathchardef``\eta``"3B7`.
Outside math mode, this gives an error:
that takes the form *Undefined command \eta; command code = 264*,
instead of *Math only command \theta. Missing dollar not inserted*;
inside math mode, the behavior is the same as the standard one.

TeX has a special register called `\fam`. If you say something like

\fam3 ${\fam9 \the\fam}\ \the\fam$

then the second `\the` expands to minus one. The first gives 9, but LaTeX complains with: *\textfont 9 is undefined (character 9)*.
In *Tralics*, you would see

<mrow><mn>9</mn><mspace width='6pt'/><mn>3</mn></mrow>

As the example shows, the family is unused, and not correctly restored.
Each character has a `\mathcode`. The following

\mathcode`\a="0941 $a\the \mathcode`\a$

is interpreted by *Tralics* as `$a2369$`. However TeX complains, with
*\textfont 9 is undefined (character A)*, because you ask the lower
case letter a to be printed like the upper case letter A with textfont 9.
A mathcode is a 15bit integer, with an exception: a character whose mathcode
is 32768 behaves like an active character, the action associated to it must be
defined somehow, for instance like this:

{\catcode`\'=\active \global\let'\active@math@prime}

There is a command `\delimiter`, it reads a number, but you cannot use
it.
There is a command `\radical`, it reads a number, then signals an
error. The `\mathaccent` command is similar.

There are commands `\raise` and `\lower`, as well as
`\vcenter`. The last one is not implemented in *Tralics*. The translation
of

a\raise2cm\xbox{foo}{bar}\lower 2pt\xbox{xfoo}{xbar}

is

<p>a<foo>bar</foo><xfoo>xbar</xfoo></p>

As you can see, the specification disappear. Maybe in a future version, we
will add an attribute to the box.
You cannot use these commands in math mode in *Tralics*. In TeX, you can get
an error of the form: *You can´t use `\raise´ in vertical mode*,
while `\vcenter` is a math only command.
Currently `\indent` and `\noindent` are ignored in math mode (in TeX
`$\indent_b$` produces a kernel and an index; the kernel is an empty box of
width `\parindent`, of type Ord).