Tralics, a LaTeX to XML translator; Part II

10. Corrigendum

We explain here some modifications of the TeX files or style sheets as described in version 2 of this document. For each section, we indicate the revision date.

10.1. Breaking Urls in the Pdf, 2007/01/28, 2007/07/30

This section discusses the following quote from Chapter 4: “the code was modified in January 2007, using \url to typeset the argument; this removes overfull hboxes for long URLs; however the \urlstyle has be to changed to `same´, so as to use the current font, and the url package has to be loaded with the `obeyspaces´ option, so as to keep spaces.”

10.1.1. Examples of hyperlinks

The problem arises when typesetting the content of the <fo:basic-link> element. Here is a typical example, it comes from the bibliography (it points to the last version of the research report that describes Tralics); it is the result of the translation of the `url´ field of some \bibitem.

<fo:basic-link
      color="red"
      external-destination="https://hal.inria.fr/inria-00069870">
   https://hal.inria.fr/inria-00069870
</fo:basic-link>

Here is a second example. It is on the title page of the report, see Chapter 7, section 7.14.

<fo:basic-link
  external-destination="http://www.inria.fr/recherche/equipes/modele.en.html">
  Team Mod&#xE8;le
</fo:basic-link>

Here is a third example. It is on the TOC, see Chapter 7, section 7.5, or 10.2 below.

<fo:basic-link internal-destination="uid19">
   <fo:page-number-citation ref-id="uid19"/>
</fo:basic-link>

10.1.2. Typesetting hyperlinks

Assume that we have a long URL of the form `www-sop.inria.fr/foo´; some people use a verbatim font: any font different from the current font could be used, the important point is to clearly show the start and the end of the string (in this example, we use quotes and a font change). By default, there is no hyphen character in a verbatim font so that the URL cannot be broken across lines. As a consequence, lots of people use footnotes: if the footnote contains `see´ followed by the URL, no line break is needed. The \url command allows line breaks at some characters like slash or dot, but never at the dash (so that there is no confusion between a dash and a hyphen).

An URL can be typeset using \htmladdnormallink (this was the only method for the Raweb, ten years ago), \href (originally translated by Tralics as the previous one, but now as in the hyperref package, i.e., arguments reversed) or \url (a command that takes a single argument). Note that Tralics translates \url{foo} as \href{foo}{foo} (unless \url is in \href), and the hyperref package redefines \url in the same spirit.

The translation of \href{X}{Y} by Tralics is some element that has an attribute X and a value Y. The attribute X is the URL, it is not typeset, it is the external destination of the examples shown above (the \ref command is used for internal destinations; it has a single argument X, and the third example gives an idea of what the system could use for Y). In some cases, Y is some text (example 2), typeset as usual, and in other cases, it is identical to X. In this case, people prefer to use another some kind of verbatim font for typesetting ths argument. In Tralics, there is a hook that allows you to change the font of Y (it is empty by default, because an XML processor can always add formatting instructions); on the other hand, the \url command of the url package uses by default a verbatim font. Assume now that the element is translated into a <fo:basic-link> element in XSL/Format syntax. We want to convert it to Pdf. The fotex.xmt has the following lines: (Chapter 4, section 4.17, line 2096)

1    <a line of TeX code shown below>
2 % this breaks any real content in the link text
3 %  \expandafter\@basiclink\relax#1//\@nil#1\@nil\FOexternaldestination\@nil%

In Chapter 4, we felt it unneccesary add the two commented-out lines. Let´s however try to explain the intent. The idea is to distinguish example 1 from example 2 by looking to the string `//´ in Y. The code on line 3 is a bit strange: the \expandafter command changes order of expansion, it starts with expanding the \relax, which is unexpandable, hence is useless. If we assume that \ifnotaurl is false in the case of an URL, the code is equivalent to:

4 \ifnotaurl
5   \href{\FOexternaldestination}{\FO@inlinesequence{#1}}%
6 \else
7   \href{\FOexternaldestination}{\FO@inlinesequence{\XURL{#1}}}%
8 \fi

It is not really clear why the line is commented out. Is it because the test is too complicated? because of expansion order? or is it because the \XURL command is incorrect? In any case, we found it worthless to describe it in Chapter 4. You would expect line 1 (replacement text foe the code above) to be the same as line 5; however it is the following:

9    \href{\FOexternaldestination}{#1}

We replaced it by the following line (which is the same as line 5); the important point is that the color attribute of the element is taken into account (if it´s red, it´s active, but the converse is false).

10    \href{\FOexternaldestination}{\FO@inlinesequence{#1}}

10.1.3. Using the \url command

In January 2007, we modified the code, in order to allow line breaks; we thought it reasonable to use the \url command. Using line 7 with \url instead of \XURL is a first idea, but the title page has `TeamFoo´ instead of `Team Foo´ (example two). Said otherwise, spaces missing, and the font change is unexpected. According to the quote at the start of the section, two lines had to be modified in file fotex-add.sty, they are the following.

11 \Requirepackage[obeyspaces]{url}
12 \urlstyle{same}

Line 1 is now (with \url instead of \XURL):

13 \href{\FOexternaldestination}{\FO@inlinesequence{\XURL{#1}}}

There is a small problem: the hyperref package redefines the \url command, and the code breaks if an ampersand character appears in an URL. The solution we found was to bypass these definitions. All in all, the idea was to use the line shown above, where \XURL is defined by

14 \def\XURL{\begingroup \urlstyle{same}\Url}

For some very strange reason, this definition was in the fotex file, was never used, but nevertheless redefined by the Raweb to \relax (end of chapter 3, code line 245).

This definition has the advantage that it makes LaTeX happy; in our test file, the title page contains Team Modèle in the right font and with the space; the ampersand character is also handled correctly, according to the specifications of the \url. However, we expected the character entity to be replaced by its value, and see `Team Modèle´.

10.1.4. Avoiding use of \url

Now here we have a big problem: let Y be, as above, the second argument of the \href command, after conversion into XSL/Format, and Z be the argument of \XURL on line 13; in fact, the \XURL command takes no argument, because it makes some characters active, and redefines some of them. In our case, Y is a sequence of characters (not yet read), and Z is a sequence of tokens (corresponding to Y, but with category code fixed). The \url command converts Z into a character string, using \meaning. As a result all characters (brace, backslash, ampersand) lose their special meaning. On the other hand, a special math code is assigned to some characters (what happens to non-7bit characters is unclear), and the result is typeset in math mode, using a typewriter font (or the current font) as text font 0. The effect of the special math code is, for instance, to get something like a less-than sign (in reality a \langle) instead of an inverted exclamation point (this depends on the font), and, more importantly, to allow line breaks.

It is impossible to handle correctly something like è using the \url command. Can we modify \url, avoiding the call to \meaning? this is not clear; and what about all commands that are in a math environment, but are assumed to behave as if they were in text? Let´s consider the following alternative: some magic gives a modified Z, with some characters being active, for instance colon, dot, slash, and we define them like this

15 {\catcode`:13 \gdef:{\char`:\allowbreak}}

This tentative failed: in example 3, there is a colon between the namespace and the local name; the redefinition above is non-sense: we cannot modify the behavior of some characters like less-than sign, colon, ampersand that participate in XML parsing. We might try to change the category codes of dot and slash, but this means redefining how Z is obtained from Y; this is non obvious.

Note. Let X be a very long string, obtained by N concatenations of `foobar´. There is a potential linebreak between `foo´ and `bar´, and a second one between `bar´ and `foo´. Only linebreaks before position 64 in the string are considered. As a consequence, if a line starts with a short word, like `foo´, and is followed by X, then TeX will not break X, and this gives an overfull line. This means that the only possible way to allow long URLs to be split is to add explicit \penalty tokens; this means that the break points are marked via a an element or an entity in the XML file.

10.1.5. A better solution?

Finally, we consider the following scheme. Links are typeset by the code shown on line 10. The translator produces a XML file containing possible line breaks. We describe the solution of 2007/01/28. Assume that the following lines are inserted in the TeX source.

16 \def\hrefcats{\catcode`.=13\catcode`/=13}
17 \let\xhref\href
18 \def\href#1{\begingroup\hrefcats\yhref{#1}}
19 \def\yhref#1#2{\endgroup\xhref{#1}{#2}}

The result is to make two characters active in the second argument of \href, they can be redefined as in line 15. The question is now how to convert the \allowbreak. If we say that it is \penalty0, the command is lost by Tralics, if we say that the XML translation is <allowbreak>, the element is lost by the style sheet. The simplest hack would be to convert this into a character, like this

20 {\catcode`.=13\catcode`/=13
21 \gdef.{\char`\.^^^^200c}\gdef/{\char`\./^^^^200c}}

The character U+200C (zero-width non-joiner) seems to be invisible in HTML. You can redefine it like this in the fotex.sty file:

22 \DefineCharacter{8204}{200C}{\penalty100 }

For the Raweb2007, the best solution to use a <allowbreak> element, and to modify the style sheets, so as to not lose it. The element is defined as

23 \XMLelement{allowbreak}
24 {}{\penalty100 }{}

The previous hack is integrated in Tralics2.10.4, for the \url command. In the followin example, breakpoints are inserted in the URLSA ending with C and E.

25 \href{http://www-sop.inria.fr/apics/SILA/WebPage/}{SilaA}
26 \href{Sila}{http://www-sop.inria.fr/apics/SILA/WebPage/B}
27 \url{http://www-sop.inria.fr/apics/SILA/WebPage/C}
28 \href{http://www-sop.inria.fr}{http://www-sop.inria.fr/apics/SILA/WebPage/D}
29 \href{ok}{\url{http://www-sop.inria.fr/apics/SILA/Web.Page/E}}

10.2. Bad TOC layout, 2007/02/04

A typical line of the TOC in a XSL/Format file is the following (line breaks added for simplicity)

1 <fo:block text-indent="0.25in">
2    2.1. &#x2003;
3    <fo:inline space-start="20pt">
4       <fo:inline>Overall Objectives</fo:inline>
5    </fo:inline>
6    <fo:leader hack="true" rule-thickness="0pt"/>
7    <fo:inline color="red">
8      <fo:basic-link internal-destination="uid4">
9         <fo:page-number-citation ref-id="uid4"/>
10      </fo:basic-link>
11    </fo:inline>
12 </fo:block>

The <fo:block> element has four children, a section number A, a section title B, a filler C, and the page number D. The section title comes from the XML file, the section number is computed by the style sheet, and the page number by LaTeX. The indentation depends on the section level. The page number is flushed right; this is achieved by inserting a filler: leaders in the case of a section, blank space otherwise. In the simple case shown here, the translation of <fo:leader> is simply \hfill (see Chapter 4, code lines 2472 to 2521).

In some cases there is not enough space on the line, and TeX inserts line breaks; a potential break point is at C; and this is a discardable item: this means that the page contains A, B, linebreak, D. The result is ugly: the page number is flushed left instead of being flushed right. Note that the page number is all the more visible since it is red and the remainder of the text is indented.

This is how the problem is solved in LaTeX:

13 \newcommand*\l@section[2]{%
14     [...]
15     \setlength\@tempdima{1.5em}%
16     \begingroup
17       \parindent \z@ \rightskip \@pnumwidth
18       \parfillskip -\@pnumwidth
19       \leavevmode \bfseries
20       \advance\leftskip\@tempdima
21       \hskip -\leftskip
22       #1\nobreak\hfil \nobreak\hb@xt@\@pnumwidth{\hss #2}\par
23     \endgroup}
24 \def\numberline#1{\hb@xt@\@tempdima{#1\hfil}}
25 \newcommand*\l@subsection{\@dottedtocline{2}{1.5em}{2.3em}}

The piece of code shown above comes from the article class. Ellipses should be replaced by: do nothing if sections should not appear in the TOC; otherwise add some vertical space and penalty. In the case of the Raweb, all lines in the TOC are equivalent, there is no additional space, nor page break hints. Typesetting a line in the TOC depends on 3 dimensions, L, N, and P. Quantity L is the left margin, it is 0 for a section, 1.5em for a subsection, etc. Quantity P is \@pnumwidth, this is defined to be 1.55em in the class file, this is the space allocated for typesetting the page number. Quantity N is 1.5em for a section, 2.3em for a subsection: this is the space allocated for typesetting the section number. The commands \l@something take two arguments: the first one contains A and B, and explains how to format A (in general via \numberline), the second one contains D.

The idea is to insert some space L, typeset A into a box of width N (flushed left), and D into a box of width P (flushed right), and insert B between them. Between B and D there is a filler (space in the case of a section, dots in the case of dottedtocline). Note: in the case of the Raweb, we use dots in the case of a section, and space otherwise.

There are some difficulties if there is not enough space. Note that the width of a digit is half of an em; thus there is in general one em between the section number and the section title; if the section number is greater than nine, there is half an em; if the section number is 100 or more, we get an overfull box of 2pt. In the case of a subsection of level five, we might find quantities like 10.11.12.13.14; this requirtes at least 100000 subsections; we provide enough space for the case of one number with two digits.

Since the page number uses \hss, no overfull box is signaled (but 1.55em is big enough for pages up to 999). The trouble is when roman numerals are used, both for page numbers and section numbers: VIII is much to big. This is not the case of the Raweb.

What happens if the section title does not fit on a line? the line breaking algorithm is called, because the argument is not in a box. The break appears at a distance P of the right margin, and the line is continued at a distance N after the left margin: this is because left and right margins have been increased (in the group) by N and P. This implies that we must add negative space before A (via \hskip and after P (via \parfillskip, this is a hack).

This idea is then used for the Raweb 2006. Our TOC line must be changed into the following:

1 <fo:block toc="true" margin-right="15pt" text-indent="25pt" margin-left="32pt">
2   <fotex:section-number width="32pt">
3      <fo:inline>2.1.1. </fo:inline>
4   </fotex:section-number>
5   <fo:inline>Research Themes</fo:inline>
6   <fotex:page-number>
7      <fo:inline color="red">
8        <fo:basic-link internal-destination="uid5">
9          <fo:page-number-citation ref-id="uid5"/>
10        </fo:basic-link>
11      </fo:inline>
12   </fotex:page-number>
13 </fo:block>

The next piece of code explains how to typeset the section number; it is the equivalent of \numberline. The width attribute holds quantity N.

14 \XMLelement{fotex:section-number}
15   {\XMLattribute{width}{\FOwidth}{10pt}}
16   {\xmlgrab}
17   {\hb@xt@\FOwidth{#1\hss}}

This produces the page number. This is the same code as on line 20, without the final \par. The width P is the right margin, value of \rightskip. We use 15pt, since the Raweb is less than one hundred pages.

18 \XMLelement{fotex:page-number}
19   {}
20   {\xmlgrab}
21   {\nobreak\hfil \nobreak\hb@xt@\rightskip{\hss #1}}

A third modification to fotex.xmt is required. We must change the \parfillskip glue inside the paragraph that contains the TOC line. This could be done by creating a clone of <fo:block>; it is however easier to modifiy it. We add an attribut toc, with `false´ as default value. If `true´, some code is executed, between lines 1714 and 1715 (see section 4.14), just before typesetting the content of the element. The code increments the left margin by the value of the paragraph indentation, sets the \parfillskip glue, and inserts a negative space. Note that we have to start a new paragarph (i.e. leave vertical mode); inserting \null and \nobreak seems unnecessary.

22    \XMLattribute{toc}{\FOcondtoc}{false}
23 ...
24 \ifx\FOcondtoc\att@true
25   \advance\leftskip\parindent
26   \parfillskip=-\rightskip
27   \leavevmode\null\nobreak\hskip-\leftskip
28 \fi

We modify the tocheading template, see section lines 713 to 758, section 7.5. The code is given below. The variables $tocindent and $tocwidth depend on the current level (between zero and six). The effective value is in the file raweb-param. We have chosen 0, 14, 25, 40, 55, 85, and 105 pt for the indentation, and 10, 24, 32, 42, 53, 63, 73 pt for the width. Quantities shown as `...´ are unchanged. This produces

29 <xsl:template name="tocheading">
30   <xsl:param name="level"/>
31   <xsl:variable name="tocindent">...</xsl:variable>
32   <xsl:variable name="tocwidth"> ... </xsl:variable>
33   <xsl:variable name="Number">... </xsl:variable>
34   <fo:block toc='true' margin-right='15pt'
35             text-indent ='{$tocindent}' margin-left ='{$tocwidth}'>
36     <fotex:section-number width='{$tocwidth}'>
37       <fo:inline>
38         <xsl:value-of select="$Number"/>
39       </fo:inline>
40     </fotex:section-number>
41     <fo:inline>
42           <xsl:apply-templates mode="section" select="bodyTitle"/>
43     </fo:inline>
44     <fotex:page-number>
45       <fo:inline color="{$linkColor}">
46         <xsl:variable name="pagref">... </xsl:variable>
47         <fo:basic-link internal-destination="{$pagref}">
48           <fo:page-number-citation ref-id="{$pagref}"/>
49         </fo:basic-link>
50       </fo:inline>
51     </fotex:page-number>
52   </fo:block>
53 </xsl:template>

10.3. Math fonts, 2007/02/14, 2007/03/20

There was a discrepancy in handling math fonts before version 2.9.4. This has been corrected in the following way. First, we have 15 math fonts, listed below. When a math expression is parsed, a token list is constructed, this list contains the math-font equivalent of the current font.

\mml@font@normal. This is the default math font. The translation of $xy$ is a math formula is a sequence of two <mi> elements, each one containing a single ASCII character with no attribute. This font is selected if you say \textnormal, or \mathnormal, as well as \sl (MathML does not define a slanted font). Note that \sl, \slshape and \textsl select the same math font, the same is true for other commands.
\mml@font@upright. This font is selected by \mathrm or \rm. The translation of $\rm xy$ is a single <mi> element, with no attribute, containing a space, the characters xy, and a space. Since version 2.9.5, translation of $\rm x$ is a <mi> containing the letter x, with an attribute pair mathvariant = `normal´.
\mml@font@fraktur. This font is selected by \mathfrak. The translation of ${\mathfrak x}$ consists in a <mi> element containing one of three possibilities: either an entity reference, &xfr;, or a character entity, 𝔵, or an ASCII character (here x) together with an attribute mathvariant = `fraktur´. If x is replaced by ab+2c, the result is a <mi> element for `ab´, another one for `c´, the translation of the plus sign and the digit are independent of the math font. With version 2.9.5, translation of a digit (or a sequence of digits) is a <mn> element, that has an attribute pair mathvariant = `font´ (unless the current font is normal or upright). The same scheme is also used for fonts described below (but entity references are defined only for fraktur, script, and double struck). Note that Unicode provides bold, double struck, sans-serif, sans-serif bold, and monospace digits. There is currently no easy way to get them in Tralics.
\mml@font@bold. This font is selected by \bf or \mathbf.
\mml@font@italic. This font can be selected by \it or \mathit.
\mml@font@bolditalic. This font can be selected by \it or \mathit, if the current math version is bold.
\mml@font@script. This font is selected by \cal or \mathcal.
\mml@font@boldscript. This font is selected by \cal or \mathcal, if the current math version is bold.
\mml@font@doublestruck. This font is selected by \mathbb.
\mml@font@boldfraktur. This font is selected by \mathfrak if the current math version is bold.
\mml@font@sansserif. This font is selected by \sf or \mathsf.
\mml@font@boldsansserif. This font is selected by \sf or \mathsf, if the current math version is bold.
\mml@font@sansserifitalic. This font cannot be directly selected.
\mml@font@sansserifbolditalic. This font cannot be directly selected.
\mml@font@monospace. This font is selected by \tt or \mathtt.

Here is an example of how to use the raw fonts

1 \def\F#1{\csname mml@font@#1\endcsname}
2 \def\A{A}
3 $  \F{normal} \A  \F{upright} \A  \F{bold} \A    \F{italic} \A
4    \F{bolditalic} \A  \F{script} \A \F{boldscript} \A \F{fraktur} \A
5    \F{doublestruck} \A \F{boldfraktur} \A \F{sansserif} \A \F{boldsansserif} \A
6    \F{sansserifitalic} \A \F{sansserifbolditalic} \A \F{monospace} \A $

This is the translation:

7 <formula type='inline'>
8   <math xmlns='http://www.w3.org/1998/Math/MathML'>
9     <mrow>
10       <mi>A</mi><mi mathvariant='normal'>A</mi><mi>&#x1D400;</mi>
11       <mi>&#x1D434;</mi><mi>&#x1D468;</mi><mi>&Ascr;</mi>
12       <mi>&#x1D4D0;</mi><mi>&Afr;</mi><mi>&Aopf;</mi>
13       <mi>&#x1D56C;</mi><mi>&#x1D5A0;</mi><mi>&#x1D5D4;</mi>
14       <mi>&#x1D608;</mi><mi>&#x1D63C;</mi><mi>&#x1D670;</mi>
15     </mrow>
16   </math>
17 </formula>

This is the translation of the same formula, using option -noentnames. No entity names are used here.

18 <formula type='inline'>
19   <math xmlns='http://www.w3.org/1998/Math/MathML'>
20     <mrow>
21       <mi>A</mi><mi> A </mi><mi>&#x1D400;</mi>
22       <mi>&#x1D434;</mi><mi>&#x1D468;</mi><mi>&#x1D49C;</mi>
23       <mi>&#x1D4D0;</mi><mi>&#x1D504;</mi><mi>&#x1D538;</mi>
24       <mi>&#x1D56C;</mi><mi>&#x1D5A0;</mi><mi>&#x1D5D4;</mi>
25       <mi>&#x1D608;</mi><mi>&#x1D63C;</mi><mi>&#x1D670;</mi>
26     </mrow>
27   </math>
28 </formula>

There is an option -mathvariant to Tralics. If you use it, then the translation of a character in a font is an ASCII character, and the font is indicated by an attribute. Thus, the translation is the following.

29 <formula type='inline'>
30   <math xmlns='http://www.w3.org/1998/Math/MathML'>
31     <mrow>
32       <mi>A</mi>
33       <mi> A </mi>
34       <mi mathvariant='bold'>A</mi>
35       <mi mathvariant='italic'>A</mi>
36       <mi mathvariant='bold-italic'>A</mi>
37       <mi mathvariant='script'>A</mi>
38       <mi mathvariant='bold-script'>A</mi>
39       <mi mathvariant='fraktur'>A</mi>
40       <mi mathvariant='double-struck'>A</mi>
41       <mi mathvariant='bold-fraktur'>A</mi>
42       <mi mathvariant='sans-serif'>A</mi>
43       <mi mathvariant='bold-sans-serif'>A</mi>
44       <mi mathvariant='sans-serif-italic'>A</mi>
45       <mi mathvariant='sans-serif-bold-italic'>A</mi>
46       <mi mathvariant='monospace'>A</mi>
47     </mrow>
48   </math>
49 </formula>

There is a counter \@mathversion, whose value is zero. If set to a positive value, then \sf selects a bold variant, otherwise a non-bold one, see list above. The user command \mathversion reads an argument and expands it fully (using \csname). If the argument is `bold´ it sets the counter to 1, otherwise to 0. A single <mi> element is produced for a sequence of characters, provided that these characters are letters, and there is no font change command between the characters, and the font is not the default one. This means that $diff$ is a math formula, containing four identifiers, (with an implicit product) and $\mathmi {diff}$ is a formula containing a single identifier, that uses an italic font. Example

50 \def\A{Xx\mathcal{Cal}\mathrm{Rm}\mathbf{Bf}\mathsf{Sf}%
51   \mathtt{Tt}\mathtt{x}\mathtt{y+1}%
52 \mathnormal{No} \mathit{It}\mathfrak{Fr}}
53 \mathversion{normal}
54 $\A$
55 \mathversion{bold}
56 $\A$

The translation is

57 <formula type='inline'>
58  <math xmlns='http://www.w3.org/1998/Math/MathML'>
59    <mrow>
60      <mi>X</mi><mi>x</mi>
61      <mi mathvariant='script'>Cal</mi>
62      <mi> Rm </mi>
63      <mi mathvariant='bold'>Bf</mi>
64      <mi mathvariant='sans-serif'>Sf</mi>
65      <mi mathvariant='monospace'>Tt</mi>
66      <mi mathvariant='monospace'>x</mi>
67      <mi mathvariant='monospace'>y</mi>
68      <mo>+</mo>
69      <mn mathvariant='monospace'>1</mn>
70      <mi>N</mi><mi>o</mi>
71      <mi mathvariant='italic'>It</mi>
72      <mi mathvariant='fraktur'>Fr</mi>
73    </mrow>
74  </math>
75 </formula>
76 <formula type='inline'>
77   <math xmlns='http://www.w3.org/1998/Math/MathML'>
78     <mrow>
79      <mi>X</mi><mi>x</mi>
80      <mi mathvariant='bold-script'>Cal</mi>
81      <mi> Rm </mi>
82      <mi mathvariant='bold'>Bf</mi>
83      <mi mathvariant='bold-sans-serif'>Sf</mi>
84      <mi mathvariant='monospace'>Tt</mi>
85      <mi mathvariant='monospace'>x</mi>
86      <mi mathvariant='monospace'>y</mi>
87      <mo>+</mo>
88      <mn mathvariant='monospace'>1</mn>
89      <mi>N</mi><mi>o</mi>
90      <mi mathvariant='bold-italic'>It</mi>
91      <mi mathvariant='bold-fraktur'>Fr</mi>
92     </mrow>
93    </math>
94 </formula>

If you set the \@nomathml counter to -1, the result is something like

95 <texmath type='inline'>
96    Xx\mml@font@script Cal\mml@font@upright Rm\mml@font@bold Bf
97    \mml@font@sansserif Sf\mml@font@monospace Tt\mml@font@monospace x
98    \mml@font@monospace y+1\mml@font@normal No\mml@font@normal
99    \mml@font@italic It\mml@font@fraktur Fr
100 </texmath>

Note that, if you say \rm\bf, the first font command is useless, hence is not indicated. This is done so because the math list of the previous example looks like this (we have only shown the start and the end):

101 $Xx\mml@font@boldscript Cal\mml@font@normal\mml@font@upright Rm...
102 ...\mml@font@normal\mml@font@boldfraktur Fr\mml@font@normal$

If you do not like the names above, you can change them in the configuration file. For instance

103 mml_font_normal = "Nr"
104 mml_font_upright = "Up"
105 mml_font_bold = "Bo"
106 mml_font_italic = "It"
107 mml_font_bold_italic = "Bi"
108 mml_font_script = "Sc"
109 mml_font_bold_script = "Bs"
110 mml_font_fraktur = "Fr"
111 mml_font_doublestruck = "Ds"
112 mml_font_bold_fraktur  = "Bf"
113 mml_font_sansserif = "Ss"
114 mml_font_bold_sansserif = "Bs"
115 mml_font_sansserif_italic = "Si"
116 mml_font_sansserif_bold_italic = "Sbi"
117 mml_font_monospace = "Mn"

In this case, the translation of

118 \csname@nomathml\endcsname=-1
119 $\frac{\mathit{\mathbf{foo}}}{\mathrm{bar}+1}=3$
120 \def\F#1{\csname mml@font@#1\endcsname}
121 \def\A{A}
122 $  \F{normal} \A  \F{upright} \A  \F{bold} \A    \F{italic} \A
123    \F{bolditalic} \A  \F{script} \A \F{boldscript} \A \F{fraktur} \A
124    \F{doublestruck} \A \F{boldfraktur} \A \F{sansserif} \A \F{boldsansserif} \A
125    \F{sansserifitalic} \A \F{sansserifbolditalic} \A \F{monospace} \A $

becomes

126 <p><Texmath type='inline'>\frac{\Bo foo}{\Up bar\Nr +1}=3</Texmath>
127 <Texmath type='inline'> \Nr  A\Up  A\Bo  A\It  A\Bi  A\Sc  A\Bs  A\Fr  A\Ds
128 A\Bf  A\Ss  A\Bs  A\Si  A\Sbi  A\Mn  A</Texmath>

Finally, we have two commands\mathfontproperty, and \setmathchar, that can be used as follows.

129 \mathfontproperty2=3 $\mathbf{x}$
130 \the\mathfontproperty\mml@font@bold
131 \the\setmathchar\mathbf`x
132 \mathfontproperty\mathbf=0
133 \setmathchar\mathbf`x={\&\#x1d431;} % this is not what you want
134 $\mathbf{x}$
135 \setmathchar\mathbf`x={\xmllatex{\&\#x1d431;}{}}
136 $\mathbf{x}$

These commands have to be followed by a mathfont identifier; this an integer between 0 and 14; it can be a math font like \mml@font@fraktur, or a TeX fontname like \mathbf; the value of \mathfontproperty is a boolean value; this means that any non-zero value is internally stored as one. The effect of the option -nomathvariant to the program is to set all bits to zero; the effect of -mathvariant is to set all bits to one; the mechanism shown here allows the user to change the behaviour of some of the fonts, or to change it temporarily. If the boolean is true, translation of a letter in the font is a <mi> element with a mathvariant attribute.

The command \setmathchar takes a second argument which is a character value (an integer between 0 and 127); this command sets a value; in the example given above, it says that the translation of character x, in the boldface font, if no mathvariant attribute is used, is a <mi> element containing 𝐱. This value is stored as a character string; assignment is global; using \the, you can get the value stored in the table, as a list of character tokens, (of category code 12, as usual).

137 <formula type='inline'>
138   <math xmlns='http://www.w3.org/1998/Math/MathML'>
139     <mi mathvariant='bold'>x</mi></math></formula>
140 1&amp;#x1D431;<formula type='inline'>
141   <math xmlns='http://www.w3.org/1998/Math/MathML'>
142     <mi>&amp;#x1d431;</mi></math></formula>
143   <math xmlns='http://www.w3.org/1998/Math/MathML'>
144     <mi>&#x1d431;</mi></math></formula>

10.4. Text font in math, 2007/04/09

In Tralics2.9.4, translation of \hbox and variants was a <mrow> element containing in general some <mtext> or <mspace> elements, and maybe some other math formulas. The enclosing <mrow> has been removed. Moreover, font changes are honoured. Example:

145 $\hbox{toto} \it \text{titi\bf tata} {\tt \text{x = y$^2$}}$

The translation contains two \mrow elements; the inner <mrow> is a consequence of the group started just before \tt. The translation of the first \text command is a sequence of two <mtext> elements, because of the font change.

146 <formula type='inline'>
147   <math xmlns='http://www.w3.org/1998/Math/MathML'>
148     <mrow>
149       <mtext>toto</mtext>
150       <mtext mathvariant='italic'>titi</mtext>
151       <mtext mathvariant='bold'>tata</mtext>
152       <mrow>
153         <mtext mathvariant='monospace'>x</mtext>
154         <mspace width='3.33333pt'/>
155         <mtext mathvariant='monospace'>=</mtext>
156         <mspace width='3.33333pt'/>
157         <mtext mathvariant='monospace'>y</mtext>
158         <msup> <mrow/> <mn mathvariant='monospace'>2</mn> </msup>
159       </mrow>
160     </mrow>
161   </math>
162 </formula>

10.5. Math extensions, 2007/02/15

The result of the translation of \xbox{foo}{bar} is an XML element named `foo´, containing `bar´. Such a construction is illegal in math mode. You must use \mathbox instead. In this case, the content is formed of three math characters (<mi> elements).

There are eight commands that generalize the \xbox command in math mode. First, we have \mathmi, \mathmo and \mathmn. These were added in 2004, and produce a <mi>, <mo> and <mn> element. For instance, both formulas

1 $x=2$ $\mathmi{x}\mathmo{=}\mathmn{2}$

translate to the same quantity. The argument should contain only letters. Since version 2.9.4, the commands \mathci, \mathco and \mathcsymbol, can be used to produce elements <ci>, <cn> and <csymbol>. The MathML recommendation has the following example:

2 <formula type='inline'>
3   <math xmlns='http://www.w3.org/1998/Math/MathML'>
4     <cn type='complex-cartesian'>3<sep/>4</cn>
5   </math>
6 </formula>

It cannot be produced by the commands shown above because there are non-letters in the <cn> element. For this reason, we have introduced \mathbox, a command that takes two arguments: an element name and the content. The previous example can be produced by using `cn´ as element name; the content is formed of three parts A, B, and C, here B is a \mathbox, named `sep´, with empty content, while A and C contain only 3 and 4. They are obtained by \mathcnothing(note: ➳), that has the same syntax as \mathmi, but produces an element with an empty name (this means that you will see only its content, a sequence of characters). The example can be produced via the following code:

7 $\mathbox{cn}{\mathcnothing{3}\mathbox{sep}{}\mathcnothing{4}}
8 \mathattribute{type}{complex-cartesian}$

Note that \mathattribute is a command, available only in math mode, that adds an attribute to the last element created. In some cases, the creation order is unclear. In the example above, the main token list has two elements, the \mathbox with its arguments and the \mathattribute with its arguments, they are processed in order: in the case of <mathbox>, its arguments are processed in order, and then an element is created; this differs from the case of \xbox, where a box is created, then arguments are evaluated, these arguments can add attributes to the current box.

We solve the problem in the following way: the \mathbox command, as well as the seven other ones, take attributes as optional arguments. The previous example could be written like this:

9 %% $\mathbox{cn}{...}[type=complex-cartesian]$  % this does not work

In fact, the attribute comes before the content:

10 %% $\mathbox{cn}[type=complex-cartesian]{...}$  % this does not work

In order to make parsing easier, two pairs of brackets are needed, as in

11 $\mathbox{cn}[type][complex-cartesian]{...}$  % this works

As many attribute pairs as deseired can be given. Thus, the easiest method is the following

12 \def\mmlAtype#1{[type][#1]}
13 $\mathbox{cn}\mmlAtype{complex-cartesian}
14   {\mathcnothing{3}\mathbox{sep}{}\mathcnothing{4}}$

There is a non-trivial point here: when Tralics tests for an optional argument, it expands the next token. If this is an opening bracket, all tokens up to the next closing bracket are read. These tokens are then read again, in a local group, as in the case of the \frac command. Example

15 \def\foo#1{[q#1][\let\x\relax\gee]}\def\gee{10}
16 $\mathbox{a}\foo1\foo2{0}$

In verbose mode, the transcript file contains the following lines. On line 45, you will see the list of tokens seen by the math parser. Some of these tokens, for instance \let are evaluated and are not part of the math formula. In a future version, they might disappear from the trace.

17 [53] $\mathbox{a}\foo1\foo2{0}$
18 {math shift character}
19 +stack: level + 2 for math entered on line 53
20 \foo #1->[q#1][\let \x \relax \gee ]
21 #1<-1
22 +stack: level + 3 for math entered on line 53
23 +stack: level - 3 for math from line 53
24 +stack: level + 3 for math entered on line 53
25 {\let \x \relax}
26 {changing \x=undefined}
27 {into \x = \relax}
28 \gee ->10
29 +stack: killing \x
30 +stack: level - 3 for math from line 53
31 \foo #1->[q#1][\let \x \relax \gee ]
32 #1<-2
33 +stack: level + 3 for math entered on line 53
34 +stack: level - 3 for math from line 53
35 +stack: level + 3 for math entered on line 53
36 {\let \x \relax}
37 {changing \x=undefined}
38 {into \x = \relax}
39 \gee ->10
40 +stack: killing \x
41 +stack: level - 3 for math from line 53
42 +stack: level + 3 for math entered on line 53
43 +stack: level - 3 for math from line 53
44 +stack: level - 2 for math from line 53
45 Math: $\mathbox{a}{q1}{\let10}{q2}{\let10}{0}$

Here is a final example. We assume that \mmlentity{foo} produces &#xfoo;, where the first character is an ampersand of category code letter. It will be output verbatim.

46 \def\test{\mathbox{a}[foo][bar][foo1][\mmlentity{a0}bar\&\#xa0;+\_1]
47 {\mathcn{1}\mathci{2}}}
48 $\test$
49 \csname@nomathml\endcsname=-1
50 $\test$

This is the translation. In no mathml mode, there are some redundant backslashes.

51 <formula type='inline'>
52   <math xmlns='http://www.w3.org/1998/Math/MathML'>
53      <a foo1='&#xa0;bar&amp;#xa0;+_1' foo='bar'>
54         <cn>1</cn><ci>2</ci></a>
55   </math>
56 </formula>
57 <texmath type='inline'>
58   \mathbox{a}[foo='bar'][foo1='&#xa0;bar\&amp;\#xa0;+\_1']
59    {\mathcn{1}\mathco{2}}
60 </texmath>

10.6. Missing minus signs, 2007/02/24

The MathML interpreter of Firefox1.5 on Macintosh does not display minus signs. This is a bit annoying; for this reason, the option -bad_minus was introduce in version 2.9.5. Its effect is to replace a minus sign by an en-dash. Thus the translation of $x-y$ is

61 <formula type='inline'>
62   <math xmlns='http://www.w3.org/1998/Math/MathML'>
63     <mrow><mi>x</mi><mo>&#x2013;</mo><mi>y</mi></mrow>
64   </math>
65 </formula>

10.7. Math attributes and other commands, 2007/03/24

10.7.1. Attributes for arrays

This section describes \rowattribute, \cellattribute and why these commands are needed. When you use \mathattribute, this adds an attribute to the last element that was created; in the case of table, the order is of creation is different, and this command cannot be used. Thus, \rowattribute and \cellattribute are commands that take two arguments and add an attribute pair to the current row and column.

In some cases, you also want to add an attribute to a math formula. You can use \formulaattribute, and \thismathattribute. We give here an example using all these four commands.

1 \begin{align}
2 \formulaattribute{tag}{8-2-3}
3 \thismathattribute{background}{white}
4 \rowattribute{mathvariant}{bold} x^2 + y^2+100 &=  z^2 \\
5 \cellattribute{columnalign}{left}  x^3 + y^3+1 &<  z^3
6 \end{align}

Translation; For simplicity, we have replaced the translation of x^2+y^2 by XX.

7 <formula type='display' tag='8-2-3'>
8  <math mode='display' xmlns='http://www.w3.org/1998/Math/MathML'
9     background='white'>
10   <mtable>
11     <mtr mathvariant='bold'>
12       <mtd columnalign='right'>
13         <mrow>XX<mo>+</mo><mn>100</mn></mrow></mtd>
14       <mtd columnalign='left'>
15         <mrow><mo>=</mo><msup><mi>z</mi> <mn>2</mn> </msup></mrow></mtd>
16     </mtr>
17     <mtr>
18       <mtd columnalign='left'>
19         <mrow>XX<mo>+</mo><mn>1</mn></mrow></mtd>
20       <mtd columnalign='left'>
21         <mrow><mo>&lt;</mo><msup><mi>z</mi> <mn>3</mn> </msup></mrow></mtd>
22     </mtr>
23   </mtable>
24  </math>
25 </formula>

10.7.2. Explicit equation numbers

The command \@y@tag takes one argument, say `foo´, and adds it as value of the attribute `tag´ of the current math formula; The \@x@tag is similarly defined, but parentheses are added. The command \x@tag takes one argument and puts it parentheses, with some space before; the command \y@tag is the same, without parentheses. The commands are defined in amsmath.plt as

26 \def\@x@tag#1{\formulaattribute{tag}{(#1)}}
27 \def\@y@tag#1{\formulaattribute{tag}{#1}}
28 \def\x@tag#1{\qquad(#1)}
29 \def\y@tag#1{\qquad#1}

The command \@xtag takes one argument, say foo and pushes \@xtag{foo} to the end of the current math list. If the command is called twice, with arguments foo and bar, the result will be \@xtag{foo,bar}. The command \@ytag is similar. If you use both \@xtag and \@ytag, the result will be \@ytag, said otherwise: \tag*{a} \tag{b} is the same as \tag*{a,b}.

The command \tagatcurpos redefines \@xtag to be \x@tag. This means that \tag{$*$} is the same as \qquad(*). The command \tagatendofformula defines \@xtag and \x@tag as explained above (this is the default behavior), This means that \tag{$*$} is the same as \qquad(*), but pushed to the right end of the formula. The command \tagasattribute defines \@xtag as explained above, and \x@tag to be \@x@tag This means that\tag{$*$} puts (*) on the attribute list of the formula. Example

30 \[ a \tag{b} c \tag{*}\]\par
31 \tagasattribute
32 \[ a \tag{b} c \tag{*}\]\par
33 \tagatcurpos
34 \[ a \tag{b} c \tag{*}\]\par

35 <formula type='display'>
36   <math mode='display' xmlns='http://www.w3.org/1998/Math/MathML'>
37     <mrow>
38       <mi>a</mi><mi>c</mi>
39       <mspace width='2.em'/><mo>(</mo><mi>b</mi><mo>,</mo><mo>*</mo><mo>)</mo>
40     </mrow>
41   </math>
42 </formula>
43 <formula type='display' tag='(b,*)'>
44   <math mode='display' xmlns='http://www.w3.org/1998/Math/MathML'>
45     <mrow><mi>a</mi><mi>c</mi></mrow>
46   </math>
47 </formula>
48 <formula type='display'>
49   <math mode='display' xmlns='http://www.w3.org/1998/Math/MathML'>
50     <mrow>
51       <mi>a</mi>
52       <mspace width='2.em'/><mo>(</mo><mi>b</mi><mo>)</mo>
53       <mi>c</mi>
54       <mspace width='2.em'/><mo>(</mo><mo>*</mo><mo>)</mo>
55     </mrow>
56   </math>
57 </formula>

10.7.3. Infinite horizontal glue

Since version 2.9.5, commands \hfill and \hfil are recognised in some trivial cases. Thus, the translation of

58 $\frac{\hfil1}{2\hfill}$

59 <formula type='inline'>
60   <math xmlns='http://www.w3.org/1998/Math/MathML'>
61     <mfrac denomalign='left' numalign='right'>
62       <mn>1</mn> <mn>2</mn></mfrac>
63   </math>
64 </formula>

Another example, from the TeXbook:

65 \[\text{The confluent image of}\quad
66 \begin{Bmatrix}\text{an arc}\hfill\\\text{a circle}\hfill\\\text{a fan}\hfill
67 \end{Bmatrix}
68 \quad\text{is}\quad
69 \begin{Bmatrix}\text{an arc}\hfill\\\text{an arc or a circle}\hfill\\
70 \text{a fan or an arc}\hfill\end{Bmatrix}.\]

Commands of the form \hfil, \hfill, \hfilneg, \hss, are allowed, as first or last position, in arguments of commands like \overline, or when scanning a cell in a table. They are ignored, unless the result is a fraction (see example above, the result being a numalign or denalign), or a cell in a table (attribute halign). If there is an \hfill command on the left, the right, or on both sides, then alignment is right, left and centered respectively; otherwise, if there is an \hfil command on the left, the right, or on both sides, then alignment is right, left and centered respectively; other default alignment will be used. Note: if the default alignment is not center (in the case of an array, you need two \hfill commands.

71 \[\text{The confluent image of}\quad
72 \begin{Bmatrix}\text{an arc}\hfill\\\text{a circle}\hfill\\\text{a fan}\hfill
73 \end{Bmatrix}
74 \quad\text{is}\quad
75 \begin{Bmatrix}\text{an arc}\hfill\\\text{an arc or a circle}\hfill\\
76 \text{a fan or an arc}\hfill\end{Bmatrix}.\]

10.8. Operators, limits, fences, 2007/03/20

Consider the following math formulas.

1 $\bigl(\frac{3}{4}\big)^{-1}$
2 $\big<\big(\frac 12 \big)\big>$

The TeX translation of \bigl( is given by

3 \def\bigl#1{%
4 \mathopen{\hbox{$\left#1\vbox to8.5pt{}\right.
5   \nulldelimiterspace0pt \mathsurround0pt$}}}

The purpose of the last line is to make sure that no unwanted space is added after the operator. The Tralics translation could have been

6 \def\bigl#1{\mathmo[mathsize][8.5pt]{#1}}

a definition that ignores the \mathopen attribute. The non-trivial question here is: why use 8.5 pt and not some other random number? what if the user changes the document size from 10pt to 12pt, or uses \large? This problem is solved in amsmath as follows; there is an empty box, named \Mathstrutbox that has the height and depth of a parenthesis, and is updated whenever needed, and \big@size contains 1.2 times the total height and depth of the box. This quantity is used by \big and friends.

The current Tralics solution is the following: \bigl marks the token that follows as big+open (\big marks it as big+type, where the type is one of open, close, or middle, depending on the operator). After that, a pair big+open, big+close is converted into a \left, \right pair. If there are more than one such operators, Tralics considers the first closing operator that is preceded by an opening one, it takes the last opening one, converts the sub-formla, and tries again. If the formula is aLbLcRdReLfRg, the sub-expressions LcR and LfR are converted; the new expression is aLbCdReFg, and LbCdR is converted. In the case of aLb, or aRb, Tralics forgets about sperious delimiters (in some versions, it discards some material, this being obviously a bug). Before version 2.9.5, an expression like \big)^2 was converted into an XML expression before bigl/bigr was converted into left/right. In such a case, the \big prefix is useless. As a consequence, Tralics 2.9.4 replaces the first expression by an empty one. For the second expression, the less-than and greater than signs were considered as relations, and the prefix ignored. Tralics2.9.5 gives the following:

7 <formula type='inline'>
8   <math xmlns='http://www.w3.org/1998/Math/MathML'>
9     <msup>
10       <mfenced open='(' close=')'>
11         <mfrac><mn>3</mn> <mn>4</mn></mfrac>
12       </mfenced>
13       <mrow><mo>-</mo><mn>1</mn></mrow>
14     </msup>
15   </math>
16 </formula>
17 <formula type='inline'>
18   <math xmlns='http://www.w3.org/1998/Math/MathML'>
19     <mfenced open='&langle;' close='&rangle;'>
20       <mfenced open='(' close=')'>
21         <mfrac><mn>1</mn> <mn>2</mn></mfrac>
22       </mfenced>
23     </mfenced>
24   </math>
25 </formula>

Consider now

26 \[\sum_1\mathop x_2 \]
27 $\sum_3\mathop x\limits_4$
28 $\lim_x \lim\limits_x$

Translation of the first two formulas is, until version 2.9.4:

29 <formula type='display'><math xmlns='http://www.w3.org/1998/Math/MathML'>
30   <mrow><msub><mo>&sum;</mo> <mn>1</mn> </msub><msub><mi>x</mi> <mn>2</mn>
31   </msub></mrow></math></formula>
32 <formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'>
33   <mrow><msub><mo>&sum;</mo> <mn>3</mn> </msub><msub><mi>x</mi> <mn>4</mn>
34   </msub></mrow></math></formula>

This does not put the number 1 below the sum sign. For this reason, Tralics 2.9.5 uses <munder> in such a case. However, the sum operator has implicit movable limits in MathML, and the default mode seams to be inline. For this reason mode = `display´ is added. The second line shows that \mathop has a \displaylimits by default.

35 <formula type='display'>
36   <math mode='display' xmlns='http://www.w3.org/1998/Math/MathML'>
37    <mrow>
38      <munder><mo>&sum;</mo> <mn>1</mn> </munder>
39      <munder><mi>x</mi> <mn>2</mn> </munder>
40    </mrow>
41   </math>
42 </formula>
43 <formula type='inline'>
44   <math xmlns='http://www.w3.org/1998/Math/MathML'>
45     <mrow>
46      <msub><mo>&sum;</mo> <mn>3</mn> </msub>
47      <munder><mi>x</mi> <mn>4</mn> </munder>
48     </mrow>
49   </math>
50 </formula>

Translation of the third line. As you can see, Tralics sets movable-limits to false in non-display mode if limits, are wanted, and used a <msub> if no limits are wanted (in this case, the value of the attribute is ignored).

51 <formula type='inline'>
52  <math xmlns='http://www.w3.org/1998/Math/MathML'>
53   <mrow>
54    <msub><mo movablelimits='true' form='prefix'>lim</mo> <mi>x</mi> </msub>
55    <munder><mo movablelimits='false' form='prefix'>lim</mo><mi>x</mi></munder>
56   </mrow>
57  </math>
58 </formula>

10.9. More math fonts, 2007/05/04

One question is how to translate the following line

1 $\mathbf{\let\P\S xy}\P$

In version 2.9.4, this is the same as

2 $\bf\let\P\S xy\normalfont\P$

where complicated names are used instead of \bf. If the counter \nomathml is negative, the translation is

3 <texmath type='inline'>\mml@font@bold xy\mml@font@normal W</texmath>

where the funny symbol has been replaced by W. Assume that you want something more readable; you can redefine \mathbf to be \string \mathbf. Since version 2.10, you can say \let\mathbf\relax, in this case, translation is void, but the token remains in the tree, and you will see its name,

4 <texmath type='inline'>\mathbf{xy}W</texmath>

Here W is some funny symbol, but not the same as above. This is a bit annyoing. Hence, we changed the meaning of the initial line; it is

5 $\bf{\let\P\S xy}\normalfont\P$

Note that this is similar to the LaTeX behavior: a font change command induces a local group. However, consider now

6 $\mathbf{xy0}z$

With the definition above, a font change is a group, and translated into a <mrow> (if the group has more than one token). It happens that $x y$ is a single <mi> element if non-default font is used, and $x y 0$ consists in two tokens, an identifier and a number. For this reason, we changed again the method: current translation is equivalent to

7 $\bf\begingroup\let\P\S xy\endgroup\normalfont\P$

In this case, there are no braces, hence no sub-tree is constructed, and no \mrow element is constructed. On the other hand, we have a group, that limits the scope of the \let, and the formula contains a pilcrow sign, not a section mark.

The following formula

8 $\mathbf{x\relax y}$

contains two <mi> elements, because the \relax command is still present when letters are converted into identifiers. There is one case where the \relax does not appear in the tree: when it is the first item in a math formula. The reason is that \ensuremath inserts such a token; an expression like

9 \ensuremath{\alpha}

contains a single token, and is considered as a trivial expression, so that the translation is α if Tralics is called with option -notrivialmath.

10.10. New raweb DTD, 2007/07/29

We changed the DTD for the Raweb, in order to accomodate with the new specifications. We indicate here the changes, with regard to line numbers given in section 9.1. The new DTD file is called raweb7.dtd.

Originally, translation of \ier was &ier;, and the entity was defined in the DTD; the translation is now independent of the DTD, so that lines 12 to 19 have been removed.

We removed the html attribute of a module (this attribute is not used anymore, the HTML file name associated to a module depends on its id) (see lines 68-70). The topic attribute has also been removed, and the definition of the <accueil> element has been modified by removing the possibility of using <topic> elements (line 224). Of course the <topic> element has beed removed (lines 271-273).

On lines 300-343 we removed the attribute bname of all bibliography elements.

In the original version, we defined the name of a section (for instance `fondements´) in the DTD; this can be `Scientific Foundations´. In the 2007 version, this name is now in the configuration file. Thus, for all sections (lines 178 to 223), the declaration of the titre attribute has been changed from `#FIXED´ to `#IMPLIED´. The value of the numero attribute (for instance 3 for `fondements´) has been removed. This number is the index of the `fondements´ in the list defined by the configuration file. We removed the attribute from the DTD because it was not used by the Raweb(see section 7.8, template calculateNumberSection). However the number is used by the file raweb3fo.xsl, that converts the XML file into Pdf, without using an intermediary XML file. The number is obtained by evaluating the element in `xref´ mode. We show here how the number of the `fondements´ section is obtained, and how the sec.num template is modified (replace question marks by the section containing the current element, see original code, lines 1307-1311).

<xsl:template match='fondements' mode='xref'>3</xsl:template>
<xsl:template name="sec.num">
   <xsl:apply-templates mode='xref' select = ?? />
  <xsl:text>.</xsl:text>
</xsl:template>

We removed the attributes url and nom of elements like <Rocquencourt>. The idea was to put these informations in a single location. However, when converting from the old DTD to the new one, all these attributes are lost, so that the value Rhône-Alpes exists in four different files (the configuration file, the DTD, the two style sheets). In 2007, we removed it from the DTD, but had to add it to raweb3fo.xsl. Lines 1818-1819 in section 7.14 were modified by evaluating the element in `intro´ mode, and adding definitions like

<xsl:template match = "URSophia" mode ="intro">
  <fo:basic-link external-destination
     ="http://www.inria.fr/inria/organigramme/fiche_ur-sop.en.html"
    >Sophia Antipolis</fo:basic-link>
</xsl:template>

Compare this with the template UR in Chapter 6, line 1242.

10.11. New module specification, 2007/08/02

We removed the preprocessor for the Raweb. As a consequence, the module environment that the translator sees is the same as that of the LaTeX source file. For compatibility reasons, a 2006 module is translated as a modulex. We removed the No-title hacks. The command line option -hacknotitle is still recognised but not used any more. This means that nothing special happens when a division has an empty title. Exception is for a module: a special mechanism was used in 2006. It is an error if a module has no title; the default title is `Overall Objectives´ for the second module, and `Introduction´ otherwise. The title of the first module is ignored, and can be empty. This is a typical example.

1 \begin{module}{presentation}{presentation}{}
2 ...
3 \end{module}

Translation is

1 <presentation id='uid3'>
2 <module id='uid4' html='module2'><head>(Sans Titre)</head>
3 ...

The first argument of the environment is the section title. If this is not the same as for the previous module, a new section is started (in particular the previous section is ended). The title of the section comes from the configuration file, see above. There is no html attribute any more for the modules. Tranlation (after filling the module title) is hence:

1 <presentation titre='Overall Objectives' id='uid3'>
2 <module id='uid4'><head>Overall Objectives</head>
3 ...

10.12. Input encoding, 2007/11/12

The input encoding mechanism has changed in Tralics 2.10.8. There are two types of files: with fixed or variable encoding. Configuration files, tcf files, bibliography data files, and TeX files opened by \openin use a fixed encoding; other source files use a variable encoding.

In the current version of Tralics, there are 34 possible encodings and the inputenc.plt defines 23. Encoding number 0 is UTF8, encoding number 1 is latin1 (also known as iso-8859-1). These are constant, remaining encodings are defined at runtime (initially, they are the same as latin1).

Whenever a file is opened, its initial encoding is computed. If the file has a fixed encoding, then all lines are immediately converted, otherwise lines are converted when needed. If the first line of the file contains the string utf8-encoded, then encoding 0 is assumed, if the line contains iso-8859-1, then encoding 1 is assumed, and if the line contains tralics-encoding:NN where NN is a sequence of one or two digits forming a number less than 34, then encodding NN is assumed. There are other heuristics. For instance, if %&TEX encoding = UTF-8 appears near the start of the file, then encoding 0 is assumed. In all other cases, the default encoding is assumed.

The default encoding is stored in \input@encoding@default. The default value is one, but can be changed via an option to the program (utf8 or latin1 select encoding 0 or 1 respectively).

The current encoding is stored in \input@encoding. This is an attribute of the current input file, it can be changed at any time. The new encoding is used when Tralics needs to read a new line in order to fetch the next token. Nothing special is done in the case of \read.

Each input line is converted into a sequence of Unicode characters. This conversion depends on the encoding. If the encoding is 0, then UTF-8 encoding is assumed. In this case a character is represented by one or more bytes. Not every sequence of bytes defines a valid character, so that errors may be signaled. In all other cases each byte is converted into a character according to a table (there is no support for UTF-16 yet). If the encoding is one, each character maps to itself (this is the default). If the encoding is greater than one, a lookup table is needed. This table is the identity at start-up, and can be changed by packages like inputenc. via \input@encoding@val; this is a command that reads an encoding, a byte and a value. In the example that follows we change the encoding number 2 so that \FOO is read as \foo:

1 \input@encoding@val 2 `O =`o
2 \input@encoding@val 2 `F =`f
3 \let\foo\bar
4 \showthe\input@encoding@val 2 `O
5 \input@encoding=2
6 \show\FOO
7 \showthe\input@encoding@val 2 `O
8 \showthe\input@encoding
9 \input@encoding@default=0
10 \showthe\input@encoding@default
11 \input@encoding=1

This example shows three commands in read or write mode: when the command is prefixed by \showthe it read a value from memory and prints it on the terminal, otherwise a number is scanned and written in memory. The equals signs before the number is optional. No less than 13 integers are scanned, some are given as an explicit integer, some as a character code. We assume that, for encoding 2, all characters map to themselves. Since \FOO is read as \foo, the \show command should print \bar, on lines 4 and 7 you see the value stored of encoding 2 for the character O (first upper case, then lower case), this is twice 111. The other show values printed are 2 and 0.

The inputenc package contains

12 \edef\io@enc{\encoding@value{latin9}}
13 \DeclareInputText{164}{"20AC}
14 \DeclareInputText{166}{"160}
15 \DeclareInputText{168}{"161}
16 \DeclareInputText{180}{"17D}
17 \DeclareInputText{184}{"17E}
18 \DeclareInputText{188}{"152}
19 \DeclareInputText{189}{"153}
20 \DeclareInputText{190}{"178}

On line 13 and following we have used a macro with LaTeX syntax and two arguments, it assumes that \io@enc contains the index of the encoding to modify. The code above defines the latin9 (iso-8859-15) encoding. It is very like latin1, but defines the Euro sign at position 164. We have also

21 \input@encoding@val \encoding@value{latin2} -96 160
22 160 "104 "306 "141 164 "13D "15A 167

As explained above, the command on the start of the line reads 3 integers: an encoding value (here, the encoding of latin2), a byte postion and a character value. The byte position is a number between 0 and 255, the value a non-negative number less than $2^{16}$ . Here the byte position is illegal: this is an extension of the syntax. If a negative number minus N has been read, followed by A such that the sum of A and N is at most 256, then N values will be read, and stored at position A and following (here N is 96, and we have shown only the first eight values).

If you call the input package with options cp1250 and utf8, the following actions are undertaken. First, if a non-trivial encoding is used (other than utf8, ascii, latin1 and latin 9) then the whole file is read and all tables for all encodings are loaded. You should not rely on that: the only guarantee is that the encodings in the list will be installed. The command \encoding@value can be applied to an encoding name, it returns the encoding number. The last argument becomes the default and current encoding. By “current encoding” we do not mean the encoding of the current file (the style file is ASCII 7 bits) but of the main file. This is done by evaluating the following two lines

23   \input@encoding@default\encoding@value{\inputencodingname}%
24   \AtBeginDocument{\inputencoding{\inputencodingname}}  %% See below

The command \inputencodingname holds the current input coding name, and the command \inputencoding can be used to change the encoding. It is defined as:

25 \def\inputencoding#1{%
26   \the\inpenc@prehook
27   \edef\inputencodingname{#1}%
28   \input@encoding=\encoding@value{\inputencodingname}%
29   \the\inpenc@posthook}

There are two hooks that do nothing. Note that the input encoding name is changed after the hook is called, so that you can say something like

30 \inpenc@prehook{\typeout {current encoding \inputencodingname}}
31 \inpenc@posthook{\typeout {changed to  \inputencodingname}}

and this should print something like: current encoding foo changed to bar. If you look at the package, you can see that line 24 is only an approximation. In fact, at the start of the document, the value of \inputencodingname is \relax, and this can be tested by the hook.

10.13. Glossary and other indexes, 2007/12/28

Tralics implements some features of the index package. The command \newindex takes an optional argument A, an optional star, a unique tag B, two arguments C, D and a last argument E. You should refer to the documentation of the package for explanations of A, C, D, and the star. It calls \@newindex with arguments B and E. The main index has tag default, the glossary has tag glossary, with titles Index and Glossary. Nothing happens if you try to redefine an existing index; the main index will be used if you try to use an undeclared index. In the example below, we define two indexes, A and B, but use only A.

The \index command takes an optional star (ignored) and an optional argument, which is the tag of an index. There is no difference between \glossary{foo} and \index[glossary]{foo}; In the same fashion \index{foo} is the same as \index[default]{foo}. The command \addattributetoindex takes three arguments (the first one being optional, and specifying an index). It adds an attribute pair to the index. The title attribute of an index is the title described above (Index for the main index), but you can overwrite it using this command. For instance, we redefine the title of the glossary an the main index.

The commands \makeindex and \makeglossary have no effect. The commands \printindex and \printglossary can be used to say where the index is to be put. By default the end of the document is considered, and the glossary is put after all other indexes. Example.

\newindex{A}{}{}{Second Index}
\newindex{B}{}{}{Third index}
\addattributetoindex{title}{First Index}
\addattributetoindex[A]{head}{Second Index}
\addattributetoindex[glossary]{title}{A Glossary}
These words are in the glossary
\glossary{G1}1\glossary{G2}2
\glossary{G1}3\index[glossary]{G2}4
These are in the second index
\index[A]{G1}1\index[A]{G2}2
\index[A]{G1}3\index[A]{G2!G3}4

Translation

<theindex head='Second Index' title='Second Index'>
<index target='uid24 uid26' level='1'>G1</index>
<index target='uid25' level='1'>G2</index>
<index target='uid27' level='2'>G3</index>
</theindex>
<theglossary title='A glossary'>
<index target='uid20 uid22' level='1'>G1</index>
<index target='uid21 uid23' level='1'>G2</index>
</theglossary>

10.14. Additional Commands, 2007/12/31

A lot of commands defined by the LaTeX kernel have been added to Tralics.

Translation of the following commands is an empty element whose name is the same as that of the command.

1 \clearpage \cleardoublepage \newpage \hrulefill \dotfill \samepage

These commands are ignored

2 \offinterlineskip \nointerlineskip \frenchspacing \nonfrenchspacing
3 \showoverfull \loggingoutput \showoutput \nofiles \sloppy \fussy
4 \onecolumn \twocolumn \flushbottom \raggedbottom \normalmarginpar
5 \reversemarginpar \normalbaselines \removelastskip

The following commands take an argument and construct a box. The name of the box is <line>. The box has an attribute rend that is respectively left, right, center, llap and rlap. For the first three commands: they are ignored in a figure or table, they start a new paragraph is if they appear in vertical mode. In the case of \marginpar, the box has the same name as the command.

6 \leftline \centerline \rightline \llap \rlap \marginpar

The following commands take one argument, and do nothing else.

7 \showhyphens \includeonly

The following commands are references to glue (rubber length). Unless indicated otherwise, the glue is initialised to zero.

8 \topsep \partopsep \@tempskipa \@tempskipb \@flushglue \listparindent
9 \hideskip \z@skip \skip@ \normalbaselineskip \normallineskip \smallskipamount
10 \medskipamount \bigskipamount \floatsep \textfloatsep \intextsep
11 \dblfloatsep \dbltextfloatsep
12  
13 \@flushglue = 0pt plus 1fil
14 \hideskip =-1000pt plus 1fill
15 \smallskipamount=3pt plus 1pt minus 1pt
16 \medskipamount=6pt plus 2pt minus 2pt
17 \bigskipamount=12pt plus 4pt minus 4pt

The following definitions are used for the float placement algorithm.

18  \def\textfraction{.2}
19  \def\floatpagefraction{.5}
20  \def\dblfloatpagefraction{.5}
21  \def\bottomfraction{.3}
22  \def\dbltopfraction{.7}
23  \def\topfraction{.7}

The following commands are equivalent to \relax in Tralics. They are used by LaTeX to separate figures from text, and could be redefined as zero-height rules.

24 \topfigrule \botfigrule \dblfigrule

The following commands are references to dimensions. Unless specified otherwise, the value is zero.

25 \paperheight \paperwidth \headheight \headsep \jot
26 \footskip \marginparwidth \marginparsep \marginparpush
27 \tabcolsep\arraycolsep\footnotesep\doublerulesep\arrayrulewidth
28 \@tempdima \@tempdima \@tempdima\topmargin\dimen@i\dimen@ii
29  
30 \paperheight=297mm \paperwidth=210mm \jot=3pt \maxdimen=16383.99999pt

The following commands are references to counters. Unless specified otherwise, the value is zero.

31 bottomnumber topnumber dbltopnumber  totalnumber
32 \@tempcnta \@tempcntb \interfootnotelinepenalty \interdisplaylinepenalty

The following commands are references to box registers. The first box should remain empty.

33 \voidb@x \@tempboxa

The command \fmtname holds the current format name. It is Tralics for Tralics.

The two commands \sbox and \savebox read a box number and a box content, and fill the box. The command takes some optional arguments.

34 \setlength{\unitlength}{1pt}
35 \sbox0{1A\bf b}
36 \savebox1{2A\bf b}
37 \savebox2(3,4){3A\bf b}
38 \savebox3(3,4)[c]{4A\bf b}
39 \savebox4[40pt]{5A\bf b}
40 \savebox{5}[40pt][c]{6A\bf b}
41 \newsavebox\Nsbox
42 \savebox\Nsbox[40pt][c]{7A\bf b}
43 \box0\box1\box2\box3\box4\box5\usebox\Nsbox

Translation, where bold face font is shown as BF.

44 <mbox>1A<BF>b</BF></mbox>
45 <mbox>2A<BF>b</BF></mbox>
46 <pic-framebox width='3' height='4'>3A<BF>b</BF></pic-framebox>
47 <pic-framebox width='3' height='4' position='c'>4A<BF>b</BF></pic-framebox>
48 <mbox width='40.0pt'>5A<BF>b</BF></mbox>
49 <mbox width='40.0pt' position='c'>6A<BF>b</BF></mbox>
50 <mbox width='40.0pt' position='c'>7A<BF>b</BF></mbox>

There are different ways to manipulate lists of tokens; some of them use Lisp names, like those described here. The \@nil command is undefined, it is used as end-of-list marker. The \@nnil command is a macro without argument whose expansion is \@nil; it is in general used in comparisons. Both commands \@car and \@cdr read a token list terminated by \@nil, they return the first token, or the remaining ones.

In Lisp, cons(A,B) produces a list whose car is A and whose cdr is B; in other words, it adds a list element A in front of a list B. In TeX most functions add material at the end of the list, for instance \addto@hook; this is a command that takes as first argument a reference to a token list register, it appends the second argument to the end of the list. The command \g@addto@macro behaves the same, but the first argument is the name of a command without argument. The g in the command name means that the command is globally modified. The command \@cons behaves in a similar fashion, but the result is fully expanded, moreover, the \@elt token is added at the end of the initial list.

51 \def\test#1{\def\res{#1}\ifx\foo\res\else \ERROR\fi}
52 \edef\foo{\@car 123\@nil} \test{1}
53 \edef\foo{\@car {1}23\@nil} \test{1}
54 \edef\foo{\@car {123}{456}{7}\@nil} \test{123}
55 \edef\foo{\@cdr 123\@nil} \test{23}
56 \edef\foo{\@cdr {134}{x}\@nil} \test{x}
57 \edef\foo{\@cdr {134}{{x}}\@nil} \test{{x}}
58 \edef\foo{\@carcube1234567\@nil}\test{123}
59 \def\foo{\@nil} \ifx\foo\@nnil\else \ERROR\fi
60 \toks@={abc\foo}\addto@hook\toks@{x\bar}
61 \expandafter\def\foo\expandafter{\the\toks@} \test{abc\foo x\bar}
62 \g@addto@macro\foo{y\gee} \test{abc\foo x\bar y\gee}
63 \def\xx{456}
64 \def\foo{123}\@cons\foo{\xx78}\test{123\@elt45678}

The command \@removeelement takes three arguments A, B and C. Last argument must be a command name or an active character. The second argument is a comma separated list of items, A is removed from B, the result is put in C. If the list B is x, y, z, you must take into account that the second item in the list is not y, but space+y; this means that spaces around commas should be removed first, for instance using \zap@space. As the example below shows, this command removes all spaces until finding a space followed by \@empty. The \strip@prefix command strips the prefix produced by \meaning for a macro; said otherwise, all tokens up to a greater-than sign. The expansion of \@expandtwoargs {\foo} {\bar} {\gee} is is \foo {barval} {geeval} (last two arguments are fully expanded).

65 \def\RM#1#2{\@expandtwoargs\@removeelement{#1}{#2}#2}
66 \def\testfoo#1{\def\xfoo{#1} \ifx\foo\xfoo\else bad \fi} %% test function
67  
68 \edef\foo{\zap@space 1 2 345 \@empty 6 7\strip@prefix 1134>89}
69              \testfoo{123456 789}
70 \def\foo{A,B C,D,E F}
71 \RM{D}\foo   \testfoo{A,B C,E F}
72 \RM{D}\foo   \testfoo{A,B C,E F}
73 \RM{B}\foo   \testfoo{A,B C,E F}
74 \RM{B C}\foo \testfoo{A,E F}
75 \RM{A}\foo   \testfoo{E F}
76 \RM{E F}\foo \testfoo{}

Look at the HTML documentation if you do not understand why braces are in the first list but not the second (the \do command is generally \@makeother

77 \def\@makeother#1{\catcode`#1=12\relax}
78 \def\dospecials{\do\ \do\\\do\{\do\}\do\$\do\&\do\#\do\^\do\_\do\%\do\~}
79 \def\@sanitize{\@makeother\ \@makeother\\\@makeother\$\@makeother\&%
80    \@makeother\#\@makeother\^\@makeother\_\@makeother\%\@makeother\~}

Assume that \foo is a command that takes an optional argument and a mandatory one and calls another command defined like \def\fooaux[#1]#2{...}; you can say \def \foo {\@testopt\fooaux{val}} if val is the default value of the optional argument. . The command \@testopt reads two arguments A and B, and checks that a bracket follows (and for this reason is not robust), if there is one, the result is A, otherwise A[B].

The commands \pagestyle, \thispagestyle and \pagenumbering are not interpreted by Tralics; they take an argument and construct a <pagestyle> element. Example

81 \pagenumbering{arabic} \pagestyle{mypagestyle}\thispagestyle{plain}

Translation

82 <pagestyle numbering='arabic'/>
83 <pagestyle style='mypagestyle'/>
84 <pagestyle this-style='plain'/>

The \@typeset@protect command is \relax; this is the value of \protect when typesetting text. The \@ident command is another name for \@firstofone, it takes an argument and returns it. The \on@line command can be used when signaling error; its expansion could be on input line 17. There is no difference between \reset@font and \normalfont. The \@thirdofthree command takes three arguments, expansion is the third.

The \usebox command takes an argument, that should expand to a box number, the effect is to leave vertical mode and insert a copy of that box.

The two commands \lbrack and \lbrace behave like \over, their usage is deprecated. See the HTML documentation for details.

The command \two@digits reads a number N. Its expansion is 0N, if N is less than ten and N otherwise. The translation of the first line below is 14:03. The command is not overly robust: on the second line the space before the digit 4 is gobbled as end marker of the number N. Note that LaTeX scans the number twice, translation is 034 and 0234, while Tralics scans the number once, and translation is 034 and 234.

85 \day=14 \month=3 \two@digits{\the\day}:\two@digits{\the\month}.
86 \two@digits{3} 4 and \two@digits{2}3 4

If you say \@addtoreset{footnote}{chapter}, then the footnote counter is reset whenever the chapter counter is incremented. The inverse command \@removefromreset is provided by the remreset package. The effect of the \listfiles command is to remember the information gathered by \ProvidesXXX and print it at the end of the run to the transcript file and the terminal, for instance as

87  *File List*
88  article.clt   2006/08/19 v1.0 article document class for Tralics
89      std.clt   2006/08/19 v1.0 Standard LaTeX document class, for Tralics
90    comma.plt   2007/12/29 v1.0 Insert commas every three digits (DPC)
91 checkend.plt   2007/12/14 v1.0 Checks for end environments
92   bbding.plt   2007/12/14 v1.0 Dingbats symbols
93 abstract.plt   2007/12/09 v1.1 configurable abstracts
94   keyval.plt   2007/12/08 v1.1 key=value parser for Tralics (DPC)
95     html.plt   2007/12/05 v1.0 Hypertext commands for latex2html
96 nopageno.plt   2007/12/31 v1.0 no page numbers
97    dummy.txt   2007/12/23 v1.0 Dummy file for Tralics
98  ***********

Assume that you want to use conditionnaly a command, you can do this \ifnum0=0 \foo \else \bar \fi. This works only if the command takes no argument, otherwise you must use something more complicated like inserting a number of \expandafter tokens; The two commands \@afterelsefi and \@afterfi can be placed at the start of the then-part or else-part, the effect is to read all relevant tokens (until \else or \fi), discard the unwanted ones (those between \else and \fi, if the condition is true), terminate the condition, and re-insert the tokens. Example

99 \def\xfoo#1#2{\def\testa{x#1#2}}
100 \def\yfoo#1#2{\def\testb{y#1#2}}
101 \def\test#1{\ifnum0=#1 \@afterelsefi\xfoo u \else\@afterfi\yfoo v\fi}
102  
103 \test0a \test1b
104  
105 \def\testA{xua}\def\testB{yvb}
106 \ifx\testa\testA\else\bad\fi
107 \ifx\testb\testB\else\bad\fi

10.15. LaTeX font support, 2007/12/31

We describe in this section a lot of commands defined by the LaTeX kernel, concerning fonts. A great number of them provoke an error.

We start with some easy commands: the name is misleading, the value of \@vpt is the number 5, not the dimension 5pt,

108 \def\@vpt{5} \def\@vipt{6} \def\@viipt{7} \def\@viiipt{8} \def\@ixpt{9}
109 \def\@xpt{10} \def\@xipt{10.95} \def\@xiipt{12} \def\@xivpt{14.4}
110 \def\@xviipt{17.28} \def\@xxpt{20.74} \def\@xxvpt{24.88}

In LaTeX, a font is characterised by 5 parameters, encoding, family, series, shape and size. A call of the form \fontsize \@xpt \@xiipt says to use a ten point font with 12pt as baselineskip. The command \fontsize is implemented in Tralics to ignore its two arguments; you should use commands of the form \large if you want to change the font size. The four commands \fontencoding, \fontfamily, \fontseries, \fontshape take one argument that evaluate to character string. The encoding could be T1, OT1, etc., it is ignored by Tralics. The font families recognised are cmr, ptm, cmss, phv, cmtt, and pcr (cm stands for computer modern, p for Adobe Postscript); they correspond to \rmfamily, \sffamily and \ttfamily. Recognised shapes are n, it, sl, and sc (normal, italic, slanted and small caps). Recognised series are m, b, bx, sb and c, they correspond to medium, bold, bold extended, semi bold, and condensed. The commands described here store the values somewhere. They will be used if you call \selectfont, either direclty, or indirectly via commands like \itshape. The command \usefont takes four arguments, encoding, family, series, shape, and selects the font. Example

111 {\fontsize{10pt}{12pt}
112  \usefont{T1}{phv}{bx}{it} B
113  \fontseries{sb} C \selectfont D
114  \fontshape{sc}\selectfont E
115  \fontfamily{cmtt}\fontencoding{OT1}\selectfont F }

Translation, using a configuration file where font attributes are packed, via xml_pack_font_att=“true”.

116 <p><hi rend='it,sansserif,boldextended'>B
117  C </hi><hi rend='it,sansserif,semibold'>D
118 </hi><hi rend='sc,sansserif,semibold'>E
119 </hi><hi rend='sc,tt,semibold'>F </hi></p>

The following commands define the default font.

120 \def\encodingdefault{T1}
121 \def\familydefault{cmr}
122 \def\seriesdefault{m}
123 \def\shapedefault{n}

Some shorthands: there is no difference between \symbol{48} and \char48\relax. In the same way \newfont {\foo} {bar} is the same as \font \foo= bar\relax.

All commands given here are defined in Tralics, with the number of arguments shown, but provoke an error.

124 \TextSymbolUnavailable\texteuro
125 \DeclareMathVersion{normal}
126 \DeclareMathDelimiter{\bracevert}
127 \DeclareTextCommandDefault{\textasciitilde}{\~{}}
128 \ProvideTextCommandDefault{\textflorin}{\textit{f}}
129 \DeclareTextSymbolDefault{\textmu}{TS1}
130 \UseTextSymbol{TS1}{\tc@fake@euro}
131 \UndeclareTextCommand{\textsterling}{OT1}
132 \DeclareFontEncodingDefaults{\relax}{\def\accentclass@{7}}
133 \DeclareSizeFunction{sgenb}{\genb@sfcnt\@font@info}
134 \DeclareSymbolFontAlphabet{\mathrm}{operators}
135 \DeclareTextFontCommand{\textrm}{\rmfamily}
136 \DeclareTextAccent{\capitalcircumflex}{TS1}{2}
137 \DeclareTextSymbol{\textflorin}{TS1}{140}
138 \DeclareFontFamily{T1}{lcmtt}{\hyphenchar\font\m@ne}
139 \DeclareFontEncoding{U}{}{\noaccents@}
140 \DeclareOldFontCommand{\bf}{\normalfont\bfseries}{\mathbf}
141 \DeclareTextCompositeCommand{\^}{OT1}{i}{\^\i}
142 \DeclareTextComposite{\^}{T1}{i}{238}
143 \DeclareFontSubstitution{OML}{cmm}{m}{it}
144 \DeclareMathAccent{\breve}{\mathalpha}{operators}{"15}
145 \DeclareMathSymbol\Join    {\mathrel}{lasy}{"31}
146 \DeclarePreloadSizes{OT1}{cmr}{m}{n}{5,7,10}
147 \DeclareMathSizes{34.4}{34.4}{28.66}{23.89}
148 \DeclareErrorFont{OT1}{cmr}{m}{n}{10}
149 \DeclareSymbolFont{lasy}{U}{lasy}{m}{n}
150 \DeclareMathAlphabet{\mathbf}{OT1}{cmr}{bx}{n}
151 \DeclareMathRadical{\sqrtsign}{symbols}{"70}{largesymbols}{"70}
152 \DeclareFontShape{OT1}{cmr}{bx}{n}
153    {%
154       <5><6><7><8><9>gen*cmbx%
155       <10><10.95>cmbx10%
156       <12><14.4><17.28><20.74><24.88>cmbx12%
157       }{}
158 \DeclareFixedFont{\svtnsy}{OMS}{cmsy}{m}{n}{\@xviipt}
159 \SetSymbolFont{lasy}{bold}{U}{lasy}{b}{n}
160 \SetMathAlphabet\mathsf{bold}{OT1}{cmss}{bx}{n}
161 \UseTextAccent{OT1}{\"}{i}
162 \@setfontsize\footnotesize\@xpt{12.3}%
163

10.16. Key-val, 2008/01/27

This section describes some commands provided by the keyval and xkeyval packages. Most commands are written in C++, those specific to xkeyval are entered in the hash table (i.e. can be used) via the use of \tralics@boot@keyval.

The command \tralics@split takes four arguments, say P, A, B and L. The last argument is a list of key value pairs, for instance u=v,w. Spaces are ignored around commas and equal signs. For each pair the command A is applied if a value is given and B otherwise; the token list P is added before the key. In the example that follows the \ifx test is true, said otherwise the expansion of \tralics@split on line 2 is shown on line 3.

1 \def\Edef#1{\expandafter\def\expandafter#1\expandafter}
2 \Edef\fooa{\tralics@split{L@}\A\B{u=v,w,, U = V}}
3 \def\foob{\A {L@u}{v}\B {L@w}\A {L@U}{V}}
4 \ifx\fooa\foob\else BUG \fi
5 \def\setkeys#1{\tralics@split{KV@#1@}\KV@normal\KV@default}

The keyval package provides four commands; one of these is called in case of error. A second one is \setkeys, as defined above. This command takes two arguments, and the following lines are equivalent

6 \setkeys{fam}{u=v,w}
7 \KV@normal{KV@fam@u}{v}\KV@default{KV@fam@w}

The command \KV@normal takes two arguments, a command name and an argument list, it applies the command to the list if defined, and provokes an error otherwise. The command \KV@default takes a single argument, a command name, adds @default, and calls this command if it exists. If we assume all commands defined, line 7 is the same as

8 \KV@fam@u{v}\KV@fam@w@default

The command define@key can be used to define the two commands needed above; after the following two definitions, the code on line 7 will print In u, value=v and In w, value=None.

9 \define@key{fam}{u}{\typeout{In u, value=#1}}
10 \define@key{fam}{w}[None]{\typeout{In w, value=#1}}

If you load the xkeyval package, the commands describe above are redefined (it is hence unwise to use both packages). You should read the xkeyval documentation for addition information. The package adds some flexibility to the command name; in the case of \KV@fam@u you can change the prefix KV, the family name fam and the key name u; you can also omit the prefix of the family. The following code shows what commands are defined:

11 \define@key{fam}{keyA}{}\ifcsname KV@fam@keyA\endcsname\else \bad\fi
12 \define@key[]{xx}{keyA}{}\ifcsname xx@keyA\endcsname\else \bad\fi
13 \define@key{}{keyA}{}\ifcsname KV@keyA\endcsname\else \bad\fi
14 \define@key[]{}{keyA}{}\ifcsname keyA\endcsname\else \bad\fi
15 \define@key[my]{fam}{keyA}[]{}\ifcsname my@fam@keyA\endcsname\else \bad\fi
16     \ifcsname my@fam@keyA@default\endcsname\else \bad\fi
17 \define@key[my]{}{keyA}{}\ifcsname my@keyA\endcsname\else \bad\fi
18 \ifcsname my@keyA@default\endcsname\bad\fi

This piece of code shows the commands are defined.

19 \define@key{family}{keyA}{The input is #1}
20 \define@key{family}{keyB}[none]{The input is #1}
21 \def\foo#1{The input is #1} % \foo==\KV@family@keyA
22 \def\bar{\KV@family@keyB{none}} % \bar==\KV@family@keyB@default

You can say \define@cmdkey. This defines a key that saves the value in a command, and may perform some additional action. We give here two examples, first with a default prefix, then with MP@. In each case, the effect is the same as the two lines that follow. The default value (x or y) is optional, if omitted the default command is not created.

23 \define@cmdkey[xKV]{fam}{keyA}[x]{code #1}
24 %\def\xKV@fam@keyA#1{\def \cmdxKV@fam@keyA {#1}code #1}
25 %\def\xKV@fam@keyA@default{\xKV@fam@keyA {x}}
26 \define@cmdkey[xKV]{fam}[MP@]{keyB}[y]{code #1}
27 %\def\xKV@fam@keyB#1{\def \MP@keyB {#1}code #1}
28 %\def\xKV@fam@keyB@default{\xKV@fam@keyB {y}}

You can use \define@cmdkeys. This defines a sequence of keys that saves the value; there is no additional action, but the syntax is otherwise the same (all three arguments in brackets are optional).

29 \define@cmdkeys[xKV]{fam}[MP@]{keyD,keyE}[z]
30 %\def\xKV@fam@keyD#1{\def \MP@keyD {#1}}
31 %\def\xKV@fam@keyE#1{\def \MP@keyE {#1}}
32 %\def\xKV@fam@keyD@default{\xKV@fam@keyD {z}}
33 %\def\xKV@fam@keyE@default{\xKV@fam@keyE {z}}

You can use \define@choicekey. This defines a choice key. The syntax is the following: the first three arguments define the key. Then comes an optional argument formed of zero, one or two arguments, followed by the list of allowed values, followed by the optional default value followed by the code.

34 \define@choicekey*[KV]{fam}{keyC}[\val\nr]{a,b}[w]{#1}
35 %\def\KV@fam@keyC#1{\XKV@cc*[\val \nr ]{#1}{a,b}{#1}}
36 %\def\foo{\KV@fam@keyC {w}}\isfoo\KV@fam@keyC@default
37 \define@choicekey*+[KV]{fam}{keyC}[\val\nr]{a,b}{#1}{=#1}
38 %\def\KV@fam@keyC#1{\XKV@cc*+[\val \nr ]{#1}{a,b}{#1}{=#1}}

The magic command is \XKV@cc. It takes four or five arguments, bin (optional), value (the value of the key), allowed (a comma separated list list of tokens), code, and maybe badcode; there are two prefixes, plus and star, the plus prefix says how many arguments are read. If the star prefix is used, then the argument and allowed values are converted to lower case letter. If the key value is not in the list, an error is signaled, unless the plus prefix is used, case where badcode is executed. Otherwise code is executed. If the bin is not empty, it should contain one or two definable commands; the value of the key is stored in the first command (possibly after conversion into lower case); its index is stored in the second command if possible. Said otherwise, if the key value is a in the example above, then \nr will hold 0, if the key value is b, it will hold 1.

You can use \define@boolkey. This is like a choice key, with two choices, true and false. The star prefix is implied: a lower case version of the key is always used. If the code below, you can see the \csname command. It starts with some name (here KV@fam@shadow, but is my@frame for the example on the last like). Call this foo; the boolean \iffoo is constructed. The \csname sets the boolean, by calling \footrue or \foofalse, the user defined code can use it.

The first line is the same as the two other ones.

39 \define@boolkey+{fam}{shadow}{B#1}{C#1}
40 %\def\KV@fam@shadow  #1{\XKV@cc*+[\XKV@resa ]{#1}{true,false}
41 % {\csname KV@fam@shadow\XKV@resa \endcsname B#1}{C#1}}
42 %% \define@boolkey{fam}[my@]{frame}{A#1}

You can use \define@boolkeys. It defines more than one key. The plus prefix is forbidden and no code be given: the effect of setting the key is just to set the boolean. In the example that follows, no error should be signaled.

43 \define@boolkey{fam}{A}{\xdef\foo{\ifKV@fam@A Atrue\else Afalse\fi}}
44 \define@boolkeys{fam}{B,C}
45 \def\Test{Atrue}
46  
47 \setkeys{fam}{A=true,B=false,C=True}
48 \ifx\foo\Test \ifKV@fam@B\else \ifKV@fam@C \let\bad\relax\fi\fi\fi
49 \bad

It is possible to disable a key via \disable@keys; the example below will disable the keys keya, keyb and keyc in the family fam (with prefix my); it is an error to disable an undefined key; otherwise this redefines the key to produced a warning when used. It is possible to check via \key@ifundefined that a key exists in a list of families. The next example should print `key defined´ if the key is defined in one of the families, and `key undefined´ otherwise. The command \XKV@tfam holds the last family checked; this is the first family in which the key is defined in case of success, the last element of the family list otherwise; in the special case where the family list is empty, the key is undefined and the macro is empty.

50 \disable@keys[my]{fam}{keya,keyb,keyc}
51 \key@ifundefined[my]{familya,familyb}{keya}
52    {\typeout{key undefined}}{\typeout{key defined}}

The command \setkeys sets a sequence of keys. Arguments are an optional prefix (default is KV), followed by a list of families and a list of key-value pairs. For each pair, all families are looked at, and the definition of the first family is considered. In the example that follows, the first \setkeys produces aAabBb, and signals an undefined key error for keyd. In the second case, an error is signaled because keyb has no default value, but CV is used as default value for keyc. The third line shows nesting, it gives: `caa and bacb and cb´. We show two more examples where a star is after the command name; it this case no error is signaled if a key is not found in the list; in this case \XKV@rm will contain the list of undefined keys. Finally, we show that the command can have an additional parameter, that is a list of keys to ignore. The command \setrmkeys is like \setkeys but it sets the keys from \XKV@rm. In the example, it is assumed to set keye and keyf in family cc; this will fail, and since the starred version is used, the result is stored back in \XKV@rm. On the second try, we use the same command to set all these keys, with the exception of keyg. Note that \setkeys and \setrmkeys accept a plus option (to be put after the star, if you want both options); this says that if a key is found in more than one family, it should be defined in all families.

53 \define@key[X]{familya}{keya}{a#1a}
54 \define@key[X]{familyb}{keyb}{b#1b}
55 \define@key[X]{familyb}{keyc}[CV]{c#1c}
56 \define@key[X]{familyc}{keye}{c#1e}
57 \define@key[X]{familyc}{keyf}{c#1f}
58  
59 \setkeys[X]{familya,familyb}{keya=A,keyb=B,keyd=D}
60 \setkeys[X]{familyb}{keyb,keyc}
61 \setkeys[X]{familyb}{keyc=a\setkeys[X]{familya}{keya=~and b},keyb=~and c}
62 \setkeys*[X]{familyb}{keyc,keyd,keye}
63 % \XKV@rm == {keyd,keye}
64 \setkeys*[X]{familya,familyb}[keya,keyd]{keyc,keyd,keye=1, keyf=2,keyg=3}
65 % \XKV@rm == {keye=1,keyf=2,keyg=3}
66 \setrmkeys*[X]{familycc}
67 \setrmkeys+[X]{familyc}[keyg]

When executing a key macro, six commands are defined; \XKV@prefix contains the prefix, \XKV@fams contains the list of families to search, \XKV@tfam contains the current family, \XKV@header contains the header which is a combination of the prefix and the current family, \XKV@tkey contains the current key name and \XKV@na contains the list of keys that should not be set. For technical reasons, the @ character has category code 11. Example:

68 \define@key[X]{familya}{keyc}{%
69 \edef\vars{prefix=\XKV@prefix, fams=\XKV@fams, this fam=\XKV@tfam,
70 header=\XKV@header,this key=\XKV@tkey, na=\XKV@na}}
71 \setkeys*[X]{familya,familyb}[keya,keyd]{keyc=x,keyd,keye=1, keyf= 2, keyg=3}
72 \show\vars
73     \vars=macro: ->prefix=X@, fams=familya,familyb,
74     this fam=familya, header=X@familya@,this key=keyc, na=keya,keyd.</span>

The package provides a mechanism to save the value of a key in variable. In the example below, we show the name of the variable; remember that the prefix my is optional, default value is KV. The difference between \savevalue and \gsavevalue is that the latter saves the value globally.

75 {
76 \setkeys[my]{familya}{\savevalue{keya}=test1}
77 % \XKV@my@familya@keya@value is test1
78 \setkeys[my]{familya}{\gsavevalue{keya}=test2}
79 }
80 % \XKV@my@familya@keya@value is test2

The six functions described now take an optional prefix as argument, and a family, and optionally a key list. In the example, they work on the macro \XKV@my@familya@save; if the command starts with the letter g, the macro is globally modified, otherwise locally. This macro contains the list of the keys that should be automatically saved; this means that \savemacro is implicityly added; after execution of the first line the two lines that follow are identical; in the case of keyc, \gsavemacro is used instead. Line four has as effect to add keyb to the macro, as well as keyc (the old value of keyc with the global flag is discarded). The command \savekeys (or \gsavekeys) adds the lists of keys to the macro (unless already present), the command \delsavekeys (or \gdelsavekeys) removes the keys when present, and \unsavekeys (or \gunsavekeys) clears the macro.

81 \savekeys[my]{familya}{keya,\global{keyc}}
82 % \setkeys[my]{familya}{\savevalue{keya}=test5}
83 % \setkeys[my]{familya}{keya=test5}
84 \gsavekeys[my]{familya}{keyb,keyc}
85 \delsavekeys[my]{familya}{keyb}
86 \gdelsavekeys[my]{familya}{keyw}
87 \unsavekeys[my]{familya}
88 \gunsavekeys[my]{familya}

You can use a saved value by using the macro that holds the value; a simpler method consists in using \usevalue; this works only if the family is the same and the command is not hidden in braces. In the example that follows, the value of keyc in familya is xyz. We give then an example where the default value of a keys uses a saved value.

89 \setkeys[my]{familya}{\savevalue{keya}=y}
90 \setkeys[my]{familya}{\savevalue{keyb}=\usevalue{keya}}
91 \setkeys[my]{familya}{keyc=a\usevalue{keyb}z}
92  
93 \define@key{fam}{keya}{keya: #1}
94 \define@key{fam}{keyb}[\usevalue{keya}Q]{keyb: #1}
95 \define@key{fam}{keyc}[\usevalue{keyb}R]{keyc: #1}
96 \setkeys{fam}{\savevalue{keya}=test}
97 \setkeys{fam}{\savevalue{keyb}}
98 \setkeys{fam}{keyc}

The command \presetkeys works the same as \savekeys with two exceptions. It takes two key lists instead of one, and these lists may contain key=value pairs. In the example the two macros \XKV@pre@fama@preseth and \XKV@pre@fama@presett are modified.

99 \presetkeys[pre]{fama}{keya, keyb=c}{Keya, Keyb=c, \savevalue{Keyc}}
100 \gpresetkeys[pre]{fama}{keya=1}{Keya=2}
101 \delpresetkeys[pre]{fama}{keya}{Keya}
102 \gdelpresetkeys[pre]{fama}{keya}{Keya}
103 \unpresetkeys[pre]{fama}
104 \gunpresetkeys[pre]{fama}

This is an example of presetting keys. We tell the system to set keya before the user keys, and keyb after that; these settings are skipped if the user specifies a key. The order of evaluation is important in this example because keyb uses a value saved by keya.

105 \define@key[my]{familya}{keya}{\typeout{keya: #1}}
106 \define@key[my]{familya}{keyb}{\typeout{keyb: #1}}
107 \define@key[my]{familya}{keyc}{\typeout{keyc: #1}}
108 \savekeys[my]{familya}{keya}
109 \presetkeys[my]{familya}{keya=blue}{keyb=\usevalue{keya}}
110 \setkeys[my]{familya}{keya=red}
111 \setkeys[my]{familya}{keyc=green}

The commands shown on the first three lines below can appear in a package or class file. When you declare an option with \DeclareOptionX (in package or class foo), you really declare a key in family foo.cls or foo.sty; the example shows the strange syntax to use if you want the family to be foo.bar. If no default value is given, an empty one is provided. The command \ExecuteOptionsX behaves like \setkeys (the same algorithm is used to set get the family). This command is provided by the package writer in order to initialise the variables in the package; as a consequence, there are no presets, no list of keys to ignore, and no error should happen. The command \ProcessOptionsX sets the keys passed as arguments to the package or class. In the current version of Tralics a list of strings (the keys) is maintained for used with the commands without extension X. Mixing these two methods is not provided in version 2.11.5. This means that \ProcessOptionsX has no access to global class options, and if used in a class, does not pass these options to packages. Moreover an optional star is ignored.

112 \DeclareOptionX{opA}[def-val]{\def\opA{#1}}
113 \ExecuteOptionsX{keya,keyb=1}
114 \ProcessOptionsX \relax
115 % \DeclareOptionX[my]&lt;foo.bar&gt;{landscape}{\landscapetrue}
116 % \usepackage[opA,opB=C,opC=\foo,opE]{testkeyval}

10.17. New file IO, 2008/02/10

The following options have been added. They all take an argument F; the dash in the name of the option is optional.

-input_file: This tells Tralics to consider F as the input file. The .tex suffix is added if not present. A .xml suffix is removed.
-output_file: This is same as -o.
-o: This tells Tralics to put the resulting XML document in file F. The .xml suffix is added if not present.
-log_file: This tells Tralics to put tracing informations in F. The .log suffix is added if not present.
-input_dir: Same as -input-path.
-input_path: The argument F is a sequence of directories, separated by colons; the current directory is marqued by a dot, or an empty slot. If the current directory is not in the list, it will be added at the end.
-output_dir: This specifies where to put all resulting files (XML result, transcript file, etc.)

If the input file has the form foo/bar, and no input directory is given, the input path is set to foo followed by the current directory; if no output directory is given, then foo will be used instead.

Back to main page