Tralics, a LaTeX to XML translator; Part I

6. Running Tralics

6.1. Introduction

There is a number of ways to alter the translation of your TeX document. One solution consists in using a ult file: this is a TeX file that Tralics loads automatically before the main source. The file has the same name as the main source, with a different extension, and is in the same directory.

All other configuration files are searched in a list of directories (default being confdir). There are four such types: file with extensions clt and plt are TeX files, that contains code associated to classes and packages (the u in ult stands for user, the other letters are for LaTeX and Tralics).

The file .tralics_rc is known as the default configuration file, its use is considered obsolete. Configuration files of this kind consist in a sequence of subfiles, and a rule for choosing a Type, that is, either a subfile or an external file, for instance ra2007.tcf. The suffix tcf stands for Tralics configuration file, there structure and use is explained here. The default value for the Type is the current document class. In the description of command line arguments below, some options are marked `Raweb only´, this means that they are meaningful only when the Type (after removal of trailing digits) is ra.

The tcf file defines the DOCTYPE: this is the second line of the XML output; if the doctype is foo+bar.dtd, this means that the dtd file is bar.dtd and the root element is <foo>. The DOCTYPE can also be given as a command line argument or in the TeX source using a special syntax.

The tcf file may contain a sequence of assignments. Some of them control the attributes of the root element, but in general they alter the name of XML elements and attributes. These names can also be given as command line argument, or in the TeX source.

The tcf file may contain some TeX code. In fact, the file ra.tcf contains code that ought to be in ra.clt, and exists only for historical reasons.

Finally, a tcf file can contain a TitlePage block: this is a description of how commands like \maketitle can be translated using meta-data (title, author, keywords, etc) defined earlier.

6.2. The command line arguments

If you call Tralics without arguments, you will see something like

This is tralics 2.11.7, a LaTeX to XML translator, running on macarthur
Copyright INRIA/MIAOU/APICS 2002-2008, Jos\'e Grimm
Licensed under the CeCILL Free Software Licensing Agreement
Say tralics --help to get some help

In any case, the first three lines are printed. The version number may vary; we shall describe here the behavior of version 2.12 (released in April 2008). Command line arguments are read and interpreted from left to right. If an argument does not start with a hyphen, it is the name of the source file (only one input file is accepted); otherwise it is called an option. Some option names are shown with a hyphen, it is optional (in fact, dashes and underscores are ignored in option names), so that `-help´ and `--help´ are synonyms. Some options take no argument, for instance -version (whose effect is to print the version number and quit); others, for instance -input-file, take an argument. The argument is the character string that follows, preceded by an optional equals sign. Example

tralics -foo = bar gee     #1
tralics -foo= bar gee
tralics -foo =bar gee
tralics -foo=bar gee       #4
tralics -foo = "bar gee"   #5
tralics -foo = bar\ gee
tralics -foo  bar\ gee
tralics -foo = " bar gee"  #8
tralics -foo = \ bar\ gee
tralics -foo  \ bar\ gee
tralics "-foo = bar gee"   #11
tralics -foo\ =\ bar\ gee\

We assume here that a command line interpreter (usually called a shell) reads the line you type, converts it in character strings, finds the executable program associated to the first string, and calls it with all these strings as arguments. There are five arguments on the first line (the first argument is the name of the program, it is currently ignored). We assume here that spaces can be inserted into an argument by either enclosing the string in quotes, or by escaping the space with a backslash, and that characters after a sharp sign are ignored. Assume that -foo is a Tralics option that takes a value; then the previous line are interpreted as follows.

The first three examples are similar but for spaces around the equals sign. Cases 1 and 4 are equivalent, the argument of -foo is bar, and there is a second option gee. In case 2, the argument of -foo is empty, and there are two options bar, gee. In case 3, the optional equals sign is omitted, hence the argument is =bar, and there is a second option. Thus you should either put no space or two spaces surrounding the equals sign.

Remaining examples show what happens if you put spaces in the argument. In cases 5, 6 and 7, the argument is bar-space-gee. In cases 8, 9, 10 it is space-bar-space-gee. Lines 11 and 12 are the same, except for the trailing space. Since Tralics removes spaces before and after the equals sign, the argument is bar-space-gee (plus space in the last case).

Here is the list of all options, in alphabetic order.

Example. Assume that we have a file, named xii.tex, containing

Fjfi71PAVVFjbigskipRPWGAUU71727374 75,76Fjpar71727375Djifx
RrhC?yLRurtKFeLPFovPgaTLtReRomL;PABB71 72,73:Fjif.73.jelse
B73:jfiXF71PU71 72,73:PWs;AMM71F71diPAJJFRdriPAQQFRsreLPAI
I71Fo71dPA!!FRgiePBt'el@ lTLqdrYmu.Q.,Ke;vz vzLqpip.Q.,tz;
;Lql.IrsZ.eap,qn.i. i.eLlMaesLdRcna,;!;h htLqm.MRasZ.ilk,%
s$;z zLqs'.ansZ.Ymi,/sx ;LYegseZRyal,@i;@ TLRlogdLrDsW,@;G
LcYlaDLbJsW,SWXJW ree @rzchLhzsW,;WERcesInW qt.'oL.Rtrul;e
doTsW,Wk;Rri@stW aHAHHFndZPpqar.tridgeLinZpe.LtYer.W,:jbye

If you call Tralics, with the option `find-words´, you can see that the XML file contains once drumming and drummers, twice piping and pipers, 3 times leaping and lords, 4 times dancing and ladies, 5 times milking and maids, 6 times swimming and swans, 7 times laying and geese, 8 times rings and gold, 9 times calling and birds, 10 times hens and french, 11 times doves, turtle and `and´, 12 times tree, pear, in, partridge, me, to, gave, love, true, my, christmas, of, day, the, on. There are 45 words with a single letter. The words: twelve, eleven, ten, nine, eight, seven, six, five, four, three, two, appear x times, where 13-x is the value of the word. The words first, second, third, fourth, fifth, sixth, seventh, eighth, ninth, tenth, eleventh, twelfth appear once. Amazing isn´t it? The file was written by D. Carlisle, it is available on the CTAN. This is not really a LaTeX file, so that some features cannot be applied (for instance, there is no at-begin-document hook). The \bye command was implemented in Tralics for this example to compile without error.

6.3. Configuration files

A configuration file is a way to alter the translation, using a special syntax. It contains some rules that define a Type and some tcf blocks, where a tcf block is identical to the content of a tcf file, and a Type is the name of a tcf file (tcf stands for Tralics configuration file). The Type can be given as a command line argument, or in the main source, provided that the following magic line appears near the beginning of the document (the tcf file name is between quotes):

% Tralics configuration file 'test0.tcf'

A tcf file may contain some blocks: for instance, a TitlePage block, described later, or a Command block, that contains LaTeX commands inserted at the start of the document; it contains also assignments of various types. In particular, it contains the Document Type used in the XML output. As already mentioned, the Document Type information can be given as a command line argument; it may also be given in the main source file, if a magic like the following appears near the start of the document (the DTD is classes.dtd, with <book> as root element):

% Tralics DOCTYPE = book classes.dtd

We explain here the default configuration file (that has little use anymore), the old default configuration file (in use before 2006), the tcf file for the Raweb, a tcf and plt file for Research Reports (we will show how the same document can be compiled in two different ways).

6.3.1. The standard configuration file

We give here the content of the standard configuration file. As you can see, there are lots of comments. There is one assignment, this is a rule that says that the Type to use is the document class of the input file. There is a block that says that if this Type is report, book, article, minimal, and if no tcf file is found for this Type, then std.tcf should be used instead; this block says also that torture1 and torture2 are aliases for torture (used only for debugging). Finally, there is a block defining std.tcf: it says that the Doctype to use is classes.dtd, and the root element is <std>.

Lines 2 and 3 were modified: we added the letter `x´ after the dollar sign, for otherwise RCS would replace the identification of the original file by the identification of the LaTeX file.

# This is a configuration file for tralics.
# $xId: tralics_rc,v 2.24 2006/07/24 08:23:17 grimm Exp $
## tralics ident rc=standard $xRevision: 2.24 $
# Copyright Inria. Jos\'e Grimm. Apics. 2004/2005 2006
# This file is part of Tralics.
% Some comments: comments start with % or #
% this means: take the documentclass value as type name
Type = \documentclass
## First type defined is the default. Since version 2.8,  there is only
## one type.
BeginType std#      standard latex classes
  DocType = std classes.dtd
  torture torture1 torture2
  std report book article minimal

Comments: If no configuration file is found, default rules apply. In particular, the default Type is the document class, and if no tcf file is found, the DocType to use is unknown from unknown.dtd, unless it is a standard LaTeX class, case where std from classes.dtd is used. This means that the standard configuration file has become useless.

6.3.2. The old configuration file

We shall describe here the old configuration file, used before the notion of tcf files was invented.

Lines starting with a sharp sign or percent sign are comment lines (ignored). Some lines start with `Begin´, and others with `End´. To each Begin, there should be an associated End. Blocks can be nested. Characters after `End´ are ignored, so that you can say `BeginFoo´ followed by `EndBar´, although it is not recommended. All other lines should be comment lines, empty, or indented.

Note the `x´ after the dollar sign; it does not appear in the source file, (see comment in the previous subsection). The third line is a bit special: when Tralics loads the file, it prints the revision number on the terminal.

1 # This is a configuration file for tralics.
2 # $xId: tralics_rc,v 2.15 2005/08/02 09:22:56 grimm Exp $
3 ## tralics ident rc=standard $xRevision: 2.15 $

This is the Copyright notice. In the current version, the semantics of the RA is in the ra.tcf file (described later).

4 # Copyright Inria. José Grimm. Apics. 2004/2005
5 # This file is part of Tralics.
6 # (See the file COPYING in the Tralics main directory)
7 # If you modify this file, by changing the semantics of type RA,
8 # please remove the `standard' on the `tralics ident' line above,
9 # or replace it by `non-standard'.
11 % Some comments: comments start with % or #

A configuration file is split into main sections, one for each type. We start with the RA, or raweb.

12 ## configuration for the RA (Inria's Activity Report)
13 ## First type defined is the default
15 BeginType RA     % Case RA
16   DocType = raweb raweb3.dtd
17   DocAttrib = year \specialyear
18   DocAttrib = creator \tralics

This comment explains how to parametrize some element or attribute names that were built-in in a previous version of Tralics. We shall see later how Language can be used (default value is `language´), the same for lang_en and lang_fr that have `english´ and `french´ as default value. Translation of a \caption produces an element <caption>, whose name will be changed to <head> by the post-processor of figures (it will be left unchanged if the caption is not in a figure or a table). The variable xml_caption_name can be used to change the first name, and xml_scaption_name can be used to change the second name. The title of a `topics´ (defined by \declaretopics) is in a <t_title> element, the name can be changed by xml_topic_title. A reference to a topic uses the num attribute; this attribute name can be changed by att_topic_num. The identification of an Inria Team is in <accueil>, this can be changed via xml_accueil_name. It is formed of a long name in <projetdeveloppe> and a short name in <projet>, the name of these elements can be changed via xml_project_name or the `expanded´ version. The section with the composition of the team is <composition>, its name can be changed by xml_composition_ra_name .

19   #(new)
20 #  Language = "xml:lang"
21 #  lang_en =  "en"
22 #  xml_scaption_name= "caption"
23 # xml_topic_title=""
24 #  xml_project_name = "title"
25 #  xml_expanded_project_name = "longtitle"
26 #  xml_accueil_name = "identification"
27 #  xml_composition_ra_name = "team"
28 #  att_topic_num = "id"

Processing of the Raweb needs converting the XML output of Tralics into XSL/Format, HTML, etc., via some external commands like `xsltproc´, `latex´, etc. Originally, Tralics was in charge of these commands, and the configuration file explains how to call these tools. These lines are not needed anymore.

29   makefo="xsltproc --catalogs -o %B %C";
30   makehtml = "xsltproc --catalogs  %B %C";
31   call_lint = "xmllint --catalogs --valid  --noout %C"
32   makepdf = "pdflatex -interaction=nonstopmode %w"
33   %makedvi = "latex -interaction=nonstopmode %w"
34   % makedvi et dvips pour marie-pierre
35   %dvitops = "dvips %w.dvi -o"
36   %makedvi = "latex -interaction=nonstopmode %w"
37   generatedvi = "latex -interaction=nonstopmode %t"
38   % old latex: "latex \\nonstopmode\\input{%t}"
39   generateps = "dvips %t.dvi -o"

This defines the list of valid Raweb sections, themes and URs (research units). If you change these lines please: a) remove the `standard´ on line 3, or b) make sure that it matches the official list, or c) make sure that this remains a private copy. A star after a section name says that topics are not allowed(note: ).

40 #these are new in version 2.0
41   theme_vals = "com cog num sym bio"
42   section_vals = "composition*/presentation*/fondements/domaine/logiciels/"
43   section_vals = "+resultats/contrats*/international*/diffusion*/"
44   ur_vals = "Rocquencourt//Sophia/Sophia Antipolis/Rennes//Lorraine//";
45   ur_vals = "+RhoneAlpes/Rhône-Alpes/Futurs//"

Due to some inertia, people continue using the obsolete environment. We make sure an error is signaled.

46 BeginCommands
47  \newenvironment{body}{\obsoleteEnvBody The body environment is %
48      obsolete since 2003}
49    {End of obsolete environment body}
50  \newenvironment{abstract}{\obsoleteEnvAbstract The abstract %
51     environment is obsolete since 2003}
52   {End of obsolete environment abstract}
53 EndCommands
55 End

This is an example of titlepage environment; it will be discussed later. In fact, we shall give below the content of the RR.tcf file, it is identical.

56 ## configuration for the RR (Research Report of Inria)
57 ## not yet complete
59 BeginType RR#      Case RR
60 ...
89 EndType

A short definition for standard classes.

90 BeginType std#      standard latex classes
91   DocType = std classes.dtd
92   xml_biblio = "bibliography"
93 End

Some aliases.

94 # (types Article and slides are not defined, hence this is useless)
96 BeginAlias
97   Article report
98   slides inriaslides foiltex
99 End

This command has to be outside any block.

100 % this means: take the documentclass value as type name
101 Type = \documentclass

More aliases. Note that toto matches RR (first in list) and report matches std (because `unknown´ is undefined).

105 BeginAlias
106   RR toto# ra2001
107   RA ra toto ra2001x%etc
108   torture torture1 torture2
109   unknown report
110   std report book article minimal
111 End

For fun.

112 ## an empty type
113 BeginType MP
114 EndType

This is used for testing Tralics.

115 BeginType torture
116   DocAttrib = creator \tralics
117   DocType  = ramain raweb.dtd
118   on package loaded calc CALC = "true"
119   on package loaded foo/bar FOO1 = "true"
120   on package loaded *foo/bar FOO2 = "true"
121   on package loaded foo/*bar FOO3 = "true"
122   on package loaded *foo/*bar FOO4 = "true"
123   url_font = "\large "
124   no_footnote_hack = "false"
125   on class loaded calc CALC="true"
127   use_font_elt = "true"
128   xml_font_small = "font-small"

A bunch of declarations omitted here. The list of all options is given later, in test.tcf.

154   xml_underline_name = "font-underline"
156 BeginCommands
157   % These commands are inserted verbatim in the file
158   \def\recurse{\recurse\recurse}
159 EndCommands
160 EndType

This may be used for typesetting a bibliography, exactly like the Raweb.

161 BeginType rabib     % Case RA
162   DocType = raweb raweb3.dtd
163   DocAttrib = year \specialyear
164   DocAttrib = creator \tralics
166 BeginCommands
167   % These commands are inserted verbatim in the file
168   \newcommand\usebib[2]{\bibliography{#1#2,#1_foot#2+foot,#1_refer#2+refer}}
169 EndCommands
170 EndType

6.3.3. The ra.tcf file

This is the tcf file used for the Raweb2006. Read carefully the copyright notice.

1 # This is a configuration file for tralics, for the Raweb
2 # $xId: ra.tcf,v 2.3 2006/07/25 16:29:39 grimm Exp $
3 ## tralics ident rc=standard-ra $xRevision: 2.3 $
6 # This file is part of Tralics.
7 # Copyright Inria. Jos\'e Grimm. Apics. 2004/2005, 2006
8 # (See the file COPYING in the Tralics main directory)
9 # If you modify this file, by changing the semantics of the RA,
10 # please remove the `standard-ra' on the `tralics ident' line above,
11 # or replace it by `non-standard'.

A tcf file is a like the configuration file, but it applies to a single type of document; for this reason, there is no need to explain how the type is computed (no `Type´ declaration), neither to what type a block applies (there is no `BeginType´ block). These three lines are the same as before. Note that the 2007 DTD is raweb7.dtd.

12   DocType = raweb raweb3.dtd
13   DocAttrib = year \specialyear
14   DocAttrib = creator \tralics

These line are as before, without commented out lines.

15   makefo="xsltproc --catalogs -o %B %C";
16   makehtml = "xsltproc --catalogs  %B %C";
17   call_lint = "xmllint --catalogs --valid  --noout %C"
18   makepdf = "pdflatex -interaction=nonstopmode %w"
19   generatedvi = "latex -interaction=nonstopmode %t"
20   generateps = "dvips %t.dvi -o"

This values are the same as those shown above.

21   theme_vals = "com cog num sym bio"
22   ur_vals = "Rocquencourt//Sophia/Sophia Antipolis/Rennes//Lorraine//";
23   ur_vals = "+RhoneAlpes/Rh\^one-Alpes/Futurs//"

In 2006, section_vals has the same value as shown above; in 2007 it is replaced by the following lines.

24   fullsection_vals = "/composition/Team/presentation/Overall Objectives/\
25      fondements/Scientific Foundations/domaine/Application Domains/\
26      logiciels/Software/resultats/New Results/\
27      contrats/Contracts and Grants with Industry/\
28      international/Other Grants and Activities/diffusion/Dissemination"

New in 2006 are the two lists affiliation_vals and profession_vals. The syntax is the same as for other lists. The value given here is an example; the real names are in French.

29   affiliation_vals ="Inria//Cnrs//University//ForeignUniversity//"
30   affiliation_vals ="+Public//Other//"
31   profession_vals = "Scientist//Assistant//Technical//PHD//"
32   profession_vals = "+PostDoc//StudentIntern//Other//"

We have the same obsolete environments as before. Moreover, we declare that \keywords is the same as \motscle; this is needed because we removed the \keywords command (for the Raweb, this is an environment, using it as a command will fail in a very strange manner).

33 BeginCommands
34  \let\keywords\motscle
35  \newenvironment{body}{\obsoleteEnvBody The body environment is %
36      obsolete since 2003}
37    {End of obsolete environment body}
38  \newenvironment{abstract}{\obsoleteEnvAbstract The abstract %
39      environment is obsolete since 2003}
40   {End of obsolete environment abstract}
41 EndCommands

This is the command block for the ra2007. The last line does not appear in the file, but is automatically added in Raweb mode; the command uses the values saved by \theme, \UR and its aliases, \project and its alias, \isproject. Some are defined as doing nothing (like \maketitle, \loadbiblio, \declaretopic, \TeamHasHdr). The \module command is redefined: if the last argument is empty, a default value is used instead.

42 BeginCommands
43  \makeatletter
44  \def\declaretopic#1#2{} %% obsolete in 2007
45  \def\TeamHasHdr#1{} %% temporary
46  \def\theme#1{\def\ra@theme{#1}}
47  \def\UR#1{\def\ra@UR{#1}}
48  \def\isproject#1{\def\ra@isproject{#1}}
49  \let\ResearchCenterList\UR
50  \let\ResearchCentreList\UR
51  \def\projet#1#2#3{\def\ra@proj@a{#1}\def\ra@proj@b{#2}\def\ra@proj@c{#3}}
52  \let\project\projet
53  \def\moduleref#1#2#3{\ref{mod:#3}}
54  \let\oldmodule\module %% Compatibility
55  \renewcommand\module[4][]{\oldmodule{#2}{#3}{\@ifbempty{#4}{(Sans Titre)}{#4}}}
56 \let\htmladdnormallinkfoot\@@href@foot
57 \let\htmladdnormallink\@@href
58  \makeatother
59  \let\maketitle\relax
60  \let\loadbiblio\relax
61  \let\keywords\motscle
62 %%% \AtBeginDocument{\rawebstartdocument} %%% pseudo line
63 EndCommands

New in 2008 is the following list. The argument of the catperso environment must be one of XX, YY, ZZ, interpreted as xx, YY and zz. If the declaration is omitted, there is no restriction on the argument. Whether or not there will be such a restriction in the file ra2008.tcf is still undecided.

64 catperso_vals = "XX/xx/YY//ZZ/zz"

6.3.4. The RR.tcf file

We indicate here the content of the RR.tcf file, it defines commands for the title page of a Research Report.

1 ## tralics ident rc=RR.tcf $Revision: 1.29 $
2 ## configuration for the RR (Research Report of Inria)
5   DocType = rr raweb.dtd
6 BeginTitlePage
7   \makeRR <RRstart> "" "type = 'RR'"
8   alias \makeRT "" "type = 'RT'"
10   <UR> -
11   \URSophia ?+<UR>
12   \URRocquencourt ?+<UR>
13   alias \URRocq
14   \Paris ?<UR> <Rocquencourt>
15   \URRhoneAlpes ?+<UR>
16   \URRennes ?+<UR>
17   \URLorraine ?+<UR>
18   \URFuturs ?+<UR>
20   \RRtitle q<title> "pas de titre"
21   \RRetitle q<etitle>  "no title"
22   \RRprojet <projet> "pas de projet"
23   \motcle <motcle> "pas de motcle"
24   \keyword <keyword>  "no keywords"
25   \RRresume p<resume> "pas de resume"
26   \RRabstract p<abstract> "no abstract"
27   \RRauthor + <author> <auth> "Pas d'auteurs"
28   \RRdate <date> "\monthyearvalfr"
29   \RRNo <RRnumber> "????"
31   \RRtheme <>  +"pas de theme" % CES
32   <Theme> -                    % E
33   \THNum ?+<Theme>             % CE
34   \THCog ?+<Theme>             % CE
35   \THCom ?+<Theme>             % CE
36   \THBio ?+<Theme>             % CE
37   \THSym ?+<Theme>             % CE
39 %%  \myself \RRauthor "grimm"  % CCS
40 %%  \cmdp <cmdp> +"nodefault"  % CES
41 %%  \cmda <cmdA> A"\cmdAval"   % CES
42 %%  \cmdb <cmdB> B"\cmdBval"   % CES
43 %%  \cmdc <cmdC> C"\cmdCval"   % CES
45 End
47 BeginCommands
48   \let\RRstyisuseful\relax
49 End

6.3.5. The RR.plt file

We indicate here the content of the RR.plt file, it also defines commands for the title page of a Research Report. This is a TeX file, loaded whenever the package `RR´ is used. Note that, if the RR.tcf file is loaded, the line 48 above defined a command that is checked on line 4 below, so that the file is ignored. We shall explain later how these two files can be used.

1 % -*- latex -*-
2 \ProvidesPackage{RR}[2006/10/03 v1.1  Inria RR for Tralics]
4 \ifx\RRstyisuseful\relax\endinput\fi
6 \newcommand\RRtitle[1]{{\let\\\ \xbox{ftitle}{#1}}}
7 \newcommand\RRetitle[1]{{\let\\\ \xbox{title}{#1}}}
8 \newcommand\RRauthor[1]{\xbox{author}{#1}}
9 \newcommand\RRprojet[1]{\xbox{inria-team}{#1}}
10 \newcommand\RRdate[1]{\xbox{date}{#1}}
11 \newcommand\RRNo[1]{\xbox{rrnumber}{#1}}
12 \newcommand\RRtheme[1]{\xbox{theme}{#1}}
13 \newcommand\keyword[1]{\xbox{keyword}{#1}}
14 \newcommand\motcle[1]{\xbox{motcle}{#1}}
15 \newcommand\THNum{THnum}
16 \newcommand\THCom{THcom}
17 \newcommand\THCog{THcog}
18 \newcommand\THSym{THsym}
19 \newcommand\THBio{THbio}
20 \newcommand\URSophia{\xbox{location}{Sophia Antipolis}}
21 \newcommand\URLorraine{\xbox{location}{Lorraine}}
22 \newcommand\URRennes{\xbox{location}{Rennes}}
23 \newcommand\URRhoneAlpes{\xbox{location}{Rhône-Alpes}}
24 \newcommand\URRocq{\xbox{location}{Rocquencourt}}
25 \newcommand\URFuturs{\xbox{location}{Futurs}}
26 \newcommand\RRresume[1]{\begin{xmlelement*}{resume}#1\end{xmlelement*}}
27 \newcommand\RRabstract[1]{\begin{xmlelement*}{abstract}#1\end{xmlelement*}}
29 \let\makeRT\relax
30 \let\makeRR\relax

6.3.6. Sample files

The Tralics distribution comes with a bunch of test files. There are two directories: the Test directory contains source files, and the Modele directory contains the translation. In particular, the file tpa2.tex explains how to use a tcf file to change the names of most XML elements.

6.4. The action before translation

As explained at the start of the Chapter, Tralics first reads all options. Some of them are marked `Raweb only´; this means that they are not used, unless the Type is ra (i.e. you are translating the Raweb, see next section); this section describes how the Type is computed.

Unless Tralics is called with option interactive-math, an input file name is required. The program is aborted if more than one input name is given. It must be the name of a TeX file: an extension tex is added if needed, so that foo and foo.tex are the same. As an exception foo.xml is also equivalent to foo.tex. We consider two examples, the xii.tex shown above, and the following LaTeX file hello1.tex:

Hello, world!

6.4.1. Files and Paths

The standard way to use Tralics is to type `tralics filename´ in a terminal, example:

grimm@macarthur$ tralics hello1
This is tralics 2.12, a LaTeX to XML translator, running on macarthur
Output written on hello1.xml (179 bytes).
No error found.
(For more information, see transcript file hello1.log)
grimm@macarthur$ ls hello1*
hello1.log      hello1.tex      hello1.xml

The ls command shows the source, the result of the translation and the transcript file. If the file hello1.ult were present, it would be been read by Tralics, and if the source were a bit more complicated the files hello1.img and hello1_.bbl might have been created. All these files are in the same directory, and this paragraph explains what you can do if input or output files are elsewhere.

Consider now a graphical interface to Tralics, where you drag and drop the TeX source; in such a case there is no shell anymore, hence no current directory; what Tralics gets is an absolute path name (that may be of the form /users/somebody/somewhere). In early versions, such an absolute path was a fatal error. Currently, only Unix-like pathnames are implemented.

Consider now a system, like the Raweb, where the XML file produced by Tralics is converted to another XML file (with a different DTD), and further processed. Thus a great number of files are created, and managing all these becomes uneasy. As the example below shows, you can ask Tralics to put the files it creates in another directory, you can chose the name of the XML output (so that Tralics can create foo-t.xml from foo.tex, and this file can be processed again into foo.xml), and you can also chose the name of the transcript file.

grimm@macarthur$ tralics hello1 -o h2 -logfile=h3 -output_dir=../Test
This is tralics 2.12, a LaTeX to XML translator, running on macarthur
Output written on ../Test/h2.xml (179 bytes).
No error found.
(For more information, see transcript file ../Test/h3.log)

The input path is a colon separated list of directories. For instance `../foo/A:/bar/B/::gee:´ contains five elements, two of them being empty. The empty slot represents the current directory, it will be added at the end if omitted. The current directory may also be given as a single dot. A final slash is silently removed. In this example, the path means: search in subdirectory A of the sibling directory foo, then if the subdirectory B of the directory bar that is is at the root, then in the current directory, then in the subdirectory gee of the current directory, then in the current directory again; this rule does not apply if a file starts with a dot or a slash.

A special case is when the main input file name starts with a dot or a slash, for instance /usr/grimm/home/hello or ./Test/hello.tex. In this case, the name is split into pieces. One piece is the entry name, say hello, another one is the directory name (everything before the final slash), and the last part is the extension (here .tex). If no output directory is given on the command line, the directory of the input file is used. In the same fashion

You can also specify the name of the transcript file; By default, this is the entry name. If for instance you use /foo/bar, then input file will be /foo/bar.tex and the transcript file will be /foo/bar.log; you may change the name of the transcript file, so as to get /foo/myfile.log; you may change the directory of the transcript file, so as to get /mydir/bar.log; you may change both.

Consider again the case where the input is /foo/bar. If no input path is given, then Tralics behaves as if the file was bar, and the input path was `foo:´. This has as consequence that, if bar inputs another file, say bar1, it is first searched in the same directory as bar, and then in the current directory. Moreover, if no output directory is specified, files written by bar are put in this directory, thus can be read again. If the user gives an input path, it will be left unchanged, and the input path is not considered for the main path. Example: Directories foo and foo1 contain files bar and bar1; bar inputs bar1, input path contains foo1. If the main file is /foo/bar, it will input /foo1/bar1. If the input path contains both foo and foo1, and the main file is bar, you will get either /foo/bar and /foo/bar1 or /foo1/bar and /foo1/bar1, depending on the order.

6.4.2. Finding the configuration

There are some options that tell Tralics not to produce an XML file, we shall not explain them. Thus, after parsing all arguments, Tralics reads the complete source (main input file). It opens the transcript file, and print on the terminal a line like Starting translation of file hello1.tex. The transcript file will contain a bit more information, namely

Transcript file of tralics 2.12 for file hello1.tex
Copyright INRIA/MIAOU/APICS 2002-2008, Jos\'e Grimm
Tralics is licensed under the CeCILL Free Software Licensing Agreement
Start compilation: 2008/04/19 18:27:18
OS: Apple, machine macarthur
Starting translation of file hello1.tex.
Using iso-8859-1 encoding (idem transcript).
Left quote is ` right quote is '
Input path (../FO:../Test:)
++ Input encoding is 1 for ../Test/hello1.tex

After that, Tralics reads the configuration file. You can use the -noconfig option, this inhibits reading a configuration file. In this case the transcript file contains

No configuration file.
No type in configuration file
Seen \documentclass article
Potential type is article
Using some default type
dtd is std from classes.dtd (standard mode)
OK with the configuration file, dealing with the TeX file...

The first line says that no configuration file is considered, so that an empty one will be used instead. The TeX source is scanned for a document class. If this is a standard one (book, article, report, minimal, the DTD is std from classes.dtd, otherwise unknown from unknown. Consider now the same file, without the -noconfig option. We get

++ file .tralics_rc does not exist.
++ file ../confdir/.tralics_rc exists.
Configuration file identification: standard $ Revision: 2.24 $
Read configuration file ../confdir/.tralics_rc.
Configuration file has type \documentclass
Seen \documentclass article
Potential type is article
Defined type: std
++ file article.tcf does not exist.
++ file ../confdir/article.tcf does not exist.
Alias torture does not match article
Potential type article aliased to std
Using type std
dtd is std from classes.dtd (standard mode)

There are some lines starting with a double plus sign. Whenever Tralics searches if a file exists, it will print such in line in the transcript file. The first two lines that do not start with a double plus are also printed on the terminal (this is an easy way to check that that right configuration file has been seen). The standard configuration file says that they Type is the document class (here article). This is a true type, provided that it is defined, and the configuration file does not define it. It could be defined in article.tcf. But you can see that there is no such file. As a consequence, the behavior is the same as if no configuration file has been given.

This is what happens if the option config=rabib is given

Trying config file from user specs: rabib.tcf
++ file ../confdir/rabib.tcf exists.
Configuration file identification: rabib.tcf $ Revision: 2.2 $
Read configuration file ../confdir/rabib.tcf.
Using tcf type rabib
dtd is raweb from raweb3.dtd (standard mode)

You can notice that a tcf file is being searched in the confdir directory. If the name starts with a slash or a dot, no extension is added, and the file is not searched in the configuration path. Assume that the source file contains a line of the form

% Tralics configuration file 'test0.tcf'

and you neither specify a configuration file, nor inhibit loading one. Then you will get

Trying config file from source file `test0'
++ file test0.tcf does not exist.
++ file ../confdir/test0.tcf exists.
Read configuration file ../confdir/test0.tcf.
Using tcf type test
catperso_vals: AA -> BB
catperso_vals: CC -> CC
catperso_vals: XX -> xx
dtd is unknown from unknown.dtd (standard mode)

As you can see, tcf extension is added, and the file is searched in the current directory first, then in the configuration path.

You can call Tralics with option type=rabib. This just says that the name of the tcf file should be rabib, instead of the document class; it is thus useless if the name of the tcf file to use has been given as shown above. It can be useful in the case of a plain TeX file, that has no document class. In the example that follows, we say that the type is ra12.

++ file .tralics_rc does not exist.
++ file ../confdir/.tralics_rc exists.
Configuration file identification: standard $ Revision: 2.24 $
Read configuration file ../confdir/.tralics_rc.
Configuration file has type \documentclass
Seen \documentclass article
Potential type is ra12
++ file ra12.tcf does not exist.
++ file ../confdir/ra12.tcf does not exist.
++ file ra.tcf does not exist.
++ file ../confdir/ra.tcf exists.
Configuration file identification: standard-ra $ Revision: 2.3 $
Read tcf file ../confdir/ra.tcf
Using type ra
dtd is raweb from raweb3.dtd (mode RAWEB2007)

Note that no file ra12.tcf was found, and ra.tcf was used searched for. As a consequence, the effective type is ra, and Raweb mode is entered; this is an error, since current file is not a ra file. In fact, you will be faced to Fatal error: Input file name must be team name followed by 2007. Note that you can compile a file named foo2006 in Raweb mode, as long as this matches the year option (if used) and the document class is ra2006.

6.4.3. Old behaviour

The algorithm is the following.

  1. If you say tralics -noconfig, then no configuration file is read at all.

  2. If you say tralics -configfile=foo, then Tralics will print Trying config file from user specs, and try to use this file.

  3. If you say tralics -configfile=foo.tcf, then Tralics will print the same as above; it will also search the file in the `confdir´ directory.

  4. If the source file contains `% tralics configuration file 'foobar'´, then Tralics will print Trying config file from source file, and try to use this file. In case of failure, and if the name `foobar´ contains no dot, the suffix .tcf is added, and the next rule is applied.

  5. If the source file contains `% tralics configuration file 'foobar.tcf'´, then Tralics will print the same as above; it will also search the file in the `confdir´ directory.

  6. The default configuration file is named .tralics_rc (or tralics_rc on Windows). The current directory is looked at first, then the tralicsdir, finally the home directory.

  7. If you say tralics -dir TOTO, then TOTO/src/.tralics_rc is the second try.

  8. The home directory, or its src subdirectory, is searched next. (Depending on the operating system, this can fail, because there is no standard way of defining the home directory of the user).

  9. If you set the shell variable TRALICSDIR to somedir, or RAWEBDIR to somedir, then somedir/src/.tralics_rc is the last try. If neither variable is set, then some default location will be used.

In current version, rules 4, 7, 8 and 9 have been removed.

6.4.4. Preparing the translation

Let´s consider again file hello1, compiled with option type=rabib. The transcript file contains the following lines.

OK with the configuration file, dealing with the TeX file...
There are 4 lines
Starting translation
{\countdef \count@=\count255}

After that, there is a bunch of lines of the form `countdef x=y´, and in verbose mode, the bootstrap code, as explained later. The meaning of the last line shown here is: all bootstrap lines have been correctly read.

{changing \countref395332=0 into \countref395332=1}
[1] %% Begin bootstrap commands for latex
[2] \@flushglue = 0pt plus 1fil
[47] %% End bootstrap commands for latex
++ Input stack empty at end of file

Our configuration file contains a block of TeX code. The transcript file shows them

[19]   % These commands are inserted verbatim in the file
[20] \newcommand\usebib[2]{\bibliography{#1#2,#1_foot#2+foot,#1_refer#2+refer}}

Our configuration file contains also

DocAttrib = variable "va'&quot;lue"
DocAttrib =Foo \World
DocAttrib =A \specialyear
DocAttrib =B  \tralics
DocAttrib =C  \today

The effect is to add an attribute to the main element. The normal syntax is: DocAttrib = foo “bar”. The attribute name must contain only ASCII letters, the value can consist of any character. An apostrophe is replaced by &apos;, double quotes must be given as &quot;, as well as some other special characters. Using a command name instead of a string means that the value of the command should be used. The value \tralics is replaced by a string of the form `Tralics version 2.9´, and \specialyear is replaced by the year as used by the Raweb (the current year, in general). The command \Word is undefined, and an error is signaled.

Before translating the document, the ult file is checked first. Here, the star says that the @ character should be of category letter while loading the file, and the plus sign says that the file should be searched in the same directory as the main file, and not elsewhere. We finish by showing the class file is found.

[1] \InputIfFileExists*+{hello1.ult}{}{}
++ file hello1.ult does not exist.
[1] \documentclass{article}
[2] \begin{document}
++ file article.clt does not exist.
++ file ../confdir/article.clt exists.

6.5. Translating the Raweb

Raweb mode is entered if a configuration file is found that says that the type to use is `RA´ or `ra´. The document class should be ra97, ra98, or, for later years, ra1999. The example has ra2003. This must match the name of the input file, which is miaou2003. The document can be translated in one of three versions: first, you may try latex, this gives miaou2003.dvi; then we have a mode in which miaou2003.tex is converted into miaou.tex, and latex can produce miaou.dvi. Finally, Tralics may produce miaou.xml, and this can be compiled into wmiaou.dvi.

Historically, we had a Perl script for the conversion, this was extended to a translator, then re-written in C++. You could edit the script and change it (for instance, if a non-standard name for the LaTeX executable is needed). Since Tralics is nowadays a binary file, you cannot edit it. For this reason the configuration file contains some lines (see old configuration file, lines 29 to 39) that can be modified. These are copied into and, after Tralics has produced a XML file, it calls an external program (defined by the `externalprog´ switch, default being If the current year (2003 in the example below) is 2007 or more, simplified ra mode is entered, not postprocessor is called, and no file is created. Here is an example:

1 $::makefo='xsltproc --catalogs -o %B %C';
2 $::makehtml='xsltproc --catalogs  %B %C';
3 $::checkxml='xmllint --catalogs --valid  --noout %C';
4 $::makepdf='pdflatex -interaction=nonstopmode %w';
5 $::makedvi='';
6 $::dvitops='';
7 $::generate_dvi='latex -interaction=nonstopmode %t';
8 $::generate_ps='dvips %t.dvi -o';
9 $::raweb_dir='/user/grimm/home/cvs/raweb';
10 $::raweb_dir_src='/user/grimm/home/cvs/raweb/src/';
11 $::ra_year='2003';
12 $::no_year='miaou';
13 $::tex_file='miaou';
14 $::todo_fo=0;
15 $::todo_html=0;
16 $::todo_tex=0;
17 $::todo_lint=0;
18 $::todo_ps=0;
19 $::todo_xml=1;
20 1;

Here is an example of a source file, valid in 2003.

1 \documentclass{ra2003}
2 \theme{Num}
3 \isproject{YES} % \isproject{OUI} works also
4 \projet{MIAOU}{Miaou}{Mathématiques et Informatique de
5    l'Automatique et de l'Optimisation pour l'Utilisateur}
6 \def\foo{bar}
7 \UR{\URSophia\URFuturs}
8 \declaretopic{abc}{Topic abc}
9 \declaretopic{def}{Topic def}
10 \begin{document}
11 \maketitle
12 ...
13 \begin{module}{composition}{en-tete}{}
14 \begin{catperso}{Head of project team}
15 \pers{Laurent}{Baratchart}[DR INRIA]
16 \end{catperso}
17 \end{module}
18 \begin{module}{diffusion}{dif-conf}{Conferences and workshops}
19 \begin{glossaire}\glo{A}{B\par C}\glo{A1}{B1\par C1}\end{glossaire}
20 \begin{participants}
21 \pers{Laurent}{Baratchart},
22 \pers{José}{Grimm}
23 \end{participants}
24 \begin{motscle}
25 meromorphic approximation, frequency-domain identification,
26 extremal problems
27 \end{motscle}
28 \end{module}
29 \loadbiblio
30 \end{document}

This is what Tralics prints, for the full miaou2003 document, in verbose mode

This is tralics 2.5 (pl7), a LaTeX to XML translator
Copyright INRIA/MIAOU/APICS 2002-2005, Jos\'e Grimm
Licensed under the CeCILL Free Software Licensing Agreement
Starting xml processing for miaou2003.
Configuration file identification: standard $ Revision: 2.14 $
Read configuration file /user/grimm/home/cvs/tralics/.tralics_rc.

The lines that follow show the assignments from the configuration file. Note that the year in the mode reflects the compilation data, not the year in the source file.

makefo=xsltproc --catalogs -o %B %C
makehtml=xsltproc --catalogs  %B %C
makepdf=pdflatex -interaction=nonstopmode %w
generatedvi=latex -interaction=nonstopmode %t
generateps=dvips %t.dvi -o
theme_vals=com cog num sym bio
dtd is raweb from raweb3.dtd (mode RAWEB2005)

Following lines are specific to the Raweb. You can see a summary of all the tests done by the program that converts miaou2003.tex to miaou.tex. The statistics (number of environments, keywords, etc) are computed by a preprocessor, that has been removed in 2007.

Ok with the config file, dealing with the TeX file...
Activity report for MIAOU (Miaou)
Mathématiques et Informatique de l'Automatique et de l'Optimisation pour l'Utilisateur
There are 138 environments
Checked 15 keyword env with 60 keywords (52 unique)
Checked 8 catperso and 31 participant(es) envs with 146 \pers
There were 2 topics
Sections (and # of modules): 1(1) 2(1) 3(2) 4(6) 5(5) 6(13) 7(4) 8(5) 9(3).

Whenever a section or a chapter is translated, a line is printed on the terminal. There is a complaint at the end, about a lonely module without title. A title is invented, namely `(Sans Titre)´. A non-trivial task for the post-processor is to remove it (it should not appear on the HTML pages). In 2007, this has become an error.

Translating section composition
Translating section presentation
Translating section fondements
Translating section domaine
Translating section logiciels
Translating section resultats
Translating section contrats
Translating section international
Translating section diffusion
Bib stats: seen 57 entries
Seen 64 bibliographic entries
(SansTitre) Only one module seen in the section
Problem with sans titre 1
There was 1 NoTitle not handled

Tralics prints now statistics.

Used 1756 commands
Math stats : formulas 503, non trivial kernels 299, cells 10227,
   special 1 trivial 149, \mbox 5 large 0 small 118.
List stats: short 0 inc 10 alloc 43456
Buffer realloc 41 string 15750 size 610086; merge 7
Macros created 80 deleted 0
Save stack +1582 -1582
Attribute list search 7539(1494) found 3154 in 5616 elements (1401 after boot)
Number of ref 92, of used labels 36, of defined labels 73, of ext. ref. 19
Modules with 24, without 16, sections with 9, without 15
There were 6 images.
Output written on miaou.xml (250758 bytes).
No error found.
(For more information, see transcript file miaou2003.log)

Here you can see the call to the post-processor. v2.12, (C) 2004 INRIA, José Grimm, projet APICS
Postprocessor did nothing

Since 2006, the syntax of the \pers command in a `catperso´ environment has changed. Example:

\begin{catperso}{Category test}
\pers{Jean}{Durant}{PHD}{ForeignUniversity}[with a T]
\pers{Jean}{Dumas}{PostDoc}{Public}[with a S]
\pers{Jean}{Dumat}{StudentIntern}{Other}[bla bla ][no]
\pers{Jean}{Dumont}{ Other }{Other}[bla bla ][no]

Here are the commands specific to the Raweb:

More information is available on the Web page.

The following commands can be used in any document, but they are specific to the Raweb.

6.6. Tracing commands

In some cases, TeX or Tralics produce wrong results, incomprehensible error messages, and so on. In these cases, you must use specialized commands to see what happens. Since the internal structure of TeX is not the same as Tralics, the results in the transcript file may be different.

We have explained the command \show: it prints the meaning of a command (useful for a user defined command) and \showthe (this shows the value of a variable, counter, a token list, etc). We have also mentioned that \showbox prints the content of a TeX box or XML element. There is a command \showlists; its effect is to indicate the global context; this is not implemented in Tralics. The typical example is from the TeXbook. Given the test file:


This is the result of the \showlists command.

### display math mode entered at line 5
.\fam1 x
### internal vertical mode entered at line 4
prevdepth ignored
### math mode entered at line 3
### restricted horizontal mode entered at line 2
\glue 3.33333 plus 1.66666 minus 1.11111
spacefactor 1000
### vertical mode entered at line 0
prevdepth ignored

This example does not compile in Tralics: you cannot put a \vbox in a math formula. You cannot put a display math formula in a formula.

TeX provides 9 commands of the form \tracingXXXX described earlier. Each variable defines an integer (in general, positive means verbose). There is a command \tracingall that turns everything on. In Tralics, it sets \tracingmacros, \tracingoutput, \tracingcommands and \tracingrestores to 1. Only these variables are useful in Tralics (the command \tracingmath is new in version 2.11, it controls the math printing). For instance, \tracingonline controls whether or not anything is printed on the terminal; for Tralics, debugging information is only printed on the transcript file. Variables like \tracingparagraphs and \tracingpages show line-break and page-break calculations, performed by TeX but not by Tralics. The command \tracingoutput shows boxes when they are shipped out (in Tralics, the whole XML tree is printed at the end; if the command is positive, lines are printed, whenever used by the scanner), \tracinglostchars indicates all characters not found in the fonts (Tralics never looks at font properties).

The command \tracingstats indicates that TeX should gather all statistical information available; in Tralics, statistics are always computed; if you call it with the `silent´ switch, statistics are not printed on the terminal. Note that the `verbose´ switch calls \tracingall.

There are three remaining commands: \tracingmacros is used whenever a user command is expanded, \tracingrestores whenever things are popped from the save stack, and finally, \tracingcommands for all other commands. Let´s start with the example given on page 2.1. This is what you see if \tracingoutput is positive:

[4] \def\foo#1{\xbar#1}
[5] \def\xbar#1{{\itshape #1}}
[6] \foo{12}

It shows the input. This is what you see if \tracingmacros is positive:

\foo #1->\xbar #1
\xbar #1->{\itshape #1}

This is what you see if \tracingrestores is positive:

+stack: level + 2 for brace
{Push p 1}
{font restore }
+stack: level - 2 for brace

This is now what you see if \tracingcommands is positive: As you can see, some commands produce more than one line in the transcript file. For instance, a line is printed for \def when the command is seen, another one when the whole definition is read.

{\def \foo #1->\xbar #1}
{\def \xbar #1->{\itshape #1}}
{begin-group character {}
{font change \itshape}
Character sequence: 1.
{end-group character }}
Character sequence: 2 .

This is the start of the trace on page 2.2:

[61] \begin{x}a b c \end{x}
{\begin x}
+stack: level + 3 for environment

What you can see is that \begin produces three lines, the second line holds the name of the environment; the last line explains that the stack pointer was changed from 2 to 3; the system remembers that the change comes from an environment, so a closing brace of a \endgroup command is illegal. Later on, the trace says:

Character sequence: ZbAY c .
{Text:ZbAY c }
{\end x}
\endx ->by\end {y}ay

In TeX, instead of the first line, you would have seen:

{the letter Z}
{blank space  }
{the letter c}
{blank space  }

Tralics shows all characters it translates; it puts them on a single line. The character sequence is printed on the transcript file, because the command \end wants to be logged. After that, we have a line that contains `Text´ in braces. The text is added to the current XML element; a line is printed whenever the buffer is flushed. The buffer is flushed here because it might be used by the internal routine that scans the argument of \end. The transcript file contains also:

Character sequence: ay.
{\endgroup (for env)}
+stack: ending environment x; resuming document.
+stack: level - 3 for environment
Character sequence:  .

Normally, each line of the form `level +3´, is followed by a line `level -3´, after that the current level is 2. The last line contains Character sequence, followed by a colon, a space, some characters, a period. Instead of `some characters´, you see only a space, it could be any character token with the category code of a space. In our case, it is the new line character that marks the end of the line. If the example is followed by an empty line, you will see:


What you see here is: open brace, Text, colon, some text, close brace. Here `some text´ is the space above, shown as a new line character. What the \par command does is: a) flush the buffer, so that the text is printed, b) remove space at end of paragraph, c) terminate the element and unwind the XML stack. The next line shows this `pop´. The integer 1 is the number of elements on the stack after the pop; you see the content of the stack, just before it is popped, in case the topstack is wrong. After the underscore, there is a suffix that indicates the mode (here p_v means that vertical mode will be entered after the pop).

{Pop 1: document_v p_v}

This is an example from page 2.3:

\E ->\expandafter
{\expandafter \E \E}
\E ->\expandafter
\E ->\expandafter
{\expandafter \expandafter \def}
{\expandafter \def \toto}
\toto ->\titi !
{\def \titi !->7}

This shows that \expandafter\foo\bar shows all three tokens. This can be interesting if these tokens come from the expansion of other tokens. For instance, in a case like this


it is interesting to know that \y is expanded before \textit.

The next one is from page 2.6.

[346] \skip\count0=2pt plus \parindent \relax
+scanint for \count->0
+scanint for \skip->1
+scanint for \skip->2
+scandimen for \skip->2.0pt
+scandimen for \skip->3.0pt
{scanglue 2.0pt plus 3.0pt}

In TeX, there is a big recursive function that converts characters into integers, dimensions and glue. The interesting point is the following: We have two commands \skip and \relax. The purpose of \relax is to stop scanning the glue, because an optional `minus´ term. You will not see \parindent nor \count. The TeX output, in this case, consists of two lines. Tralics offers 6 more lines. The last line holds the glue that is effectively read and put in the register. There are three calls to the internal function `scanint´, the first is the number of the count register, the second is the number of the skip register, the last is the integer part of the dimension. There are two calls to the internal routine `scandimen´, one for each component of the glue (the shrink component is omitted, hence not read).

The next example comes from page 2.6.

1 [3506] \count0=2\ifnum\count0=\count13\fi4
2 {\count}
3 +scanint for \count->0
4 +\ifnum3532
5 +scanint for \count->0
6 +scanint for \ifnum->7
7 +\fi3532
8 +scanint for \count->13
9 +scanint for \ifnum->7
10 +iftest3532 true
11 +scanint for \count->2
12 {\relax}
13 +\fi3532
14 Character sequence: 4 .

We have already explained that `scanint´ is used to read something in case of assignment; as you can see, the procedure is also called in the case of a conditional. This example is a bit strange. Let´s explain what happens. A line of characters is read (see line L1), tokens are constructed, expanded and evaluated. The evaluator sees a first token, printed on line L2. This matches the rule: <simple assignment>, in fact, the first clause, which is <variable assignment>, that is defined as <integer variable> <equals> <number>, and the first term is \count<8-bit number>. There are two calls to `scanint´, the first with a range check. In order to makes things easier to understand, we have given an index to each call, like S 1 , S 2 , etc.

The job of S 1 is easy: there is one digit, printed on L3. The equals character is the first unread character. It is an <equals>. After the equals sign an integer is read, via S 2 . This sees the digit 2, then the conditional I 3532 . This number was computed by Tralics, it is printed on line L4 to make debug easier. The \ifnum command reads two numbers, and a character between them, and compares the numbers. First number is read via S 3 . In fact, S 3 sees \count and calls S 4 . Procedure S 4 sees the number 0, followed by an equals sign. It prints that value on line L5. Now S 3 knows that its value is in \count0, this is 7, printed on line L6. After that, I 3532 has a first number 7, and sees the equal sign, and reads the second number via S 5 . This sees \count and reads a number via S 6 . This reads 13. Then comes \fi. The \fi command prints line L7 on the transcript file. This terminates I 3532 . This is not possible: our conditional is still reading the second number. As a consequence, two tokens are pushed back, a \fi and a \relax, in this order: the \relax is read again first. This terminates expansion of the \fi.

Our procedure S 6 is programmed to fully expand tokens, and read them as long as digits are seen (even if the result overflows). It is unaware of the fate of the \fi token. All it knows is that the first unexpandable token after the digit 3 is \relax. Thus, S 6 has finished its jobs: the value is 13, printed on line L8. Then S 5 knows its result: the value is \count13, hence 7, printed on line L9. After that, I 3532 has two numbers, they are the same, the test is true, as can been seen on line L10. Normal expansion resumes; however the condition stack has a marker that tells that the next \fi or \else matches I 3532 . If the test had been false, the expansion of the test would have read, at high speed, all tokens up to the next \fi or \else. There are four unread tokens: a \relax, a \fi, the digit 4, and the newline character. Remember S 2 : this is a procedure that has read a digit, and wonders what follows. It does not care how complicated the task of `expand´ may be. It just wants a non-expandable token. In fact, the current token is \relax. Thus, S 2 knows it has read all digits; it prints the result of the transcript file L11. As a consequence, 2 is stored in \count0. The \relax token does nothing (let´s hope nobody has redefined it), see line L12. The conditional is terminated because of the inserted \fi token, line L13. After the last character on the line is translated, a new line is read, and a line of the form L1 will be printed; before a line is added to the transcript file, the internal buffer is flushed, this explains line L14. Note that the following sequence provokes an error in TeX(if both counters are equal)


It is accepted by Tralics. As a consequence, the special \relax token inserted by TeX always behaves like \relax.

When Tralics sees a \else in a true condition, it reads everything at high speed, until finding the matching \fi; the example below shows the trace in such a case.

1 [1] \iftrue \else \ifcat 11\ifx ab \else \fi \ifnum1=2 \else \fi \fi\fi
2 +\iftrue1
3 +iftest1 true
4 +\else1
5 +\ifcat1(+1)
6 +\ifx1(+2)
7 +\else1(+2)
8 +\fi1(+2)
9 +\ifnum1(+2)
10 +\else1(+2)
11 +\fi1(+2)
12 +\fi1(+1)
13 +\fi1

6.7. Pictures and friends

We explain here the translation of some commands related to the picture environment. The syntax is unusual. In some cases, a pair of integers or a pair of real numbers are read. These numbers are multiplied by the value of the current unit of length, and the XML file contains these values, in pt, without the unit. The default value of \unitlength is 1pt. For instance


translates as <pic-put xpos=´30´ ypos=´30.59999´>x</pic-put>. As the example shows, arithmetic on scaled integers is exact, but `10.2´ cannot be represented exactly. In some cases, arguments are converted to attributes, and errors can be signaled. In the case `\put(1.2.3,4) {}”, you will see Missing unit (replaced by pt) {Character . of catcode 12}, followed by three other errors. In the case of `\makebox(1,2)[$\alpha$]{x}´ the error is unexpected element formula. Without the dollar signs, an error is signaled, the math formula is discarded, a second error is signaled with unexpected element error. If you invoke Tralics with `-noxmlerror´, the first error produces no <error> element, so that there is only one error.

6.8. The title page

Consider the following document fragment. Note the spellings of the commands \keyword and \motcle, this is not the same as for the environments of the Raweb.

1 \documentclass[a4paper]{report}
2 \usepackage{RR}
4 \providecommand\Tralics{\xbox{Tralics}{}}
5 \def\XML{XML}
7 \RRtitle{Tralics, a \LaTeX\ to XML translator\\Partie I}
8 \RRetitle{Tralics, a \LaTeX\ to XML translator\\Part I}
9 \RRauthor{José Grimm\thanks{Email:}}
11 \RRprojet{Apics}
12 \RRtheme{\THNum}
14 \RRresume{
15 Dans cet article\par nous décrivons le logiciel \Tralics,\par...}
16 \RRabstract{
17 In this paper we describe \Tralics, a \LaTeX\ to \XML\ translator.}
19 \RRdate{Aout 2005}
20 \URSophia
21 \motcle{Latex, XML, HTML, MathML, Perl, PostScript, Pdf}
22 \keyword{Latex, XML, HTML, MathML, Perl, PostScript, Pdf}
24 \begin{document}
25 \makeRR
26 text

This is translated by Tralics as follows.

<?xml version='1.0' encoding='iso-8859-1'?>
<!DOCTYPE std SYSTEM 'classes.dtd'>
<!-- Translated from latex by tralics 2.9, date: 2006/10/03-->
<std chapters='true'>
<ftitle>Tralics, a <LaTeX/> to XML translator Partie I</ftitle>
<title>Tralics, a <LaTeX/> to XML translator Part I</title>
<author>José Grimm<note id='uid1' place='foot'>Email:</note></author><inria-team>Apics</inria-team>
<theme>THnum</theme><resume><p>Dans cet article</p>
<p>nous décrivons le logiciel <Tralics/>,</p>
<abstract><p>In this paper we describe <Tralics/>, a <LaTeX/> to XML
<date>Aout 2005</date><location>Sophia Antipolis</location>
<motcle>Latex, XML, HTML, MathML, Perl, PostScript,
Pdf</motcle><keyword>Latex, XML, HTML,  MathML, Perl, PostScript,

The document you are reading here is a technical report (it has \makeRT instead of \makeRR, but it uses the same commands starting with `RR´, they are defined in the file RR.sty, and Tralics (since version 2.9) uses some equivalents from the file RR.plt. This was translated to XML then HTML (may be you are reading the HTML version).

Since year 2006, Inria´s research reports are to be put on HAL(note: ); before that, they were stored on Inria´s Web server. The meta-data were generated automatically: the author sends to the “gescap” mailing list the beginning of the document, up to the magic command \makeRR; this is translated by Tralics (that does not care about missing \end{document}); a post-processor extracts the <RRstart> element from the XML result, and converts it to HTML.

Let´s compile the same file as above with the command tralics tptest.tex -type RR.

<?xml version='1.0' encoding='iso-8859-1'?>
<!DOCTYPE rr SYSTEM 'raweb.dtd'>
<!-- Translated from latex by tralics 2.9, date: 2006/10/03-->
<rr type='RR' chapters='true'>
<title>Tralics, a <LaTeX/> to XML translator Partie I</title>
<etitle>Tralics, a <LaTeX/> to XML translator Part I</etitle>
<motcle>Latex, XML, HTML, MathML, Perl, PostScript, Pdf</motcle>
<keyword>Latex, XML, HTML, MathML, Perl, PostScript, Pdf</keyword>
Dans cet article</p>
<p>nous décrivons le logiciel <Tralics/>,</p>
In this paper we describe <Tralics/>, a <LaTeX/> to XML translator.</p></abstract>
<author><auth>José Grimm<note id='uid1' place='foot'>
<date>Aout 2005</date>

We have shown above the content of the files RR.plt and RR.tcf. One file ends with \let \RRstyisuseful \relax and the other starts with \ifx \RRstyisuseful \relax \endinput \fi. As a result, if you load the tcf file, the content of the plt file is discarded.

The syntax in the TitlePage part of a configuration file is the following: each line has some fields, that can be of type A (the word `alias´, or `action´ or `execute´) or type C (a command, a backslash followed by some letters), or E (an element name, delimited by less-than and greater-than) or S (a string, delimited by a double quote). Before each field, you can put one or two modifiers. Only the second and the third fields can have a modifier. More details can be found on the web page. The following combinations are recognized.


as in \makeRR <RRstart> "" “type = ´RR´”. This declaration has to be the first in the list. It can be given only once. No modifiers are allowed. It defines a command \makeRR, that can be used only once in the document, after \begin{document}.

The effect is to insert the <RRstart> element into the XML tree, after some checks (that may produce an error). In what follows, we shall call it the TPA element. This element is formed of other elements defined by the titlepage info, the names of these elements are statically defined, their content is dynamic (i.e., the names depends on the configuration file, the content on the TeX document). The first string is a list of attributes added to the TPA element and the second string is a list of attributes added to the document element. In our example, the first string is empty.

In the case where one of the attributes of the second string has the value `only title page´, then \endinput is evaluated just after the Titlepage command. This means that everything after the titlepage command is ignored. This is useful if you want to extract the titlepage information from a document, without converting the whole document.


as in alias "" “type = ´RT´”. This declaration is valid only after a CESS declaration (or after another AESS declaration). It defines a command \makeRT that can be used instead of \makeRR (only one of these commands can be used). The result is the same; however it can use different attributes. (Same remark as above for special attribute values in the second string). In what follows, the \TPA command means one of the commands defined by this rule or the preceding one.


as in \RRauthor + <author> <auth> “Pas d´auteurs”. Note that the plus sign is required before the <author> element. This declaration has as side effect that the TPA element will contain a <author> element, formed of a number of <auth> elements. Initially there is only one, initialized with `Pas d´auteurs´.

The declaration has another effect, it defines a command \RRauthor, that has to be used before the \TPA command. It takes one argument, and creates a <auth> element whose content is the translation of the argument. This element is added to the end of the <author> element. The command can be used more than once, in the case there are multiple authors. Note that the default value is removed in case at least one value is given.


as in \myself \RRauthor “JG”. The effect is the same as \def\myself{\RRauthor{JG}}. However, the string argument is not translated, it is taken verbatim.


as in alias \URRocq. This makes \URrocq an alias for the command defined on the previous line. Aliasing is achieved via \let.


as in <UR> -. The dash after the element is required. Another example can be <sUR fr=´unité de recherche´ en=´research unit´> -. In this second example, we have an element named <sUR>, that has two attributes. The effect is to put, in the XML result, this element (with its attributes), and its content is a list of items declared in the configuration file (the list can be empty).


as in \URSophia ?+<UR> or \URFuturs ?+<UR d=´true´>. This has as effect to define a command, here \URSophia or \URFuturs, that takes no argument, whose effect is to insert, to the element <UR> (that must be defined by a previous rule), an empty element, whose name is <URSophia>, and that has the attributes of <UR>.


as in \Paris ?<UR> <Rocquencourt>. The effect is to define a command \Paris, that behaves like \URsophia, but the element created is <Rocquencourt> instead of one named <Paris>.


A character string is inserted verbatim. Note that less-than signs are not converted to entities like &lt;.


as in execute \foo or action \foo. The expression \xbox{}{\foo} is translated. The resulting box is added to the XML tree.


as \RRtitle <title> “pas de titre”. This is the generic command. The element can have the modifiers p, q, e or E, and the value can have the modifiers +ABC. The effect is to define a command \RRtitle (or an environment `RRtitle´ if the E modifier has been given), that can be used only before the \TPA command. The argument of the command (or the content of the environment) is translated, put in a <title> element, and added to the TPA element.

If no modifier is given for the element, paragraphs are forbidden in the argument. If you want to use paragraphs (either \par or \\) you must use the P modifier (lower-case letter). In the same fashion, a lower case E means environment without paragraphs, an upper case E means environment with paragraphs. If the q modifier is given, paragraphs are forbidden, but you can use \\, which is ignored. (In fact, the command reads an optional star, an optional argument, and the result is replaced by a space). Note that, in this document, there is a \\ in the title, that appears as a space in the page headings. This is done by redefining \\ to \space, so that optional arguments are not taken into account. There is a dirty hack in Tralics.

If no modifier is given for the value, then <title>pas de titre</title> is added to the TPA element in case the command is never used.

Near the end of the titlepage example, we define \cmdp, \cmdA, \cmdB, and \cmdC in a similar fashion, but add a modifier before the value. None of these commands is used in the TeX file; if you uncomment them, you can observe the following facts.

There is special trick for the case where the name of the element associated to the command is empty. Assume that the configuration file contains \RRtheme <> +“pas de theme”. In the case where the user does not use \RRtheme, an error will be signaled, and the text will appear in the resulting XML. If the user says \RRtheme{foo}, then Tralics remembers the use and issues no complain. Moreover, it reads the argument, and pushes foo\par in the input stream (the reason why \par is executed is to make sure that Tralics remains in vertical mode).

6.9. Array and Tables

We describe here the implementation of the arrays in Tralics. One has to distinguish between `table´ which is an environment in which you can put some objects (in general tables) with a caption; like the `figure´ environment, this generates a floating object. On the other hand, the `array´ and `tabular´ environments can be used to create a table: the first one is designed for math only, the second for non-math material. Math tables are described in the chapter about mathematics. There is currently no difference between `figure´, `figure*´ and `wrapfigure´.


\begin{tabular}{c} x \\y \end{tabular}
\caption{My caption}
\begin{tabular}{c} \ref{tl} \end{tabular}

The translation is a follows. As you can see, both objects have the same name. If the table contains a tabular, only one XML object is created.

<table id='uid1'>
  <head>My caption</head>
  <row><cell halign='center'>x</cell></row>
  <row><cell halign='center'>y</cell></row>
<p><table rend='inline'><row><cell halign='center'><ref target='uid1'/></cell>

6.9.1. The tabular environment

You can say \begin{tabular} [pos] {cols} ...\end{tabular} or \begin{tabular*}{width} [pos] {cols} ...\end{tabular*}. In both cases, the result is a <table> element. This element has a vpos attribute whose value is t, b or c, provided that the optional [pos] argument is one of [t], [b] or [c]. The element has a width attribute with value xx, provided that the `tabular*´ environment has been used and the first argument evaluates to xx as a dimension. The resulting element consists of some <row> elements, each of which contains some <cell> elements. A more complicated example:


The translation here shows that the name of elements and attributes can be changed.

<Table VPos='b' TableWidth='120.0pt' rend='inline'>
  <Row SpaceAfter='2.0pt' TopBorder='true'>
     <Cell Align='Cleft'>a</Cell>
     <Cell Align='Cright'>b</Cell>
     <Cell Align='Ccenter'>c</Cell>
  <Row BottomBorder='true'>
     <Cell Align='Cleft' Cols='1'>A</Cell>
     <Cell Align='Cright'>B</Cell>
     <Cell Align='Ccenter'>C</Cell>

6.9.2. Interpreting the preamble

This is an example of halign from the TeXbooktextable.png

The preamble of the array is the quantity marked `{cols}´ in the description above. This is a specification for columns. It specifies how the columns should be formatted. In standard LaTeX, you cannot use more columns than specified; in Tralics, this is not relevant. The TeX primitive is called \halign, and LaTeX has to construct a preamble that matches the requirements of TeX; it is very difficult to implement the TeX algorithm, so that we make no attempt to implement the commands. This is an example

\strut \quad\hfil#\quad\cr
height2pt &\omit&&\omit&\cr
&Year\hfil&&Word Population&\cr
height2pt &\omit&&\omit&\cr
\noalign{\vskip 2pt}
height2pt &\omit&&\omit&\cr
height2pt &\omit&&\omit&\cr}\hrule}

The table has 5 columns, of the form ABABA, because the preamble has the form &A&B\cr, the first & marks repetition. Both templates A and B are formed of a 〈u〉 part, then #, then a 〈v〉 part. In the table, the # means: “stick the text of each column entry in this place”. In the case of A, this is almost always empty. In some cases, it is `height2pt´, case where the values of B are \omit. Here \omit says that 〈u〉 and 〈v〉 should be omitted; the important point is that the \strut be omitted. This gives additional vertical space, of exactly 2pt; other rows have (at least) the vertical size of a \strut. Very often row are too narrow; LaTeX has a command \arraystretch that controls this. Note that A is \vrule# and \vrule is a command that accepts an optional argument. Note how horizontal rules are inserted in the table.

As the previous example shows, there are three standard column types: c, l and r (centered, left-aligned, right-aligned). A TeX preamble like \quad\hfil#\quad corresponds to `r´ (instead of \quad, LaTeX uses some default intercolumn space that can be modified). You can also say p{dim}. This should typeset the column in a \parbox[t]{dim}. This feature is not implemented: the argument is ignored, and p is replaced by c. Note: \parbox currently ignores its argument in Tralics.

The array.sty package adds two options that take a dimension as argument: `m´ and `b´. The `b´ option is like the `p´ option, but bottom-aligned. The `m´ option should be used only in math mode (i.e. for the array environment, and not tabular). In Tralics, there is no difference between `b´, `m´ and `p´.

There is a @{text} option. It inserts `text´ in every row, where `text´ is processed in math mode in the `array´ environment and in LR mode in the `tabular´ and `tabular*´ environments. Such an expression suppresses the space that LaTeX normally inserts between columns. For instance, an array specification like {l@{\hspace{1cm}}l} says that the two columns of text should be separated by exactly one centimeter. A specification like {@{}c@{}} says that no additional space should be added neither of the left not the right of the column. An \extracolsep{wd} command can be used inside such an expression. It causes an extra space to appear to the left of all subsequent columns. Note that \extracolsep expands to \tabskip; this TeX primitive is not implemented in Tralics. In fact, Tralics ignores an `@´ and its argument.

You can use a | for specifying a vertical rule. However, in Tralics you cannot use double or triple rules. Sorry. There is also a !{...} options that is not implemented.

Every specification (`l´, `r´, `c´, `p´, `b´, `m´) can be preceded by a >{xx} declaration, and followed by a <{yy} declaration. In case of multiple declarations, the last will be executed first. Said otherwise, >{3}>{b}c<{a}<{z} is the same as >{b3}c<{za}. The effect is to insert `b3´ before the cell in the current position, and `za´ after the cell. See the last tabular in table 2. This corresponds to 〈u〉 and 〈v〉 parts of a TeX array. Note that the cell is finished when a token is sensed that indicates either a new cell, a new row or the end of the array. Technically, this means a &, a \\, or an \end (the end of the environment). A special marker is pushed back after the `za´. This is a special endtemplate token in the case of a cell, and a \cr in the case of \\. You should not use \cr or \crcr outside an array defined by \halign (this is not yet implemented). You must be careful that the `za´ (more generally, the 〈v〉 part) does not contain something that reads the special end marker. For instance \def\x#1{}\halign{#\x&#\cr 1&2\cr} is an error. Finally, *{N}{text} can be used instead of N occurrences of `text´.

Note. At the end of Chapter 22 of the TeXbook, Knuth gives an example of a table where the preamble is \centerline{#}. Such a construction cannot be done in Tralics, since a specification of the form >{\centerline}c<{} would transform into \centerline?#? and question marks cannot be replaced by braces; you could try >{\expandafter\centerline?} and replace the question mark by something that expands to an open brace but contains as many open braces as closing ones, for instance \expandafter {\iffalse}\fi. However, it is not possible to put in the <{?} part something that the parser considers as a closing brace followed by some other text (otherwise, this closing brace would terminate parsing of the <{?} part).

Knuth says that an entry of the form a}b{c is legitimate, with respect to this template. This cannot be the case in Tralics, but it would be valid for a template like >{\bgroup\bf}c<{\egroup}. This justifies that a table has to be terminated by \cr or \crcr. In the case of Tralics, this is not needed.

6.9.3. New column types

You can add new column types to the list of existing one, using \newcolumntype, with as argument a letter. For instance:


In this case, the transcript file will contains (line breaks added before `r´)

{Push tabular 2}
array preamble at start: |c||c||c|d{23}X
array preamble after X: |c||c||c|d{23}CLR
array preamble after d: |c||c||c|>{\rightdots {23}}
                              r<{\endrightdots }CLR
array preamble after C: |c||c||c|>{\rightdots {23}}
                              r<{\endrightdots }>{$}c<{$}LR
array preamble after L: |c||c||c|>{\rightdots {23}}
                              r<{\endrightdots }>{$}c<{$}>{$}l<{$}R
array preamble after R: |c||c||c|>{\rightdots {23}}
                              r<{\endrightdots }>{$}c<{$}>{$}l<{$}>{$}r<{$}
array preamble parse: | c | | c | | c | >>{}
                              r <<{} >>{} c <<{} >>{} l <<{} >>{} r <<{}

Whenever a tabular is seen, optional arguments are read, and then the first argument is handled. In a first pass, * is evaluated. This gives the lines marked `at start´. After that, the preamble contains, at toplevel (outside braces) two characters `d´ and `X´ that are defined to be new column types. These are evaluated one after the other (the order is irrelevant, here alphabetic order is used so that X is expanded first). Since the expansion was non trivial, a second try is made. Note that only a finite numbers of tries are executed. In case of recursion, strange things can happen. Note how you can use commands with arguments (here `d´ takes one argument, it is `23´).

The table is empty, on purpose, because there are two undefined macros, moreover, because, in the current version of Tralics, dollar signs have to be explicit, and not hidden in a >{}...<{} construction.

6.9.4. Another example

We consider here the following new column types. As you can see, one of them is the character +, another is the character _. The fact that these characters have special catcodes is irrelevant (they cannot be of catcode 1 and 2, because this would interfere with brace matching, and they cannot be of catcode 10, because space characters should be ignored in the preamble).

\newcolumntype{L} {>{\large\bfseries 2}l <{y}|}
\newcolumntype{+} {>{B}l <{D}|}

Consider the four following tables

a1&a2&a3&a4  & b1&b2&b3&b4 & c1&c2&c3&c4& d1&d2&d3&d4\\
Wa1&Wa2&Wa3&Wa4  & Wb1&Wb2&Wb3&Wb4 & Wc1&Wc2&Wc3&Wc4& Wd1&Wd2&Wd3&Wd4\\
\hline a&b&c&d&e&f\\
aaa&bbb&ccc&ddd  &eee&fff\\\hline
\begin{tabular} {| >{\large 1}c <{x}| L > {\large\itshape 3}c <{z}|}
\hline A&B&C\\\hline 100&10 &1\\\hline
\begin{tabular} {| >{\large 1}c <{x}| L > {\large\itshape 3}x <{z}|}
\hline A&B&C\\\hline 100&10 &1

Table 2. Some LaTeX tables

a1 a2 a3x a4 b1 b2x b3 b4 c1 c2x c3 c4 d1 d2x d3 d4
Wa1 Wa2 Wa3x Wa4 Wb1 Wb2x Wb3 Wb4 Wc1 Wc2x Wc3 Wc4 Wd1 Wd2x Wd3 Wd4
a b c d e f
aaa bbb ccc ddd eee fff
ab c d e f
aaa bbb ccc ddd eee fff
1Ax 2By 3Cz
1100x 210y 31z
1Ax 2By b3Cza
1100x 210y b31za

You can see the LaTeX result on table 2. We had to change the \tabcolsep of the first table to 0, otherwise, it is two wide. Specifying 0pt as width gives the following warning : Overfull \hbox (339.38422pt too wide) in alignment, from this we deduced that 339pt could be a good value, specifying 340 gives an underful box. The XML translation can be found on the Web page. In order to explain what happens, we consider an example:

a&\omit b&c\\

We explain now the translation. Line numbers refer to the transcript file given below. We do not show the start of the job (initialization). The command \par is redefined to do nothing. It is restored on line 107.

This is the transcript file.

1 [7] \hline
2 [8] a&b&c\\[2pt]
3 {Push row 3}
4 {Push cell 4}
5 +stack: level + 3 for cell
6 Character sequence: a.
7 {alignment tab character &}
8 {Text:a}
9 {\endtemplate}
10 {Pop 4: document_v p_v tabular*_h row_a cell_a}
11 +stack: level - 3 for cell
12 {Push cell 4}
13 +stack: level + 3 for cell
14 Character sequence: xb.
15 {alignment tab character &}
16 {Text:xb}
17 Character sequence: y.
18 {\endtemplate}
19 {Text:y}
20 {Pop 4: document_v p_v tabular*_h row_a cell_a}
21 +stack: level - 3 for cell
22 {Push cell 4}
23 +stack: level + 3 for cell
24 Character sequence: c.
25 {\\}
26 {Text:c}
27 +scanint for \\->2
28 +scandimen for \\->2.0pt
29 {scanglue 2.0pt}
30 {\cr withargs}
31 +scanint for \cr withargs->1703
32 {Pop 4: document_v p_v tabular*_h row_a cell_a}
33 +stack: level - 3 for cell
34 {Pop 3: document_v p_v tabular*_h row_a}
35 [9] a&\omit b&c\\
36 {Push row 3}
37 {Push cell 4}
38 +stack: level + 3 for cell
39 Character sequence: a.
40 {alignment tab character &}
41 {Text:a}
42 {\endtemplate}
43 {Pop 4: document_v p_v tabular*_h row_a cell_a}
44 +stack: level - 3 for cell
45 {Push cell 4}
46 +stack: level + 3 for cell
47 Character sequence: b.
48 {alignment tab character &}
49 {Text:b}
50 {\endtemplate}
51 {Pop 4: document_v p_v tabular*_h row_a cell_a}
52 +stack: level - 3 for cell
53 {Push cell 4}
54 +stack: level + 3 for cell
55 Character sequence: c.
56 {\\}
57 {Text:c}
58 [10] \multicolumn{1}{l}{A}&B&C\\\hline
59 {\cr}
60 {Pop 4: document_v p_v tabular*_h row_a cell_a}
61 +stack: level - 3 for cell
62 {Pop 3: document_v p_v tabular*_h row_a}
63 {Push row 3}
64 {Push cell 4}
65 +stack: level + 3 for cell
66 {Push argument 5}
67 Character sequence: 1.
68 {Text:1}
69 {Pop 5: document_v p_v tabular*_h row_a cell_a argument_a}
70 array preamble at start: l
71 array preamble parse: l
72 {begin-group character {}
73 +stack: level + 4 for brace
74 Character sequence: A.
75 {end-group character }}
76 +stack: level - 4 for brace
77 {alignment tab character &}
78 {Text:A}
79 {\endtemplate}
80 {Pop 4: document_v p_v tabular*_h row_a cell_a}
81 +stack: level - 3 for cell
82 {Push cell 4}
83 +stack: level + 3 for cell
84 Character sequence: xB.
85 {alignment tab character &}
86 {Text:xB}
87 Character sequence: y.
88 {\endtemplate}
89 {Text:y}
90 {Pop 4: document_v p_v tabular*_h row_a cell_a}
91 +stack: level - 3 for cell
92 {Push cell 4}
93 +stack: level + 3 for cell
94 Character sequence: C.
95 {\\}
96 {Text:C}
97 {\cr}
98 {Pop 4: document_v p_v tabular*_h row_a cell_a}
99 +stack: level - 3 for cell
100 {Pop 3: document_v p_v tabular*_h row_a}
101 [11] \end{tabular*}
102 {\end}
103 {\end tabular*}
104 {\endtabular*}
105 {Pop 2: document_v p_v tabular*_h}
106 {\endgroup (for env)}
107 +stack: restoring \par=\par
108 +stack: ending environment tabular*; resuming document.
109 +stack: level - 2 for environment
110 Character sequence:  .

Alternate version, where the final \\\hline is commented out

111 [11] \end{tabular*}
112 {\end}
113 {Text:C}
114 {\end tabular*}
115 {\cr}
116 {Pop 4: document_v p_v tabular*_h row_a cell_a}
117 +stack: level - 3 for cell
118 {Pop 3: document_v p_v tabular*_h row_a}
119 {\end}
120 {\end tabular*}
121 {\endtabular*}

6.10. Actions declared in the configuration file

An action is defined by a name, an equals sign, and a value. Optional spaces can be used. The syntax of DocType and DocAttrib is special. In all other cases, double quotes must delimit the value. All names contain only letters, digits, and underscores. If the name has the form att_foo, this changes the value of attribute `foo´. If the name has the form xml_foo, it changes the value of element `foo´. In a previous version you had to give the full name, xml_foo_name. Names of elements and attributes can be dynamically changed: when you say


this changes the name of element `item´ and attributes `rend´ and `quote´.

In the examples that follow, we shall assume that the file defined in section 6.3.6 is loaded.

6.11. Trace of titlepage

This is a part of the transcript file for the titlepage command.

1 Defining \makeRR as \TitlePageCmd 0
2    main <RRstart  -- type = 'RR'/>
3 Defining \makeRT as \TitlePageCmd 1
4    main <RRstart  -- type = 'RT'/>
5 Defining \UR as \TitlePageCmd 2
6    ur_list <UR/>
7 Defining \URSophia as \TitlePageCmd 3
8    ur <URSophia/>
9 Defining \URRocquencourt as \TitlePageCmd 4
10    ur <URRocquencourt/>
11 Defining \URRocq as alias to \URRocquencourt
12 Defining \Paris as \TitlePageCmd 6
13    ur <Rocquencourt/>
14 Defining \URRhoneAlpes as \TitlePageCmd 7
15    ur <URRhoneAlpes/>
16 Defining \URRennes as \TitlePageCmd 8
17    ur <URRennes/>
18 Defining \URLorraine as \TitlePageCmd 9
19    ur <URLorraine/>
20 Defining \URFuturs as \TitlePageCmd 10
21    ur <URFuturs d='true'/>
22 Defining \RRtitle as \TitlePageCmd 11
23    usual <title/> (flags -par)
24 Defining \RRetitle as \TitlePageCmd 12
25    usual <etitle/> (flags -par)
26 Defining \RRprojet as \TitlePageCmd 13
27    usual <projet/>
28 Defining \motcle as \TitlePageCmd 14
29    usual <motcle/>
30 Defining \keyword as \TitlePageCmd 15
31    usual <keyword/>
32 Defining \RRresume as \TitlePageCmd 16
33    usual <resume/> (flags +par)
34 Defining \RRabstract as \TitlePageCmd 17
35    usual <abstract/> (flags +par)
36 Defining \RRauthor as \TitlePageCmd 18
37    list <author/> and <auth/>
38 Defining \RRdate as \TitlePageCmd 19
39    usual <date/>
40 Defining \RRNo as \TitlePageCmd 20
41    usual <RRnumber/>
42 Defining \RRtheme as \TitlePageCmd 21
43    usual </> (flags +list)
44 Defining \Theme as \TitlePageCmd 22
45    ur_list <Theme/>
46 Defining \THNum as \TitlePageCmd 23
47    ur <THNum/>
48 Defining \THCog as \TitlePageCmd 24
49    ur <THCog/>
50 Defining \THCom as \TitlePageCmd 25
51    ur <THCom/>
52 Defining \THBio as \TitlePageCmd 26
53    ur <THBio/>
54 Defining \THSym as \TitlePageCmd 27
55    ur <THSym/>
56 Defining \myself as \TitlePageCmd 28
57    list? <grimm/> and <auth/>
58 Defining \cmdp as \TitlePageCmd 29
59    usual <cmdp/> (flags +list)
60 Defining \cmda as \TitlePageCmd 30
61    usual <cmdA/> (flags +A)
62 Defining \cmdb as \TitlePageCmd 31
63    usual <cmdB/> (flags +B)
64 [1] \cmdb{\cmdBval}
65 ++ End of virtual file.
66 Defining \cmdc as \TitlePageCmd 32
67    usual <cmdC/> (flags +C)
68 [1] \documentclass[a4paper]{report}
69 ...
70 [1] \cmda{\cmdAval}
71 {(Unknown)}
72 {\titlepage 30}
73 {\titlepage 30=\cmda}
74 {Push cmdA 1}
75 Error signaled at line 1 of file tptest.tex:
76 Undefined command \cmdAval.
77 ...
78 [8] \RRetitle{Tralics, a \LaTeX\ to XML translator\\Part I}
79 {(Unknown)}
80 {\titlepage 12}
81 {\titlepage 12=\RRetitle}
82 {Push etitle 1}
83 ...
84 [12] \RRtheme{\THNum}
85 {(Unknown)}
86 {\titlepage 21}
87 {\titlepage 21=\RRtheme}
88 {(Unknown)}
89 {\titlepage 23}
90 {\titlepage 23=\THNum}
91 {\par}
92 [13]
93 ...
94 [24] \begin{document}
95 {\begin}
96 {\begin document}
97 +stack: level + 2 for environment
98 {\document}
99 +stack: ending environment document; resuming document.
100 +stack: level - 2 for environment
101 +stack: level set to 1
102 [1] \let\do\noexpand\ignorespaces
103 ++ End of virtual file.
104 atbegindocumenthook= \cmdb {\cmdBval }\let \AtBeginDocument \@notprerr \let \do
105 \noexpand \ignorespaces
106 {(Unknown)}
107 {\titlepage 31}
108 {\titlepage 31=\cmdb}
109 {Push cmdB 1}
110 Error signaled at line 24 of file tptest.tex:
111 Undefined command \cmdBval.
112 {Pop 1: document_v cmdB_t}
113 {\let}
114 {\let \AtBeginDocument \@notprerr}
115 {\let}
116 {\let \do \noexpand}
117 {\ignorespaces}
118 [25] \makeRR
119 {(Unknown)}
120 {\titlepage 0}
121 {\titlepage 0=\makeRR}
122 Error signaled at line 25 of file tptest.tex:
123 No value given for command \cmdp.
124 [1] \cmdc{\cmdCval}
125 ++ End of virtual file.
126 {(Unknown)}
127 {\titlepage 32}
128 {\titlepage 32=\cmdc}
129 {Push cmdC 1}
130 Error signaled at line 25 of file tptest.tex:
131 Undefined command \cmdCval.
132 {Pop 1: document_v cmdC_t}
133 {Push p 1}
134 [26] text
135 ...
136 Output written on tptest.xml (1059 bytes).
137 There were 4 errors.
138 (For more information, see transcript file tptest.log)

6.12. Extensions

We describe here the extensions defined by ϵ-TeX, and how they are implemented in Tralics.

Tracing and Diagnostics. When \tracingcommands has a value of 3 or more, the commands following a prefix (like \global) are shown by ϵ-TeX; this is the standard behaviour of Tralics. When \tracinglostchars has a value of two or more, missing characters are displayed on the terminal; no character is lost by Tralics. When \tracingassigns has a value of 1 and more, all assignments subject to TeX´s grouping mechanism are traced. This is set to one by \tracingall. When \tracingifs has a value of one or more, all conditionals are traced, together with the starting line and nesting level; not implemented in Tralics, but it is easy to find the \if associated to a \fi because each of them has a serial number. When \tracinggroups has a value of 1 or more, the start and end of each save group is traced, together with the starting line and grouping level. Not implemented in Tralics, but since version 2.9, you will see line numbers when a group is started (for instance +stack: level + 2 for brace entered on line 9) or terminated (as in +stack: level - 2 for brace from line 9). When \tracingnesting has a value of 1 or more, unclosed conditionals are printed in the transcript file; not implemented in Tralics. When \tracingscantokens has a value of one or more, the opening and closing of pseudo files is recorded as for any another file.

Example. Given the following input

1 \global\count66=12
2 {\count66=12  \count66=13}
3 \catcode200=12
4 \parindent=3pt\parindent=1\parindent\global\parindent=2\parindent
5 \parskip=\parindent plus 2pt\parskip=\parskip
6 \setbox0\null
7 \setbox0\xbox{foo}{bar}
8 \everyhbox{abc}\everyhbox{abc}\everyhbox{c}
9 \let\foo\relax
10 \newcommand*\foo{\relax}\renewcommand\foo{\relax}
11 \let\bar\foo

We get the following lines in the transcript file.

1 [8] \global\count66=12
2 {\global}
3 {\global\count}
4 +scanint for \count->66
5 +scanint for \count->12
6 {globally changing \count66=0 into \count66=12}
7 [9] {\count66=12  \count66=13}
8 {begin-group character}
9 +stack: level + 2 for brace entered on line 9
10 {\count}
11 +scanint for \count->66
12 +scanint for \count->12
13 {reassigning \count66=12}
14 {\count}
15 +scanint for \count->66
16 +scanint for \count->13
17 {changing \count66=12 into \count66=13}
18 {end-group character}
19 +stack: restoring integer value 12 for \count66
20 +stack: level - 2 for brace from line 9
21 [10] \catcode200=12
22 {\catcode}
23 +scanint for \catcode->200
24 +scanint for \catcode->12
25 {reassigning \catcode200=12}
26 [11] \parindent=3pt\parindent=1\parindent\global\parindent=2\parindent
27 {\parindent}
28 +scanint for \parindent->3
29 +scandimen for \parindent->3.0pt
30 {changing \parindent=0.0pt into \parindent=3.0pt}
31 {\parindent}
32 +scanint for \parindent->1
33 +scandimen for \parindent->3.0pt
34 {reassigning \parindent=3.0pt}
35 {\global}
36 {\global\parindent}
37 +scanint for \parindent->2
38 +scandimen for \parindent->6.0pt
39 {globally changing \parindent=3.0pt into \parindent=6.0pt}
40 [12] \parskip=\parindent plus 2pt\parskip=\parskip
41 {\parskip}
42 +scanint for \parskip->2
43 +scandimen for \parskip->2.0pt
44 {scanglue 6.0pt plus 2.0pt}
45 {changing \parskip=0.0pt into \parskip=6.0pt plus 2.0pt}
46 {\parskip}
47 {reassigning \parskip=6.0pt plus 2.0pt}

Transcript for boxes and token lists:

1 [13] \setbox0\null
2 {\setbox}
3 \null ->\hbox {}
4 +scanint for \setbox->0
5 {Constructing an anonymous box}
6 +stack: level + 2 for brace entered on line 13
7 {Push hbox 1}
8 {end-group character}
9 +stack: finish a box of type 0
10 {Pop 1: document_v hbox_v}
11 +stack: level - 2 for brace from line 13
12 {changing \box0=</> into \box0=}
13 [14] \setbox0\xbox{foo}{bar}
14 {\setbox}
15 +scanint for \setbox->0
16 {Push argument 1}
17 Character sequence: foo.
18 {Text:foo}
19 {Pop 1: document_v argument_v}
20 {Constructing a box named foo}
21 +stack: level + 2 for brace entered on line 14
22 {Push hbox 1}
23 Character sequence: bar.
24 {end-group character}
25 +stack: finish a box of type 0
26 {Text:bar}
27 {Pop 1: document_v hbox_v}
28 +stack: level - 2 for brace from line 14
29 {changing \box0= into \box0=<foo>bar</foo>}
30 [15] \everyhbox{abc}\everyhbox{abc}\everyhbox{c}
31 {\everyhbox}
32 {changing \everyhbox= into \everyhbox=abc}
33 {\everyhbox}
34 {reassigning \everyhbox=abc}
35 {\everyhbox}
36 {changing \everyhbox=abc into \everyhbox=c}

This is the transcript file for the case of \def and friends. Here, two lines are printed by \tracingassigns.

1 [16] \let\foo\relax
2 {\let}
3 {\let \foo \relax}
4 {changing \foo=undefined}
5 {into \foo = \relax}
6 [17] \newcommand*\foo{\relax}\renewcommand\foo{\relax}
7 {\newcommand}
8 {\newcommand* \foo}
9 {changing \foo=\relax}
10 {into \foo= macro:->\relax }
11 {\renewcommand}
12 {\newcommand \foo}
13 {changing \foo=macro:->\relax }
14 {into \foo=\long macro:->\relax }
15 [18] \let\bar\foo
16 {\let}
17 {\let \bar \foo}
18 {changing \bar=macro:->\mathaccent "7016\relax }
19 {into \bar = \long macro:->\relax }

In order to debug conditionals, the variables \currentiflevel, \currentifbranch and also \currentiftype can be consulted. The level is the number of currently active conditionals, the branch is 1 if the `then branch´ is taken, -1 if the `else branch´ is taken, 0 otherwise (condition not yet evaluated, or out of condition). The type is given in the following table (if the case of \unless, the opposite of this number is returned).

1 \if 8 \ifmmode      15 \iftrue
2 \ifcat 9 \ifinner 16 \iffalse
3 \ifnum 10 \ifvoid 17 \ifcase
4 \ifdim 11 \ifhbox 18 \ifdefined
5 \ifodd 12 \ifvbox 19 \ifcsname
6 \ifvmode 13 \ifx 20 \iffontchar
7 \ifhmode     14 \ifeof

The \unless command is an extension of ϵ-TeX; the behavior of \unless\iftrue is the same as \iffalse. This means the following: This command is expandable; it reads a token; this token must be a conditional, but not \ifcase. The conditional computes a truth value, which is then negated. Expansion of the command is the same as that of the conditional, said otherwise, the next token, if the test is true, otherwise what follows the \else or \fi.


\typeout{type \the\currentiftype,
level \the\currentiflevel,
branch \the\currentifbranch.}}
\showif \fi\fi\fi

The following is printed on the terminal.

type -16, level 1, branch 1.
type 16, level 2, branch -1.
type 3, level 3, branch 1.

The command \ifdefined is a conditional; it reads a token and its truth value is true if this token is defined. The command \ifcsname reads and expands all tokens as \csname, until finding \endcsname. The condition is true if the token exists and is defined. If the token does not exist, it will not be created. In LaTeX, \@ifundefined call \csname, but has as side effect that the resulting token is never undefined.

The command \iffontchar is a conditional; it reads a font identifier, and a character position, and evaluates to true in the case where this character is defined in the font. Since Tralics does not read font metric files, nothing special happens, we pretend that the character exists, unless the font is the null font. Thus

\ifdefined \undefined \bad\fi
\ifdefined \par\else\bad \fi
\ifcsname foo bar\endcsname\bad\fi
\ifcsname bar\endcsname\else\bad\fi
\csname foo bar\endcsname\ifcsname foo bar\endcsname\else\bad\fi
\unless\iffontchar\font 32 \bad\fi
\iffontchar\nullfont 32 \bad\fi

The command \protected is a prefix, like \long, that applies to a macro definition. A protected macro is not expanded when building an expanded token list (for instance in \edef).

The command \scantokens absorbs a list of unexpanded tokens, converts it into a character string, that is treated as if it were an external file and starts to read from this pseudo file. Every newline character(note: ) is interpreted as the start of a new line.

The command \showgroups shows the current grouping structure. The read-only integer \currentgrouplevel returns the current save group level, and \currentgrouptype returns a number representing the type of the current group. This gives a number between 0 and 16, see the ϵ-TeX documentation. The values used by Tralics are: 0 is bottom level (no group), 1 is simple group, 9 is math group, 14 is semi simple group, 18 is environment, 19 is a cell in a table, 20 is a local group (corresponds to 4 and 5 in ϵ-TeX), 21 is a title-page group, 17 is impossible.

Example. If we have a file with


The following is printed by ϵ-TeX

### simple group (level 10) entered at line 13 ({)
### align group (level 9) entered at line 12 (align entry)
### align group (level 8) entered at line 12 (\halign{)
### vcenter group (level 7) entered at line 12 (\vcenter{)
### math shift group (level 6) entered at line 12 ($)
### hbox group (level 5) entered at line 12 (\hbox{)
### semi simple group (level 4) entered at line 12 (\begingroup)
### simple group (level 3) entered at line 11 ({)
### semi simple group (level 2) entered at line 10 (\begingroup)
### semi simple group (level 1) entered at line 9 (\begingroup)
### bottom level

or by Tralics

### cell group (level 5) entered at line 13
### environment group (level 4) entered at line 12
### brace group (level 3) entered at line 11
### \begingroup group (level 2) entered at line 10
### environment group (level 1) entered at line 9
### bottom level

You may wonder why ϵ-TeX uses twice as many stack levels as Tralics. This is because tables are implemented in a different way. For instance, using math mode for tables is very strange; one might wonder what the current mode is, when \tracinggroupd is seen; the answer is restricted horizontal mode, but why? If we insert a \showlist command, we see

### restricted horizontal mode entered at line 13 []
spacefactor 3000
### restricted horizontal mode entered at line 13 []
spacefactor 0
### internal vertical mode entered at line 12
prevdepth ignored
### internal vertical mode entered at line 12
prevdepth ignored
### math mode entered at line 12
### restricted horizontal mode entered at line 12
spacefactor 1000
### horizontal mode entered at line 12 []
spacefactor 1000
### vertical mode entered at line 0
### current page: []

In the same situation, you will see the following lines in the Tralics transcript file. Here, the level is an index in the XML stack, and you can see that the current mode is `a´ (for array).

level 0 entered at line 0, type document, mode_v:
<table rend='inline'><row><cell/></row></table></std>
level 1 entered at line 12, type tabular, mode_v:
<table rend='inline'><row><cell/></row></table>
level 2 entered at line 13, type row, mode_a:
level 3 entered at line 13, type cell, mode_a:

If you translate the following line in verbose mode


you will see the following in the transcript file. The current level outside the group is one, so that you see it increase to 2; but ϵ-TeX shows it as zero, so that the \the inside the group expands to one. The reason for this strange behaviour is that a quantity defined at level zero is never defined; the level is never zero, so that the \the never returns a negative value.

[9] {\the\currentgrouplevel}
{begin-group character}
+stack: level + 2 for brace entered on line 9
{\the \currentgrouplevel}
{end-group character}
+stack: level - 2 for brace from line 9

The command \eTeXversion expands to a token list containing the current ϵ-TeX revision. The counter \eTeXversion returns ϵ-TeX´s major version number. Thus


prints something like `2.0´.

The commands \gluestretchorder, \glueshrinkorder, \gluestretch, \glueshrink can be used when some internal quantity is scanned, for instance after \the. They read some glue and return one part of the glue, it can be the stretch order or the shrink order (an integer between 0 and 3), or the stretch or shrink value (as a dimension). The commands \gluetomu, \mutoglue read and return some glue. The ϵ-TeX manual says: glue is converted into muglue and vice versa by simply equating 1pt with 1mu. Example: we define here a command, whose value will be used later.

\muskip0 = 18mu plus 36mu minus 1 fill
\skip0 = 10pt plus 20pt minus 1 fil

The commands \detokenize and \unexpanded read a token list. The second command returns the list unchanged, the first one detokenizes it; said otherwise the token list is converted into a character list, of category code 12 (except for space). These commands behave like \the, in that the resulting token list is not expanded, even in a \edef or \write. This example shows how \unexpanded works.


The command \uselater, defined above, should compare equal to \xoo defined here(note: )

{\let\GDEF\gdef\let\XDEF\xdef\def\S{ }
 \catcode`m=12 \catcode`u=12 \catcode`p=12 \catcode`f=12
 \catcode`i=12  \catcode`l=12 \catcode`n=12 \catcode`i=12 \catcode`s=12
\XDEF\xoo{18.0\MU\S \PLUS\S 36.0\MU\S \MINUS\S 1.0\FILL,%
18.0\PT\S \PLUS\S 36.0\PT\S \MINUS\S 1.0\FILL,%
10.0\PT\S \PLUS\S 20.0\PT\S \MINUS\S 1.0\FIL,%
10.0\MU\S \PLUS\S 20.0\MU\S \MINUS\S 1.0\FIL,%
10.0\PT\S \PLUS\S 20.0\PT\S \MINUS\S 1.0\FIL,%

Using \detokenize makes life easier. The test should be true here.

\edef\yoo{\detokenize{18.0mu plus 36.0mu minus 1.0fill,%
18.0pt plus 36.0pt minus 1.0fill,%
10.0pt plus 20.0pt minus 1.0fil,%
10.0mu plus 20.0mu minus 1.0fil,%
10.0pt plus 20.0pt minus 1.0fil,%

As in the case of ϵ-TeX, Tralics provides the notion of expressions of type number, dimen, glue or muglue, that can be used whenever a quantity of that type is needed. Such an expression is read by the scanning mechanism; basically scanint and friends are used to read a quantity, and \multiply and friends are used to perform operations. The four commands that can be used are \numexpr, \dimexpr, \glueexpr and \muexpr. They determine a type t, the type of the result, and read an expression, that is followed by an optional \relax (that will be read). When scanning for an operator or the end of an expression, spaces are discarded. An expression consists of one or more terms of type t, that are added or subtracted. A term of type t consists of an initial factor of type t, multiplied or divided by a numeric (integer) factor. Finally, a factor is either a quantity of type t, or a parenthesized expression. Example.

\ifdim \dimexpr(2pt-5pt) *\numexpr 3-3*13/5\relax + 34pt/2=32pt

Here the \relax terminates the \numexpr. This is the trace. You will see expr so far when a term is converted into an expression (prefix `=´), or after an addition or subtraction (prefix `+´ or `-´). You will see term so far after a multiplication or division (prefix ´*´ or ´/´) or a scaling (prefix backslash). In the case of a*b/c, a 64bit intermediate product is computed.

[8] \ifdim \dimexpr(2pt-5pt) *\numexpr 3-3*13/5\relax + 34pt/2=32pt
+scanint for \dimexpr->2
+scandimen for \dimexpr->2.0pt
+expr so far for \dimexpr= 2.0pt
+scanint for \dimexpr->5
+scandimen for \dimexpr->5.0pt
+expr so far for \dimexpr- -3.0pt
+scanint for \numexpr->3
+expr so far for \numexpr= 3
+scanint for \numexpr->3
+scanint for \numexpr->13
+scanint for \numexpr->5
+term so far for \numexpr\ 8
+expr so far for \numexpr- -5
+scan for \numexpr= -5
+scanint for \dimexpr->-5
+term so far for \dimexpr* 15.0pt
+expr so far for \dimexpr= 15.0pt
+scanint for \dimexpr->34
+scandimen for \dimexpr->34.0pt
+scanint for \dimexpr->2
+term so far for \dimexpr/ 17.0pt
+expr so far for \dimexpr+ 32.0pt
+scan for \dimexpr= 32.0pt
+scandimen for \ifdim->32.0pt
+scanint for \ifdim->32
+scandimen for \ifdim->32.0pt
+iftest1 true

Note that 3*13/5 is 8-1/5, and this is rounded to 8. In the case of \divide, the result is truncated. All intermediate expressions are checked for overflow(note: ), which is 2 31 for an integer, and 2 30 otherwise (in magnitude). This means that dimensions and components of glue must be less than 2 14 in units of pt, mu or fil.

One important point is that these operations do no side effects, hence can be used inside an \edef. If used out of context, you can see error messages like You can´t use `\numexpr´ in horizontal mode, (the messages depends on the current mode), in Tralics, the error is Read only variable \numexpr, because these operations are implemented as the value of a read only variable. Example

  \ifnum#1<#2, %
\def\xBar{7, 8, 9, 10, 11, 12, 13}
\ifx\Bar\xBar\else \bad\fi

The integer \lastnodetype returns a number indicating the type of the last node, if any, on the current list. This is not implemented in Tralics, and the value is always zero.

The \interactionmode command allows you to get or set the current interaction mode, an integer between 0 and 3. Setting it is no-op in Tralics (no error signaled), the value is always zero (this is batchmode in TeX, which is more or less the only mode of interaction of Tralics).

The commands \fontcharwd, \fontcharht, \fontchardp, \fontcharic can be used to get some information about characters; do not use them to set a value. The command reads a font identifier, and a character position; if the character does not exists, the value is zero, otherwise the width, height, depth or italic correction. In the following example, Tralics shows 0 for the interaction mode, and 0.0pt for the other values; in ϵ-TeX, only the italics correction is zero.


The commands \parshapelength, \parshapeindent, read an integer n; they return the length or indentation of the line n in the current parshape; the command \parshapedimen reads 2n or 2n+1, and returns one of these quantities, depending on the parity of the argument. Example: in the following code, the \bad macro is not called.

\parshape 3 1pt 2pt 3pt 4pt 5pt 6pt
\ifnum\parshape         = 3 \else\bad\fi
\ifdim\parshapelength 1 = 2.0pt\else\bad\fi
\ifdim\parshapeindent 2 = 3.0pt\else\bad\fi
\ifdim\parshapedimen 4  = 4.0pt\else\bad\fi
\ifdim\parshapedimen 5  = 5.0pt\else\bad\fi
\ifdim\parshapedimen 6  = 6.0pt\else\bad\fi
\ifdim\parshapedimen 7  = 5.0pt\else\bad\fi
\ifdim\parshapedimen 0  = 0.0pt\else\bad\fi
\parshape 0
\ifdim\parshapedimen 1  = 0.0pt\else\bad\fi

The four commands \interlinepenalties, \clubpenalties, \widowpenalies, as well as \displaywidowpenalties can be used to get or set penalties. The values are read, but not used by Tralics. The syntax is the following. In a set context, an optional equals sign is read, followed by an integer n. If the integer is positive, then n integer values are read and stored, otherwise the table is cleared. In a get context, an integer n is read, and the result is an integer; if n is negative, this is zero, if n is zero it is the length of the table, if n is positive it is the value found in the table (or the last value if n is too big). Example: in the following code, the \bad macro is not called.

\interlinepenalties=3 1 2 3
\clubpenalties=3 11 12 13
\widowpenalties=3 101 102 103
\displaywidowpenalties=3 1001 1002 1003
\widowpenalties= -1
\the\interlinepenalties 1
\the\displaywidowpenalties -1
\the\displaywidowpenalties 0
\the\displaywidowpenalties 4
\the\widowpenalties 0}

The command \showtokens reads a token list, and prints it on the terminal and transcript file. As the example below shows, the start of the token list is obtained by expanding tokens and ignoring \relax until a brace (implicit or explicit) is found.

\showtokens \expandafter{\jobname}
\showtokens \expandafter{\tralicsversion }

The command \readline has the same syntax as \read, it is followed by a channel number, a to keyword, and a definable command. It reads a line from a file, and puts it in the command. The difference is that all characters are assumed of category code 12, except space that has its standard category code; only one line is read, since the result is always properly nested.

The command \everyeof holds a token list, like \everypar, that is inserted at the end of every file, or virtual file.

The counter \lastlinefit contains an integer, that is used by ϵ-TeX to set the glue in the last line of a paragraph. For each line, the actual space used by by a glue item of the form 10pt plus 4pt minus 3pt is 10+4f or 10-3f, depending on whether the natural width is too big or too small. The last line of a paragraph is generally terminated by an infinite stretchable glue, so that the glue factor f for normal glue is zero. In ϵ-TeX, you can use the same factor as the previous line, or an interpolation (l/1000 times the factor of the previous line), where l is the value of \lastlinefit, or 0 if negative, or 1000 if greater than 1000. Not used in Tralics.

The integer quantity \savinghyphcodes, when positive, tells ϵ-TeX to store the current \lccode table together with the hyphenation table of the current language. See ϵ-TeX documentation for why it is useful to store such a table; not used in Tralics.

When TeX´s page builder transfers material from the `recent contribution´ to the `page so far´, it discards discardable items preceding the first box or rule on the page. When ϵ-TeX´s parameter \savingdiscards is positive, these discarded items are stored in a special list; the command \pagediscards reinserts these items (and clears the list). The same holds for \vsplit, the command is then \splitdiscards.

The \middle command is not implemented in Tralics. This is a math-only command that reads a delimiter, and should be placed between \left and \right, more than one such command can be used. The height of the delimiter of \left, \right and \middle is the height of the formula, from \left to \right.

The six commands \marks, \firstmarks, \topmarks \botmarks, \splitfirstmarks, and \splitbotmarks generalise commands like \mark, they read an integer N, and set a mark at position N, or get the mark at position N. No error is signalled if N is out of range.

There is a possibility to type text from, left to right or right to left in ϵ-TeX. Not implemented in Tralics. The use of these features is controlled by the integer \TeXXeTstate.

The integer \predisplaydirection contains the text direction preceding a display. The commands \beginL, \beginR, \endL, \endR mark the start and end of a left-to-right or right-to-left region.

6.13. Bootstrap code

The transcript file of Tralics contains a line for each use of \dimendef or friends. Here is the list of all standard definitions (for Tralics version 2.12). Note that \count@ is counter 255, while \dimen@ is dimension 0. The names (and values) of the commands \z@, \@ne, \tw@, \thr@@, \sixt@@n, \@cclv, \@cclvi, \m@ne, \@m, \@M, \@Mi, \@Mii, \@Miii, \@Miv, \@MM, were chosen by Knuth (maybe Lamport). The values are 0, 1, 2, 3, 16, 255, 256, -1, 1000, 10000, 10001, 10002, 10003, 10004, and 20000. These numbers were typeset via \number\@MM.

Remember that the internal name of the LaTeX counter `enumi´ is \c@enumi.

{\countdef \count@=\count255}
{\countdef \c@page=\count0}
{\dimendef \dimen@=\dimen0}
{\dimendef \dimen@i=\dimen1}
{\dimendef \dimen@ii=\dimen2}
{\dimendef \epsfxsize=\dimen11}
{\dimendef \epsfysize=\dimen12}
{\chardef \@ne=\char1}
{\chardef \tw@=\char2}
{\chardef \thr@@=\char3}
{\chardef \sixt@@n=\char16}
{\chardef \@cclv=\char255}
{\mathchardef \@cclvi=\mathchar256}
{\mathchardef \@m=\mathchar1000}
{\mathchardef \@M=\mathchar10000}
{\mathchardef \@Mi=\mathchar10001}
{\mathchardef \@Mii=\mathchar10002}
{\mathchardef \@Miii=\mathchar10003}
{\mathchardef \@Miv=\mathchar10004}
{\mathchardef \@MM=\mathchar20000}
{\chardef \active=\char13}
{\tokesdef \toks@=\toks0}
{\skipdef \skip@=\skip0}
{\dimendef \z@=\dimen13}
{\dimendef \p@=\dimen14}
{\dimendef \oddsidemargin=\dimen15}
{\dimendef \evensidemargin=\dimen16}
{\dimendef \leftmargin=\dimen17}
{\dimendef \rightmargin=\dimen18}
{\dimendef \leftmargini=\dimen19}
{\dimendef \leftmarginii=\dimen20}
{\dimendef \leftmarginiii=\dimen21}
{\dimendef \leftmarginiv=\dimen22}
{\dimendef \leftmarginv=\dimen23}
{\dimendef \leftmarginvi=\dimen24}
{\dimendef \itemindent=\dimen25}
{\dimendef \labelwidth=\dimen26}
{\dimendef \fboxsep=\dimen27}
{\dimendef \fboxrule=\dimen28}
{\dimendef \arraycolsep=\dimen29}
{\dimendef \tabcolsep=\dimen30}
{\dimendef \arrayrulewidth=\dimen31}
{\dimendef \doublerulesep=\dimen32}
{\dimendef \@tempdima=\dimen33}
{\dimendef \@tempdimb=\dimen34}
{\dimendef \@tempdimc=\dimen35}
{\dimendef \footnotesep=\dimen36}
{\dimendef \topmargin=\dimen37}
{\dimendef \headheight=\dimen38}
{\dimendef \headsep=\dimen39}
{\dimendef \footskip=\dimen40}
{\dimendef \columnsep=\dimen41}
{\dimendef \columnseprule=\dimen42}
{\dimendef \marginparwidth=\dimen43}
{\dimendef \marginparsep=\dimen44}
{\dimendef \marginparpush=\dimen45}
{\dimendef \maxdimen=\dimen46}
{\dimendef \normallineskiplimit=\dimen47}
{\dimendef \jot=\dimen48}
{\dimendef \paperheight=\dimen49}
{\dimendef \paperwidth=\dimen50}
{\skipdef \topsep=\skip11}
{\skipdef \partopsep=\skip12}
{\skipdef \itemsep=\skip13}
{\skipdef \labelsep=\skip14}
{\skipdef \parsep=\skip15}
{\skipdef \fill=\skip16}
{\skipdef \@tempskipa=\skip17}
{\skipdef \@tempskipb=\skip18}
{\skipdef \@flushglue=\skip19}
{\skipdef \listparindent=\skip20}
{\skipdef \hideskip=\skip21}
{\skipdef \z@skip=\skip22}
{\skipdef \normalbaselineskip=\skip23}
{\skipdef \normallineskip=\skip24}
{\skipdef \smallskipamount=\skip25}
{\skipdef \medskipamount=\skip26}
{\skipdef \bigskipamount=\skip27}
{\skipdef \floatsep=\skip28}
{\skipdef \textfloatsep=\skip29}
{\skipdef \intextsep=\skip30}
{\skipdef \dblfloatsep=\skip31}
{\skipdef \dbltextfloatsep=\skip32}
{\countdef \m@ne=\count20}
{\countdef \c@FancyVerbLine=\count21}
{\countdef \c@enumi=\count22}
{\countdef \c@enumii=\count23}
{\countdef \c@enumiii=\count24}
{\countdef \c@enumiv=\count25}
{\countdef \c@footnote=\count26}
{\countdef \c@part=\count27}
{\countdef \c@chapter=\count28}
{\countdef \c@section=\count29}
{\countdef \c@subsection=\count30}
{\countdef \c@subsubsection=\count31}
{\countdef \c@paragraph=\count32}
{\countdef \c@subparagraph=\count33}
{\countdef \c@mpfootnote=\count34}
{\countdef \c@bottomnumber=\count35}
{\countdef \c@topnumber=\count36}
{\countdef \@tempcnta=\count37}
{\countdef \@tempcntb=\count38}
{\countdef \c@totalnumber=\count39}
{\countdef \c@dbltopnumber=\count40}
{\countdef \interfootnotelinepenalty=\count41}
{\countdef \interdisplaylinepenalty=\count42}
{\toksdef \@temptokena=\toks11}
{\chardef \@tempboxa=\char11}
{\chardef \voidb@x=\char12}

At the start of the run, some commands are created; we start with commands that take no argument, whose expansion is formed of characters only.

We continue with more complicated commands:

We show here the meaning of \@sanitize, \@nnil and \do.

\@sanitize=macro: ->\@makeother \ \@makeother \\\@makeother \$\@makeother
   \&\@makeother \#\@makeother \^\@makeother \_\@makeother \%\@makeother \~.
\dospecials=macro: ->\do \ \do \\\do \$\do \&\do \#\do \^\do \_\do \%\do
   \~\do \{\do \}.
\@nnil=macro: ->\@nil .

The page counter is \count0. We first define \c@page, via \countdef, then \cl@page to be empty, \c@page to be one, and \thepage to use arabic numbers; the \pagenumbering command redefines the \thepage. The command \p@page is not defined because \p@foo is used only for printing the label associated to the counter (if you want the page number of reference `foo´, use \pageref). Note that in Tralics, the page counter is never modified, \pageref is the same as \ref. Consider a reference to the page containing the start of the verbatim environment we are commenting: . In LaTeX, the value of \@currentlabel is considered, in Tralics, the current anchor is used instead; this works well for \ref. The current label is defined here:6.13, at the start of the section. For the HTML version of the document, we have cheated a bit. We have added an \index command, and this inserts an anchor after the word `bootstrap´; the label follows the \index. The \pageref command uses this label. Note that the \anchor command adds an anchor, but this is not defined by LaTeX. The commands defined here produce a <pagestyle> element.


We show now the remaining of the bootstrap code. We start with filling some registers.

[1] %% Begin bootstrap commands for latex
[2] \@flushglue = 0pt plus 1fil
[3] \hideskip =-1000pt plus 1fill
[4] \smallskipamount=3pt plus 1pt minus 1pt
[5] \medskipamount=6pt plus 2pt minus 2pt
[6] \bigskipamount=12pt plus 4pt minus 4pt
[7] \z@=0pt\p@=1pt\m@ne=-1 \fboxsep = 3pt %
[8] \c@page=1 \fill = 0pt plus 1fill
[9] \paperheight=297mm\paperwidth=210mm
[10] \jot=3pt\maxdimen=16383.99999pt

The two commands defined here take as argument either a character or a one character command.

[33] \def\@makeother#1{\catcode`#1=12\relax}
[34] \def\@makeactive#1{\catcode`#1=13\relax}

Other commands

[11] \def\newfont#1#2{\font#1=#2\relax}
[12] \def\symbol#1{\char #1\relax}
[16] \newenvironment{cases}{\left\{\begin{array}{ll}}{\end{array}\right.}%
[19] \def\stretch#1{\z@ \@plus #1fill\relax}
[20] \theoremstyle{plain}\theoremheaderfont{\bfseries}
[21] \def\@namedef#1{\expandafter\def\csname #1\endcsname}
[22] \def\@nameuse#1{\csname #1\endcsname}
[23] \def\@arabic#1{\number #1}
[24] \def\@roman#1{\romannumeral#1}
[25] \def\@Roman#1{\Romannumeral#1}
[30] \def\LaTeXe{\LaTeX2$\epsilon$}
[32] \def\enspace{\kern.5em }
[35] \def\root#1\of{\@root{#1}}
[40] \def\eqref#1{(\ref{#1})}
[43] \def\on@line{ on input line \the\inputlineno}
[47] %% End bootstrap commands for latex

6.14. Standard packages

Version 1 of this document described the status of standard packages for Tralics2.9. It has been withdrawn. The list of all packages, with documentation, is be found on the web.

6.15. Images

We give here some examples of the \includegraphics command. We consider a file with the following content. Notice that the clip attribute is set to true if `clip´ appears in the list, whether or not a value has been given. A colon and an underscore in a file name is never interpreted. The extension is always removed:

{\language=1 a:c
\IC[angle=0, =foo,,width=3cm,scale=1,scale=2,clip]{../../a_b:c}

We continue with an example of \epsfbox.


Tralics pretends that there are 4 different images. The translation is:

<figure rend='inline' clip='true' width='3cm' file='Logo-INRIA-couleur'/>
<figure rend='inline' height='' width='7.5cm'
     angle='20' file='Logo-INRIA-couleur'/>
<figure rend='inline' height='' width=''
a :c
<figure rend='inline' clip='true' scale='2' width='3cm' file='../../a_b:c'/>
<figure framed='true' rend='inline' file='x_'/></p>
<figure height='60.0pt' width='50.0pt' rend='inline' file='x'/>
<figure height='70.0pt' rend='inline' file='x'/>
<figure rend='inline' file='x'/>

The file `\jobname.img´ contains the following. The last number is the number of times the image was included. The second number explains in which format the file has been found. Whether or not the image file is found is irrelevant. The information given in the file is for information only.

# images info, 1=ps, 2=eps, 4=epsi, 8=epsf, 16=pdf, 32=png, 64=gif

6.16. The puzzle

The only requirements for the xii file is that ~ is an active character, \ has category code 0, % is a comment character, the end of line character is as usual. The file modifies the category code of 7, F, j and P, in such a way that `jdefjx71F71P´ is the same as `\def\x#1{#1}´. This is one way of making the code incomprehensible, the other is to use commands so that `six´, `geese´ and `laying´ are replaced by `/sx´, `Yegse´ and `RyalD´. The programs make the letter H active and defines it via `AHHFLP´. For those who want to write puzzles like this one: is it possible to avoid doubling the H? Without using all these strange commands, the file could be written as

\let~\catcode ~`A13 \defA#1{~`#113\def}
AZZ{}APP{\par}AXX#1{\bigskip On the #1 day of Christmas my true love gave to me}
ABB{PZAZZ{and }a partridge in a pear tree.}
ACC{Ptwo turtle doves}
ADD{Pthree french hens}
AEE{Pfour calling birds}
AFF{Pfive gold rings}
AGG{Psix geese a laying}
AHH{Pseven swans a swimming}
AII{Peight maids a milking}
AJJ{Pnine ladies dancing}
AKK{Pten lords a leaping}
ALL{Peleven pipers piping}
AMM{Ptwelve drummers drumming}
ATT#1 #2,#3:{\if.#3.\elseT#3:\fiX{#1}\U#1 #2,#3:}
\def\U#1#2#3#4 #5,#6{\F#1#2#3#5\if:#6\elseV\U#6\fi}
Ttwelfth M,eleventh L,tenth K,ninth J,eighth I,seventh H,sixth G,fifth
F,fourth E,third D,second C,first B,:\bye

The size of the file is 698 characters (compare to the 767 of the xii file). Note how the double loop is constructed. The xii file is made obscure by replacing B, C, D etc., by expression that have the same expansion, using instead of \F a command that uses some characters (because `nine´ and `ninth´ start with the same letters, etc.) An interesting point is that we can write a smaller file, with all loops unrolled, replacing the last 5 lines by the following (this makes a total of 642 characters):

Back to main page