Tralics, a LaTeX to XML translator; Part I

# 2. Expansion

One part of the work of TeX is to replace all user defined tokens by primitives; this is the main objective of the expansion´ process. In this respect, there is little difference between TeX and Tralics. In this chapter, we review some constructions.

## 2.1. Defining new commands

A definition is typically of the form

\def\fooi{foo}
\def\fooii#1#2{#2#1}
\def\fooiii+#1.#2=#3#{Seen#1#2#3.}


You may wonder why the commands are not called \foo1´, \foo2´ and \foo3´. The reason is that, if digits have standard category codes, they are not of type letter, so that \2foo´ is the command \2, followed by the letters foo´ (the tokens are 2 f11 o11 o11) and \foo2´ is the command \foo followed by the digit 2 (the tokens are foo 212). It is possible to create the token foo2 via \csname foo2\endcsname, and it is also possible to change the category code of 2. This is in general a bad idea: If you say \setlength{\parindent}{\foo2+2cm}, it is impossible to design the \setlength command so that \foo2´ is read as a command and 2cm´ as a dimension. On the other hand, if you say \def\foo2#1#2{#2#1}, TeX expects, after the second #, the character 2 with category code 12; if not it complains with: Parameters must be numbered consecutively. In Tralics, the message is a bit different, it says Error while scanning definition of \foo2 expecting #2; got #{Character 2 of catcode 11}.) Note how 211 is printed.

Before \def, you can put a prefix: it can be \long, indicating that the command accepts whole paragraphs as arguments; it can be \outer, indicating that the command cannot be the argument of another command; it can be \protected, indicating that the command should not be expanded in an \edef (this is an ϵ-TeX extension); it can be \global. This last prefix can be put before any assignment, it says that the assignment is global (unless \globaldefs is non-zero). More than one prefix can be used, the order is irrelevant. After the \def comes the object to define (this is either an active character, or a command name), then what TeX calls <parameter text>, and this is followed by the body. The body starts with the first opening brace (any character of category code 1) and ends with the first closing brace (any character with category code 2) that makes the body balanced against braces. These braces are not part of the body. The parameter text is an arbitrary sequence of tokens, but cannot contain braces. If it contains a # (in fact, any character of category code 6), it has to be the final character of the sequence, or be followed by the digits 1, 2, 3, up to 9, in order. If there is some text between #3 and #4 (or between #3 and the start of the body), this imposes a constraint on the third argument. If there is some text before #1, this imposes a constraint on the command itself. In the body you can use ##, this will be replaced by a #; you can also use #1, #2, etc., this will be replaced by the value of the first, second, etc., argument. As above, the # is any character of category 6, the digits are of category 12, you cannot access the second argument if only one is available. If you define \foo2 as above, TeX will signal a second error: Illegal parameter number in definition of \foo2.

Once you have defined the commands, you can use them. We give here an example, and the translation by Tralics

\fooi\fooii12\fooiii+ok. {\itshape 3} =xyz{}!

foo21Seenok <hi rend='it'>3</hi> xyz.!


and also by LaTeX foo21Seenok 3 xyz.!´ Some explanations. The first command takes no argument, thus is easy to understand. The second command takes two arguments, its body is #2#1´ so that the expansion is the token list formed by the tokens of the second argument followed by the tokens of the first argument. In the case of \foo12´, the arguments are 1´ and 2´ (a list of length one). In the case of \fooii {AB} {CD}´, the arguments are AB´ and CD´, a list of length two. This is because TeX ignores initial spaces when reading undelimited arguments; in any case, an argument is well-balanced against braces (same definition as above for the body of a command). The shortest possible sequence of tokens is read (in the case of an undelimited argument, this sequence is never empty). If the result starts with an open brace and ends with a closing braces, these are removed, provided that the remaining token list is well-balanced; for instance, in the case \fooii{}a´, the first argument is empty. If the command is not \long, then \par tokens are forbidden in the argument. In any case tokens that are defined to be \outer are forbidden in a parameter.

In the case of \fooiii, the situation is a bit more complicated. Fetching the argument is more involved than in the general case. The specification is: plus sign, argument, dot, argument, equals sign, argument, sharp sign. Note first that the +´ sign is not part of the command name, but is required after it whenever used. The first argument here is the shortest sequence (possibly empty) of tokens, that is a balanced list, and this is followed by the required token list (here, a single dot). Here it is ␣{\it␣3}␣´; a pair of initial and final braces disappear, if possible. The #{´ after #3´ says that the third argument is delimited by an open brace. This brace is left unread. Such a construction is rare: it occurs only four times in the LaTeX sources, two example will be given later in section 2.10.

Consider the following example: \def\opt[#1]{}´. If you say \opt[foo]´ or \opt[{foo}]´, the argument is foo´. If you say \opt[{[foo]}]´, it is [foo]´. It is important to know that braces are required if you want a closing bracket in the argument. In the case of \item[{\it foo}]´, the braces are useless; the scope of the \it command is limited to foo´ because an additional pair of braces is added somewhere in the body of the \item command. The following example is non-trivial:

\def\@car#1#2\@nil{#1}
\def\@cdr#1#2\@nil{#2}
\if b\expandafter\@car\f@series\@nil\boldmath\fi


Both commands \@car and \@cdr read a normal (undelimited) argument, and a second argument delimited by \@nil, and return one of these. These commands are implemented in Tralics in the C++ kernel for efficiency. The third line shows a use of \@car, where the arguments are the expansion of \f@series; the main assumption is that this token list does not contain the \@nil token, which is a reserved command. The caller of the macro must also ensure that the list is not empty, for otherwise the first argument would be be \@nil, and the end of the second argument would never be seen if the \@nil does not appear in the document text. Note that an error is signaled and scanning stops at the first \par token (or empty line) because the command is not outer.

Let´s assume that \f@series expands to a non-empty list, for instance mc´ (this means that the current font has medium weight and is condensed). Then \@car md\@nil´ expands to m´. The third line of our example uses \@car to get the first character of \f@series, and compares it to b´ (the result is true if the current font is bold, extra bold, bold condensed, etc). This code is used for typesetting the LaTeX2ϵ logo in bold version as LaTeX2ϵ. The commands \if and \expandafter will be explained later. Note that \if fully expands what follows the letter b. This means that you are in trouble if \f@series expands to an empty list, or if the first token is a command whose expansion may cause problem (perhaps because it has delimited arguments and \@car gobbled the delimiter), or is empty, or is a list that starts with the letter b.

The following example is from the TeXbook:

\def\cs AB#1#2C$#3\$ {#3ab#1 c##\x #2}
\cs AB{\Look}C${And\$ }{look}\$5  If you feed this to Tralics(note: ), you will get three errors (one because of the ##´, and two undefined commands). In verbose mode, the transcript file of Tralics will contain the following \cs AB#1#2C$#3\$->#3ab#1 c##\x #2 #1<-\Look #2<- #3<-{And\$ }{look}



If you want to put 7 in the category code of the character defined by the command \A, you should say \catcode\A=7~´.(note: ) It is possible to make \A a reference to the character number 25, by using \chardef. Thus you can say \chardef\A25~´ and \catcode\A7~´. Note that, in the context of routines like scanint, a character number is a valid number; so that \A can be used as the number 25, wherever a number is required. In the sources of LaTeX you can see \chardef\active=13´. You will also see \mathchardef\@cclvi=256´; there is no difference between \chardef and \mathchardef, except that a character is in the range 0-255, while a math char can take larger values (less than ${2}^{15}$). You can use \countdef\B26 (this will make \B as a reference to count register number 26), \dimendef\C27 (this will make \C as a reference to dimension register number 27), \skipdef\D28 (this will make \D as a reference to skip register number 28), \muskipdef\E29 (this will make \E as a reference to muskip register number 29), and \toksdef\F30 (this will make \F as a reference to token register number 30). There is no \boxdef´. The reason is that, if you want to copy the value of counter 1 into counter 0, you say \count0=\count1. If you say \count@=\B this will put the value of the counter 26 into \count@ (this is the counter 255). However, you say \setbox0=\copy1 if you want to copy the content of box 1 into box 0: the syntax is not the same. Note that \setbox0=\box1 copies and clears the box number one. When you use a command like \chardef, a line will be added to the transcript file, even in non-verbose mode, see section 6.13.

Commands can be defined via \let´. You say \let\A=\B, where \A is a token that can be defined (active characters or commands; TeX does not care if the token is defined or not). It is followed by <equals><one optional space>. This means that TeX reads all space tokens; if the first unread token is an equals sign, it is read as well as the next token, provided that it is a space. If the equals sign is followed by two space tokens, only one is read. Instead of \B, you can put any token. After that, the current meaning of \A will be the current meaning of \B. For instance, if you say \let\foo\bar\show\foo you will get \foo=macro:->\mathaccent "7016\relax. In plain TeX, you would see a space instead of \relax (both a space and a \relax indicate the end of the number). In Tralics, you would see \foo=\bar, this is because \bar is a primitive, instead of a user defined command. If you say \let\A=+, then \A will behave like a + character (of category 12). In fact, this is called an implicit character, and sometimes an explicit character is required. For instance in the case \parindent=-3.4pt, the minus sign, the digits, the dot, and the two letters pt must be explicit characters. However, after

\let\bgroup={  \let\egroup=} \let\sp=^ \let\sb=_


there is no difference between $x\sp\bgroup a\sb b\egroup$ and $x^{a_b}$. The assignments shown here are made by Tralics when bootstrapping, and the command so defined should be considered primitives. A token list has to be well balanced against explicit braces. For instance

\def\foo{{\catcode}=0\egroup}


satisfies the requirements. The body of the command consists in {1 catcode 12 }2 =12 012 egroup. If you evaluate \foo, the \catcode command will read the four tokens that follow; it will modify the category code of the opening brace. All this happens inside a group opened by {1 and closed by egroup, so that this is harmless. One use of \let is the following:

\def\fooA{a very long command}
\def\fooB{another very long command}
\def\xbar#1{\ifx 0#1\let\foo\fooA \else \let\foo\fooB\fi}


Here we use the fact that \let just moves a pointer.(note: ) This is faster than copying a list. In particular, consider

\def\xbar#1{\ifx 0#1\fooA \else \fooB\fi}
\def\xbar#1{\ifx 0#1\let\foo\fooA \else \let\foo\fooB\fi\foo}


The first line executes conditionally one of \fooA and \fooB. However, this command cannot read an argument (because \fooA is followed by \else and \fooB by \fi). In the second case, we define \foo conditionally, and it can read its arguments without problem.

You can use the following construct

\def\addtofoo#1{\let\oldfoo\foo\def\foo{#1\oldfoo}}
% example of use
\def\foo{A}\foo


This typesets as ABA. Beware: the \addtofoo command can be used only once (the old value of \oldfoo has to be saved...). We shall see later how to replace in the definition above the \oldfoo by its value, using either tokens lists or \edef, using a method where \oldfoo is a temporary. This is another example:

\def\double#1#2{\let#1#2\def#2{#1#1}}
% example
\def\B{\C}\def\C{to}\double\tmp\B


Here \B´ typesets as toto´. In fact \B is defined as \tmp\tmp´, where \tmp is the old definition of \B, namely a command that expands to \C´. It you say \def\C{ti}\B, you will get titi´. If in \double the \let is replaced by a \def as \def#1{#2}, the expansion of \tmp would have been \B, and \B would have been the same as \B\B. You see the problem? This could provoke a stack overflow, a parameter stack overflow, or even a program crash.

Let´s mention the existence of \futurelet\A\B\C. It is the same as \let\A\C\B\C. The usefulness of such a construct will be explained later.

You can say \expandafter\A\B. In such a case, TeX reads the first token, saves it somewhere, calls expand if possible, re-inserts the saved token. Nothing special happens if the second token (here \B) cannot be expanded, because it is a non-active character, or a command like \par or \relax. But assume that \A is a command that uses one argument (for instance \textit) and \B expands to foo´. If you use \expandafter, only the first letter will be in italics. Assume that \foo expands to a dollar sign. Then $\foo is an empty math formula because \foo is not expanded, but \expandafter$\foo. is a display math formula with a dot. The main reason why tokens are not expanded after a dollar sign (when TeX looks for an other dollar sign) is that a test $\ifmmode true\fi$ should evaluate to true. You can use \expandafter if you want the test to be executed outside math mode. Note: if a table contains a template of the form $#$´, if the cell starts with \ifmmode, then the test is expanded (i.e. evaluated) before math mode is entered, because TeX is looking for an \omit token. As a consequence you should always put \relax´ before a test (this is not needed if a command is made “Robust”).

Look carefully at the following lines:

1 \def\toto{\titi!}\def\titi{\tata}\def\tata{\tutu}
2 \expandafter\expandafter\expandafter\def\toto{5}
3 \let\E\expandafter \E\E\E\def\toto{6}
4 \def\E{\expandafter} \E\E\E\def\toto{7}
5 \expandafter\def\toto{8}


On the first line we define three commands \toto, \titi and \tata. As we shall see, lines 2, 3 and 4 do not change the meaning of \toto, so that, on line 5, the expansion of \toto´ is \titi!´. In this case, the effect of the \expandafter is to replace \toto´ by \titi!´. Hence, line 5 defines a macro \titi, that has to be followed by an exclamation point, takes no argument, and expands to 8. Consider now line 2. The first \expandafter puts apart the \expandafter token; it expands the next token, which is \expandafter, and the expansion of this is: read the token that follows (here \def´), and expand the token that follows. This is \toto´, that expands to \titi!´. If we pop back the two tokens, line 2 is equivalent to \expandafter\def\titi!{5}´. This looks like line 5, so that it is the same as \def\tata!{5}´. There is no difference between lines 2 and 3: the \E command behaves exactly like \expandafter. Consider now line 4. What TeX does is expand the first token. It is \E, it expands to \expandafter´. Since the token can be expanded, it will. Thus TeX reads and remembers the token that follows. It expands the next token (the third \E´). Its expansion is \expandafter´. Hence, line 4 is equivalent to \E\expandafter\def\toto{7}´. Now, the \E in this list has as effect to try to expand the second token; it is \def, which cannot be expanded. Hence this \E´ is useless. Line 4 is equivalent to \expandafter\def\toto{7}´. And this defines \titi. We give here the trace of Tralics (it is a bit more complete then the trace of TeX):

\E ->\expandafter
{\expandafter \E \E}
\E ->\expandafter
\E ->\expandafter
{\expandafter \expandafter \def}
{\expandafter \def \toto}
\toto ->\titi !
{\def}
{\def \titi !->7}


A question is : how many commands with two characters can be defined in Tralics? The answer is 255 squared (all characters but the null character are allowed(note: )). Of course, if you say \def\++{}´, this defines the \+´ command not the \++´. You could imagine to change category codes (but, in a construction like \def\{}{}, it is impossible to give a different role to the first and second opening brace). The solution is given by \csname, you can use it like this \csname1+1=2\endcsname´. Note that this typesets nothing: when \csname manufactures a new control sequence name, it defines it as being \relax (the control sequence will exist, until the end of the job). You can hide the \csname command, like this

\def\nameuse#1{\csname #1\endcsname}
\nameuse{1+1=2}


If you want to define such a beast, you must use \expandafter.

\def\namedef#1{\expandafter\def\csname #1\endcsname}
\namedef{1+1=2}{true}


The two commands \@namedef and \@nameuse are defined by LaTeX and Tralics like \namedef and \nameuse.

You can also say \namedef{++}#1{#1+#1} followed by \nameuse{++}{3}. This should give 3+3. If you want a macro named \{}, you can say \nameuse{\string\{\string\}}, provided that \escapechar=-1. If you do not like this setting of \escapechar, you can define a command, say \Lbra, that expands to {12 (an inactive opening brace character) using whatever method seems best. For instance

{\escapechar=-1 \xdef\Lbra{\string\{}\xdef\Rbra{\string\}}}
\namedef{\Lbra\Rbra}{Hey}


We explained above what happens when three \expandafter come in a row. Thus, it should not surprise you that the following command defines \foo.

\expandafter\expandafter\expandafter\def\nameuse{foo}{12}


A more realistic example of \csname is

\def\allocate#1{....}
\def\newcount#1{\allocate{ctr}\countdef#1\allocationnumber}
\def\newcounter#1{\expandafter\newcount\csname c@#1\endcsname}


There are ten such commands in LaTeX, \newcount, \newtoks, \newbox, \newdimen, \newskip, \newmuskip, \newread, \newwrite, \newlanguage are implemented in Tralics. The equivalent of \allocate takes as argument a type (for counters, dimensions, skip registers, muskip registers, box registers, token registers, input registers, output register, math families, language codes, insertions, etc) and allocates a unique number depending on the type, and puts it in \allocationnumber. Count registers between 10 and 19 are used for this purpose, and the user should not modify them. Command \new@mathgroup is not implemented because math groups are unused. Note that \newsavebox and \newdimen are the same as \newbox and \newskip since Tralics does not check redefinition of the command; the command \newinsert is not implemented (this requires a box register, a count register, a dimen register and a skip register; each unprocessed float in LaTeX uses a insert, this may trigger a too many unprocessed floats error). The command \newhelp is not implemented in Tralics, it allocates no counter.

For instance, if you say \newcount\Foo, the allocated number could be 110, if you say \newskip\Bar, the number could be 46. In the first case, the result is as if you had said \countdef\Foo110. In the case of \newcounter{foo}, the result is as \newcount\c@foo111. Note that there are only 256 count registers available in TeX(note: ). You can use registers zero to nine as scratch registers (Do not forget that \count0 contains the current page number), LaTeX uses registers 10 to 21 for its allocation mechanism. In the current version, the first free counter is 79. Some other counters are allocated by the class, and the package (in the transcript file, one line is printed for every call to \allocate, for instance: \c@chapter=\count80; in Tralics, the line looks like {\countdef \c@foo=\count43}).

A very important point is that all tokens between \csname and \endcsname are fully expanded. It is an error if a non-character token remains. Thus it is important to know which commands are expanded, and those that cannot be expanded. The exact rules are in the TeXbook, chapter 20. As a rule of thumb, commands that do no typesetting and modify no internal table can be expanded. More precisely: user defined commands, conversions like \string, \number, conditionals like \fi, marks, and some special commands like \csname, \expandafter, \the can be expanded. A construction like \csname\charA\endcsname is invalid.

If you say \noexpand\foo, the result is \foo, unexpanded. Example:

1 \def\FOO{12}
3 \edef\xbar{\noexpand\FOO}
4 \noexpand\FOO
5 \expandafter\textit\FOO
6 \expandafter\textit\noexpand\FOO
7 \count0=1\FOO
8 \count0=1\noexpand\FOO


Line two is an error: the no-expanded \FOO is not a character. On line 3, the body of \xbar is \FOO´, it will be expanded later. The translation of line 4 is empty (the command \FOO is temporarily seen as \relax, and \relax does nothing). Because of the \expandafter, the argument of \textit on line 5 is 1, on line 6 it is 12. On line 7, 112 is put in \count0, because \FOO is expanded. On line 8, 1 is put in the register, and 12 is typeset. On lines 8 and 6, \FOO is expanded twice, the first expansion being inhibited by the \noexpand.

Some quantities are never expanded, for instance \lowercase (this is black magic), \def (more generally all assignments), \relax (it does nothing, but stops scanning integers, dimensions, glue, etc), \hbox, \par(note: ), \left, etc. There are cases when an expandable token is not expanded: ten cases in TeX, and four additional cases in ϵ-TeX, these are described in section 6.12. Be careful with constructs like \csnameé\endcsname: LaTeX may signal an error involving \unhbox.

A command can be defined via \edef instead of \def (\xdef is the same as \edef, with an implicit \global prefix). All tokens, unless defined with \protected, in the body of the definition are expanded. Example:

\def\A{\B\C} \def\C{1}
\def\Bgroup{{\iffalse}\fi}\def\Egroup{\iffalse{\fi}}
{\let\B\relax \global\edef\D\bgroup{\A\noexpand\C\egroup}}
{\let\B\relax \global\edef\E\Bgroup{\A\noexpand\C\Egroup}


In this example, we consider two groups, that define (locally) a command \B and (globally) two commands \D and \E. The difference between these two commands is that \bgroup is an implicit character: when evaluated, it behaves like an opening brace, but it cannot be expanded. On the other hand, \Bgroup expands to an open brace. The \edef expands tokens following an explicit opening brace. It stops reading after having found an explicit closing brace (resulting from the expansion of \Egroup, not \egroup). The expansion of \A´ is \B\C´, this is expanded again. Since \B is relax, it cannot be expanded, and is left unchanged. The expansion of \C´ is 1´, so that the full expansion of \A´ is \B1´. The expansion of \noexpand\C´ is \C´. Thus, the example is equivalent to

\global\def\D\bgroup{\B1\C\egroup}
\global\E\Bgroup{\B1\C}


You can put three \noexpand in a row followed by some token X. After the first expansion, the result is \noexpand followed by X, after the second expansion, the result is X. In the example that follows, the value of \B is \xbar.

\def\xbar{xbar}
\edef\A{\noexpand\noexpand\noexpand\xbar}
\edef\B{\A}


Consider a realistic example like this

\def\add#1#2{\edef#1{#1\do{#2}}}
\def\cons#1#2{\begingroup\let\@elt\relax\xdef#1{#1\@elt #2}\endgroup}


We can say something like

\def\A{}\def\B{}  %init
\let\do\relax% just in case
\cons\B{ab}, \cons\B{cd}, \cons\B{ef}.
\show\A\show\B


This gives two ways to add some tokens to a list. Because both commands use \edef, full expansion is in use; you have to be very careful if the tokens contain macros that can be expanded. For the case of \add, we assume that \do does nothing; for the case of \cons, the command resets \@elt to \relax. The body of \A will be \do{x}\do{y}\do{z} and the body of \B will be \@elt ab\@elt cd\@elt ef. Note the absence of braces: if you really need them, you should add them to the argument of the \cons command. The built-in command \@cons

The major problem with \edef is that it is not aware of assignments. Assume that \def\@A\B{}, and \def\C{B \let\@A\D}, \def\E{\C} have been somehow evaluated. Consider now an \edef containing \E. This implies expansion of \C, hence of \let\@A\D´. The \let command cannot be expanded. Hence \@A is expanded, and you get the following error: Use of \@A doesn´t match its definition from inside \C. You have never heard of this command \@A, and never used \C! For this reason some commands are made robust: for instance \hspace expands to \protect\hspace ´ (the second command here has a space at the end), and \protect is defined to be \relax, or \noexpand, and sometimes \string. This mechanism works only if you use \protected@edef instead of \edef. (Note: \protect behaves like \string inside \protected@write, which is a variant of \write).

## 2.4. Variables in TeX

By variable, we mean everything that the user can modify or watch changing. For instance, the current hash table usage is not a variable (it varies, of course, but the value is available only at the end of the run, in the transcript file). The current vertical list is updated whenever a paragraph is split into lines; you cannot access this list, however the \output routine gets the part of it that should be typeset on the current page in the box register 255. There are general purpose variables, and specialised ones: for instance \spacefactor makes sense only in horizontal mode, and the height of the material on current page (\pagetotal) can be used only between paragraphs (in fact, it is updated by TeX whenever a line is added to the page; you can consult, and even modify, this value at any time). There are variables that you cannot modify (the version number, for instance) or only once (the magnification), or in the preamble (i.e., LaTeX reads some variables at begin-document, changes done later to these variables are ignored).

Variables can be classified into two categories depending on their use: in some cases you need to put a prefix before \foo if you want to use it, in other cases the prefix is required for modification. For instance, if \foo is a user-defined command, you say \let\foo, or \def\foo, if you want to change the value, and simply \foo if you want to use it. In the same fashion \font\tenrm defines a font, and \tenrm is a use. On the other hand, if you say \pageno=3, this will set the current page number to 3 (this is plain TeX syntax, the LaTeX syntax will be explained later). If you say something like \hskip-\fontdimen2\font, the \hskip command is a prefix that says that the variable that follows will be used. In this case, this is some dimension from a font. Note that \fontdimen is a prefix so that \font does not define a new font, but refers to the current font. The meaning of the above piece of code is: insert horizontal space, whose amount is the opposite of the second parameter of the current font (i.e., normal interword space).

According to the TeXbook, a <font> can be a command like \tenrm defined by \font \tenrm =somefont, of the null font \nullfont, or the current font \font, or a family member (\textfont, \scriptfont, or \scriptscriptfont, followed by a 4bit integer). In the case of \hyphenchar or \skewchar, a <font> follows the command. This gives a reference to an integer, the hyphenchar or skewchar of the font (if this integer is not a valid character, the font has no hyphenchar or skewchar). In the case of \fontdimen, there is an integer P, a font, and this defines a reference to a dimension. The integer P must be positive and not greater than the number of parameters in the font (initialised by TeX to the number of parameters in the font metric file, 7 for a normal font, 13 for math extension, 22 for math symbols, see TeXbook, appendix F). You can get an error: Font somefont has only 7 fontdimen parameters. In Tralics, the value is zero if P is out-of-range. In TeX, the last loaded font table can be dynamically increased: if you assign a value at position $P>M$, this will increase M. In Tralics, this is possible for all fonts, if $P<{10}^{5}$.

The value of a variable can be

• an integer (32bit, signed, with magnitude less than ${2}^{31}$). The value can be restricted in some cases (to 0-255 if it is an index in a table of registers, to 0-255 if it is a character, etc).

• a dimension, often expressed in pt, (an integer number of times the small unit sp). Normally, the maximum value is ${2}^{14}$pt, but TeX does not always check for overflow.

• a glue. Called rubber length in LaTeX. It is like a dimension with a stretch part, and a shrink part.

• a muglue. Like glue, but only one unit of measure is allowed: mu (math unit).

• a token list. This is a list of tokens (as always, well balanced against explicit braces).

• a font.

• a box (a box contains characters, rules, boxes, penalties, glue, whatsit, etc, but no commands). In Tralics a box contains XML stuff.

You can say \afterassignment\foo\count0=3´; in this case, the command \foo is pushed on a special stack, and popped after assignment is complete. There is only room for one token on this special stack. For instance, if you write the following:

\def\fooA{\relax}\def\fooB{\relax}\def\fooC{\relax}\def\fooD{\relax}
\afterassignment \fooA\afterassignment\fooB
\fooC\count0=1\fooD


the transcript file of Tralics will contain (in verbose mode)

[9] \afterassignment \fooA\afterassignment\fooB
{\afterassignment}
{\afterassignment: \fooA}
{\afterassignment}
{\afterassignment: \fooB}


At this point, the after assignment stack contains \fooB. The order of evaluation is now the following: \fooD is expanded; this gives \relax, which terminates scanning of the number; it will be read again, after evaluation of \fooB:

[10] \fooC\count0=1\fooD
\fooC ->\relax
{\relax}
{\count}
+scanint for \count->0
\fooD ->\relax
+scanint for \count->1
{after assignment: \fooB}
\fooB ->\relax
{\relax}
{\relax}


You can use the \showbox command for displaying the content of a box. This is a little example. It uses \everyhbox and \afterassignment. Note the order in which these tokens are inserted.

\everyhbox{3}
\def\foo{12}
\afterassignment\foo\setbox0=\hbox{4}
\showbox0


This is what TeX prints in the log file:

> \box0=
\hbox(6.4151+0.0)x19.99512
.\T1/cmr/m/n/10 1
.\T1/cmr/m/n/10 2
.\T1/cmr/m/n/10 3
.\T1/cmr/m/n/10 4


The first line of the trace starts with \hbox or \vbox, followed by the dimensions (height, depth, width; the unit is pt´ by default), optionally followed by shifted 27.1´ if the the box is shifted, and by glue set 0.19´ if the glue has to be stretched or shrunk. After that, you will see the content of the box, one line per item (no more than \showboxbreadth lines are printed per box), each item is preceded by a context (a sequence of N dots at depth N, tokens at depth greater than \showboxdepth are not shown). In the box, you can see things like \penalty -51´ or \kern 28.45274´ or \glue 3.0 plus 1.0´ or \glue(\baselineskip) 2.28015´ (this last glue is inserted automatically by TeX, it knows where it comes from, so that the name can be printed), \special{...}, \write4{\indexentry...}. The interesting point in the last object is that we have a list of tokens that will be evaluated later (when the page is shipped out). Tralics does not put \kern, \penalty, neither \glue in a box. The \special command is not implemented; finally \write is never delayed. In our example, the box contains four items, which are characters (TeX shows a command that contains the name of the font; in our example, the font is something like ecrm1000´).

In Tralics, you would see the same characters, but no font and no size. On the other hand, you can say something like

\everyxbox{Test}
\setbox0=\xbox{foo}{1\xbox{bar}{2} %
\showbox0


and you will see

<foo y='2'>Test1<bar x='1'>Test2</bar> 3</foo>


Note the two commands that were used to add attributes to the current XML elements, and the last constructed one. We have added another command, \XMLaddatt that takes as optional argument the id of the element to which the attribute value pair should be added. This is an integer; if omitted, the current element is used. You can use \XMLlastid or \XMLcurrentid (there are references to variables, you must use \the if you want the value). If you want to overwrite an existing attribute pair, you must use a star. The previous example can be written like this:

\everyxbox{Test}
\setbox0=\xbox{foo}{1\xbox{bar}{2} %
\showbox0


If \foo is any command then \show\foo will show its value. Here are some examples

\def\Bar#1#{#1} \show\Bar
\let\foo\par \show\foo
\renewcommand\foo[2][toto]{#1#2} \show\foo
\let\foo=1 \show\foo
\let\foo=_ \show\foo
\let\foo=\undef \show\foo
\show\bgroup


This is what Tralics prints (it differs slightly from the LaTeX output)

\Bar=macro: #1#->#1.
\foo=\par.
\foo=opt \long macro: toto#2->#1#2
\foo=the character 1.
\foo=subscript character _.
\foo=undefined.
\bgroup=begin-group character {.


In the case of a variable, you can say \the\foo, the result is a token list that represents the value of \foo (if \foo is a token list, \the\foo is the value of \foo, otherwise, it is a list of characters). The command \showthe will show the value, i.e. print on the terminal the token list returned by \the. Example

\def\Show#1{\the#1\showthe#1}
\widowpenalty=3 \Show\widowpenalty
\parindent1.5pt \Show\parindent
\leftskip = 1pt plus 2fil minus 4fill \Show\leftskip
\thinmuskip = 3mu plus -2fil minus 4fill \Show\thinmuskip
\count0=17 \Show{\count0}
\dimen0=17pt \Show{\dimen0}
\skip0=17pt plus 1 pt minus 2pt \Show{\skip0}
\muskip0=17mu plus 1 mu minus 2mu \Show{\muskip0}
\Show{\catcode\A}
\Show{\lccode\B}
\Show\inputlineno
\font\xa=cmr10 at 11truept
\fontdimen6\xa = 11pt \hyphenchar\xa=\-
\Show{\fontdimen6\xa}
\Show{\hyphenchar\xa}
\chardef\foo25
\Show\foo
\Show\xa
\toks0={\foo = \foo} \def\foo{foo}
\Show{\toks0}


This is what Tralics prints on the screen.

\show: 3
\show: 1.5pt
\show: 1.0pt plus 2.0fil minus 4.0fill
\show: 3.0mu plus -2.0fil minus 4.0fill
\show: 17
\show: 17.0pt
\show: 17.0pt plus 1.0pt minus 2.0pt
\show: 17.0mu plus 1.0mu minus 2.0mu
\show: 11
\show: 98
\show: 79
\show: 11.0pt
\show: 45
\show: 25
\show: cmr10
\show: \foo= \foo


The typeset result is: 31.5pt0.0pt0.0mu1717.0pt17.0pt plus 1.0pt minus 2.0pt17.0mu plus 1.0mu minus 2.0mu11987911.0pt 45 25cmr10 foo= foo(note: ).

In the case of \the\foo, \showthe\foo, \advance\foo, \multiply\foo, \divide\foo, the token that follows the first command is fully expanded.

## 2.5. All the variables

All variables (exceptions will be given later) are in the table of equivalents: this table contains the current meaning of quantities that are saved/restored by the grouping mechanism of TeX. In TeX this table is divided into six parts; in Tralics, the layout is slightly different, for instance, because TeX makes a heavy using of glue (each space character produces a glue item), while Tralics ignores them completely. This big table contains the following objects

1. the current equivalent of single character control sequences (for ~ as well as \~);

2. the hash table (in Tralics, there are two such tables, if the command \foo produces <bar gee=´true´>, the three strings bar´, gee´ and true´ are in a special table).

3. all glue parameters.

4. all quantities that fit on 16 bits.

5. all integers.

6. all dimensions.

The glue parameters are the following (unused by Tralics, initialised to 0, unless stated otherwise.

• \lineskip: interline glue if \baselineskip is infeasible.

• \baselineskip: desired glue between baselines.

• \parskip: extra glue just above a paragraph.

• \abovedisplayskip: extra glue just above displayed math.

• \belowdisplayskip: extra glue just below displayed math.

• \abovedisplayshortskip: glue above displayed math following short lines.

• \belowdisplayshortskip: glue below displayed math following short lines.

• \leftskip: glue at left of justified lines.

• \rightskip: glue at right of justified lines. LaTeX uses \leftskip and \rightskip for commands like \centering, \raggedright etc. Unused by Tralics. On the other hand there is \nocentering, whose effect is the same as setting both leftskip and rightskip to zero.

• \topskip: glue at top of main pages.

• \splittopskip: glue at top of split pages.

• \tabskip: glue between aligned entries.

• \spaceskip: glue between words.

• \xspaceskip: glue after sentences.

• \parfillskip: glue on last line of paragraph.

• \thinmuskip: thin space in math formula.

• \medmuskip: medium space in math formula.

• \thickmuskip: thick space in math formula.

• \itemsep: defined by LaTeX. Rubber space between successive items in a list.

• \labelsep: defined by LaTeX. The space between the end of the label box and the text of the item in a list.

• \parsep: defined by LaTeX. Rubber space between paragraphs within an item.

• \fill: defined by LaTeX. Holds 0pt plus 1fill. You should not modify it.

• \smallskipamount, \medskipamount, \bigskipamount. Quantities defined by Tralics in the same way as LaTeX, but unused.

• \floatsep, \textfloatsep, \intextsep, \dblfloatsep, \dbltextfloatsep: glue inserted by LaTeX between float and other material.

• \hideskip. Holds -1000pt plus 1 fill. This is used in TeX to implement \hidewidth, a construction not implemened in Tralics.

• \z@skip. You should leave this quantity unchanged

• \normalbaselineskip, \normallineskip. Use by LaTeX for font switching.

• \listparindent, \topsep, \partopsep. Some parameters used by LaTeX.

• \@tempskipa, \@tempskipb: temporary skip

The token parameters are the following (initially empty; unused by Tralics unless stated otherwise):

• \parshape: for funny paragraphs (not really a token list).

• \output: user defined output routine.

• \everypar: tokens inserted by TeX at start of every paragraph.

• \everymath: tokens inserted by TeX and Tralics at the start of every non-display math formula.

• \everydisplay: tokens inserted by TeX and Tralics at the start of every display math formula.

• \everyhbox: tokens inserted by TeX and Tralics at the start of every \hbox.

• \everyvbox: tokens inserted by TeX and Tralics at the start of every \vbox.

• \everyxbox: tokens inserted by Tralics at the start of every \xbox.

• \everyjob: tokens inserted by TeX and Tralics at the start of every job. This must be defined by the format (for TeX), otherwise it is useless; in Tralics, you must put everyjob="\something{like this}" in the configuration file.

• \everycr: tokens inserted by TeX after every \cr or non redundant \crcr.

• \errhelp: tokens that will be printed by TeX in case of user-error.

• \everyeof: tokens inserted by ϵ-TeX at each end-of-file.

• \@temptokena: scratch register.

The integer parameters are the following. These parameters are zero, unless stated otherwise.

• \pretolerance: badness tolerance before hyphenation (initialised to 100 by Tralics).

• \tolerance: badness tolerance after hyphenation (initialised to 200 by Tralics).

• \linepenalty: amount added to the badness of every line in a paragraph.

• \hyphenpenalty: penalty for break after discretionary hyphen.

• \exhyphenpenalty: penalty for break after explicit hyphen.

• \clubpenalty: penalty for creating a club line at a bottom of a page.

• \widowpenalty: penalty for creating a widow line at top of page.

• \displaywidowpenalty: ditto, just before a display.

• \brokenpenalty: penalty for breaking a page at a broken line.

• \binoppenalty: penalty for breaking after a binary operation in a math formula.

• \relpenalty: penalty for breaking after a relation in a math formula.

• \predisplaypenalty: penalty for breaking just before a displayed formula.

• \postdisplaypenalty: penalty for breaking just after a displayed formula.

• \interlinepenalty: additional penalty for a page break between lines.

• \doublehyphendemerits: demerits for for consecutive broken lines.

• \finalhyphendemerits: demerits for a penultimate broken line.

• \mag: magnification ratio, times 1000.

• \delimiterfactor: ratio for variable-size delimiters.

• \looseness: change to the number of lines in a paragraph.

• \time: current time of day. Number of minutes since midnight, computed by Tralics at start of run.

• \day: current day of the month (between 1 and 31).

• \month: current month of the year (between 1 and 12).

• \year: current year of our Lord. The initial values of \time, \day, \month, \year, are printed in the transcript file by Tralics in the following format: 2006/10/24 10:18:10

• \showboxbreadth: maximum items per level when boxes are shown (when Tralics shows the content of a box, it always shows everything).

• \showboxdepth: maximum level when boxes are shown (when Tralics shows the content of a box, it always shows everything).

• \pausing: pause after each line is read from a file. In Tralics there is no interaction with the user.

• \tracingonline: show diagnostic output on terminal. In verbose mode, this variable, and some other ones are set to a non-zero value, as explained in section 6.6.

• \tracingmacros: show macros as they are being expanded.

• \tracingstats: show memory usage if TeX knows it.

• \tracingparagraphs: show line-break calculations.

• \tracingpages: show page-break calculations.

• \tracingoutput: show boxes when they are shipped out.

• \tracinglostchars: show characters that aren´t in the font.

• \tracingcommands: show command codes.

• \tracingrestores: show equivalents when they are restored.

• \uchyph: hyphenate words beginning with a capital letter.

• \outputpenalty: penalty found at current page break.

• \insertpenalties. Is the sum of all penalties for split insertions on the current page.

• \spacefactor: According to the TeXbook: “the exact amount of glue inserted by a space depends on \spacefactor, the current font, and the \spaceskip and \xspaceskip parameters as described in Chapter 12.”

• \hangafter: hanging indentation changes after this many lines.

• \floatingpenalty: penalty for insertions heldover after a split.

• \globaldefs: override \global specifications.

• \fam: current family.

• \escapechar: escape character for token output. Initialised by Tralics to backslash.

• \defaulthyphenchar: value of \hyphenchar when a font is loaded.

• \defaultskewchar: value of \skewchar when a font is loaded.

• \endlinechar: character placed at the right end of the buffer when reading a new line. Initialised by Tralics to CR (ascii 13).

• \newlinechar: character that prints as a LF. Initialised by Tralics to LF, but not used.

• \language: the current set of hyphenation rules. For Tralics, 0 means English, 1 means French, 2 means German, and 3 stands for any other language.

• \lefthyphenmin: minimum left hyphenation fragment size.

• \righthyphenmin: minimum right hyphenation fragment size.

• \holdinginserts: do not remove insertion nodes from \box255.

• \errorcontextlines: maximum intermediate line pairs shown. In Tralics, the context of an error is not shown.

• \tracingassigns,\tracinggroups,\tracingifs,\tracingscantokens, \tracingnesting. These are extensions of ϵ-TeX, that control what is printed in the transcript file.

• \tracingmath: controls what is printed when Tralics interprets a math formula.

• \predisplaydirection,\lastlinefit, \savingdiscards, \savinghyphcodes. These are other ϵ-TeX extensions, described in section 6.12

• \FPseed. This is defined only when the FP package is loaded.

• \TeXXeTstate controls ϵ-TeX bidirectional printed, unused by Tralics.

• \@nomathml. If this number is non-zero, math formulas are not converted into MathML expressions.

• \notrivialmath. This controls how some trivial math formulas should be translated as text.

• \hyphenchar, \skewchar. The command should be followed by a font reference, and this is a reference to the hyphenchar or skewchar of the font.

• \interlinepenalties, \clubpenalties, \widowpenalies, and \displaywidowpenalties. Commands defined by ϵ-TeX, described in section 6.12. Like \parshape; the command reads an integer n, and returns the corresponding value in the slot, it can also read n integers and store them.

• \@mathversion. If the value of the counter is positive, then a bold variant is used for math characters, if possible. The command \mathversion reads an argument; it sets the counter to one if the argument is bold´, to zero otherwise.

• \@tempcnta, \@tempcnta: Scratch counter.

• \interfootnotelinepenalty contains penalty added by LaTeX in footnotes.

• \interdisplaylinepenalty contains penalty inserted by LaTeX between lines of equations/

The following quantities are read only variables. They are integers, unless stated otherwise.

• \lastpenalty returns the value of the last item on the current list, if it is a penalty (always zero in Tralics).

• \lastkern returns the value of the last item on the current list, if it is a kern. This is a dimension, 0pt in Tralics.

• \lastskip returns the value of the last item on the current list, if it is glue. This is a dimension, 0pt in Tralics.

• \lastnodetype is an ϵ-TeX extension containing the type of the last item on the current list; always zero in Tralics.

• \inputlineno contains the current input line number.

• \XMLlastid contains the unique identifier of the most recently created XML element. Defined only by Tralics.

• \XMLcurrentid contains the unique identifier of the current XML element. Defined only by Tralics.

• \currentgrouplevel contains the current grouping level (index in the semantic stack); it is an ϵ-TeX extension.

• \currentgrouptype contains the type of the current semantic group; it is an ϵ-TeX extension, explained in section 6.12.

• \currentiflevel, \currentiftype, \currentifbranch are ϵ-TeX extensions described in section 6.12. The variables contain information about the condition stack.

• \eTeXversion is defined by ϵ-TeX, contains its revision number.

• \fontcharwd, \fontcharht, \fontchardp,\fontcharic. These ϵ-TeX extension commands read a font identifier, and an integer (character position). They return a property of the character in the font, always 0 in Tralics.

• \parshapelength, \parshapeindent, \parshapedimen. Commands that read an integer and give properties of the current paragraph shape.

• \numexpr, \dimexpr, \glueexpr, \muexpr: these commands read an expression, that can be a number, a dimension, a glue or a math dimension, with an extended syntax. They are ϵ-TeX extensions described in section 6.12.

• \gluestretchorder, \glueshrinkorder, \gluestretch, \glueshrink. These commands read some glue, and extract a part of it. They are ϵ-TeX extensions described in section 6.12.

• \gluetomu, \mutoglue: These commands read some glue, or math glue, and convert. They are ϵ-TeX extensions described in section 6.12.

The counters defined in Tralics are the following. The counters are not used unless specified otherwise, but you can say \renewcommand\thepage{...}, this is not an error.

• page. This is \count0. Tralics initialises it to 1.

• enumi, enumii, enumiii, enumiv: for enumerations. Unless specified otherwise, the enumeration counter is updated, but the value is not used.

• part, for parts of a book.

• chapter, subsection, section, subsubsection, paragraph, subparagraph. Each counter depends on the preceding one.

• FancyVerbLine, this is the default counter used by the verbatim environment for counting lines.

• footnote. Each call to the \footnote command increments this counter, but the value is not used.

• mpfootnote. Minipage footnotes.

• bottomnumber, topnumber: used by latex for float placement.

• totalnumber, dbltopnumber:

The dimension parameters are the following:

• \parindent: indentation of paragraphs.

• \mathsurround: space around math in text.

• \lineskiplimit: threshold where \baselineskip switches to \lineskip.

• \hsize: line width in horizontal mode.

• \vsize: page height in vertical mode.

• \maxdepth: maximum depth of boxes on main pages.

• \splitmaxdepth: maximum depth of boxes on split pages.

• \boxmaxdepth: maximum depth of explicit vboxes.

• \hfuzz: tolerance for overfull hbox messages.

• \vfuzz: tolerance for overfull vbox messages.

• \delimitershortfall: maximum amount uncovered by variable delimiters.

• \nulldelimiterspace: blank space in null delimiters.

• \scriptspace: extra space after subscript or superscript.

• \predisplaysize: length of text preceding a display.

• \displaywidth: length of line for displayed equation.

• \displayindent: indentation of line for displayed equation.

• \overfullrule: width of rule that identifies overfull hboxes.

• \hangindent: amount of hanging indentation.

• \hoffset: amount of horizontal offset when shipping pages out.

• \voffset: amount of vertical offset when shipping pages out.

• \emergencystretch: reduces badnesses on final pass of line-breaking.

• \z@: You should forget that this is a variable, and use it only as the constant zero.

• \p@: contains 1pt.

• \evensidemargin: defined by LaTeX, left margin for even pages.

• \oddsidemargin: defined by LaTeX, left margin for odd pages.

• \leftmargin: defined by LaTeX. Space between the left margin of the page and the left margin of the text. Depends on the list level.

• \rightmargin: defined by LaTeX. Similar to \leftmargin, but for the right margin.

• \leftmargini: defined by LaTeX. Value of left margin for a list at level one.

• \leftmarginii: defined by LaTeX. Value of left margin for a list at level two.

• \leftmarginiii: defined by LaTeX. Value of left margin for a list at level three.

• \leftmarginiv: defined by LaTeX. Value of left margin for a list at level four.

• \leftmarginv: defined by LaTeX. Value of left margin for a list at level five.

• \leftmarginvi: defined by LaTeX. Value of left margin for a list at level six.

• \itemindent: defined by LaTeX. Extra indentation added to the horizontal indentation of the text part of the first line of an item in a list.

• \labelwidth: defined by LaTeX. Nominal width of the box containing the label of an item.

• \fboxsep: defined by LaTeX. Space left between the edge of the box and its content produced by \fbox or \framebox.

• \fboxrule: defined by LaTeX. Width of line produced by \fbox or \framebox.

• \arraycolsep: defined by LaTeX, contains half of the witdh of the default horizontal space in a math array.

• \tabcolsep: defined by LaTeX, contains half of the witdh of the default horizontal space in a text array.

• \arrayrulewidth: defined by LaTeX, contains the witdh of vertical rules in an array.

• \doublerulewidth: defined by LaTeX, contains the distance between two vertical rules in an array.

• \normallineskiplimit: default value of \lineskiplimit.

• \epsfxsize. Used for including images.

• \epsfysize. Used for including images.

• \unitlength. Used for the picture environment. Initialised by Tralics to 1pt.

• \textwidth. Width of the text. Initialised by Tralics to 427pt. More or less 15cm.

• \textheight. Height of the text. Initialised by Tralics to 570pt. More or less 20cm.

• \columnwidth: this is defined by LaTeX as containing the current line width; in a two-column document this is the half of \textwidth minus \columnsep, otherwise it is \textwidth. LaTeX copies this value in \hsize is some cases (for instance, when switching between one column and two columns).

• \marginparwidth, \marginparsep, \marginparpush: three parameters defined by LaTeX for placement of marginal notes.

• \topmargin, \headheight, \headsep: these three dimensions are used by LaTeX for controlloing the header of a page.

• \footskip: this dimension (not glus) is used by LaTeX for controlloing the footer of a page.

• \columnsep, \columnseprule: this is defined by LaTeX as containing the distance between columns and width of the vertical rule between columns in a multi-column document.

• \linewidth: this is defined by LaTeX as containing the current line width (typically, this is \columnwidth minus the margins introduced by list environment). In Tralics, these two commands are undefined, except that you can use it as a unit of measure inside the optional argument of \includegraphics for height´ and width´: the value is 15cm.

• \@tempdima, \@tempdima, \@tempdima: three scratch registers.

• \paperheight, \paperwidth: contains the size of the paper on which the document will be printed; is A4 by default in Tralics, namely 297 and 210mm.

• \jot: a small quantity, in fact 3pt.

• \maxdimen: a large quantity, in fact the largest dimension that can be stored is ${2}^{14}$pt minus one sp.

The registers are the following

• \count xxx: table of 256 “count” registers.

• \dimen xxx: table of 256 “dimen” registers.

• \skip xxx: table of 256 “skip” registers.

• \muskip xxx: table of 256 “muskip” registers.

• \toks xxx: table of 256 token lists.

• \box xxx: table of 256 box registers.

• \wd xxx: Width of box N. If you ask for the value of the width, you will get zero. If you modify the width, nothing happens.

• \ht xxx: Height of box N.

• \dp xxx: Depth of box N.

• \delcode xxx: table of 256 delimiter code mappings. Unused by Tralics.



Translation is

<p>a</p>
<vfil/><vfill/><vfilneg/><vss/>
<p>b<hfil/><hfill/><hfilneg/><hss/>c</p>
<p spacebefore='12.0pt'>d</p>
<p spacebefore='3.0pt'>e</p>
<p spacebefore='6.0pt'>f
<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'>
<mrow><mspace width='1.66656pt'/><mspace width='1.111pt'/>
<mspace width='10.0pt'/></mrow>[/itex]
</formula>
</p>


In TeX, there is no command that starts a paragraph. The \leavevmode command is implemented as \unhbox\voidb@x, where \unhbox starts a new paragraph if needed, and produces nothing, provided that its argument is the void box; the paragraph may contain the current indentation and the value of \everypar. This is a primitive in Tralics, the value of \everypar is unused. Both commands \indent and \noindent make sure the current mode is horizontal, the first one inserts the current indentation (an empty box with the width of \parindent). In TeX, you can use \indent anywhere in a paragraph. In Tralics, the translation of

a\noindent b \indent c
{\centering a\noindent b \indent c\par d}
{\raggedright a\noindent b \par\indent c\par d}


is

<p>a</p>
<p noindent='true'>b</p>
<p rend='center' noindent='false'>c
a</p>
<p rend='center'>b</p>
<p rend='center'>c</p>
<p rend='center'>d
a</p>
<p noindent='true' rend='flushed-left'>b</p>
<p noindent='false' rend='flushed-left'>c</p>
<p rend='flushed-left'>d</p>


The rules are the following: if \indent or \noindent appear in an empty paragraph, that is not centered, and that has no noindent attribute, one is set. Otherwise a new paragraph is started. It will have a noindent attribute, unless the paragraph is centered. The value of \parindent is never considered.

The translation of \par is a bit complicated. Nothing happens inside a \hbox, in \term(note: ), or if the current mode is not horizontal. The current XML element should be a <p>. A final space is removed from it. It will be popped from the stack. This restores the mode to the value of the previous mode. It restores the current XML element to the parent of the <p>. A newline character is added to it. There is an exception: in cases like \noindent\par, or \bigskip\par, or \\\par, the \par command was ignored until version 2.5 (pl7). The behavior is now: if the paragraph is empty, but there are attributes, then the <p> is removed, and attributes are added to the next <p> element.

The translation of \\ depends on the context. The command can be followed by an optional star, and an optional dimension. Inside a cell, this indicates the end of the cell as well as the the end of the row. You can say \newline, this is like \\ without optional argument and array test. In vertical mode, LaTeX complains with There´s no line here to end, but Tralics ignores the command. Inside a title, the command is ignored. Otherwise, the behavior is like \noindent; if an optional argument is given, it behaves like \vskip. For instance, the translation of


An extension of ϵ-TeX is \isdefined. This reads a token, and yields true unless it is a macro (or active character) that is undefined. The command \ifcsname reads all characters up to \endcsname and constructs a character string in the same way as \csname. The value is true if a command with that name exists (possibly undefined); it is false otherwise (the important point is that the command is not created). In the example that follows, assuming \foo and \FOO undefined, you will see aBc (or abc, in case someone dedfined \undefined). You will also see DEF, because the LaTeX command \@ifundefined creates the token if it deos not exists, and sets it to \relax.

\makeatletter
\ifcsname foo\endcsname A\else a\fi
\ifx\foo\undefined  B\else b\fi
\ifdefined\foo  C\else c\fi
\@ifundefined{FOO}{D}{d}
\ifcsname FOO\endcsname E\else e\fi
\ifdefined\FOO F\else f\fi


The command \iffontchar is another extension; it reads a font identifier (for instance \font denotes the current font) and an integer (a character position); it yields true if the font specifies a character at that position.

The last conditional to explain is \ifx. This reads two tokens and compares them. Two tokens are equal if they are character tokens (implicit or explicit) with same character value and category code, or two TeX primitives with the same meaning, or two user-defined commands with the same value (same arguments, same body, same \long and \outer flags)(note: ),(note: ).

### 2.11.2. Examples of conditional commands

Using \ifx we can code our \Color command properly, like that

\def\Color#1#2{%
\def\crouge{rouge}\def\cvert{vert}\def\cc{#1}%
\ifx\cc\crouge\enrouge{#2}\else\ifx\cc\cvert\envert{#2}\else#2\fi\fi}


It is possible to avoid these assignments in the \Color macro, provided that they are hidden elsewhere. For instance

\def\ifstringeq#1#2#3#4{%
\def\tempa{#1}\def\tempb{#2}%
\ifx\tempa\tempb#3\else#4\fi}

\def\Color#1#2{%
\ifstringeq{#1}{rouge}{\enrouge{#2}}
{\ifstringeq{#1}{vert}{\envert{#2}}{#2}}}


Note that the ifthen package provides the \equal command as helper for such a situation: you could say \ifthenelse{\equal{A}{B}}{X}{Y} instead of \ifstringeq {A}{B}{X}{Y}. Caveat: the \equal command fully expands its two arguments, our version expands nothing.

In any computer language, you would define a command that compares two strings and returns true or false; this is not possible in TeX because commands return no value. All you can do is modify some variable (a command, a register, a token list, etc). This assignment can be done by the caller or the callee. Here is a solution where the token \next is set by the caller:

\def\Color#1{%
\ifstringeq{#1}{rouge}{\let\next\enrouge}
{\ifstringeq{#1}{vert}{\let\next\envert}{\let\next\relax}}%
\next}


Note that, if \envert accepts an optional argument, for instance if \envert[clair]{text} typesets the text using light green, you can say \Color{vert}[clair]{text}. We consider now a case where the assignment is done by the callee (via \equaltrue or \equalfalse; there is a variant that uses \setboolean).

\newif\ifequal
\def\streq#1#2{%
\def\tempa{#1}\def\tempb{#2}%
%%variant: \setboolean{equal}{\ifx\tempa\tempb true\else false\fi}
\ifx\tempa\tempb\equaltrue\else\equalfalse\fi}

\def\Color#1{%
\streq{#1}{rouge}%
\ifequal\let\next\enrouge\else
\streq{#1}{vert}%
\ifequal\let\next\envert\else \let\next\relax\fi\fi
\next}


A subtlety of TeX is that tokens are read only when needed. Said otherwise, if you say \if AB C\else D\fi´, TeX will evaluate the test; it will remember that a new conditional has started. If the test is false, it will skip at high speed until the \else, and resume normal evaluation; but if the test is true, it will resume normal evaluation right now. It is only when TeX sees an \else token (and this can be another one) that it will read all tokens at high speed until the \end. And, when TeX sees the \fi, it will pop the conditional stack. Consider the following example:

\def\ifstringeq#1#2#3#4{%
\def\tempa{#1}\def\tempb{#2}%
\ifx\tempa\tempb\aux{#3}\else\aux{#4}\fi}
\def\aux#1#2\fi{\fi#1}
\def\color#1{%
\ifstringeq{#1}{rouge}{\enrouge}{\ifstringeq{#1}{vert}{\envert}{\relax}}}


Assume that the test is true. Then \aux reads all tokens, up to \fi´, provides a \fi to finish the conditional now, then expands to its first argument (which is argument 3 of \ifstringeq). In the case where the test is false, the same thing happens. This is nicer that the solution that consists in defining conditionally \next and evaluating it after the \fi, it avoids an assignment.

### 2.11.3. Testing the next token

Let´s consider now a variant of the color problem. We want to write a command with three arguments A, B and C, it is assumed to read a token, compare it with A, and expand to B or C. We need an auxiliary command that reads the token. Thus the solution

\def\ifnextchar#1#2#3{%
\let\tempa=#1\def\tempb{#2}\def\tempc{#3}%
\ifaux
}
\def\ifaux#1{%
\let\lettoken=#1%
\ifx\lettoken\tempa\let\tempd\tempb\else\let\tempd\tempc\fi
\tempd
}


Note that we have put an equals sign after \let\tempa´ and \let\lettoken´ for the case where the token to match is an equals sign. If you want to catch spaces, a bit more complicated machinery must be used. There is a problem with this command, because, if the argument of \ifaux is not a single token, say foobar´, then only f´ will be put in \lettoken and oobar´ will be typeset. On the other hand, if the argument is empty, then \ifx´ will be put in \lettoken; after that \lettoken will be expanded. Since this is \ifx, the following tokens will be compared (said otherwise \tempa´ and \let´), this is not exactly what is required. In order to solve this problem, we first modify slightly our code:

\def\ifnextchar#1#2#3{%
\let\tempa=#1\def\tempb{#2}\def\tempc{#3}%
\ifaux
}
\def\ifaux#1{\let\lettoken=#1\ifnch}
\def\ifnch{%
\ifx\lettoken\tempa\let\tempd\tempb\else\let\tempd\tempc\fi
\tempd
}


The \ifnch command given above looks like the LaTeX version of the beast. In fact, spaces are ignored in LaTeX, so that there is an additional test. Moreover, some variables have a different name, nevertheless, here is the code:

\def\@ifnch{%
\ifx\@let@token\@sptoken
\let\reserved@c\@xifnch
\else
\ifx\@let@token\reserved@d
\let\reserved@c\reserved@a
\else
\let\reserved@c\reserved@b
\fi
\fi
\reserved@c}


The problem is the \ifaux command. The question is: can we rewrite it in such a way as to read a single token, before calling \ifnch. Recall that we want to distinguish between {x}´ and x´. A very interesting question is the following: if we read the opening brace, how can we put it back in the input stream? we cannot do so by just expanding a macro (because the body is always well balanced). You could try something like {\ifnum0=}\fi (that leaves an unmatched brace after expansion), or something like {\iffalse}\fi´. Our solution is much simpler. There is a TeX primitive that gets the token without reading it. To be precise, \futurelet reads a token A, that has to be a command name or an active character, then a second token B, then a third token C. The value of the token is put in A, using the equivalent of \let, then C and B are pushed back in the input stream (in this order, the token to be read first is B). The code of \ifnextchar is hence the following:

\def\ifnextchar#1#2#3{%
\let\tempa=#1\def\tempb{#2}\def\tempc{#3}%
\futurelet\lettoken\ifnch}


What \futurelet\lettoken\ifnch´ does is read a token. This could be a space character, an equal sign, an open brace, a closing brace, whatever. It puts it back in the input stream. It puts it also in \lettoken. After that, it evaluates \ifnch (which is a command that should take no argument, of course; it should consult \lettoken and depending on the value, call a command that, maybe, reads the token). There are some variants. For instance amsmath has a version that omits the comparison with <@sptoken>. The xkeyval package provides a version where the category codes of the character to test and the actual token may be different.

We consider in this paragraph the following problem: is it possible to define a command \sptoken that behaves like a space character inside \ifx? One problem with the current version of Tralics is that, as has been mentioned earlier, a newline character in the source file produces a new line character in the XML file; thus has a different representation as a normal space. Thus, there are two different space tokens N and S (they have the same category code, but a different value, 13 or 32). If a macro requires an argument delimited by a space, both these characters can be used. When comparing token lists, these tokens are considered equal. However, when using \ifx, these two tokens compare unequal. Our purpose is to create \sptoken that compares equal to S; it is trivial to create the N token, and compare them.

We give here three solutions. The first one uses \futurelet. If the arguments are A, B and C, where A is the command to define, and C the space, then B has to be a command (if it is a character, it will be typeset); this cannot be \foo, since spaces after \foo disappear, it has to be something like \;´. This command must read the space, otherwise it appears in the output. We provide two solutions: a command that is delimited by a space, and a command that takes an argument (remember that spaces disappear before undelimited arguments):

\def\; {}\futurelet\SPtoken\; % comment required
\def\;#1{}\futurelet\SPtoken\; 0


In both cases, the command \; cannot be used for typesetting (in the LaTeX kernel, it is used for computing the \SPtoken, and correctly redefined after that). We give here an example, where the redefinition is temporary, inside the box. We can discard the content of the box.

\setbox0\hbox{\def\;{}\global\futurelet\SPtoken\; }


We give now a solution using \let. Remember the syntax, after \let and \sptoken (the token to be assigned), comes <equals> and <one optional space> and <token>, where the last token is our space token. Since <equals> reads an arbitrary number of spaces and an optional equals sign, an equals sign is required. Our optional space cannot be optional. So we must produce a double space. This is not completely trivial. We give here two solutions (the comment is necessary)

\def\makesptoken#1{\let\sptoken= #1}\makesptoken{ }
\def\:{\let\Sptoken= } \:  % this makes \Sptoken a space token


And now, how can we define \@xifnch? this command is assumed to read a space, discard it, and check again for the next character. Thus the question is to design a macro that reads a space. This cannot be done via \def\@xifnch#1..., since spaces are ignored before undelimited arguments; we cannot use the technique of the command \;´ above, because we cannot read what follows the space; the solution consists in a command that takes no argument, and that starts with a compulsory token, like \def\foo\bar{etc}. The non trivial point is that we want \bar to be replaced by a space token, but spaces disappear after \foo. We give here two solutions.

\expandafter\def\expandafter\foo\space{etc}
\def\:{\Foo}\expandafter\def\: {etc}


### 2.11.5. Variants of the Map problem

Let´s consider the following variant of the \Map command. If we have \do{A}\do{B}\do{C}, we want to separate arguments with a comma, and put a period after the last argument; we might as well do something with the argument, say, typeset it in italics. This is not always possible. In one of the style sheets used by the Raweb, a Perl postprocessor is used for replacing some commas by a period. We assume here that we know where the list ends. For instance, we assume that we can put a \endl´ token at the end of the list. Then we can write something like

\def\foo#1#2\endl{\textit{#1}\ifx#2\endl\endl.\else, \foo#2\endl\fi}


Then \foo{A}{B and C}{D}\endl´ produces A, B and C, D.´ as expected. Let´s analyze the code and try to see why it is wrong. We assume that you never say \foo\endl, because the list is assumed non-empty. We also assume that the list does not contain the \endl token (in LaTeX, you should use the special marker \@nil´ only as list delimiter). In our case, the first argument is A´, the second is {B and C}{D}´. In the case where the second argument is empty, the test is true, because \endl is compared against itself. In our case, the test is false because the brace is compared with the letter B. If we put the second argument in a pair of braces, we get an error: Too many }´s, because the test is true, and a part of #2\endl\endl´ has been evaluated. This means that our test is wrong. The only safe way to check whether #2 is empty is to put it in a command, and check whether this is the same as \empty. We shall give a second version of the code where the test is replaced by \ifx\endl#2\endl. In the case where #2 is empty, the test evaluates to true, and if #2 evaluates to some token list that does not start with \endl, the test will be false; this is better.

Note that, when \foo is called again, it compares D´ with \endl´. Does this surprise you? In fact, if you say \foo{A}{XY}{UV}\endl´, you get A, XY, U, V.´. The trouble is the following: when TeX reads the arguments of a command, a pair of braces disappears, when possible. Thus arguments are A´ (without braces) and {XY}{UV}´ (it is not possible to remove the braces). When \foo is called again, arguments are XY´ and UV´, without braces. This explains why the test compares U and V (by the way, if UV´ is replaced by UUVV´, the test will be true, yielding an Undefined control sequence error). When \foo is called again, arguments are now U´ and V´, an unwanted result. There is a simple way to avoid disappearance of braces: it suffices to put a token before each item, for instance like this

\def\foo\do#1#2\endl{\textit{#1}\ifx\endl#2\endl.\else, \foo#2\endl\fi}
\foo\do{A}\do{B}\do{C}\endl


The good way of testing that the argument is empty is to use \@iftempty, which a has different syntax:

\def\foo\do#1#2\endl{\textit{#1}\@iftempty{#2}{.}{, \foo#2\endl}}
\foo\do{A}\do{B}\do{C}\endl


A more elegant solution: notice that #2 starts with \do, unless it is empty. There is no need to read the argument for seeing this, we can use the \ifnextchar command. With the solution proposed here, the token that marks the end of the list is evaluated: we use \relax, because this is harmless.

\def\foo{\def\do##1{\textit{##1}\@ifnextchar{\do}{, }{.}}}
\foo\do{A}\do{B}\do{C}\relax


Note that we can replace \relax by something more useful, for instance a period:

\def\foo{\def\do##1{\textit{##1}\@ifnextchar{\do}{, }{}}}
\foo\do{A}\do{B}\do{C}.


An alternate solution could use \ifprevchar´ instead of \ifnextchar´. There is no such command in LaTeX, but the idea is the following: instead of putting a comma after each argument but the last, we can put a comma before each argument but the first. All we need to do is to know if this argument is the first. In one application, we have coded this as: apply \do-first on the first argument, and map \do-other on the rest of the list. If side effects are allowed, we can use a piece of code like this (note how the final period is typeset):

\newif\iffirst
\def\do#1{\iffirst\firstfalse\else , \fi\textit{#1}}
\firsttrue
\do{A}\do{B}\do{C}.


In fact, there is no need to use an auxiliary command, it suffices to modify \do itself:

\def\foo{\def\do##1{\textit{##1}\def\do####1{, \textit{####1}}}}
\foo\do{A}\do{B}\do{C}.


If you think that there are two many sharp signs, you can try

\newcommand\normaldo[1]{, \textit{#1}}
\newcommand\firstdo[1]{\textit{#1}\let\do\normaldo}
\newcommand\foo{\let\do\firstdo}
\foo\do{A}\do{B}\do{C}.


There are other possibilities implying conditional commands. We shall see later how to define a comment environment that ignores the content of it. It is as if you said

\newenvironment{comment}{\iffalse}{\fi}


One can make the following strange construct {\ifnum0=}\fi. In this case, we compare two numbers, zero and the internal code of the brace (which is in general non-zero). The result of the test is false, but who cares? the body of the conditional as well as the else part is empty. Hence, the result is like \bgroup, there are some differences because TeX has two brace counters: the balance counter and the master counter; there is only one counter in Tralics. For details, see the TeXbook and its appendix D, where it is said “If you understand [...] you´ll agree that the present appendix deserves its name.” (the name of the appendix is Dirty Tricks´).

A piece of code like this causes trouble to Tralics

\def\foo#1{%
\sbox\tempboxa{#1}%
\ifdim \wd\tempboxa >\hsize
#1\par
\else \hbox to \hsize{\hfil\box\tempboxa\hfil}%
\fi}


It is a simplification of the \@makecaption command of the article class. The idea is to center the caption of an image if it fits on a line (centering is achieved via \hfil). The argument is typeset in a temporary box, and the width of the box is compared against \hsize. Captions in the Raweb are always centered, but this is not aesthetic.

### 2.11.6. More examples

Consider again the following example

\def\ifnch{%
\ifx\lettoken\tempa\let\tempd\tempb\else\let\tempd\tempc\fi
\tempd
}


It would be much simpler to write:

\def\ifnch{%
\ifx\lettoken\tempa\tempb\else\tempc\fi
}


The problem here is that the commands \tempb and \tempc may take an argument, that would be \else or \fi. The remedy is

\def\ifnch{%
\ifx\lettoken\tempa\expandafter\tempb\else\expandafter\tempc\fi
}


In general, you need an \expandafter before each token between \else and \fi. The command \@afterfi can be used to simplify such definitions. Its effect is easy: it reads all token, up to the \fi tokens, evaluates \fi, then the other tokens. Such a command is provided by the following packages: typehtml, grabhedr, gmutils, gmverb, morehelp, splitbib, babel, and maybe others. Example:

\def\test#1{
\ifnum\count0=#1
somecode
\else\@afterfi\fct v\fi}


If the test is true, then somecode is evaluated, then everything between \else and \fi is discarded. But if the test if false, the else part is interpreted as if it were \fi\fct v. The command \@afterelsefi is to be used in the true part (all tokens between \else and \fi are discarded). In the example that follows, \fct is called with two arguments, the first one is u or v, the second is 2.

\def\test#1{%
\ifnum\count0=#1 %
\@afterelsefi \fct u
\else\@afterfi\fct v\fi}
\def\fct#1#2{} \test32


The piece of code that follows computes the factorial of a number, using only expandable commands (it requires \numexpr, an extension provided by ϵ-TeX).

\def\JGfactorial#1{%
\ifnum\numexpr#1>1
\number \numexpr#1*\JGfactorial{(#1-1)}\relax
\else 1\fi}

\def\factorial#1{%
\ifnum\numexpr#1>1
\number \numexpr#1*\factorial{(#1-1)}\expandafter\relax
\else
\expandafter1\fi}

\def\Factorial#1{%
\number\ifnum\numexpr#1>1
\numexpr#1*\Factorial{(#1-1)}\expandafter\relax
\else
1\expandafter\space
\fi
}

\def\UDfactorial#1{%
\number\ifnum\numexpr#1>1
\numexpr#1*\UDfactorial{(#1-1)}\expandafter\relax
\else
\numexpr\ifnum\numexpr#1<0 0\else1\fi\expandafter\relax
\fi
}%


Ulrich Diez, wrote versions 3 and 4; Version 3 uses a space character instead of \space using one of the techniques shown above; he then produced version 4, which gives a different value for the factorial of a negative number, and the space after the digit 1 is not needed anymore. In fact, if the argument is zero or one (case where the first \ifnum is false, version 1 and 2 return the character 1, while versions 3 and 4 return the digits of the number 1, computed by \number; in case 3, an optional space is read after the integer constant, in case 4, the \relax token is an end marker for \numexpr, an no optional space is needed after it (I guess that the purpose of this \numexpr if to avoid any problems if \space is redefined); the first \numexpr is needed for the product, and the two other calls are needed if the command calls itself). The difference between versions 2 and 3 is the placement of \number. I put it just before \numexpr, because \numexpr can be used only in a context where a number is seen. Ulrich puts it before the \ifnum. Does this make any difference? If you want to compute the factorial of a number, no. What about the following code:

\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\factorialresult
\expandafter\expandafter\expandafter{\JGfactorial{12}}


The effect is the following. The command \JGfactorial is expanded twice, and the result is put in a command; evaluating this command yields the desired result. The same can be applied to \UDfactorial. In any case, the first expansion gives the body of the macro. The second expansion expands the \ifnum and \number respectively. In one case you get lines two and three of \JGfactorial. This is something like

\def\factorialresult{...\else...\fi}


If you do not use this command, TeX will signal an unterminated \if. If you call it twice, you will get an extra \else error. On the other hand, if you consider \UDfactorial, the one-level expansion of \number implies expansion of the \ifnum, then the \numexpr of the body; expansion of the command means considering all tokens up to the final \relax, and since this \relax is preceded by \expandafter, everything up to the final \fi is taken into account. Thus, the one-level expansion of the body is a number, the desired result.

### 2.11.7. Producing N asterisks in a row

In appendix D of the TeXbook, there are some examples of how to produce N asterisks in a row. The question is: can we produce this using pure expansion? this is a solution given by D. Kastrup:

\def\nlines#1{\expandafter\nlineii\romannumeral\number\number #1 000\relax}
\def\nlineii#1{\if#1m\expandafter\theline\expandafter\nlineii\fi}
\def\theline{A}
\nlines{5}


This produces AAAAA´. The idea is the following: \romannumeral3000´ expands to mmm´. It is then rather easy to convert this sequence of m into a sequence of A. The argument of the command can be \count0´; the \number´ has as effect to convert the value of this counter into a number, it gobbles a space. The argument of the command can be \count1␣´; the second \number´ will gobble the second space (I don´t know if there is some other reason for these two \number commands). Here is the same idea, without tests:

\def\recur#1{\csname rn#1\recur}
\def\rn#1{}
\def\rnm#1{\endcsname{#1}#1}
\def\replicate#1{\csname rn\expandafter\recur
\romannumeral\number\number#1 000\endcsname\endcsname}

\dimen0=4sp \replicate{\dimen0}{P}


You may wonder how this works. Here is the transcript file of Tralics.

1 [216] \replicate{\dimen0}{P}
2 \replicate #1->\csname rn\expandafter \recur \romannumeral
3    \number \number #1 000\endcsname \endcsname
4 #1<-\dimen 0
5 {\csname}
6 {\expandafter \recur \romannumeral}
7 +scanint for \dimen->0
8 +scanint for \number->4
9 +scanint for \number->4000
10 +scanint for \romannumeral->4000
11 \recur #1->\csname rn#1\recur
12 #1<-m
13 {\csname}
14 \recur #1->\csname rn#1\recur
15 #1<-m
16 {\csname}
17 \recur #1->\csname rn#1\recur
18 #1<-m
19 {\csname}
20 \recur #1->\csname rn#1\recur
21 #1<-m
22 {\csname}
23 \recur #1->\csname rn#1\recur
24 #1<-\endcsname
25 {\csname}
26 {\csname->\rn}
27 \rn #1->
28 #1<-\recur
29 {\csname->\rnm}
30 \rnm #1->\endcsname {#1}#1
31 #1<-P
32 {\csname->\rnm}
33 \rnm #1->\endcsname {#1}#1
34 #1<-P
35 {\csname->\rnm}
36 \rnm #1->\endcsname {#1}#1
37 #1<-P
38 {\csname->\rnm}
39 \rnm #1->\endcsname {#1}#1
40 #1<-P
41 {\csname->\rn}
42 \rn #1->
43 #1<-P
44 Character sequence: PPPP .


This is now something else, it is part of a command defined in the RR style file:

\bgroup
\edef\foo{\ifnum 0<0#1x\else y\fi}\def\xbar{x}%
\ifx\foo\xbar
\global\compteurtheme=#1
\else \global\compteurtheme=0 \@latex@error{Pas un thème #1}\@eha\fi
\egroup


Assume that #1 contains a positive number, for instance 25. In this case, the test will be true, \foo will be defined as x´, and will be equal to \xbar. In this case, our command puts 25 in \compteurtheme. Some other tests (not shown here) are done for instance, the value should be a number between 1 and 4, or a number with two digits, each one being between 1 and 4. Assume that the argument is not a number, say it is gee´; then \ifnum will compare 0 and 0, the test will be false, \foo will be defined as y´ hence is not equal to \xbar. Assume that the argument is 3a´; this is not a theme, but a theme and a subtheme. In this case, the test is true, but \foo expands to 3x´, and this is not equal to \xbar. Nowadays, themes are com´, cog´, etc, and this piece of code has become useless. It is replaced by something different, see end of section 6.9.

## 2.12. A nontrivial command \verb

The code that follows is a simplified version of a LaTeX command

1 \def\verb{%
2   \bgroup
3     \let\do\@makeother \dospecials
4     \verbatim@font\@noligs
5     \@vobeyspaces \frenchspacing\@sverb}
6
7 \def\verb@egroup{\global\let\VBG\@empty\egroup}
8 \let\VBG\@empty
9
10 \def\@sverb#1{%
11   \catcode#1\active
12   \lccode\~#1%
13   \gdef\VBG{\verb@egroup\error{...}}%
14   \aftergroup\VBG
15   \lowercase{\let~\verb@egroup}}


Note first that this code contains two empty lines, that are read by TeX as a \par token (it is ignored, provided that the definition is read in vertical mode). Lines 5, 7, and 15 are terminated by a brace and the end of line character produces a space token, that is ignored for the same reasons. Lines 1, 10, 12, and 15 are terminated by a % character, since otherwise, it would produce a space character (ignored in case the command is executed in vertical mode, and that is not always the case). In the case of lines 2, 3, 4, etc., the end of line is converted into a space character that disappears because it follows a command name.

This code defines a command \verb that starts a group via \bgroup. At line 3, \dospecials is executed, after redefining \do. This changes the category code of all special characters (included all characters made active by packages like babel(note: )). Line 4 changes the current font to a typewriter one, and it executes a piece of code that inhibits ligatures (for instance the one that converts a double dash in an en-dash). Note that this document contains a great number of verbatim examples, either inline or as environments. In some cases, we use a smaller font; it is hence important to allow the user to parameterize commands like these. Line 5 contains three commands: The first makes an end-of-line character active (usually, it will behave like \par), the second enters so-called french spacing mode (a mode where the width of a space is constant), and the last command \@sverb will be explained later. The s´ in the name of this command comes from the starred´ version of \verb: If you say \verb*+ +´, you will get ´. We have omitted the test with the star character.

On lines 7 and 8, we define a command \VBG that does nothing (i.e. expands to the empty list) and a command that evaluates to \egroup preceded by a global assignment of \VBG to nothing. On line 13, \VBG is defined as calling \verb@egroup plus some error, whose text is not shown here. Thus \VBG is a command that 1) resets \VBG to a harmless command, 2) closes the current group, 3) signals an error.

Let´s consider lines 11 and 12. We assume that the argument of \@sverb is some character c (If you say \def\foo{\verb\foo=\foo then \foo, you will get an error Improper alphabetic constant, and after that, you´re really in trouble. In the usual case, the character that follows \verb is read with category code 11 or 12, because of the code line 3.) Line 11 makes the character c active (of category 13); the category code will recover its old value at the end of the group, and line 13 changes the lc-code of the tilde character (the lc-code will recover its value at the end of the group). The lc-code of a character will be used for hyphenation, as well as conversion from upper case to lower case. We assume here, for the sake of simplicity, that hyphenation is inhibited by the use of a verbatim font. Note that Tralics does not care about subtleties like hyphenation. For this reason, when you say \verb+foo+, it will execute \verbprefix {\verbatimfont foo}. You can redefine both commands (the prefix is empty, the font defaults to \tt). Notice that Tralics grabs the argument, contrarily to LaTeX.

Line 14 contains the special command \aftergroup. This reads a token, saves it on a stack, and re-inserts it at the end of the current group.

Let´s come back to the LaTeX implementation of \verb. So far, we have read a character, changed its category code, changed the lc-code of the tilde character, changed the font and other tables, redefined \VBG, aftergrouped it (code on line 13: the token is popped at the end of the current group, that was opened on line 2, and normally closed on line 7). Line 15 is a kludge: what \lowercase does is replace in its argument every character by its lower case equivalent (using the lc-code table). The result is evaluated again. Here the argument is formed of three tokens: \let, the tilde and \verb@egroup. Since ~ is a character that has a lower-case equivalent, it will be replaced by that, namely the character c. Note: category codes are left unchanged by this procedure. It is hence important that ~ be an active character (because \let modifies that value of ~) and that c be active (otherwise, there is no meaning in changing the value of c).

Consider the case of \verb+\toto+. Here the character c is the plus sign. After line 15 has been executed, the situation is the following: all characters are of category other, ligatures are disabled, french spacing is active, current font is typewriter, a group is opened, and a token is waiting for the group to terminate. In such a situation, you cannot go outside LaTeX properly. In fact, the carriage return has been made active in order to help error recovery (this is not shown here), and the +´ sign has been made active: this will help us. TeX sees now the following tokens \12 t11 o11 t11 o11 +13. The first five tokens are added to the current horizontal list as characters in the current font, while the last one is expanded. The expansion is that of \verb@egroup, see line 7. This defines globally \VBG, then closes the group, restoring everything. It does not restore \VBG (because the last assignment was global). After the group, the after-grouped token \VBG is evaluated but it does nothing.

So far, so good: the translation of \verb+\foo+´ is the same as \texttt{\char\\foo}´. Note that the author could have entered the previous expression as \verb-\verb+\foo+-´, or using the fancyvrb package as |\verb+\toto+|´, but he used \quoted{\BS verb+\BS foo+}, because, in the HTML file produced by Tralics, different colors are used for verbatim material; this is explained in the second part of this document.

Consider now the following example:

\def\duplicate#1{#1#1} \duplicate{\verb+x+}++'


You would expect xx++´ but you get x+x+ in LaTeX, an error in tralics. Explanations: the expansion of \duplicate is verb +12 x11 +12 verb +12 x11 +12 +? +?. The last two plus signs have not been read, and their category code is still unassigned. The \verb command reads the +12 via \@sverb. It changes the category code of the plus sign. The second \verb does the same. It reads the +? as a +13, this finishes evaluation of \verb. The second \verb command does the same. In the case where you replace ++ by --, the \verb command will see an end of line character before a plus character and complain with LaTeX Error: \verb ended by end of line.

Consider now the following example:

\def\braceme#1{{#1}} \braceme{\verb+x+}++'


You get the following error LaTeX Error: \verb illegal in command argument. Let´s try to see how this is done. The expansion of \braceme produces the following tokens: {1 verb +12 x11 +12 }2. After \@sverb has finished, the first non-inactive character is }2, this closes the current group. Hence, as above, this restores category code, fonts, lc-codes, etc. It does not restore \VBG because assignment is global (\gdef at line 13 is like \global\def). The trick is now that a \VBG token is popped from the aftergroup stack. This one calls \verb@egroup and signals an error. What \verb@egroup does is to close a group (the one opened by \braceme), and reset \VBG to something harmless. Note that TeX is in a clean mode when the error is signaled. Tralics has no such error handling mechanism (however, no category codes are changed when scanning for the end of the command, so nothing harmful can be done). What this example shows is that error recovery is not completely trivial; nevertheless nice things can be done.

Note the following special cases;

\verb test
\verb+test+
\verb^^abtest^^ab


In the first case, the delimiter is a space character; the first line is terminated by a space and you would expect it to be interpreted in the same way as the second line. The trouble is that TeX removes all spaces characters at the end of the line (regardless of category codes). The last line has also a problem: the delimiter is character 171 (double hat mechanism), and one \verb has changed category codes, the double hat sequence is not seen any more as such, and an error is signaled.

There is a variant to \verb, it is the verbatim´ environment. The classical exercise is: write a command that reads everything up to \end{verbatim} (backslash and braces are of category 12 in this token list). There are different packages that solve this problem; For instance fancyvrb is one of them. A solution is also given in the first chapter. It does not allow an optional space after \end´.

We give here the LaTeX implementation of the \end command.

\def\end#1{%
\csname end#1\endcsname\@checkend{#1}%
\expandafter\endgroup\if@endpe\@doendpe\fi
\if@ignore\@ignorefalse\ignorespaces\fi}


As you can see, if you say \end{foo}, then \endfoo is executed first. After that the current environment in \@currenvir is compared with the argument, in case of error the variable \on@line contains the start line of the environment. After that, the group is terminated, and we have two tests. The first uses \expandafter, this means that the command \@doendpe is executed outside the environment in the case where the variable \if@endpe is true inside the environment. This command is very complicated (it redefines \par and modifies \everypar), and not implemented in Tralics; the effect is to suppress the indentation of the following paragraph. On the other hand, the two commands \@ignoretrue and \@ignorefalse redefine \if@ignore globally, so that no \expandafter is needed for this one.

This is an example of \aftergroup.

\def\lrbox#1{%
\edef\reserved@a{%
\endgroup
\setbox#1\hbox{\begingroup\aftergroup}%
\def\noexpand\@currenvir{\@currenvir}%
\def\noexpand\@currenvline{\on@line}%
}%
\reserved@a
\@endpefalse \color@setgroup \ignorespaces}
\def\endlrbox{\unskip\color@endgroup}


The effect of the \edef command is to replace the previous definition by the following (where 17´ is to be replaced by the current line number). One important point here is that implementing colors in LaTeX is non trivial, and for this reason, there are two hooks (the commands with the name color´, that do nothing if the package is not loaded). Colors are not implemented in Tralics.

\def\lrbox#1{%
\endgroup
\setbox#1\hbox{\begingroup\aftergroup}%
\def\@currenvir{lrbox}%
\def\@currenvline{ on input line 17}%
\@endpefalse \color@setgroup \ignorespaces}


The order of evaluation is the following. Assume that the current environment is X. The \begin command opens a group via \begingroup and changes the environment name to lrbox´. The command starts with \endgroup, closing this group. After that, we put something in the box whose number is the argument of the environment; the content is a hbox, whose start is defined by the brace (and this brace is a group); we start a group with \begingroup, and call \aftergroup. This pushes a brace on the stack; this brace indicates the end of the hbox, but it will be evaluated later. After that, we change again the name of the current environment (it was restored to X by the \endgroup, but we made a copy of it in the \edef). When the end of the environment is reached, the following happens. First, the end-code is executed (this removes space at the end of the box), and \endgroup is executed. As a side-effect this restores the current environment name to X. It also pops the after group stack, namely the closing brace that terminates the \hbox. One important point here is that the \setbox assignment is done outside the environment (it could done inside, with a \global prefix). Such a piece of code is illegal. The lrbox environment is not implemented in Tralics version 2.10.

## 2.13. Expandable tokens

Assume thar \err is un undefined command. The following code

\ifnum1=0\err1 \err1 \fi


will signal two errrors: when TeX reads the second number, it expands undefined command (hence a first error), and continues scanning, until finding the space; the test is true, hence the second error.

We give here the list of all tokens that can be expanded.

• All user defined command.

• All undefined commands.

• \noexpand: inhibits expansion of the next token.

• \expandafter: changes order of expansion.

• \csname. This manufactures a token. Note that \endcsname marks the end of the list, is not expandable, is an error elsewhere.

• \number, \@arabic, \romannumeral, \Romannumeral: these convert numbers.

• \string, \meaning, \fontname, \jobname, \tralicsversion: These commands convert some internal quantities into tokens; for instance \jobname is the name of the file that is translated (without an extension .tex´), and \tralicsversion is the version number of Tralics; it could be 2.5 (pl7)´, or more likely 2.9´.

• \the. You say \the\foo, if you want to typeset the value of a variable \foo.

• \input, \include, \endinput: Special macros for files.

• \if, \ifcat, \ifnum, \ifdim, \ifodd, \ifvmode, \ifhmode, \ifmmode, \ifinner, \ifvoid, \ifhbox, \ifvbox, \ifx, \ifeof, \iftrue, \iffalse, \ifcase. These start a conditional.

• \fi, \or, \else. These modify the conditional stack.

• \topmark, \firstmark, \botmark, \splitfirstmark, \splitbotmark. Marks are not yet implemented. There are ϵ-TeX extensions, with the same same but for a final s, that manage a stack of marks.

• A number of commands defined by LaTeX, are implemented as expandable commands in Tralics.

• \a, , \, \”, \/HAT, \~, \k, \H, \v, \b, \d, \u, \C, \f, \c, \., \=, \r, \T, \V, \D, \h: these commands produce an accented character.

• Commands of the form \textunderscore, \AA, \texteuro expand to a Unicode character.

• \@firstofone, \@firstoftwo, \@secondoftwo expand to the first or second argument.

• \@car, \@cdr: these give access to first or rest of a list terminated by \@nil.

• \@gobble, \@gobbletwo, \@gobblefour: these commands read and ignore 1, 2 or 4 arguments.

• \zap@space removes unwanted spaces.

• \strip@prefix removes the prefix inserted by \meaning before the body of a macro.

• \ensuremath inserts dollars signs outside math mode.

• \hexnumber@, \multispan, are defined as in LaTeX.

• \@afterfi, \@afterelsefi: These commands can be used in a if-then-else structure, they terminate the condition and re-insert the interesting tokens.

• $$,$$, $,$: commands for math mode.

• \UseVerb: restores a quantity saved by \SaveVerb.

• \@stpelt, \stepcounter, \refstepcounter, \addtocounter, \setcounter, \value. These are user-defined commands in LaTeX. The expansion in Tralics can depend on the loading of the calc package.

• \setlength, \addtolength. These are user-defined commands in LaTeX. The expansion in Tralics can depend on the loading of the calc package.

• \arabic, \roman, \Roman, \alph, \Alph, \fnsymbol, \@alph, \@Alph, \@fnsymbol: commands to typeset a LaTeX counter.

• \loop, \@whilenum, \@whiledim, \@whilesw: For loops.

• \xspace. This adds a space if needed.

Back to main page