Tralics, a LaTeX to XML translator; Part II

5. Converting XML to XML

This chapter, and the next one, describes three types of style sheets: they convert XML to XML, to XSL/Format and HTML. Originally, in 2002 and 2003, the XML files created by Tralics were used directly for production of the HTML and Pdf version, but in 2004, a new DTD was designed for the Raweb. The name of this new DTD was unclear for a long time; it is now `raweb2.dtd´ and the old name is `raweb3.dtd´ (the 3 here is for 2003). We shall explain here the style sheets that convert from the old DTD to the new one, and from this to HTML; we shall also explain the style sheets that convert from the old DTD to XSL/Format (those for the new one are similar).

The style sheets for converting into XSL/Format are adaptations by José Grimm of the TEI code (by Sebastian Rahtz). These are part of the Tralics distribution. Other files were written by J. Grimm (conversion to HTML) or Tahia Benhaj Abdellatif (conversion to XML) and maintained by Marie-Pierre Durollet and Bruno Marmol. The Tralics files have a Copyright notice that looks like this:

<!-- Copyright Inria 2003-2004 Jose Grimm. This file is an adaptation of
files from the TEI distribution. See original Copyright notice below.

The “original Copyright notice” is given here:

 Copyright 1999-2001 Sebastian Rahtz/Oxford University
 Permission is hereby granted, free of charge, to any person obtaining
 a copy of this software and any associated documentation files (the
 ``Software''), to deal in the Software without restriction, including
 without limitation the rights to use, copy, modify, merge, publish,
 distribute, sublicense, and/or sell copies of the Software, and to
 permit persons to whom the Software is furnished to do so, subject to
 the following conditions:
 The above copyright notice and this permission notice shall be included
 in all copies or substantial portions of the Software.

Let´s consider an example. This is the start of a document created by Tralics:

<?xml version='1.0' encoding='iso-8859-1'?>
<!DOCTYPE raweb SYSTEM 'raweb3.dtd'>
<!-- translated from latex by tralics 2.4-->
<raweb language='english' creator='Tralics version2.4' year='2004'>
<accueil isproject='false' html='apics'>

Two years later, the team has become a project, and the header is:

<!-- translated from latex by tralics 2.8.1-->
<raweb language='english' creator='Tralics version 2.8.1' year='2006'>
<accueil isproject='true' html='apics'>

This is the start of the translation to the new DTD:

<?xml version="1.0" encoding="iso-8859-1"?>
<!--translated from old xml 2003 by with 2XMLvalideDTD2.xsl-->
<!DOCTYPE raweb PUBLIC "-//INRIA//DTD Raweb 2" "raweb2.dtd">
<raweb xmlns:html="" xml:lang="en" year="2006">
  <identification isproject="true" id="apics">

There is a second style sheet that adds ids to all elements. The resulting file starts like this:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE raweb SYSTEM "raweb.dtd">
<raweb xmlns:html=""
       xmlns:xlink="" id="id2243496"
       xml:lang="en" year="2004">
  <identification id="apics" isproject="false">
    <shortname id="id2267539">apics</shortname>

In 2006, `SYSTEM´ was replaced by `PUBLIC´, and ids are added only when needed.

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE raweb PUBLIC "-//INRIA//DTD Raweb 2" "raweb2.dtd">
<raweb xmlns:html=""
       xml:lang="en" year="2006">
  <identification id="apics" isproject="true">

Note the following details: Tralics uses simple quotes when it outputs its tree; the XSLT processor used by the Raweb uses double quotes; in the style files random values are used. In this exemple the output of the XSLT processor is `indented´, this is an option that depends on the style sheet. Tralics never indents.

5.1. Converting the XML to the new DTD

All files start like this, we shall not repeat this line.

1 <?xml version="1.0" encoding="iso-8859-1" ?>

The root element here is <xsl:transform>; in some other files, it can be <xsl:stylesheet>, this is the same. We declare the namespaces. The `html´ namespace is not used here.

2 <xsl:transform
3         xmlns:xsl="" version="1.0"
4         xmlns:xlink=""
5         xmlns:html=""
6         exclude-result-prefixes="xlink">

We use an auxiliary file for the bibliography; this will be defined in the next section.

7 <xsl:import href="2XMLvalideDTD2-biblio.xsl"/>

The style sheet contains this line, but it is useless since we will use <xsl:document>.

8 <xsl:output method='xml' doctype-system='raweb2.dtd' indent='yes'
9     encoding='iso-8859-1'/>

Spaces are removed in some elements, listed here. The list contains a single name, <UR>.

10 <xsl:strip-space elements="UR"/>

The root of the Raweb is the <raweb> element. Its children are <accueil> (that gives some information about the team; this is used to produce a title page(note: )), <moreinfo> (this is optional, it is a short piece of text, annexed to a section, or the whole document), <composition> (this describes the members of the team), <presentation>, <fondements>(note: ), <domaine>, <logiciels>(note: ), <resultats>, <contrats>, <international>, <diffusion> (these eight elements are called `sections´, they have the same structure and contain text, that can be divided into subsections, etc.), and <biblio> (this is the bibliography).

The table of contents of the Raweb has ten entries: Members, Overall Objectives, Scientific Foundations, Application Domains, Software, New Results, Contracts and Grants with Industry, Other Grants and Activities, Dissemination, Bibliography. Each entry comes from one of these elements, except <accueil> and <moreinfo>, that play a special role. In particular the html attribute of <accueil> is the Team´s name in lower case ASCII 7bits. This is also the name of the source file, and a prefix for output files. We store in $LeProjet this name(note: ).

11 <xsl:variable name="LeProjet" select="/raweb/accueil/@html"/>

The <raweb> element has two important attributes, language and year. We put in the variable $year this quantity with a default value of 2004.

12 <xsl:variable name="year">
13  <xsl:choose>
14   <xsl:when test='/raweb/@year'><xsl:value-of select="/raweb/@year"/></xsl:when>
15   <xsl:otherwise>2004</xsl:otherwise>
16  </xsl:choose>
17 </xsl:variable>

This is the main rule. The translation of <raweb> is the file apics-dtd2.xml (assuming that $LeProjet is `apics´), its root element is <raweb>. This element has some attributes, namely namespaces, year (from the $year variable) and xml:lang from the language attribute. The content is formed by the transformation of the eight standard sections, followed by the bibliography, preceded by the transformation of <accueil> and <composition>.

18 <xsl:template match="/raweb">
19  <xsl:document href="{$LeProjet}-dtd2.xml" method="xml"
20     doctype-public="-//INRIA//DTD Raweb 2"  doctype-system="raweb2.dtd"
21     indent='yes' encoding='iso-8859-1'>
22  <xsl:comment>translated from old xml 2003 by with 2XMLvalideDTD2.xsl</xsl:comment>
23  <raweb
24         xmlns:xlink=""
25         xmlns:html="">
26    <xsl:attribute name="lang" namespace="">
27       <xsl:choose>
28          <xsl:when test="@language='english'">en</xsl:when>
29          <xsl:when test="@language='french'">fr</xsl:when>
30          <xsl:otherwise><xsl:value-of select="@language"/></xsl:otherwise>
31       </xsl:choose>
32    </xsl:attribute>
33    <xsl:attribute name="year"><xsl:value-of select="{$year}"/></xsl:attribute>
34    <xsl:apply-templates select="accueil"/>
35    <xsl:call-template name="topic"/>
36    <xsl:apply-templates select="presentation |
37                         fondements | domaine | logiciels | resultats |
38                         contrats | international | diffusion"/>
39    <xsl:apply-templates select="biblio"/>
40   </raweb>
41  </xsl:document>
42 </xsl:template>

A topic is an attribute of a module. Modules can be ordered by section or by topic; for this reason there are two style sheets that convert the XML into HTML. The table of contents of the HTML version thas a button that switches form one view to the other; there is only one style sheet for the Pdf, it ignores these topic attributes. A topic attribute is a reference to a topic declaration like <topic num='1'><t_titre> High-level modeling</t_titre></topic>. The value of the num attribute is computed by Tralics; specifications say that it should be an integer. In the new DTD, it should be an ID, so that we transform it to t_1 (the IDs generated by Tralics are of the form uid125 or bid125, so that this transformation does not conflict with already existing IDs). The <t_titre> element is useless here; it was removed in the new DTD. The topic declarations are moved from inside <accueil> to after <identification>.

43 <xsl:template name="topic">
44  <xsl:for-each select="accueil/child::topic">
45   <xsl:element name="topic">
46    <xsl:attribute name="id">t_<xsl:value-of select="@num"/></xsl:attribute>
47      <xsl:value-of select="./t_titre"/>
48    </xsl:element>
49  </xsl:for-each>
50 /xsl:template>

This piece of code is a copy from the TEI. The idea is to leave math formulas unchanged. It will be used in all style sheets.

51 <xsl:template match="*|@*|comment()|processing-instruction()|text()" mode="math">
52  <xsl:copy>
53   <xsl:apply-templates mode="math" select="*|@*|processing-instruction()|text()"/>
54  </xsl:copy>
55 </xsl:template>

A <formula> is a wrapper for a <math> expression. The action here is to copy the attributes and the content.

56 <xsl:template match="formula">
57     <xsl:element name="formula">
58        <xsl:copy-of select="@*"/><xsl:apply-templates mode="math"/>
59     </xsl:element>
60 </xsl:template>

If a formula has the attribute type = `display´, it is a display math formula, outside any paragraph. We put it in a <p> element.

61 <xsl:template match="formula[@type='display']">
62  <p>
63   <xsl:element name="formula">
64      <xsl:copy-of select="@*"/><xsl:apply-templates mode="math"/>
65   </xsl:element>
66  </p>
67 </xsl:template>

We convert <accueil> into <identification>. We copy the isproject attribute (this is `true´ in the case where the team is a project, `false´ otherwise). We copy the html trait, renaming it id. After that, we add the transformations of the following elements: <projet>, <projetdeveloppe>, <theme>, <composition> (this is a sibling), <UR>, and <moreinfo> (this is optional).

68 <xsl:template match="accueil">
69  <xsl:element name="identification">
70   <xsl:attribute name="isproject"><xsl:value-of select="@isproject"/></xsl:attribute>
71   <xsl:attribute name="id"><xsl:value-of select="@html"/></xsl:attribute>
72   <xsl:apply-templates select="projet"/>
73   <xsl:apply-templates select="projetdeveloppe"/>
74   <xsl:apply-templates select="theme"/>
75   <xsl:apply-templates select="../composition"/>
76   <xsl:apply-templates select="UR"/>
77   <xsl:if test="/raweb/moreinfo">
78     <xsl:apply-templates select="/raweb/moreinfo"/>
79   </xsl:if>
80  </xsl:element>
81 </xsl:template>

We copy the <theme> element, replacing lowercase letters by uppercase ones.

82 <xsl:template match="accueil/theme">
83  <xsl:element name="theme">
84    <xsl:value-of select="translate(.,'abcdefghijklmnopqrstuvwxyz',
85                                      'ABCDEFGHIJKLMNOPQRSTUVWXYZ')"/>
86  </xsl:element>
87 </xsl:template>

We copy the <projet> element, renaming it <shortname>. Note that, for the LaTeX team, this could be <LaTeX>, so that a simple copy is not enough.

88 <xsl:template match="accueil/projet">
89  <xsl:element name="shortname"> <xsl:apply-templates />  </xsl:element>
90 </xsl:template>

We copy the <projetdeveloppe> element, renaming it <projectName>(note: ). Note that this element can have font changes, hence we must process the content.

91 <xsl:template match="accueil/projetdeveloppe">
92  <xsl:element name="projectName"> <xsl:apply-templates /> </xsl:element>
93 </xsl:template>

In the case of <UR>, we consider only the content; it should be a sequence of elements of the form <URxxx>, these elements are listed below.

94 <xsl:template match="UR">
95    <xsl:apply-templates />
96 </xsl:template>

We replace <URRocquencourt>, <URRhoneAlpes>, <URRennes>, <URLorraine>, <URFuturs>, and <URSophia> by <UR name='Rocquencourt'/>, etc. These elements are empty, they represent one of the six INRIA´s research units. We use here a litteral result element, the actual code uses <xsl:element> and <xsl:attribute>.

97 <xsl:template match="URRocquencourt">
98    <UR name="Rocquencourt" />
99 </xsl:template>
100 <xsl:template match="URRhoneAlpes">
101    <UR name="RhoneAlpes"/>
102 </xsl:template>
103 <xsl:template match="URRennes">
104    <UR name="Rennes"/>
105 </xsl:template>
106 <xsl:template match="URLorraine">
107    <UR name="Lorraine"/>
108 </xsl:template>
109 <xsl:template match="URFuturs">
110    <UR name="Futurs" />
111 </xsl:template>
112 <xsl:template match="URSophia">
113    <UR name="Sophia" />
114 </xsl:template>

Translation of <presentation><fondements><domaine><logiciels><resultats><contrats><international>, and <diffusion>. The result is an element of the same name. The only difference is that we convert the titre attribute in a <bodyTitle> element. This attribute is constant, defined in the DTD, see page 9.1.2 and following, lines 181, 186, 191, 196, 201, 206, 211, 216, and 221.

115 <xsl:template match="presentation | fondements | domaine | logiciels |
116     resultats | contrats | international | diffusion">
117  <xsl:variable name="nodename" select="name()"/>
118  <xsl:element name="{$nodename}">
119    <xsl:attribute name="id">  <xsl:value-of select="@id"/>  </xsl:attribute>
120    <xsl:element name="bodyTitle">
121        <xsl:value-of select="@titre"/>
122    </xsl:element>
123    <xsl:apply-templates/>
124  </xsl:element>
125 </xsl:template>

We simplified a bit the code above by assuming that the id attribute is present. The following code should be used instead of line 119.

       <xsl:when test="@id">
          <xsl:attribute name="id">
             <xsl:value-of select="@id"/>
       <xsl:when test="@num">
          <xsl:attribute name="id">
             <xsl:value-of select="@num"/>
          <xsl:attribute name="id">
             <xsl:value-of select="position()"/>

Translation of <module>. The result is a <subsection>. We convert, in order, <head> (this is the title of the module), <participant>, <participante>, <participants>, <participantes> (four variants that indicate the participants to the action described in the module), <keywords> (the keywords), <moreinfo> (the `moreinfo´ data structure; we grab the first element, its transformation will read the other siblings), and finally everything else. A module has two attributes id and topic that are copied. If the topic is present, we have to convert the value as above line 46 (`12´ replaced by `t_12´).

126 <xsl:template match="module">
127  <xsl:element name="subsection">
128    <xsl:if test="@topic and @topic!=''">
129      <xsl:attribute name="topic">t_<xsl:value-of select="@topic"/>
130      </xsl:attribute>
131    </xsl:if>
132    <xsl:call-template name="id"/>
133    <xsl:apply-templates select="head" mode="caption"/>
134    <xsl:apply-templates
135       select="participants | participant | participantes | participante"/>
136    <xsl:apply-templates select="keywords"/>
137    <xsl:apply-templates select="moreinfo[position()=1]"/>
138    <xsl:apply-templates select="node()[local-name() != 'moreinfo'
139        and local-name()!='keywords' and local-name()!='head'
140        and local-name()!='participants' and local-name()!='participant'
141        and local-name()!='participante' and local-name()!='participantes' ]"/>
142  </xsl:element>
143 </xsl:template>

Translation of <div0>, <div1>, <div2>, <div3>, and <div4>. The result is a <subsection>, the code is the same as for a module, except that these elements have no topic attribute.

144 <xsl:template match="div0 | div1 | div2 | div3 | div4">
145  <xsl:element name="subsection">
146    <xsl:call-template name="id"/>
147    <xsl:apply-templates select="head" mode="caption"/>
148    <xsl:apply-templates
149         select="participants | participant | participantes | participante"/>
150    <xsl:apply-templates select="keywords"/>
151    <xsl:apply-templates select="moreinfo[position()=1]"/>
152    <xsl:apply-templates select="node()[...]" />         <!-- as above l. 139-142-->
153  </xsl:element>
154 </xsl:template>

Transformation of <moreinfo>. The result is a <moreinfo> that contains the content of the element and all the following siblings.

155 <xsl:template match="moreinfo">
156  <xsl:element name="moreinfo">
157    <xsl:apply-templates/>
158    <xsl:for-each select="following-sibling::moreinfo">
159      <xsl:apply-templates/>
160    </xsl:for-each>
161  </xsl:element>
162 </xsl:template>

Transformation of <composition>. The result is a <team> element, containing the <catperso> children and an optional <moreinfo> (let´s hope there is only one, because of the code line 158).

163 <xsl:template match="composition">
164  <xsl:element name="team">
165    <xsl:call-template name="id"/>
166    <xsl:call-template name="catperso"/>
167    <xsl:apply-templates select="moreinfo"/>
168  </xsl:element>
169 </xsl:template>

The transformation of <catperso><head>foo</head>etc</catperso> is a <participants> element with an attribute category = `foo´, with spaces replaced by underscores, and whose content is the translation of all <pers> elements it contains (the semantics is: a <catperso> contains a title in <head>, that could be `Ph.D. Students´, followed by some <pers> elements, all the students of the team).

170 <xsl:template name="catperso">
171  <xsl:for-each select="catperso">
172    <xsl:element name="participants">
173      <xsl:attribute name="category">
174        <xsl:value-of select="translate(./head, ' ', '_')"/>
175      </xsl:attribute>
176      <xsl:apply-templates select="pers"/>
177    </xsl:element>
178  </xsl:for-each>
179 </xsl:template>

The transformation of <participants> is also a <participants> element, where the category attribute has value `None´. Originally, we had four elements, this one and <participant>, <participante>, <participantes>. This was simplified: the difference between masculine and feminine does not appear in English; the final s is removed, it will be added later if the list contains more than one element.

180 <xsl:template match="participants | participant | participantes | participante">
181   <xsl:element name="participants">
182     <xsl:attribute name="category">None</xsl:attribute>
183     <xsl:apply-templates/>
184   </xsl:element>
185 </xsl:template>

The transformation of <pers prenom=`Donald´ nom=`Knuth´>Author of <TeX/> </pers> is a <person> element, with three children, the first is <firstname>, the second is <lastname>, they contain the prenom and nom, and the last one is a <moreinfo> element that contains the content of this element; it is optional. The test is strange because later on, see lines 1133 and 1147 in the next chapter, we test again for emptyness, but white space is normalised there, not here. The code has changed in 2006, because two required attributes affiliation and profession were added. Moreover, the LaTeX command has two optional arguments, producing the value of the hdr attribute and the content of the <pers> element. If only one optional arguments is given, it is the value of the element; this piece of code allows the case where one optional argument is given, with value `habilite´,(note: ), which is handled as if there were two optional arguments, empty content, non-empty attribute.

186 <xsl:template match="pers">
187  <xsl:element name="person">
188    <xsl:call-template name="id"/>
189    <xsl:element name="firstname"><xsl:value-of select="./@prenom"/></xsl:element>
190    <xsl:element name="lastname"><xsl:value-of select="./@nom"/> </xsl:element>
191    <xsl:element name="affiliation">
192        <xsl:value-of select="./@affiliation"/>
193    </xsl:element>
194    <xsl:element name="categoryPro">
195        <xsl:value-of select="./@profession"/>
196    </xsl:element>
197    <xsl:if test="string-length(.) > 0 and .!='habilite'">
198      <xsl:element name="moreinfo">
199         <xsl:apply-templates/>
200      </xsl:element>
201    </xsl:if>
202    <xsl:if test="string-length(./@hdr) > 0 or .='habilite'">
203      <xsl:element name="hdr">
204        <xsl:text>oui</xsl:text>
205      </xsl:element>
206    </xsl:if>
207  </xsl:element>
208 </xsl:template>

The element <refperson> is not defined in the old DTD. We can leave it unchanged.

209 <xsl:template match="refperson"> <xsl:copy-of select="."/> </xsl:template>

Transformation of <hi rend=XX>text</hi>. The result depends on the value of the attribute. If the attribute is `sup´, we construct a <sup> element; if the attribute is `sub´, we construct a <sub> element; if the attribute is `bold´, we construct a <b> element, with a hack: if you use the obsolete environments body and abstract, Tralics inserts a warning in the document, this is removed here(note: ); if the attribute is `small´, we construct a <small> element; if the attribute is `large´, we construct a <big> element; if the attribute is `tt´, we construct a <tt> element; if the attribute is `sc´, we construct a <span> element(note: ); if the attribute is `center´(note: ), we construct a <span> element; if the attribute is `underline´ we construct a <em> element; otherwise, the result is a <i> element.

210 <xsl:template match="hi">
211   <xsl:choose>
212    <xsl:when test="@rend = 'sup'"> <sup><xsl:apply-templates/></sup></xsl:when>
213    <xsl:when test="@rend = 'sub'"> <sub><xsl:apply-templates/></sub></xsl:when>
214    <xsl:when test="@rend = 'bold'">
215     <xsl:if test=".!='Body (obsolete)'">
216        <xsl:if test=".!='Abstrat (obsolete)'">
217          <b><xsl:apply-templates/></b>
218        </xsl:if>
219     </xsl:if>
220    </xsl:when>
221    <xsl:when test="@rend = 'small'">
222      <small><xsl:apply-templates/></small> </xsl:when>
223    <xsl:when test="@rend = 'sc'">
224      <span class="smallcap"> <xsl:value-of select="."/> </span>
225    </xsl:when>
226    <xsl:when test="@rend = 'large'"><big><xsl:apply-templates/></big> </xsl:when>
227    <xsl:when test="@rend = 'center'">
228       <span align="center"><xsl:apply-templates/></span>
229    </xsl:when>
230    <xsl:when test="@rend = 'underline'">
231       <em style="UNDERLINE"><xsl:apply-templates/></em>
232    </xsl:when>
233    <xsl:when test="@rend = 'tt'"> <tt><xsl:apply-templates/></tt> </xsl:when>
234    <xsl:otherwise> <i><xsl:apply-templates/></i> </xsl:otherwise>
235   </xsl:choose>
236 </xsl:template>

Transformation of <keywords>. We consider only the <term> children, changing the name to <keyword> (there should be no other children). Note: in 2006, a test was added, if the value of the term is empty, nothing happend

237 <xsl:template match="keywords">
238   <xsl:for-each select="term">
239     <xsl:if test="string-length(.)>0">
240       <xsl:element name="keyword">
241         <xsl:call-template name="id"/>
242         <xsl:value-of select="."/>
243       </xsl:element>
244     </xsl:if>
245   </xsl:for-each>
246 </xsl:template>

Transformation of <code>. This is trivial.

247 <xsl:template match="code">
248    <xsl:element name="code"> <xsl:apply-templates/> </xsl:element>
249 </xsl:template>

Transformation of <ref>. The result is an element of the same name. It has the same id (does anybody reference a reference?). The target attribute is replaced by a xlink:href attribute, with the same value, but it has a # in front. There is also a location attribute whose value is `intern´, except when the parent is a <cit>, case where `biblio´ is used.

250 <xsl:template match="ref">
251    <xsl:element name="ref">
252       <xsl:call-template name="id"/>
253       <xsl:attribute name="xlink:href" namespace="">
254          <xsl:value-of select="concat('#', @target)"/>
255       </xsl:attribute>
256       <xsl:attribute name="location">
257          <xsl:choose>
258             <xsl:when test="parent::cit">biblio</xsl:when>
259             <xsl:otherwise>intern</xsl:otherwise>
260          </xsl:choose>
261       </xsl:attribute>
262       <xsl:apply-templates/>
263    </xsl:element>
264 </xsl:template>

Transformation of <cit>. We transform only the content, which should be a single <ref>. It is possible to know that the <ref> comes from a <cit> because of its location attribute.

265 <xsl:template match="cit">  <xsl:apply-templates/> </xsl:template>

Transformation of <xref>. We simplified the code by removing a test that was wrong, hence always false. The result is the same as <ref>, but location is always external, and the link is unchanged. The test that was removed is: if the name of the ancestor of the element is `citation´, then generate the same code, but in <biblScope> element.

266 <xsl:template match="xref">
267      <xsl:element name="ref">
268       <xsl:call-template name="id"/>
269       <xsl:attribute name="xlink:href" namespace="">
270          <xsl:value-of select="@url"/>
271       </xsl:attribute>
272       <xsl:attribute name="location">extern</xsl:attribute>
273       <xsl:apply-templates/>
274      </xsl:element>
275 </xsl:template>

The <ident> element is unused. It should contain only text.

276 <xsl:template match="ident"> <xsl:copy/> </xsl:template>

Transformation of <note>; the result is <footnote>.

277 <xsl:template match="note">
278    <xsl:element name="footnote"><xsl:copy-of select="@*"/>
279       <xsl:call-template name="id"/>
280       <xsl:apply-templates/>
281    </xsl:element>
282 </xsl:template>

Transformation of <p>. Trivial. Note: we shall see later that there is a second rule for this element.

283 <xsl:template match="p">
284    <xsl:element name="p">
285       <xsl:copy-of select="@*" />
286       <xsl:apply-templates/>
287    </xsl:element>
288 </xsl:template>

Transformation of <list>. The result is <descriptionlist>, <glosslist>, <orderedlist>, <simplelist>, depending on the value of the type attribute.

289 <xsl:template match="list">
290   <xsl:choose>
291    <xsl:when test="@type='description'">
292      <xsl:element name="descriptionlist">
293         <xsl:call-template name="id"/>  <xsl:apply-templates/>
294      </xsl:element>
295    </xsl:when>
296    <xsl:when test="@type='gloss'">
297      <xsl:element name="glosslist">
298        <xsl:call-template name="id"/> <xsl:apply-templates/>
299      </xsl:element>
300    </xsl:when>
301    <xsl:when test="@type='ordered'">
302      <xsl:element name="orderedlist">
303        <xsl:call-template name="id"/> <xsl:apply-templates/>
304      </xsl:element>
305    </xsl:when>
306    <xsl:otherwise>
307      <xsl:element name="simplelist">
308        <xsl:call-template name="id"/> <xsl:apply-templates/>
309      </xsl:element>
310    </xsl:otherwise>
311   </xsl:choose>
312 </xsl:template>

Transformation of <item>: the result is <li>.

313 <xsl:template match="item">
314    <xsl:element name="li">
315       <xsl:call-template name="id"/> <xsl:apply-templates/>
316    </xsl:element>
317 </xsl:template>

Transformation of <label>: the result is <label>.

318 <xsl:template match="label">
319    <xsl:element name="label">
320       <xsl:call-template name="id"/>  <xsl:apply-templates/>
321    </xsl:element>
322 </xsl:template>

Transformation of <table>. The result is a <table>. We set the attribute border to `solid´ in case one of the cells in the table has a bottom-border attribute that is true.(note: ) The rend attribute is copied. We copy all <row> children, followed by the <caption>, if there is one (normally, there is none), followed by <head>, renamed to <caption>(note: ).

323 <xsl:template match="table">
324  <xsl:element name="table">
325    <xsl:if test="./row/cell/@bottom-border='true'">
326      <xsl:attribute name="border">solid</xsl:attribute>
327    </xsl:if>
328    <xsl:if test="./@rend">
329     <xsl:attribute name="rend"><xsl:value-of select="./@rend" /></xsl:attribute>
330    </xsl:if>
331    <xsl:call-template name="id"/>
332    <xsl:apply-templates select="row" />
333    <xsl:apply-templates select="caption"/>
334    <xsl:element name="caption"> <xsl:value-of select="head"/> </xsl:element>
335   </xsl:element>
336 </xsl:template>

Transformation of <row>. The test here is strange. In the case where the test is false, the result is a <tr>, with the same content as the row. There is one attribute style(note: ) obtained from the right-border, top-border, left-border, and bottom-border attributes.

337 <xsl:template match="row">
338   <xsl:choose>
339     <xsl:when test="normalize-space(.) = '' and not(cell/child::*)">
341     </xsl:when>
342     <xsl:otherwise>
343      <xsl:element name="tr">
344        <xsl:attribute name="style">
345         <xsl:if test="@right-border='true'"
346                 >border-right-style:solid;border-right-width:1px;</xsl:if>
347         <xsl:if test="@top-border='true'"
348                 >border-top-style:solid;border-top-width:1px;</xsl:if>
349         <xsl:if test="@left-border='true'"
350                 >border-left-style:solid;border-left-width:1px;</xsl:if>
351         <xsl:if test="@bottom-border='true'"
352                 >border-bottom-style:solid; border-bottom-width:1px;</xsl:if>
353        </xsl:attribute >
354        <xsl:apply-templates/>
355      </xsl:element>
356     </xsl:otherwise>
357   </xsl:choose>
358 </xsl:template>

The transformation of <cell> is <td>, with the same content. There is one attribute style obtained from the halign, right-border, top-border, left-border, and bottom-border attributes. Attributes rows and cols are copied if the value is greater than one (this is the row span or column span of the cell).

359 <xsl:template match="cell">
360  <xsl:element name="td">
361    <xsl:attribute name="style">
362     <xsl:if test="@halign"
363        >text-align:<xsl:value-of select="@halign"/>;</xsl:if>
364     <xsl:if test="@right-border='true'"
365        >border-right-style:solid;border-right-width:1px;</xsl:if>
366     <xsl:if test="@top-border='true'"
367        >border-top-style:solid;border-top-width:1px;</xsl:if>
368     <xsl:if test="@left-border='true'"
369        >border-left-style:solid;border-left-width:1px;</xsl:if>
370     <xsl:if test="@bottom-border='true'"
371        >border-bottom-style:solid; border-bottom-width:1px;</xsl:if>
372    </xsl:attribute >
373    <xsl:if test="./@cols>1">
374      <xsl:attribute name="cols"><xsl:value-of select="./@cols" /></xsl:attribute>
375    </xsl:if>
376    <xsl:if test="./@rows>1">
377      <xsl:attribute name="rows"><xsl:value-of select="./@rows" /></xsl:attribute>
378    </xsl:if>
379   <xsl:apply-templates/>
380  </xsl:element>
381 </xsl:template>

Attributes halign are always copied. This should be explained, because, a priori, all these attributes were converted to a style attribute.

382 <xsl:template match="@halign">
383    <xsl:attribute name="halign"><xsl:value-of select="." /></xsl:attribute>
384 </xsl:template>

This converts a <figure> element into a <ressource> element. The rend attribute is copied under the name type. Other attributes like width, height, scale, angle, and framed are just copied.

385 <xsl:template name="ressource">
386   <ressource xlink:href="{@file}">
387     <xsl:if test="@rend">
388       <xsl:attribute name="type"><xsl:value-of select="@rend"/></xsl:attribute>
389     </xsl:if>
390     <xsl:if test="@width"> <xsl:copy-of select="@width"/> </xsl:if>
391     <xsl:if test="@height"> <xsl:copy-of select="@height"/> </xsl:if>
392     <xsl:if test="@scale">  <xsl:copy-of select="@scale"/> </xsl:if>
393     <xsl:if test="@angle"> <xsl:copy-of select="@angle"/>  </xsl:if>
394     <xsl:if test="@framed"> <xsl:copy-of select="@framed"/> </xsl:if>
395     <xsl:if test="head and ((ancestor::figure) or not(@file))">
396       <xsl:apply-templates select="head"  mode="caption"/>
397     </xsl:if>
398   </ressource>
399 </xsl:template>

In the case where a <figure> is in a <table> which is in a <figure>, and if it has a file attribute, then the result is a `ressource´.

400 <xsl:template match="figure//table//figure[@file]" priority="5">
401   <xsl:call-template name="ressource"/>
402 </xsl:template>

In the case where a <figure> has a file attribute, is below a figure, but does not match the rule above, then the result is a `ressource´, as above, but in a <td>.

403 <xsl:template match="figure[(ancestor::figure) and @file]">
404   <td><xsl:call-template name="ressource"/></td>
405 </xsl:template>

This is the last rule for a <table>. Let´s hope no case is forgotten. The result is a <object>. It contains a <table>: in the case where there is a file attribute, the element is empty, and the translation is a <table> with a single <tr> with a single <td> with the ressource. In the case where there is a <p> with a table, we consider only these elements (let´s hope for the best). Otherwise, we add a <table>, and each <p> will produce a row. A caption is put at the end.

406 <xsl:template match="figure[not(ancestor::figure)]">
407    <xsl:element name="object">
408       <xsl:call-template name="id"/>
409       <xsl:choose>
410          <xsl:when test="@file">
411             <table>
412               <tr><td>
413               <xsl:call-template name="ressource"/>
414               </td></tr>
415             </table>
416          </xsl:when>
417          <xsl:when test="p/table">
418             <xsl:apply-templates select="p/table" />
419          </xsl:when>
420          <xsl:otherwise>
421             <table> <xsl:apply-templates /> </table>
422          </xsl:otherwise>
423       </xsl:choose>
424       <xsl:apply-templates select="head"  mode="caption"/>
425    </xsl:element>
426 </xsl:template>

This code is applied only in the `otherwise´ case of the previous template. For 2005, the best thing to do should be to modify Tralics so that this style sheet can be made more robust.

427 <xsl:template match="figure/p">
428    <tr><xsl:apply-templates/></tr>
429 </xsl:template>

This copies the id attribute if present.(note: )

430 <xsl:template name="id">
431  <xsl:if test="./@id">
432    <xsl:attribute name="id"><xsl:value-of select="./@id"/> </xsl:attribute>
433  </xsl:if>
434 </xsl:template>

This interprets <head>. If the parent is <list>, we put the content of the element in the title attribute of the current element.(note: ) If the parent is <figure> or <table> we put the content in a <caption> element. Otherwise, we put it in a <bodyTitle> element. Note that Tralics replaces some empty titles by `(Sans Titre)´; Code on line 447 was changed in 2005: if the section has a single module, the name of the section is used, otherwise `Introduction´ will be used. You will see twice Xsl instead of xsl, in both these cases, the code contained a <xsl:text></xsl:text> that is not shown here (it seems useless to me).

435 <xsl:template match="head" mode="caption">
436   <xsl:choose>
437     <xsl:when test="parent::list" >
438       <xsl:attribute name="title"> <xsl:apply-templates/> </xsl:attribute>
439     </xsl:when>
440     <xsl:when test="parent::figure | parent::table">
441       <xsl:element name="caption"> <xsl:apply-templates/> </xsl:element>
442     </xsl:when>
443     <xsl:otherwise>
444       <xsl:element name="bodyTitle">
445         <xsl:choose>
446           <xsl:when test=".='(Sans Titre)'">
447             <xsl:choose>
448               <xsl:when test="count(../../module)=1">
449                 <xsl:value-of select="../../@titre"/>
450               </xsl:when>
451               <xsl:otherwise> <Xsl:text>Introduction</xsl:text></xsl:otherwise>
452             </xsl:choose>
453           </xsl:when>
454           <xsl:otherwise> <Xsl:apply-templates/> </xsl:otherwise>
455         </xsl:choose>
456       </xsl:element>
457     </xsl:otherwise>
458   </xsl:choose>
459 </xsl:template>

We do nothing with <head>, because this element should be handled by the routines given above.

460 <xsl:template match="head"></xsl:template>

We leave the <LaTeX> element unchanged.

461 <xsl:template match="LaTeX"> <LaTeX/></xsl:template>

We leave the <TeX> element unchanged.

462 <xsl:template match="TeX"> <TeX/> </xsl:template>

This is the end of the file.

463 </xsl:transform>

5.2. Addings Ids

There is a style sheet that adds some Ids. Original version (denoted by V1 hereafter) is in add-id.xsl, revised one (denoted by V2) in add-idDTD2.xsl. There is a comment that says: “Some Ids are missing in some XML files (that were not generated by Tralics) these are required in the bibliography, we add them everywhere”. The revised version has “they are required for the bibliography and subsection, we add them where needed”. As we shall see, Ids are added only for subsections. We propose additional simplifications, see comments below; this gives V3.

This is the header of the file. It declares a namespace (it binds xmlns:m to thez MathML namespace), but this declaration is not used.

501 <xsl:transform
502   xmlns:xsl="" version="1.0"
503   xmlns:m=""
504   exclude-result-prefixes="m">

The result is a XML file, conforming the Raweb DTD version 2.

505 <xsl:output method='xml'
506      doctype-public="-//INRIA//DTD Raweb 2" doctype-system='raweb2.dtd'
507      indent='yes' encoding='iso-8859-1'/>

We do not want to add an ID to each character of a math formula; for this reason we copy recursively the formula. We could remove this code, however a `diff´ between the original XML and the resulting file shows that this code does not a simple copy: each <math> element in the formula has a useless xmlns:xlink attribute, that is removed here. Moreover, the DTD specifies some attributes (like TEIform) with a default value; whenever the attribute is missing, the default value is added; for instance, in the case of the `apics´ file, the size changes from 377517 to 406132 bytes.

508 <xsl:template match="formula">
509   <formula>
510     <xsl:copy-of select="@*"/>
511     <xsl:apply-templates mode="math"/>
512   </formula>
513 </xsl:template>

This is a copy of a rule defined elsewhere.

514 <xsl:template match="*|@*|comment()|processing-instruction()|text()" mode="math">
515  <xsl:copy>
516   <xsl:apply-templates mode="math" select="*|@*|processing-instruction()|text()"/>
517  </xsl:copy>
518 </xsl:template>

The whole document is converted.

519 <xsl:template match="/">
520   <xsl:apply-templates />
521 </xsl:template>

We copy all attributes, and text nodes.

522 <xsl:template match="@*">
523   <xsl:copy />
524 </xsl:template>
526 <xsl:template match="text()">
527   <xsl:copy />
528 </xsl:template>

The default template rule says to copy everything.

529 <xsl:template match="*">
530   <xsl:copy>
531      <xsl:apply-templates select="node()|@*" />
532   </xsl:copy>
533 </xsl:template>

In the case of <subsection>, we add an id attribute, if there is none, before copying.

534 <xsl:template match="subsection">
535   <xsl:copy>
536     <xsl:if test="not(./@id)">
537       <xsl:attribute name="id">
538         <xsl:value-of select="generate-id()" />
539       </xsl:attribute>
540     </xsl:if>
541     <xsl:apply-templates select="node()|@*" />
542   </xsl:copy>
543 </xsl:template>

This is the end of the file.

544 </xsl:transform>

The code that follows appears in one of the two style sheets distributed with the Raweb2006; we modified them in order to make the code shorter.

The following rule appears in the style sheets V1 and V2, removed in V3. Its effect is to not copy the id attribute; since the purpose of the file is to add missing ids, keeping existing ones, we have to add another rule. According to [5], “Duplicate attribute nodes are removed. If several attributes in the sequence have the same name, all but the last are discarded”. Thus we can safely remove this rule, as well as the additional rules.

545 <xsl:template match="@id">
546 </xsl:template>

This is the code of V1. The effect is to add an id to every node; but this is overkill.

547 <xsl:template match="*">
548   <xsl:copy use-attribute-sets="ID">
549     <xsl:apply-templates select="node()|@*" />
550   </xsl:copy>
551 </xsl:template>

This is the code of V2. The effect is a simple copy of the element and its attributes. By default, the id attribute is not copied, hence an explicit copy is needed.

552 <xsl:template match="*">
553    <xsl:copy>
554      <xsl:if test="./@id">
555        <xsl:attribute name="id"><xsl:value-of select="./@id" /></xsl:attribute>
556      </xsl:if>
557      <xsl:apply-templates select="node()|@*" />
558    </xsl:copy>
559 </xsl:template>

This rule is added in V2; it effectively adds an id to each <subsection>. This is equivalent to the V3 code shown above (but V3 does not use the use-attribute-set feature).

560 <xsl:template match="subsection">
561   <xsl:copy use-attribute-sets="ADDID">
562     <xsl:apply-templates select="node()|@*" />
563   </xsl:copy>
564 </xsl:template>

This rule was named `ID´ in V1 (used on line 548), renamed to `ADDID´ (used on line 561). It uses the `generate-id´ function in order to create a unique id, in the case where there is none.

565 <xsl:attribute-set name="ADDID">
566   <xsl:attribute name="id">
567     <xsl:choose>
568       <xsl:when test="./@id"> <xsl:value-of select="./@id" />  </xsl:when>
569       <xsl:otherwise> <xsl:value-of select="generate-id()" />  </xsl:otherwise>
570     </xsl:choose>
571   </xsl:attribute>
572 </xsl:attribute-set>

Since putting Ids everywhere is overkill, some templates were added in V2 to inhibit this.

573 <xsl:template match="LaTeX"> <LaTeX/> </xsl:template>
574 <xsl:template match="TeX">  <TeX/> </xsl:template>

These two variables were used to compute the name of the output file.

575 <xsl:variable name="LeProjet"/>
576 <xsl:variable name="year"/>
Back to main page