12 | Serrano, M. The HOP Development Kithttp://www.inria.fr/mimosa/Manuel.Serrano/publi/sfp06/article.htmlproceedings of the Seventh ACM sigplan Workshop on Scheme and Functional ProgrammingPortland, Oregon, USASep 2006. |
13 | Serrano, M. HSS: a Compiler for Cascading Style Sheets10th ACM Sigplan Int'l Conference on Principles and Practice of Declarative Programming (PPDP)Hagenberg, AustriaJul 2010. |
16 | World Wide Web Consortium, Cascading Style Sheets level 2 Revision 1 CSS2.1 Specificationhttp://www.w3.org/TR/2009/CR-CSS2-20090423/CR-CSS2-20090423W3C RecommendationApr 2009. |
10 | Loitsch, F. and Serrano, M. Trends in Functional ProgrammingHop Client-Side CompilationSeton Hall University, Intellect Bristol (ed. Morazán, M. T.)UK/Chicago, USA 2008141--158. |
1 | Bobrow, D. et al.Common lisp object system specificationhttp://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/html/cltl/cltl2.htmlspecial issueSigplan Notices23Sep 1988. |
12 | Serrano, M. The HOP Development Kithttp://www.inria.fr/mimosa/Manuel.Serrano/publi/sfp06/article.htmlproceedings of the Seventh ACM sigplan Workshop on Scheme and Functional ProgrammingPortland, Oregon, USASep 2006. |
15 | World Wide Web Consortium, XQuery 1.0: An XML Query Languagehttp://www.w3.org/TR/xquery/REC-xquery-20070123/W3C RecommendationJan 2007. |
5 | Hosoya, H. and Pierce, B. XDuce: a Typed XML Processing LanguageIn Proc. of Workshop on the Web and Data Bases (WebDB 2000226--244. |
7 | Kelsey, R. and Clinger, W. and Rees, J. The Revised(5) Report on the Algorithmic Language Schemehttp://www.inria.fr/mimosa/fp/Bigloo/doc/r5rs.htmlHigher-Order and Symbolic Computation111Sep 1998. |
6 | Iso/Iec, Information technology, Processing Languages, Document Style Semantics and Specification Languages (DSSSL)http://www.jclark.com/dsssl/10179:1996(E)ISO 1996. |
14 | Walsh, N. and Muellner, L. DocBook: The Definitive GuideO'ReillyOct 1999. |
2 | Flatt, M. and Barzilay, E. and Findler, R. B. Scribble: closing the book on ad hoc documentation toolshttp://www.cs.utah.edu/plt/publications/icfp09-fbf.pdfICFP '09: Proceedings of the 14th ACM SIGPLAN International Conference on Functional ProgrammingEdinburgh, Scotland 2009109--120. |
3 | Gallesio, E. and Serrano, M. Skribe: a Functional Authoring Languagehttp://www.inria.fr/mimosa/Manuel.Serrano/publi/jfp05/article.htmlJournal of Functional Programming 2005. |
11 | Maranget, L. Hevea, un traducteur de LaTeX vers HTML en CamlActes des 10e Journfrancophones des langages applicatifs 1999. |
4 | Greene, A. BASIX -- An Interpreter Written in http://www.tug.org/TUGboat/Articles/tb11-3/tb29greene.pdfTUGBoat113 1990381--392. |
9 | Lamport, L. LaTeX - a Document Preparation SystemAddison-Wesley, ReadingsMassachusetts, USA 1986. |
8 | Knuth, D. The TEXbookAddison-Wesley, ReadingsMassachusetts, USA 1986. |
@Misc{ serrano:hoptex11, author = {Serrano, M.}, title = {HopTeX - Compiling HTML to LaTeX with CSS}, category = {web programming}, year = 2011, month = jan, url = {http://hop.inria.fr/hop/weblets/homepage?weblet=hoptex&file=hoptex.pdf} }
This article1 presents HopTeX, a new application for authoring Html and LaTeX documents. The content of the document is either be expressed in Html or in a blending of Html and a dedicated wiki syntax, for the sake of conciseness and readability. The rendering of the document is expressed by a set of Css rules. The main originality of HopTeX is to consider LaTeX as a new media type for Html and to express the compilation from Html to LaTeX by the means of dedicated style sheet rules.
HopTeX can then be used to generate high quality documents for both paper printed version and electronic version.
HopTeX is implemented in Hop, a multi-tier programming language for the Web 2.0. This implementation extensively relies on two facilities generally only available on the client-side that Hop also supports on the server-side of the application: DOM manipulations and Css server-side resolutions.
Many scientific publications, in particular in academia, are authored with TeX or LaTeX 8, 9. This is a batch system where documents are actually disguised programs that, when executed, produce various output document formats including DVI or PDF.
Although the TeX programming language is Turing-complete, it is mostly exclusively used as a purely authoring declarative language. Being more than forty years old it lacks most modern features of programming languages: its syntax is difficult to parse, it supports no object-oriented features, and it offers a limited set of functions for interacting with the operating system. In consequence, programming in TeX requires a strong expertise that is repellent to many, although a small community of aficionados is able to use it beyond expectations (see for instance 4). On the other hand TeX is still widely used because its rendering engine, coupled with the MetaFont tool, delivers high quality documents that hardly no contemporary typesetting system matches.
The most striking shortcoming of TeX/LaTeX is its inability to produce Html. Since publishing on the web is nowadays mandatory, translators from LaTeX to Html such as Latex2html or Hevea 11 have emerged. These tools have limitations because they offer few facilities for controlling the graphical rendering of the generated documents. This limitation comes from their inability to use Css with the generated Html documents because these lack Html classes or Html identifiers.
Other tools such as Skribe 3 and Scribble 2 follow the symmetrical path which consists in considering LaTeX as a target and no longer as a source. They attempt to improve LaTeX by providing a sane programming language used to generate the texts. They offer an ad hoc syntax that combines algorithmic constructs and text oriented markups. A program can generate LaTeX as well as Html. These two systems are agnostic with respect to the generated format. As a consequence of this design choice, they adopt abstractions reflecting a least-common denominator of their target formats. That design choice also makes them difficult to use when fine grain tuning of the generated document is needed. This characteristic is shared by systems such as Texinfo or DocBook 14 that represent texts using a neutral syntax that can be either compiled to Html, LaTeX, and even other formats.
Accommodating Html as a regular data type in programming language is not new. DSSSL 6, the pioneer and LAML are two examples based on the Scheme programming language 7. Other languages such as XDuce 5 or XQuery 15 extends this to XML. These languages are well suited for manipulating XML documents but they have no particular skill for authoring documents.
HopTeX is a new system for authoring articles, reports, documentation, and books that follows yet another approach. It accepts as input either Html or a compact wiki syntax that can be seconded by the expressions of the Hop programming language. It either produces web pages or LaTeX files. HopTeX aims at combining the best of the two worlds: it generates Html for using the modern interactive features of the web browsers and it generates LaTeX for producing high quality paper output. This approach enables HopTeX to generate online documents that embed arbitrary Html fragments such as videos, canvas, pictures, or interactive Ajax elements. It also enables HopTeX to generate paper documents that rely on pre-existing LaTeX styles. HopTeX generates regular LaTeX files so it is up to the user to include the correct proper statement in his document source. For instance, to accommodate the ACM style required by the conference, the head of the present paper contains the following:
<tex:verbatim> \documentclass[nocopyrightspace]{sigplanconf} \usepackage{amsmath} \usepackage{graphicx} \usepackage{color} \setlength{\pdfpagewidth}{8.5in} \setlength{\pdfpageheight}{11in} ... \maketitle </tex:verbatim>
HopTeX is implemented in Hop 12, a multi-tier programming language for the web. Hop offers features that dramatically simplify the implementation of HopTeX. In particular, it constructs a server-side DOM for the HTML documents and it supports a server-side CSS resolver. These two features are extensively used to compile to LaTeX.
This presentation of HopTeX is organized as follows. First, to let users unfamiliar with the Hop programming language understand this paper without consulting previous articles, the language is briefly presented in Section 2. Section 3 presents the main functionalities of HopTeX. Section 4 shows how LaTeX is generated out of the initial Html document. Section 5 shows the benefit HopTeX users can expect from resorting to a full-fledged web programming language.
Hop is a multi-tier programming language for the web which shares many characteristics with JavaScript. It belongs to the functional languages family. It relies on a garbage collector for automatically reclaiming unused allocated memory. It supports type annotations that let the compiler partially check types at compile-time. Types that cannot be inferred are check dynamically at runtime. It is fully polymorphic (i.e., the universal identity function can be implemented). Hop has also several differences with JavaScript, the most striking one being its parenthetical syntax closer to Html than to C-like languages. Hop is a full-fledged programming language so it offers an extensive set of libraries. It advocates CLOS-like object oriented programming 1. Its main characteristic is that it fosters a programming model where a web application is conceived as a whole. For that, it relies on a single formalism that embraces simultaneously server-side and client-side of the applications. Both sides communicate by means of function calls and signal notifications. Server-side parts are compiled to a mix of bytecode or native code and client-side parts are compiled to JavaScript 10. In the source code, a syntactic mark instructs the compiler about the location where the expression is to be evaluated.
When an URL is intercepted by a Hop server for the first time, the server automatically loads the associated program and the libraries it depends on. Programs first authenticate the user they are to be executed on behalf of and they check his permissions. In order to load or install the program on the client side, the server elaborates an abstract syntax tree (AST) and compiles it on the fly to generate a Html/JavaScript document that is sent to the client. Here is an example of a simple Hop program that is started by browsing the URL http://localhost/hop/hello.
(define-service (hello) (<HTML> (<DIV> :onclick ~(alert "world!") "Hello")))
Contrary to Html, Hop's markups (i.e.,
,<HTML>
and <DIV>
) are node constructors. That is, the
service hello
elaborates an AST whose compilation into Html
is delayed until the result of the request is transmitted to the
client. This two phased evaluation process is strongly different
from embedded scripting language such as PHP. The AST representing the
GUI exists on the client as well as on the server. This brings
flexibility because it gives the server opportunities to deploy
optimized strategies for building and manipulating the ASTs as it lets
DOM computations take place on the server-side of the
application. This characteristic is extensively used for implementing
HopTeX.
This article not being a HopTeX user manual only its prominent features are presented. HopTeX documents are expressed in Html. However, because Html concrete syntax is verbose it is cumbersome to manipulate for the user. HopTeX therefore proposes an alternative wiki syntax that can be used in conjunction of Html. It is expected that this syntax will be preferred by users so it is first presented in this section. Secondly, it is shown how the wiki syntax and the full-fledged Html syntax can be blended inside documents.
HopTeX syntax is stratified: the surface syntax is used to typeset input texts, the deep syntax, which coincides with the syntax of the Hop expressions, is used to embed complex Html trees in the document. The surface syntax is inspired by most popular wiki syntaxes and in particular by MediaWiki2 and CreoleWiki3. It allows authors to express a subset of Html in a concise and visual way. For instance, tags for strong and emphasize are ** and // which are considered by some more intuitive and more compact than the corresponding Html tags. For instance, the following HopTeX input text:
HOP wiki supports **strong**, //emphasize//, __underline__, and ++mono space++. These can be **__combined__** **//anyhow//**.
is rendered as:
The surface syntax supports sections (==), paragraphs (~~), verbatim texts (lines beginning with two white spaces), tables (lines beginning by either ^ or |), lists (lines beginning with two whitespaces followed by either a * or - character), or other classical block constructs that are separated one entry from another by two blank lines. For instance, the following table:
| This | is ^ a table ^
produces the following result:
This | is | a table |
---|
The delimiter ^ introduces table head while the delimiter |
introduces regular table cells. This explains why the words a table
is rendered with a bold font in the example above.
HopTeX supports mathematical expressions which are introduced by the $$ delimiter. Inside this delimiter HopTeX borrows the syntax of TeX whose syntax for mathematics is deemed expressive and compact. Mathematical expressions are compiled to MathML on the fly. For instance:
* $$\prod_n^m \lim_{n \rightarrow \infty} x = 0$$ * $$\overbrace{\overline{x}^{2} + 1}$$ * $$(n+1)^2\quad \sqrt{1-x^2}\quad\overline{w+\bar z} \quad p^{e_1}_1$$
produces:
Links and anchors are syntactically similar to those of MediaWiki but extended to support citations, references, and footnotes that are introduced by using a dedicated protocol (bib: for citations, section: for sections, ...). For instance:
Links refer to URLs such as ++[[http://www.inria.fr]]++. They may also refer to sections or bibliographic entries such as: HopTex is described in Section [[section://HopTeX]].
produces:
The surface syntax trades completeness for compactness. That is not all Html trees can be represented using the surface syntax. For such trees, the deep syntax is used. The escaping sequence of the deep syntax is ,(. When the HopTeX parser reads such a prefix, it reads the rest of the expression using the regular Hop parser, evaluates the expression, and inserts the result in the tree. For instance:
The //deep// escape sequence is ,(<TT> ",("). It can be used to insert HTML trees such as ,(<KBD> "C-x s"). The ++<WIKI>++ markup is used to ,(<SPAN> :style "color: darkblue" (<WIKI> [enter the //surface// syntax from the //deep// syntax])).
produces:
Wiki syntaxes such as the HopTeX surface syntax are designed to express a subset of Html concisely. As such, they are easy to translate into Html. They are far less obviously translated into TeX. This translation is described in this section.
Observation 1: TeX/LaTeX (henceforth LaTeX) and Html are not isomorphic. Html is more flexible and more compositional. For instance a Html TABLE might contain PRE elements while LaTeX refuses verbatim environments inside a tabular. Consequently not all Html documents, and thus HopTeX documents, can be automatically compiled into LaTeX.
Facing this problem, two obvious solutions emerge: either reduce the expressiveness to HopTeX to the least common denominator of Html and LaTeX, or treat Html parts that have no LaTeX equivalent specially. We have considered the intersection of the two languages too small so we have adopted the latter solution. In consequence, from time to time, HopTeX users have to specify explicitly how to compile some part of the text into LaTeX. However, we have worked hard to minimize the number of occurrences of such situations and we have worked even harder to provide convenient means for expressing these ad-hoc compilation schemas.
Observation 2: Cascading Style Sheets (henceforth Css) 16 effectively separate the structure of a document from its rendering. If compiling Html into LaTeX is possible roughly equivalent to rendering Html into LaTeX, then, Css could probably be used for that compilation.
Consider our previous example using bold-face fonts and italic and consider what happens if we ask a web browser to render them using the following Css rules:
strong:before { content: "{\\textbf{"; } em:before { content: "{\\emph{"; } strong:after, em:after { content: "}}"; }
The browser will display the following document
HOP wiki supports {\textbf{bold}}, {\emph{italic}}...
which is almost4 a LaTeX compilation.
The HopTeX compilation relies on Css in a principled manner where the compilation rules are expressed as Css rules. In addition to simplicity, using Css also brings flexibility because it let users provide their own compilation rules in their own Css files that can override the default compilation strategy.
The browser cannot be used to implement the compilation as a simple
Html rendering for two reasons. First, the browser cannot save the
rendered text. Second, some compilation rules are more complex than
merely adding a prefix and a suffix. For instance, in Html, pre
elements are regular blocks that only differ from paragraph by not
collapsing white spaces and by breaking lines at
newline
character positions and by using a dedicated
font. LaTeX has nothing similar. The
verbatim
environment has the same behavior for justification and
line breaks but considers markups as plain texts. Extensions such
as alltt
approach
pre
but all have incompatibilities. In consequence, Html
pre
elements have to be treated specially when compiled to
LaTeX.
HopTeX relies on server-side Css processing. It resorts to the Hss 13 compiler which is included in the Hop development environment 12. Amongst other features, Hss contains a parser that builds abstract syntax trees and a resolver that matches rules against HTML elements.
When a HopTeX input text is to be compiled into LaTeX, the
surface syntax is first parsed to produce a full-fledged
server-side DOM representation of the Html document. The elements
of this tree are matched against Css rules which govern
the compilation into LaTeX. The extra tex
keyword can be used
in Css
@media
rules to specify rules that are only applicable to the
LaTeX compilation.
The rest of this section presents the details of the compilation. The algorithm is expressed by 4 Hop functions. We deem the Hop language sufficiently high level to be used as an abstract notation for describing these algorithms. Readers unfamiliar with functional programming will probably find some details of the implementation obscure. We hope they will still be able to grasp the general intuition of the algorithms.
The service hoptex/tex
implements the entry point of the
compiler. It accepts two parameters, the URL of the source file to be
compiled and the name of the target file. The service first
builds a server side DOM for the document (using the library function
wiki-file->dom
). Then it loads the Css style sheets imported in
the DOM tree and invokes the xml->tex
function.
(define-service (hoptex/tex url dest) (let* ((doc (wiki-file->dom url)) (hd (dom-get-elements-by-tag-name doc "head")) (css (map tex-load-hss (links-of-head hd)))) (call-with-output-file dest (lambda (op) (xml->tex doc css op)))))
The function xml->tex
is in charge of compiling one node of the
DOM tree into one LaTeX element. The parameter node
is the node
to be compiled, the parameter css
is the opaque data structure
representing the Css rules, and the last parameter p
is the
output port where to write the result of the compilation. Numbers are
written in the target file without modification; strings are escaped,
that is, all special LaTeX characters are protected against
interpretation (the function
tex-string
is in charge of this task); lists are recursively
processed; and XML nodes are treated specially by the function
xml-elements->tex
which is given in Figure
1.
(define (xml->tex node::obj css::obj p::output-port) (cond ((string? node) (display (tex-string node) p)) ((number? node) (display node p)) ((list? node) (for-each (lambda (o) (xml->tex o css p)) node)) ((xml-element? node) (xml-element->tex node css p))))
Compiling a XML element is decomposed in 7 steps.
css-get-computed-style
. If no style is found then the
compilation simply compiles recursively the children of the node.
display: none
. Invisible
elements are ignored by the compiler.
beforeattribute is string of characters that has to be inserted before the current element. It is handled by the function
xml-style->tex
.
For instance, the default beforeattribute of the Html
em
nodes is the
string {\em{
. The beforeattribute can be customized by users while the prelude is hardwired in HopTeX.
afterattribute is symmetrical to the
beforeattribute. It closes the LaTeX environment opened in the
beforeattribute.
{\small{
,
the postlude emits }}
.
(define (xml-element->tex node::xml-element css p) ;; step 1: compute the style (let ((style (css-get-computed-style css node))) (if (css-style? style) ;; step 2: check visibility (when (css-visible? style) (xml-element-visible>tex node css p style)) ;; step 1b: plain recusive compilation (xml->tex (xml-element-body node) css p)))) (define (xml-element-visible->tex n css p style) (with-access::css-style style (after before) (let ((texc (style->tex (xml-element-tag n) style)) (css-proc (css-style-get-attribute style 'proc))) ;; step 3: tex prelude (for-each (lambda (t) (display (car t) p)) texc) ;; step 4: style :before (when (css-style? before) (xml-style->tex before css p)) ;; step 5: body compilation (if (procedure? css-proc) ;; step 5b: a dedicated compiler is used (css-proc n css p) ;; step 5c: a simple recursive descent is used (xml->tex (xml-element-body n) css p)) ;; step 6: style :after (when (css-style? after) (xml-style->tex after css p)) ;; step 7: tex postlude (for-each (lambda (t) (display (cdr t) p)) texc))))) |
The function xml-style->tex
, not given here, is a trimmed down
version of xml-element->tex
that is in charge of processing
the content strings of the or
attributes.
In this section we present a few examples of compilation and we show how users can change the generated LaTeX rendering by providing additional Css rules.
Assuming the Css rules given in Section 4, let us study the compilation of the following text:
A **strong //and emphasized//** text
First, the server parses the text and translates it into Html. Along this process, it builds a DOM representation of the following tree:
<DIV> A <STRONG>strong <EM>and emphasized</EM></STRONG> text</DIV>
The compiler has to compile the DIV
elements which has three
children: the string
, the A
DIV
containing the
elements, and the string STRONG...
. Since the
text
DIV
element has no style attached to it then its compilation
consists in a simple traversal of the tree. The first string is
written as is. Then comes the compilation of the STRONG
and EM
elements. These ones have styles that specify a before
and after
strings that are inserted in the generated LaTeX output. The
result of the compilation is:
A {\textbf{strong{\emph{and emphasized}}}} text
A user wanting to emphasize even more texts which are under
a STRONG
and a EM
elements could use his own Css rule such
as (remember that the > CSS operator filters direct descendant of a
node):
strong > emph { text-decoration: underline; }
This changes the compilation of the EM
nodes whose
parents are STRONG
nodes. It adds the rule
text-decoration: underline
to the style computed by the
css-get-computed-style
that enriches the default compilation of
EM
elements. The generated LaTeX code becomes:
A {\textbf{strong{\emph{\underline{and emphasized}}}}} text
As with Html, Css rules for HopTeX can be used to change the compilation of individual nodes. A simple way to achieve this is to assign identifiers to nodes and use these identifiers in the rules. Wiki tags used by HopTeX accept identifier and class declarations. They are given by suffixing the tag with :id@class. For instance, one may write:
~~:p1@note This is a note.
which defines a paragraph named p1
that belongs
to the class
note
. Identifiers and classes can be used in rules such as:
p.note:before { content: "Note:"; font-style: italic; } @media tex { #p1 { font-size: 70%; } }
The
rule applies to all rendering
engines. So in particular to the LaTeX code generator that adds the
italicized version content before the paragraph. The p.note:before
rule
only applies to the LaTeX compilation because protected by a
#p1
rule. It instructs the code generator to use
tiny font the paragraph @media tex
that will be compiled as:
#p1
As mentioned in Section 4, resorting to
before
and after
attributes of Css style suffice to compile most
Html elements. However, for a few of them, inserting a prefix and a suffix
is not enough. The current HopTeX version makes a special case for
exactly 4 elements, namely IMG
, PRE
, TABLE
, and A
.
We present the compilation of the first three in this section. The
compilation of A
is delayed to Section
51.
When Css prefixes are not enough, an ad hoc compilation
function can be defined. These functions are declared in the rules as
the value of the HopTeX specific proc
property. They are Hop
functions that HopTeX calls with three parameters: the
node
to be compiled, the current css
rule set, and the output
port
where the result should be written. Let us illustrate these
compilation function on three examples.
Images are inserted in the text
with either the regular IMG
markup or with the wiki syntax {{...}} as in:
{{screenshot.png|a screenshot}}
Images are compiled in LaTeX into a includegraphics
environment in which image resizing is expressed as a ratio of the
line width. The HopTeX function xml->tex-img
is in charge of this
translation. It computes the LaTeX size of the image. If no width
is specified for a image, the generated LaTeX image spans over the
whole line. If a width is given, the percentage string is converted
into a floating point value in the range , which is
concatenated to the string \linewidth
.
(define (xml->tex-img node::xml-img css p) (fprintf p "\\includegraphics[width=~a]{~a}" (let ((w (node-computed-style node :width css))) (if (string? w) (let ((m (pregexp-match "^([0-9]+)%$" w))) (if (not m) ;; a string such as "10em" w ;; a percentage (format "0.~a\\linewidth" (cadr m)))) "\\linewidth")) (dom-get-attribute node "src")))
The default HopTeX rule for compiling images is:
img { width: 80%; proc: $xml->tex-img; }
The dollar sign before the xml->tex-img
is a syntactic
annotation that tells the Css parser that the following expression
is not a literal but a value of the Hop language. The compilation of the
image given above using the previous Css is:
\includegraphics[width=0.8\linewidth]{screenshot.png}
As noted in Section 4, Html PRE
elements have no direct LaTeX counterpart. To compile them,
HopTeX generates a full line wide tabular
nested in a
texttt
environment, and it replaces all white spaces with
the explicit command \
that forces LaTeX to introduce
plain blank characters. The implementation of this function is as follows:
(define (xml->tex-pre node::xml-pre css p) (with-access::xml-pre node (body) (display "\\noindent\\texttt{" p) (display "\\begin{tabular*}{\\linewidth}" p) (display "{l@{\\extracolsep{\\fill}}}\n" p) (let loop ((b body)) (cond ((string? b) (let ((s (tex-string b))) (display (string-substitute s " \n" "\\ " "\\\\\n") p))) ((pair? b) (for-each loop b)) (else (xml->tex b css p)))) (display "\\end{tabular*}}\n" p)))
The Css rule that accommodates this compilation scheme is:
pre { font-size: small; proc: $xml->tex-pre; }
Html tables and LaTeX tabulars have nearly orthogonal
designs. Html tables tunings are expressed on a per-cell basis
while LaTeX tables are configured on a per-column/per-row basis.
In consequence, compiling Html tables into LaTeX tabulars is
inherently ad hoc. The default HopTeX compilation flushes left
cells and includes no rule at all. The function xml->tex-table
first counts the number of columns in order to generate the LaTeX
columns declaration. Then, each row of the table is compiled with the
function xml->tex-tr
that separates each element with the &
sign and that inserts an end of line delimiter after each row.
(define (xml->tex-table el::xml-table css p) (define (count-columns obj) (define (tr-count-columns obj) (length (xml-element-body obj))) (apply max (map tr-count-columns (xml-element-body obj)))) (fprintf p "\\begin{tabular}{~a}\n" (make-string (count-columns el) #\l)) (xml->tex (xml-element-body el) css p) (display "\\end{tabular}\n" p)) (define (xml->tex-tr el::xml-tr css p) (with-access::xml-element el (body) (if (null? body) (display "\\\\\n" p) (let loop ((body body)) (xml->tex (car body) css p) (if (null? (cdr body)) (display " \\\\\n" p) (begin (display " & " p) (loop (cdr body))))))))
The default Css rules for table are as follows:
table { proc: $xml->tex-table; } tr { proc: $xml->tex-tr; } th:before { content: "{\\textbf{\\textsf{"; } th:after { content: "}}}"; }
In addition to connecting the two functions above to the
TABLE
and TR
elements, it also configures TH
elements to
mimic their Html default appearance. Provided with
these declarations, the table example given Section
31 is compiled as:
\begin{tabular}{lll} This & is & {\textbf{\textsf{a table}}} \\ \end{tabular}
In this section we illustrate the benefits of using a full-fledged programming language in HopTeX by presenting two extensions. We show how to manage bibliographic references and how to delegate the placement of floating elements to Css rules.
Bibliography citations are treated by HopTeX as a special kind
of external hyperlinks. Consistently, the wiki syntax is augmented
with the new bib://
protocol that accommodates citations
which then look like:
[[bib://knuth:tex86 lamport:latex86]]
Because the BibTex format is widely used, it has been found appropriate to make it directly usable in HopTeX. For that, a full BibTex parser has been implemented in HopTeX. When a document is to be processed, the BibTex bibliography database is then parsed and stored in a hash table. Then the DOM is traversed and all citations are adjusted. For the sake of the example, here is the code in charge of this traversal:
(define (citation? e) (when (xml-element? e) (with-access::xml-element e (attributes) (let ((href (xml-get-attribute :href attributes))) (and href (string? (xml-attribute-value href)) (string-prefix? "bib://" (xml-attribute-value href))))))) (define (dom-get-citations expr) (filter citation? (dom-get-elements-by-tag-name expr "a")))
It uses regular DOM functions, that in Hop are also
available on the server-side of the applications, to retrieve all the
link elements (A
Html elements) whose links are prefixed by
the
string.
bib://
Placing floating elements with LaTeX, is a nightmare that we have
all lived once. Directives such as htbn
are supposed to instruct
the layout algorithm but they constantly fail. More strict directives
have been added such as !H
but in practice they show similar
results. The only effective solution to trick the internal TeX
algorithms consists in moving the floating elements in the source text
back and forth. In addition to be painful and error prone this
idiosyncratic behavior has an important drawback when a single source
is used to generate LaTeX and Html document. Since the web
browser does not move float elements, the figures moved for LaTeX
appear as randomly placed in the Html version.
Because HopTeX generates LaTeX documents from Html specifications, we have an opportunity to improve over the previously described solution. Instead of moving the floating elements in the source text, HopTeX moves them only in the generated LaTeX target accordingly to configurations expressed in Css rules. For instance, one may write:
@media tex { #float1 { with: 100%; column-count: 2; float: -350; } }
which means that the float element named float1
has to be moved 350 elements upward in the DOM tree.
Prior to generating LaTeX code the DOM tree is thus traversed to
inspect all floating elements that have a float
style attribute
attached. Such elements are moved backward when the value is negative
and forward when positive. The source code for moving a node in the
tree is traditional DOM programming. It is given in Figure
2.
(define (move-float-backward! node offset) (let loop ((o offset) (prev node)) (if (= o 0) (dom-insert-before! (dom-parent-node prev) node prev) (loop (- o 1) (dom-previous-node prev doc))))) (define (dom-previous-node node doc) (let ((sibling (dom-previous-sibling node))) (if (not sibling) (dom-parent-node node) (dom-last-node sibling)))) (define (dom-last-node node) (let ((l (dom-child-nodes node))) (if (pair? l) (let ((n (car (last-pair l)))) (if (xml-text-element? n) n (dom-last-node n))) node))) |
HopTeX is an operational system. It has already been used to write a couple of articles in addition to the present one. The whole implementation counts less than 4KLOC lines of Hop code and 1KLOC of Css rules. Such a compactness is possible only because it extensively uses the features offered by the Hop programming language: high level of abstractions supported by functional values, object-oriented support, full polymorphism, DOM server-side manipulation, Css server-side resolution, and builtin parsing facilities. HopTeX is free software released under the GPL license. It is available from the Hop web page.
1 | Bobrow, D. et al.Common lisp object system specificationhttp://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/html/cltl/cltl2.htmlspecial issueSigplan Notices23Sep 1988. |
2 | Flatt, M. and Barzilay, E. and Findler, R. B. Scribble: closing the book on ad hoc documentation toolshttp://www.cs.utah.edu/plt/publications/icfp09-fbf.pdfICFP '09: Proceedings of the 14th ACM SIGPLAN International Conference on Functional ProgrammingEdinburgh, Scotland 2009109--120. |
3 | Gallesio, E. and Serrano, M. Skribe: a Functional Authoring Languagehttp://www.inria.fr/mimosa/Manuel.Serrano/publi/jfp05/article.htmlJournal of Functional Programming 2005. |
4 | Greene, A. BASIX -- An Interpreter Written in http://www.tug.org/TUGboat/Articles/tb11-3/tb29greene.pdfTUGBoat113 1990381--392. |
5 | Hosoya, H. and Pierce, B. XDuce: a Typed XML Processing LanguageIn Proc. of Workshop on the Web and Data Bases (WebDB 2000226--244. |
6 | Iso/Iec, Information technology, Processing Languages, Document Style Semantics and Specification Languages (DSSSL)http://www.jclark.com/dsssl/10179:1996(E)ISO 1996. |
7 | Kelsey, R. and Clinger, W. and Rees, J. The Revised(5) Report on the Algorithmic Language Schemehttp://www.inria.fr/mimosa/fp/Bigloo/doc/r5rs.htmlHigher-Order and Symbolic Computation111Sep 1998. |
8 | Knuth, D. The TEXbookAddison-Wesley, ReadingsMassachusetts, USA 1986. |
9 | Lamport, L. LaTeX - a Document Preparation SystemAddison-Wesley, ReadingsMassachusetts, USA 1986. |
10 | Loitsch, F. and Serrano, M. Trends in Functional ProgrammingHop Client-Side CompilationSeton Hall University, Intellect Bristol (ed. Morazán, M. T.)UK/Chicago, USA 2008141--158. |
11 | Maranget, L. Hevea, un traducteur de LaTeX vers HTML en CamlActes des 10e Journfrancophones des langages applicatifs 1999. |
12 | Serrano, M. The HOP Development Kithttp://www.inria.fr/mimosa/Manuel.Serrano/publi/sfp06/article.htmlproceedings of the Seventh ACM sigplan Workshop on Scheme and Functional ProgrammingPortland, Oregon, USASep 2006. |
13 | Serrano, M. HSS: a Compiler for Cascading Style Sheets10th ACM Sigplan Int'l Conference on Principles and Practice of Declarative Programming (PPDP)Hagenberg, AustriaJul 2010. |
14 | Walsh, N. and Muellner, L. DocBook: The Definitive GuideO'ReillyOct 1999. |
15 | World Wide Web Consortium, XQuery 1.0: An XML Query Languagehttp://www.w3.org/TR/xquery/REC-xquery-20070123/W3C RecommendationJan 2007. |
16 | World Wide Web Consortium, Cascading Style Sheets level 2 Revision 1 CSS2.1 Specificationhttp://www.w3.org/TR/2009/CR-CSS2-20090423/CR-CSS2-20090423W3C RecommendationApr 2009. |