This chapter describes the Bigloo API for processing texts.
bibtex obj | Bigloo Text function |
bibtex-port input-port | Bigloo Text function |
bibtex-file file-name | Bigloo Text function |
bibtex-string string | Bigloo Text function |
These function parse BibTeX sources. The variable obj can either
be an input-port or a string which denotes a file name. It returns a
list of BibTeX entries.
The functions bibtex-port , bibtex-file , and
bibtex-string are mere wrappers that invoke bibtex .
Example:
(bibtex (open-input-string "@book{ as:sicp,
author = {Abelson, H. and Sussman, G.},
title = {Structure and Interpretation of Computer Programs},
year = 1985,
publisher = {MIT Press},
address = {Cambridge, Mass., USA},
}")) => (("as:sicp" BOOK
(author ("Abelson" "H.") ("Sussman" "G."))
(title . "Structure and Interpretation of Computer Programs")
(year . "1985")
(publisher . "MIT Press")
(address . "Cambridge, Mass., USA")))
|
|
bibtex-parse-authors string | Bigloo Text function |
This function parses the author field of a bibtex entry.
Example:
(bibtex-parse-authors "Abelson, H. and Sussman, G.")
=> (("Abelson" "H.") ("Sussman" "G."))
|
|
hyphenate word hyphens | Bigloo Text function |
The function hyphenate accepts as input a single word and
returns as output a list of subwords. The argument hyphens is
an opaque data structure obtained by calling the function load-hyphens
or make-hyphens .
Example:
(hyphenate "software" (load-hyphens 'en)) => ("soft" "ware")
|
|
load-hyphens obj | Bigloo Text function |
Loads an hyphens table and returns a data structure suitable for
hyphenate . The variable obj can either be a file name
containing an hyphens table or a symbol denoting a pre-defined hyphens
table. Currently, Bigloo supports two tables: en for an English
table and fr for a French table. The procedure load-hyphens
invokes make-hyphens to build the hyphens table.
|
Example:
(define (hyphenate-text text lang)
(let ((table (with-handler
(lambda (e)
(unless (&io-file-not-found-error? e)
(raise e)))
(load-hyphens lang)))
(words (string-split text " ")))
(if table
(append-map (lambda (w) (hyphenate w table)) words)
words)))
|
The procedure
hyphenate-text
hyphenates the words of the
text
according to the rules for the language denoted by
its code
lang
if there is a file
lang
-hyphens.sch
.
If there is no such file, the text remains un-hyphenated.
make-hyphens [:language] [:exceptions] [:patterns] | Bigloo Text function |
Creates an hyphens table out of the arguments exceptions and
patterns .
The implementation of the table of hyphens created by make-hyphens
follows closely Frank Liang's algorithm as published in his doctoral
dissertation Word Hy-phen-a-tion By Com-pu-ter
available on the TeX Users Group site here:
http://www.tug.org/docs/liang/. This table is a
trie (see http://en.wikipedia.org/wiki/Trie for
a definition and an explanation).
Most of this implementation is borrowed from Phil Bewig's work available
here: http://sites.google.com/site/schemephil/, along with
his paper describing the program from which the Bigloo implementation is
largely borrowed.
exceptions must be a non-empty list of explicitly hyphenated
words.
Explicitly hyphenated words are like the following:
"as-so-ciate" , "as-so-ciates" , "dec-li-na-tion" ,
where the hyphens indicate the places where hyphenation is allowed. The
words in exceptions are used to generate hyphenation patterns,
which are added to patterns (see next paragraph).
patterns must be a non-empty list of hyphenation patterns.
Hyphenation patterns are strings of the form ".anti5s" , where a
period denotes the beginning or the end of a word, an odd number denotes
a place where hyphenation is allowed, and an even number a place where
hyphenation is forbidden. This notation is part of Frank Liang's
algorithm created for Donald Knuth's TeX typographic system.
|
gb2312->ucs2 string | Bigloo Text function |
Converts a GB2312 (aka cp936) encoded 8bits string into an UCS2 string.
|