The Mathemagix glue mechanism

1.Introduction

New functionality can be added to Mathemagix using the “glue” mechanism. There are two typical types of glue:

The user has written a new C++ class or a new C++ template class, and would like to use this class from withing the interpreter.
The user has written an interface to an external software which comes with its own language and typesystem. In this case, one would like to have an automatic mechanism for making the functionality of the external software available from within Mathemagix.

In both cases, the user has to write a “glue definition routine”, which declares the functionality provided by the glue, using a standard API. This routine can then be called from some central point (at startup or when loading a dynamically linked glue library) in order to make the glue functionality available from within the current evaluator.

The Mathemagix glue mechanism comes in two layers. On the one hand hand, we provide a low-level C++ API (see section ? below), which allows you to manually add new types, routines and converters. On top of this layer, the Mathemagix language also provides the keyword foreign cpp, which allows for a higher level description of the glue, which can also be used by the Mathemagix compiler. This second preferred mechanism is not yet documented.

2.The C++ glue API

2.1.A simple example

Let us first consider writing a glue definition routine for a simple C++ class named color. Assume also that the color class comes with two routines which we wish to export to Mathemagix:

color named_color (const string& name);
color invert_color (const color& c);

Then the glue definition routine for color typically looks as follows:

void
define_color () {
  define_type<color> ("Color");
  define ("named_color", named_color);
  define ("invert", invert_color);
}

The define_type instruction exports the type color to Mathemagix under the name Color. The two next lines export the routines named_color and invert_color under the same names. The type of the routines named_color and invert_color is automatically inferred by the gluing mechanism. The always template parameter is not really important for our simple example; below, we will see how to use it for the conditional definition of glue in the case of template types.

Whenever a new type is exported to Mathemagix, a few standard routines are required to be implemented for this type. In the case of our class color, the following routines must be provided:

nat hash (const color& c);
generic flatten (const color& c);
bool operator == (const color& c1, const color& c2);
bool operator != (const color& c1, const color& c2);

The first routine computes a hash value for c, where nat stands for unsigned int. The second flattening routine converts c to a generic expression; this routine will be used when printing a color. The last two routines implement equality testing. Notice that there are several types of equality in Mathemagix. In addition to the == and != operators, you may wish to define

bool hard_eq (const color& c1, const color& c2);
bool hard_neq (const color& c1, const color& c2);
bool eq (const color& c1, const color& c2);
bool neq (const color& c1, const color& c2);

The routine hard_eq is a fast test whether c1 and c2 are represented in the same way in memory. In the case of pointer objects like vectors, one typically tests whether the pointers match; in particular, two identical vectors which are stored at different locations would not be “hard equal”. The routine eq tests whether c1 and c2 are syntactically equal. Typically, the rational number might be equal to the integer , without being syntactically equal.

2.2.A short reference guide

The main routines for defining new types and routines are specified in glue.hpp. Here follows a short description:

void define_type<C> (const generic& name)

This routine declares a new C++ type C and exports as name. Whenever a new type is defined, a few other derived types are defined as well (see section ? below).

void define_constant<C> (const generic& name, const C& x)

This routine exports the constant C x as name.

void define_constructor<C> (generic (*fun) (const C&))

This routine exports a constructor fun for instances of type C.

void define<D> (const generic& name, D (*fun) ())

void define<D,S1> (const generic& name, D (*fun) (const S1&))
void define<D,S1,S2> (const generic& name, D (*fun) (const S1&, const S2&))

...

This routine exports a function fun as name.

void define_converter<D,S> (const generic& name, D (*fun) (const S&), nat pen)

This routine exports a type conversion routine fun: S->D (which defaults to the default converter from S to D) with a given penalty pen. In cases of ambiguity, conversion chains with the least penalty are preferred. The name of the converter should be one among "convert", "upgrade" and "downgrade", depending on the nature and transitivity properties of the converter. Upgraders are used for automatic constructors (such as integer->rational) and downgraders for type inheritance (such as circle->shape). Plain converters cannot be composed with other converters.

void define_primitive<Cond> (const generic& name, generic (*f) (const generic&))

This routine exports a language primitive f. The argument to the primitive is a tuple which is not evaluated before f is called. The primitive should take care of the possible evaluation of its arguments itself.

2.3.Implicitly glued classes

Whenever a new type C is defined by the user, a few other related types are added automatically. More precisely, Mathemagix will automatically define the following types:

alias<C>

This type is used for aliases to instances of type C (see alias.hpp). The alias<C> type plays a similar role as the C++ reference type C&, but there are some subtle differences. In Mathemagix, aliases are much slower, but more functional and powerful in nature. More precisely, alias<C> is an abstract class with two promises for read-access and write-access. In particular, an alias need not correspond to a physical location in memory. To understand one major difference, consider the following session:

Mmx]	v: Vector Generic := vector (1, 2, 3, 4, 5)

Mmx]	x: Alias Generic == v[4];

Mmx]	v := vector (6, 7, 8, 9, 10)

Mmx]

In C++, the second line would be incorrect. However, in Mathemagix, a read-access for v[4] is only performed at the last line, when computing x. Notice also that Mathemagix performs only a read-access for v[4], whereas C++ would typically perform a write-access (see remark ? below).

tuple<C>

This type stands for a tuple of elements of type C. By defining a vector constructor using

define<Cond> ("vector", make_vector<C>);

where

template<typename C> vector<C> make_vector (const tuple<C>& t);

this will allow you to enter vectors in the Mathemagix interpreter using

vector(a,b,c,d,e,f)

alias<tuple<C> >

This type is also added, for coherence.

Remark 1. Concerning C++ references types, one should notice another subtlety when writing accessors for compound data types. For instance, consider the methods

template<typename C,typename T> C
table<C,T>::operator [] (const T& key) const;
template<typename C,typename T> C&
table<C,T>::operator [] (const T& key);

The overloading will allow for both read-access and write-access using the same syntax. However, it sometimes happens that you have a (non constant) table which corresponds to a global environment. In that case, any access to this table will be a write-access independently if you really perform some modifications of the table. This may lead to subtle errors if you really wanted to perform a read-access, since a write-access might actually modify the table (allocating a non existant key-value pair, for instance). This subtlety does not occur for the Mathemagix Alias<C> type.

2.4.Implicit conversions

The routine define_converter allow the user to define automatic converters between different types. This facility is quite powerful, but has to be used with care: since automatic converters are applied in a very systematic way, they may even be applied in sitations which the user did not foresee.

First of all, the user has to carefully select between the three different types of converters: upgraders, downgraders and plain converters. These types differ in the way they may be composed: plain converters are neither left nor right composable, upgraders are left composable, and downgraders are right composable. Given a left composable converter B->C and an arbitrary converter A->B, Mathemagix automatically adds a converter A->C (which is left composable if A->B is left composable and right composable if B->C is right composable). Similarly, if A->B is right composable and B->C is arbitrary, then we add a converter A->C.

Typically, upgraders correspond to type constructors, such as Integer -> Rational or C -> Matrix(C). Similarly, downgraders correspond to type inheritance, such as Rectangle -> Shape, Sum_series(C) -> Series(C), etc. Plain converters are often used for converters which may involve some loss of data, such as Integer->Double.

A second important property of a converter is the correponding penalty: when several conversion schemes can be used in order to apply a function to some arguments, the scheme with the lowest penalty will be preferred (here we notice that the penalty of a conversion (A,B)->(C,D) is the maximum of the penalties of the conversions A->C and B->D). Among all possible schemes with the lowest penalty, the most specialized function will be chosen (a type T is strictly more specialized than U if there exists a converter T->U but no converter U->T). If no conversion schemes can be found to apply the function, then it will be applied symbolically.

Currently, the following penalties are provided:

PENALTY_NONE: This corresponds to an exact match.
PENALTY_AUTOMATIC: This corresponds to the penalty for automatic language-related conversions, such as Alias(C)->C.
PENALTY_INCLUSION: This penalty should be used for conversions T->U, where T may be viewed as a mathematical subset of U. Example: Integer->Rational. This penalty is the default one for conversions T->U when T is different from Generic.
PENALTY_HOMOMORPHISM: This penalty should be used for conversions T->U which can be viewed as mathematical homomorphisms, but not as inclusions. Examples: Integer->Int or, more generally, Integer->Modular(p).
PENALTY_VARIANT: Sometimes, two distinct libraries implement the same or a similar type. In that case, it may be interesting to provide automatic converters between these types (in both ways). Using the higher penalty PENALTY_VARIANT for this kind of conversions will ensure that operations are performed in the library of the types of the arguments, unless an implementation is only available in the other library.
PENALTY_CAST: This penalty should be used for all conversions which, even though convenient for the user, entail some loss of information. Example: Integer->Double.
PENALTY_FALL_BACK: For any type T, this is the penalty of the conversion T->Generic. Hence, if the user provides a generic fall back implementation of an operation, then this will be the penalty for the application of the fall back method.
PENALTY_PROMOTE_GENERIC: Many composite types, such as Complexify(C), come with an inclusion C->Complexify(C). Although it is generally correct to use PENALTY_INCLUSION for the corresponding penalty, many unwanted conversions may arise when C=Generic. For this reason, the default penaly for conversions of the kind Generic->T is the maximal penalty PENALTY_PROMOTE_GENERIC.

Some common sources of bugs when using overloading and automatic conversions are the following:

You specified an upgrader Generic->Polynomial(Generic), but addition on generic polynomials can not be applied so as to add one to a polynomial. The point here is that the penalty of the conversion Generic->Polynomial(Generic) should be the maximal penalty PENALTY_PROMOTE_GENERIC, which is larger than the penalty PENALTY_FALL_BACK for the symbolic addition of expressions. The solution to this problem is to implement the following two specialized additions:
```
    +: (Polynomial (Generic), Generic) -> Polynomial (Generic)
    +: (Generic, Polynomial (Generic)) -> Polynomial (Generic)
```
Actually, these operations may be useful for other coefficient types than Generic, since they can usually be implemented in a particularly efficient way.
You forgot to define a generic fall back method for some operation. Sometimes, your implementation may rely on the assumption that a given operation foo has no implementation, so that it will be applied symbolically. When providing an implementation of foo for some type, this assumption may suddenly be violated and provoke infinite loops or incorrect results. In that case, you should provide a default symbolic implementation for foo.

3.Dynamically linked libraries

A package with a collection of types, routines and glue definition routines should be compiled into a dynamic library, which can then be loaded on the fly into the interpreter. Names of Mathemagix glue libraries should be of the form libmmxname.la or libmmxname.so and the principal glue definition routine for the library should carry the name define_name. Notice that C++ names are mangled, so you may need to declare define_name as a void (*) ().

In the subdirectories examples/fibonacci and examples/gaussian, you may find two simple examples on how to glue new code to Mathemagix. For larger projects, we refer to the documentation on coding conventions and how to add new packages.

3.1.Computing fibonacci numbers

Assume that we want to add a routine fibonacci for computing Fibonacci numbers to the glue. The file fibonnacci.cpp with the routine fibonacci and the corresponding glue definition routine would typically look at follows:

#include "glue.hpp"
using namespace mmx;

int
fibonacci (const int& n) {
  if (n <= 1) return 1;
  return fibonacci (n-1) + fibonacci (n-2);
}

void
define_standard_fibonacci () {
  define<always> ("fibonacci", fibonacci);
}

void (*define_fibonacci) () = define_standard_fibonacci;

The very last line is added to prevent name mangling of define_fibonacci. We may now compile the file fibonnacci.cpp into a shared library libmmxfibonacci.so:

g++ --shared ‘basix-config --cppflags --libs‘
    fibonacci.cpp -o libmmxfibonacci.so

After putting the directory which contains libmmxfibonacci.so in your LD_LIBRARY_PATH, you may now use the library from within Mathemagix:

Mmx]	use "fibonacci"

Mmx]	fibonacci(37)

3.2.Complexified numbers

To be completed.

4.Exception handling

It is possible to catch exceptions occurring in glued C++ routines within the Mathemagix interpreter. In order to make this work, you should first configure Mathemagix using the (default) option –enable-exceptions. Next, your glue code may raise exceptions of the type mmx::exception.

In fact, it is better not to directly raise exceptions by yourself, but rather use the convenience macros defined in basix.hpp. Indeed, Mathemagix distinguishes two main kind of exceptions, whose default behaviour can be configured:

Normal exceptions are the typical exceptions that you want to make visible within the interpreter. Since the interpreter is much slower than the C++ code anyway, it is a good practice to heavily test for erroneous exceptional cases in the glue routines. For instance, when making an array access, one should typically test the bounds. Normal exceptions are enabled using the default –enable-exceptions configuration option. When disabled, they will be replaced by assertions inside the code.
Low level exceptions are additional checks that you may wish to add in critical parts of the C++ code. For instance, low level C++ array access is performance-critical, since it might be used intensively by other C++ routines. When using the –enable-verify configuration option, you may make add additional checks for such low level routines. Since this may slow down the global performance of the system, this option is disabled by default, and should mainly be used for debugging purposes. Instead of directly gluing low level routines to the editor, we rather recommend to write a small wrapper with the necessary checks for the routine.

The two above types of exceptions both come with their corresponding macros. The following macros should be used for raising exceptions:

ERROR(message)

Raises the error message message.

ASSERT(condition,message)

Raises the error message message if the condition is not satisfied.

VERIFY(condition,message)

When compiling using –enable-verify, this macro raises the error message message if the condition is not satisfied.

5.The Mathemagix module system

Still to be written and documented.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License. If you don't have this file, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.