This document presents the ASN.1 compiler, MAVROS, which is composed of the MAVCOD preprocessor and the MAVROS code generator.

The MAVCOD preprocessor:

The MAVROS generator: The environment provided by MAVROS makes the development of OSI applications very easy.

The presentation of MAVROS follows the following plan:

ASN.1 and compilers


The first transmission protocols used only relatively simple encoding technics. The transmission control layers, such as "HDLC" or "IP" used only a few bytes of headers, were a limited number of protocol elements were coded on a precisely located bit fields. The application control layers such as "TELNET" or "TELETEX" transmitted mainly textual data, or at least text encoded commands.

With the advent of more complex applications, the need arised to pass structured information, suitable for program manipulation: the current programming languages such as "C", "PL-1", "PASCAL" or "ADA" all include facilities to describe complex types of data, e.g. structures, records or arrays.

In order to describe complex messages containing these types of data, Interface Notation Languages have been developped, in particular in order to enable the usage of Remote Procedure Calls.

ASN.1 is the acronym of "Abstract Syntax Notation 1", the interface notation language standardized by ISO and CCITT for applications developped in the "Open System Interconnection" (OSI) framework.

The relevant ISO standards are:

ASN.1 and Presentation

Following the OSI seven layers model, the conversion of message format between the local representation used for computing and the binary format carried over the transmission wires is performed inside the Presentation layer. The presentation service uses the notion of "abstract syntax" and "transfer syntax":
An abstract syntax is a description of the message from a "logical" point of view, in terms of records and arrays of various types of elements, e.g. integers or strings. ASN.1 is the language used to describe the "abstract syntaxes".

Two application can only communicate if they have a common knowledge of the "abstract syntax" of the messages that they will exchange. The OSI model does not constrain them to use the same binary representation, as this would probably imply using the same hardware platform as well as the same programming languages and the same compilers. In order to communicate, the messages must be encoded in a binary format agreed by both parties.

According to the presentation protocol, the binary format is defined by the "transfer syntax". A transfer syntax is a set of coding rules, explaining for example how to encode basic elements such as integers and string as well as the composition rules used to encode structures or arrays. The combination of a "transfer syntax" with an "abstract syntax" specifies the binary syntax of the messages that will be exchanged.

The presentation protocol allows the presentation entities to negociate the transfer syntax that will be sued on a given connection. Although the "basic encoding rules" defined in ISO 8825 is the only T.S. which has already been standardized, work is ungoing at ISO and in various research group in order to define:

ASN.1 is not a classic language

ASN.1 is designed for telecommunications: Direct mapping to "C" or "ADA" structure is almost impossible. One has to resort to explicit coding and decoding functions.

The need of ASN.1 Compilers

The coding and decoding functions can be implemented in several ways. One could in particular: Hand coding is not the best possible idea:
|			  |   Compiled |      Manual	 |
| Develop an application  |	Days   |      Months	 |
| Update a protocol	  |    Hours   |       Weeks	 |
| Achieve conformance	  |   Correct  |      Test and	 |
|			  |    Input   |	fail	 |
| Reliability of code	  |	Good   |     Dubious	 |
| Debugging tools	  |	Yes    |       ???	 |
| Speed	of decoding	  |	Good   |    Excellent... |
|			  |	       |    ...or poor	 |
| Respect of layering	  |   Natural  |      Depends	 |
|			  |	       |   of discipline |
| Introduce faster	  |	       |		 |
| Transfer syntaxes	  |   Natural  |     Ackward	 |
In fact, we have here another instance of the old debate, i.e. should one use "High Level Languages" or "Assembly language".

The principles of ASN.1 compilers

Almost all ASN.1 compilers provide: However, all ASN.1 compilers are not born equal. Different ASN.1 compilers will present different: The MAVROS compiler incorporates features which result from more than 5 years of experimentation and research. MAVROS contains three parts:

A fully compliant preprocessor

ASN.1 modules

The basic element of the ASN.1 language is the "module", i.e a set of definitions like:

      IMPORTS	GraphicString FROM graphic;

      EXPORTS	Modem ModemData	defaultModem;

      Modem ::= SEQUENCE {
	    name	 GraphicString,
	    data	 ModemData
			 DEFAULT	 defaultModem }

      ModemData ::= SEQUENCE {
		standard GraphicString,
		bps	 INTEGER }

      defaultModem ModemData ::= {
		standard "V.23",
		bps 1200 }

A module contains a set of type and value declarations. In the example, we have defined the types "Modem" and "ModemData", as well as the value "defaultModem". This types and values are defined by reference to either basic types defined in the ASN.1 language, like "INTEGER" in our example, or to type and values defined in the module, or to types and values imported from another module, like "GraphicString" in our example.

A typical feature of ASN.1, different from most other programming languages, is the very frequent use of "top down" notations: it is quite natural, when writing specifications, to specify first the "upper level" elements, like "Modem" in our example, and to specify then the "components" included in the box. While this feature makes the specification more readable, it does indeed not simplify the task of the compiler...

Special values can be defined for every type, whatever complex. These values can be used in documentations of communication protocols; they can also be used in the "DEFAULT" clause, to specify that when optional elements of a message are "absent" on the transmission wires the recipient should act as if it had received the "default" value. This imply the necessity to detect the default values in order not to encode them, and to be able to copy them in the received messages.

MAVCOD supports:

In short, "MAVCOD" fully implement the ASN.1 language, without any restriction. ASN.1 modules found in ISO and CCITT standards, or developped for specific applications, just need to be typed in a text file and fed into the compiler for obtaining the corresponding encoding and decoding routines.

Soft typing

In complement to the "IMPORT" clause, which enables users to explicitely make reference to types defined in other modules, ASN.1 include a "type extension facility", used to define "generic" messages where some particular components will be specified at "run time". An example of this feature is the notation for object attributes found in standards like X.500 or CMIP:
	Attribute ::= SEQUENCE {
		value ANY DEFINED BY type}
When receiving a message containing this components, a presentation entity should in theory look in an "attribute catalog", using the specific value of the "type" as a key. The attribute catalog will, among other things, contain the ASN.1 spefication of the attribute values.

In order to support this feature, MAVCOD requires the application programmer to provide specific tables, or more precisely to provide a search function which, given a key of the specified type, will return a pointer to a "table of functions" used for encoding and decoding the type. Entering the specification mentioned above will cause MAVCOD to issue a warning message:

	You will have to define	the procedure Attribute_value()!
This procedure will be called by the decoding functions, with the value of the "type" as argument. It will most probably use a table linking the type and the attributes. In order to generate these tables, the programmer can indeed use MAVCOD to specify the values used for keying, and the types of the attributes. For example, one could enter the following ASN.1 module:
	attributes DEFINITIONS ::=

	EXPORTS	oid1, oid2, oid3, xtype1, xtype2, xtype3;

	-- this	is indeed only an example
	-- real	life oid values	shall be allocated conforming
	-- to standard practices.

	basicOid OBJECT	IDENTIFIER ::= { 1 2 3 4 5 }

	-- Definition of the first attribute,
	-- of type integer

	oid1 OBJECT IDENTIFIER ::= { basicOid 1	}

	xtype1 ::= OPAQUE INTEGER

	-- Definition of the second attribute,
	-- a binary octet string

	oid2 OBJECT IDENTIFIER ::= { basicOid 2	}


	-- Definition of the third attribute,
	-- a printable string.

	oid3 OBJECT IDENTIFIER ::= { basicOid 3	}

	xtype3 ::= OPAQUE CHOICE {
		T61String }
MAVCOD will initialize the C values corresponding to the various object identifiers, and will generate an annotated ASN.1 file so that MAVROS could generate the encoding routines for the various data types. Note the insertion here of the "OPAQUE" keyword. This is a MAVCOD specific extension of the ASN.1 language, that one could describe as a built ASN.1 "macro". The effect of this keyword is to instruct the compiler that the type should be externally opaque, i.e. typed in C as a "void *". This is necessary in our case, as the value corresponding to the "ANY DEFINED BY" element will be typed as a "void *".

Using the output of MAVCOD and MAVROS, the procedure needed to "solve" the "soft typing" can be easily entered as:

	#include "attributes.h"

	extern asn1_type_desc asn1_absurd_desc;

	struct table_entry {
		asn1_oid * type;
		asn1_type_desc * value;
	} table[] = {
		{&oid1,	&xtype1_desc },
		{&oid2,	&xtype2_desc },
		{&oid3,	&xtype3_desc }};
	int table_size = sizeof(table)/sizeof(struct table_entry);

	asn1_type_desc * Attribute_value(x)
	asn1_oid x;
	{	register i;

		for (i=0; iv, table[i].type->l) == 0)
The values "oid1", "oid2" and "oid3" have been initialized by MAVCOD from the value definitions present in the ASN.1 file. The procedure tables "xtype1_desc", "xtype2_desc" and "xtype3_desc" are created by MAVROS from the definition of the types "xtype1", "xtype2" and "xtype3".

Note that in case of "failure" we do not return a "nil" pointer, but rather reference the predefined type "asn1_absurd". This pseudo type is defined in the ASN.1 support library. Using it will guarantee that the coding functions terminate properly, and that the decoding functions return a decoding error.

A natural solution for generating this tables is to use the MACRO facility of the ASN.1 language, which will be explained in the following sections.

ASN.1 macros

The ASN.1 language includes a syntax extension facility, the possibility to define "Macros". A macro can be included in any ASN.1 module, and can also be imported from other modules, just like a type or a value definition. The following module, copied from an example in the ASN.1 standard, includes the definition of a macro called "PAIR".


		"TYPEX"	"=" type (Type-1)
		"TYPEY"	"=" type (Type-2)
		"X" "="	value(value-1 Type-1) ","
		"Y" "="	value(value-2 Type-2)
		< VALUE	SEQUENCE {Type-1, Type-2}
		      ::= { value-1, value-2 } > ")"


	V1 T1 ::= ( X =	3, Y = TRUE )

	      ::= ( X =	3, Y= (X=4, Y=FALSE))

A macro definition consists of a set of "production rules", which are described in a sort of "Backus-Naur" form. The production rules list several alternatives separated by the "|" symbol. Each alternative is described by a list of "tokens", which can be: A type is designed by the keyword "type", which can be optionally followed by a local type identifier between parenthesis, as in "type (Type-1)". A value is designed by the keyword "value", which is followed by a group between parenthesis containing an optional local value identifier, and a reference to the type of the value. That reference can be either a direct reference to an ASN1 type, or the local identifier attached to a type reference within the macro.

A production rule may also contain "embedded definitions", i.e. definition of types and values enclosed between angle brackets. These definitions may use either ASN.1 types defined outside the macro, or local references defined within the production rules.

A macro definition must always contain at least two production rules, called "TYPE NOTATION" and "VALUE NOTATION". Once the macro has been defined, the "TYPE NOTATION", preceded by the MACRO name, can be used wherever a "type definition" is expected within the ASN.1 language, e.g. in order to define a new type, or as member of a SET, a SEQUENCE or a CHOICE. The module "pair" contained above includes for example the definition of the type "T1" and the implicit definition of the type associated to "v2" by use of the macro "PAIR".

Whenever one has to define a value for a type defined by a macro, one must follow the syntax defined in the "VALUE NOTATION". This can indeed happen when a value is defined at the "module" level, but also when a component of a complex structure is defined by means of the macro notation. This is the case in our example for the value "v1", of type "T1", for the value "v2" whose type is defined by the macro "PAIR", but also for the "Y" component of "v2", which is defined by the type "T1".

A value notation must always include the definition of a value identified by the reserved identifier "VALUE". This identifier denotes the definition of the value associated with the value notation -- by opposition to the other values, which merely play the role of "building blocks". As a result, the definition of a type by use of a macro is equivalent to a reference to the type associated to this special value.

MAVCOD fully supports the ASN.1 macros:

MAVCOD will parse the macro instantiations present in the ASN.1 modules, and replace the type and value notations by the corresponding ASN.1 definitions. This process is normally not visible to the user. MAVCOD can also be used to provide a "demacro-ized" version of the ASN.1 module, that could for example be used by less capable ASN.1 compilers. This is obtained by the calling the program with a special command line option, as in:
	mavcod pair.asn1 -umac pair.bis
Running this option on the example quoted above yelds the following result:


	   T1-Type-1 INTEGER,
	   T1-Type-2 BOOLEAN}

	v1 T1 ::= {
	   v1-value-1  ,
	   v1-value-2  }

	v1-value-1 INTEGER ::= 3

	v1-value-2 BOOLEAN ::= TRUE

	v2 v2-type ::= {
	   v2-value-1  ,
	   v2-value-2  }

	v2-type	::= SEQUENCE{
	   v2-type-Type-1 INTEGER,
	   v2-type-Type-2 T1}

	v2-type-Type-2 ::= T1

	v2-value-1 INTEGER ::= 3

	v2-value-2 v2-type-Type-2 ::= {
	   v2-value-2-value-1  ,
	   v2-value-2-value-2  }

	v2-value-2-value-1 INTEGER ::= 4

	v2-value-2-value-2 BOOLEAN ::= FALSE

This usage of the macros correspond to a full implementation of the ASN.1 language, and does not require any modification to the ASN.1 input. However, the usage of macros in international standards is not limited to the definition of types and values, but may well include the provision of "knowledge". In is "Open Book", Dr. Marshall Rose goes as far as saying that a "disastrous characteristic of the macro facility is that it has "buried semantics" and that "the macro facility introduces tremendous difficulty for automatic tools." In order to solve this difficult problem, an extension to the MACRO syntax definition has been incorporated in MAVCOD.

Extended support of macros

Lets consider an ASN.1 macro, that would be used to describe "Attribute syntaxes", as can for exemple be found in the X.500 standard:

	TYPE NOTATION ::= Syntax MatchTypes |


	Syntax ::= type(syntax)

	MatchTypes ::= "MATCHES	FOR" Matches
		| empty
	Matches	::= Match Matches | Match
	Match ::= "EQUALITY"
Examples of instantiation of this macro are:
	       ::= {syntax 1}

	printatt ATTRIBUTE PrintableString
	       ::= {syntax 2}

	unknown	ATTRIBUTE ::= {syntax 3}

	       ::= {syntax 4}
If we merely state these definitions in an ASN.1 module, the value notations will be expanded, and we will get the following output:
	intatt intatt-type ::= {
	   syntax  1  }

	intatt-type ::=	OBJECT IDENTIFIER

	printatt printatt-type ::= {
	   syntax  2  }

	printatt-type ::= OBJECT IDENTIFIER

	unknown	unknown-type ::= {
	   syntax  3  }

	unknown-type ::= OBJECT	IDENTIFIER

	anyatt anyatt-type ::= {
	   syntax  4  }

	anyatt-type ::=	OBJECT IDENTIFIER
This is certainly correct from a "language" point of view, but falls somewhat short of the needs of the application programmer. In order to implement the "soft-typing" facility described above, he would certainly have liked to find something like:
	intatt OBJECT IDENTIFIER ::= {syntax  1	}
	intatt-syntax ::= OPAQUE INTEGER

	printatt OBJECT	IDENTIFIER ::= {syntax 2 }
	printatt-syntax	::= OPAQUE PrintableString

	unknown	OBJECT IDENTIFIER ::= {syntax 3	}

	anyatt OBJECT IDENTIFIER ::= { syntax  4 }
	anyatt-syntax ::= OPAQUE OCTET STRING
Moreover, he would also have liked to use the knowledge of the macro in order to build "automatically" the table of attributes. This table is probably not only used for the coding and decoding routines, but perhaps also by some parts of the application program, e.g. to check whether this or that comparison operation is allowed for the attribute.

As implied by Dr. Marshall Rose, there is probably no general solution to the problem. In order to solve at least the case of the "most used" macros, MAVCOD provide the following extensions to the ASN.1 macro syntax:

Using this facility, we can annotate the ATTRIBUTE macro:

	TYPE NOTATION ::= Syntax MatchTypes |


	Syntax ::= type(OPAQUE syntax) <
			EXPORTS	syntax;
			FILE ::= type(syntax);>

	MatchTypes ::= "MATCHES	FOR"
		| empty	
	Matches	::= Match Matches | Match
	Match ::= "EQUALITY" 
An example of use of the "OPAQUE" keyword is found in the production rule name "Syntax"; it is followed by an example of the EXPORTS clause within an embedded definition.

The example contains the three allowed forms of the FILE command:

When a macro definition contains "FILE" specification, each definition of a macro value will result in a line in the "macro output" file, which will contain: There is no need to change the macro instantiation themselves: the extension are all concentrated in the macro definition. As a consequence, we can use the same input as above, which, after modification of the macro definition, will result in the following ASN.1 definitions:
	intatt intatt-type ::= {
	   syntax  1  }

	intatt-type ::=	OBJECT IDENTIFIER

	intatt-type-syntax ::= OPAQUE INTEGER

	printatt printatt-type ::= {
	   syntax  2  }

	printatt-type ::= OBJECT IDENTIFIER

	printatt-type-syntax ::= OPAQUE	PrintableString

	unknown	unknown-type ::= {
	   syntax  3  }

	unknown-type ::= OBJECT	IDENTIFIER

	anyatt anyatt-type ::= {
	   syntax  4  }

	anyatt-type ::=	OBJECT IDENTIFIER

	anyatt-type-syntax ::= OPAQUE OCTET STRING
The macro definition file will contain the following lines:
	ATTRIBUTE intatt-type-syntax { 1 3 } intatt
	ATTRIBUTE printatt-type-syntax { 1 2 } printatt
	ATTRIBUTE ? { 1	} unknown
	ATTRIBUTE anyatt-type-syntax { 1 } anyatt
It can easily be parsed by a user provided program, in order to produce a definition of the attribute table.

An efficient code generator

Transfer syntaxes

The prime function of the MAVROS compiler is to generate coding and decoding routine from the annotated ASN.1 text produced by MAVCOD:
		    Annotated ASN.1,
		    C type description,

			 | MAVROS|


		    Coding routines (C),
		    Header file,
		    Make file.
The "C" output include indeed coding routines for the ASN.1 "Basic EncodingRules", as do most ASN.1 compiler. However, the MAVROS compiler generates also other routines: The X.509 standard defined a "unique encoding of ASN.1 data", which is used in conjunction with digital signature algorithms. The routine generated by MAVROS can optionally generate this unique encoding. The work started in X.509 has been followed by the definition by ISO of a new transfer syntax, called "distinguished encoding rules"; support for this new syntax will be provided in the new version of MAVROS.

The "light weight" syntax has been developped in a research project at INRIA. It can be negociated by knowledgeable presentation layers, and result in significant reduction of the coding and decoding time between compatible machines, in particular when the syntax is oriented towards "computing" applications, with lot of integers or floating point numbers.

The programmation and debugging support provided by MAVROS are designed to ease the development of complex applications.


MAVROS does not only generate coding and decoding routines, but also, for each data type: Routines are generated. The "copy" and "comparison" routines are used for the handling of the "DEFAULT" clause of ASN.1. The "hash coding" clause has been developped for the X.500 applications: it helps making quick accesses to tables indexed by complex attribute types. Each routine can be superceded by user-provided code. This is in particular useful when the type definitions include some "hidden characteristics", as:
	Case-Independant-String	::= CHOICE {
		T61String }
Together with other features of MAVROS, like the handling of "OPAQUE" types, these facilities can be used to program in "object oriented" style.

Debugging tools

MAVROS can generate: for each type. For example, if a data type has been defined as:
	Modem ::= SEQUENCE {
		name	 GraphicString,
		data	 ModemData
		DEFAULT	 defaultModem }

	ModemData ::= SEQUENCE {
		standard GraphicString,
		bps	 INTEGER }
Mavros will enable users to enter values of the type:
	<	name = "Modem 1";
		data = <
			standard = "V.21";
			bps = 2400 >>
The structure of the text output follow closely the ASN.1 syntax of the data. The text routines are not necessarily adequate for sophisticated user interfaces, but they can be used to quickly develop debugging programs for new applications, following a simple structure:
  1. Read "text" message,
  2. Encode as ASN.1 BER,
  3. Send to server,
  4. Wait response,
  5. Decode the ASN.1 BER,
  6. Print "text" output.
The text versions can also be used in logging files. In fact, one can very easily use text from a logging file, edit it to include particular difficulties, and obtain a "test suite" for a particular program.

A well tested system

The development of the MAVROS compiler started at INRIA in 1987, and take advantage of the experience acquired since 1985 in the development of X.400 systems. As a result, the MAVROS compiler is highly portable, has been thoroughly tested and includes several performance optimizations.

High portability

The initial development of MAVROS took place in the ESPRIT project "THORN", with the requirement that the compiler should work on the systems used by several "industrial" members of the project like BULL, GEC, ICL, Olivetti or Siemens. It has then been used by Siemens in its submission to the "DCE" request for technology of the OSF, and has then been ported to the machines of the members of the OSF consortium.

As a result, the MAVROS compiler is very portable:

The portability problems are solved by a number of configuration specific flags set in the "asn1.h" header file, which describe all the data types and procedure interfaces of the support library. This flags are used to parametrize: This file is included in: As soon as the main "header file" has been modified to take into account the hardware, systems and compiler specificities of a particular platform, the compilation becomes automatic. Both the ASN.1 "support library" and the coding routine generated by MAVROS are portable, and it is possible to generate the routine on one platform and to compile them on another.

Qualification tests

One of the result of the porting exercises carried over the years for the MAVROS compiler has been to stress the need for a "test suite". The MAVROS delivery includes a test suite, which consists of a syntax notation in annotated ASN.1 and a test program exercising the syntax.

The "test" syntax:

The text syntax has been extended after each port, in order to reflect all particular cases which posed problem on the different platforms, e.g. in order to mimic the situations which caused "bug reports" from the developpers of application protocols like X.400, X.500, CMIP, SNMP or Kerberos.

The testing program:

The testing program also includes "resistance" test. A valid ASN.1 encoding is perturbated by randomly changing some of the octest that it contains. After each perturbation, the decoding routine are tested: they should detect an error and terminate properly. This mimic the situation of a network connection where data can be incorrectly copied in a relay, or improperly generated by a remote application; such incorrect behaviour should not be allowed to cause a fatal termination of a network server, and will be detected by the consistency checks of the decoding routines.

The test suite is run:

It is, in that case, used as a "non-regression test".

Performance measurement

Performance tests are currently conducted at INRIA, in relation with research projects on the performance of the presentation layer. The goal of the studies are to: These tests have evidenced some possible performance gains, as shown on the following table:
|us / function |    optim |	   v2 |	       v1 |	  v0  |
|	  cpy  |   264989 |    340819 |	   359818 |    386817 |
|	  cod  |    51664 |	      |		  |     62330 |
|	  dec  |   187325 |    269989 |	   307154 |    291821 |
|	  ftcd |    35998 |	      |		  |	37998 |
|	  ftdc |   144827 |    221157 |	   247656 |    243490 |
|	  out  |   233823 |	      |		  |    221157 |
|	  in   |  1481440 |   2539231 |	  2576230 |   2480234 |

The test case showed on this table consisted of a batch of 20 "P1" messages, and was run on a SUN 3-60. The times figured in the table indicate the number of micro second necessary to encode or decode this 20 messages. The different columns show the coding or decoding time for successive versions of the compiler, and evidence a gain in performance of about 30% between the version "v0" of January 91 and the current version.

The important difference between coding and decoding is explained by:

Further optimization is going on, and new results are expected before the end of 91.