The MAVCOD preprocessor:
The presentation of MAVROS follows the following plan:
With the advent of more complex applications, the need arised to pass structured information, suitable for program manipulation: the current programming languages such as "C", "PL-1", "PASCAL" or "ADA" all include facilities to describe complex types of data, e.g. structures, records or arrays.
In order to describe complex messages containing these types of data, Interface Notation Languages have been developped, in particular in order to enable the usage of Remote Procedure Calls.
ASN.1 is the acronym of "Abstract Syntax Notation 1", the interface notation language standardized by ISO and CCITT for applications developped in the "Open System Interconnection" (OSI) framework.
The relevant ISO standards are:
An abstract syntax is a description of the message from a "logical" point of view, in terms of records and arrays of various types of elements, e.g. integers or strings. ASN.1 is the language used to describe the "abstract syntaxes".Two application can only communicate if they have a common knowledge of the "abstract syntax" of the messages that they will exchange. The OSI model does not constrain them to use the same binary representation, as this would probably imply using the same hardware platform as well as the same programming languages and the same compilers. In order to communicate, the messages must be encoded in a binary format agreed by both parties.
According to the presentation protocol, the binary format is defined by the "transfer syntax". A transfer syntax is a set of coding rules, explaining for example how to encode basic elements such as integers and string as well as the composition rules used to encode structures or arrays. The combination of a "transfer syntax" with an "abstract syntax" specifies the binary syntax of the messages that will be exchanged.
The presentation protocol allows the presentation entities to negociate the transfer syntax that will be sued on a given connection. Although the "basic encoding rules" defined in ISO 8825 is the only T.S. which has already been standardized, work is ungoing at ISO and in various research group in order to define:
_________________________________________________________ | | Compiled | Manual | |_________________________|____________|_________________| | Develop an application | Days | Months | | Update a protocol | Hours | Weeks | | Achieve conformance | Correct | Test and | | | Input | fail | |_________________________|____________|_________________| | Reliability of code | Good | Dubious | | Debugging tools | Yes | ??? | |_________________________|____________|_________________| | Speed of decoding | Good | Excellent... | | | | ...or poor | | Respect of layering | Natural | Depends | | | | of discipline | | Introduce faster | | | | Transfer syntaxes | Natural | Ackward | |_________________________|____________|_________________|In fact, we have here another instance of the old debate, i.e. should one use "High Level Languages" or "Assembly language".
modem DEFINITIONS ::= BEGIN IMPORTS GraphicString FROM graphic; EXPORTS Modem ModemData defaultModem; Modem ::= SEQUENCE { name GraphicString, data ModemData DEFAULT defaultModem } ModemData ::= SEQUENCE { standard GraphicString, bps INTEGER } defaultModem ModemData ::= { standard "V.23", bps 1200 } ENDA module contains a set of type and value declarations. In the example, we have defined the types "Modem" and "ModemData", as well as the value "defaultModem". This types and values are defined by reference to either basic types defined in the ASN.1 language, like "INTEGER" in our example, or to type and values defined in the module, or to types and values imported from another module, like "GraphicString" in our example.
A typical feature of ASN.1, different from most other programming languages, is the very frequent use of "top down" notations: it is quite natural, when writing specifications, to specify first the "upper level" elements, like "Modem" in our example, and to specify then the "components" included in the box. While this feature makes the specification more readable, it does indeed not simplify the task of the compiler...
Special values can be defined for every type, whatever complex. These values can be used in documentations of communication protocols; they can also be used in the "DEFAULT" clause, to specify that when optional elements of a message are "absent" on the transmission wires the recipient should act as if it had received the "default" value. This imply the necessity to detect the default values in order not to encode them, and to be able to copy them in the received messages.
MAVCOD supports:
mavcod modem.asn1 -load /graphic/graphic.asn1The next version will also use a "search path" in order to automatically chase for these files.
Attribute ::= SEQUENCE { type OBJECT IDENTIFIER, value ANY DEFINED BY type}When receiving a message containing this components, a presentation entity should in theory look in an "attribute catalog", using the specific value of the "type" as a key. The attribute catalog will, among other things, contain the ASN.1 spefication of the attribute values.
In order to support this feature, MAVCOD requires the application programmer to provide specific tables, or more precisely to provide a search function which, given a key of the specified type, will return a pointer to a "table of functions" used for encoding and decoding the type. Entering the specification mentioned above will cause MAVCOD to issue a warning message:
You will have to define the procedure Attribute_value()!This procedure will be called by the decoding functions, with the value of the "type" as argument. It will most probably use a table linking the type and the attributes. In order to generate these tables, the programmer can indeed use MAVCOD to specify the values used for keying, and the types of the attributes. For example, one could enter the following ASN.1 module:
attributes DEFINITIONS ::= BEGIN EXPORTS oid1, oid2, oid3, xtype1, xtype2, xtype3; -- this is indeed only an example -- real life oid values shall be allocated conforming -- to standard practices. basicOid OBJECT IDENTIFIER ::= { 1 2 3 4 5 } -- Definition of the first attribute, -- of type integer oid1 OBJECT IDENTIFIER ::= { basicOid 1 } xtype1 ::= OPAQUE INTEGER -- Definition of the second attribute, -- a binary octet string oid2 OBJECT IDENTIFIER ::= { basicOid 2 } xtype2 ::= OPAQUE OCTET STRING -- Definition of the third attribute, -- a printable string. oid3 OBJECT IDENTIFIER ::= { basicOid 3 } xtype3 ::= OPAQUE CHOICE { PrintableString, T61String } ENDMAVCOD will initialize the C values corresponding to the various object identifiers, and will generate an annotated ASN.1 file so that MAVROS could generate the encoding routines for the various data types. Note the insertion here of the "OPAQUE" keyword. This is a MAVCOD specific extension of the ASN.1 language, that one could describe as a built ASN.1 "macro". The effect of this keyword is to instruct the compiler that the type should be externally opaque, i.e. typed in C as a "void *". This is necessary in our case, as the value corresponding to the "ANY DEFINED BY" element will be typed as a "void *".
Using the output of MAVCOD and MAVROS, the procedure needed to "solve" the "soft typing" can be easily entered as:
#include "attributes.h" extern asn1_type_desc asn1_absurd_desc; struct table_entry { asn1_oid * type; asn1_type_desc * value; } table[] = { {&oid1, &xtype1_desc }, {&oid2, &xtype2_desc }, {&oid3, &xtype3_desc }}; int table_size = sizeof(table)/sizeof(struct table_entry); asn1_type_desc * Attribute_value(x) asn1_oid x; { register i; for (i=0; iThe values "oid1", "oid2" and "oid3" have been initialized by MAVCOD from the value definitions present in the ASN.1 file. The procedure tables "xtype1_desc", "xtype2_desc" and "xtype3_desc" are created by MAVROS from the definition of the types "xtype1", "xtype2" and "xtype3".v, table[i].type->l) == 0) return(table[i].value); } return(&asn1_absurd_desc); }
Note that in case of "failure" we do not return a "nil" pointer, but rather reference the predefined type "asn1_absurd". This pseudo type is defined in the ASN.1 support library. Using it will guarantee that the coding functions terminate properly, and that the decoding functions return a decoding error.
A natural solution for generating this tables is to use the MACRO facility of the ASN.1 language, which will be explained in the following sections.
pair DEFINITIONS ::= BEGIN EXPORTS PAIR, T1, V1, V2; PAIR MACRO ::= BEGIN TYPE NOTATION ::= "TYPEX" "=" type (Type-1) "TYPEY" "=" type (Type-2) VALUE NOTATION ::= "(" "X" "=" value(value-1 Type-1) "," "Y" "=" value(value-2 Type-2) < VALUE SEQUENCE {Type-1, Type-2} ::= { value-1, value-2 } > ")" END T1 ::= PAIR TYPEX = INTEGER TYPEY = BOOLEAN V1 T1 ::= ( X = 3, Y = TRUE ) V2 PAIR TYPEX = INTEGER TYPEY = T1 ::= ( X = 3, Y= (X=4, Y=FALSE)) ENDA macro definition consists of a set of "production rules", which are described in a sort of "Backus-Naur" form. The production rules list several alternatives separated by the "|" symbol. Each alternative is described by a list of "tokens", which can be:
A production rule may also contain "embedded definitions", i.e. definition of types and values enclosed between angle brackets. These definitions may use either ASN.1 types defined outside the macro, or local references defined within the production rules.
A macro definition must always contain at least two production rules, called "TYPE NOTATION" and "VALUE NOTATION". Once the macro has been defined, the "TYPE NOTATION", preceded by the MACRO name, can be used wherever a "type definition" is expected within the ASN.1 language, e.g. in order to define a new type, or as member of a SET, a SEQUENCE or a CHOICE. The module "pair" contained above includes for example the definition of the type "T1" and the implicit definition of the type associated to "v2" by use of the macro "PAIR".
Whenever one has to define a value for a type defined by a macro, one must follow the syntax defined in the "VALUE NOTATION". This can indeed happen when a value is defined at the "module" level, but also when a component of a complex structure is defined by means of the macro notation. This is the case in our example for the value "v1", of type "T1", for the value "v2" whose type is defined by the macro "PAIR", but also for the "Y" component of "v2", which is defined by the type "T1".
A value notation must always include the definition of a value identified by the reserved identifier "VALUE". This identifier denotes the definition of the value associated with the value notation -- by opposition to the other values, which merely play the role of "building blocks". As a result, the definition of a type by use of a macro is equivalent to a reference to the type associated to this special value.
MAVCOD fully supports the ASN.1 macros:
mavcod pair.asn1 -umac pair.bisRunning this option on the example quoted above yelds the following result:
pair DEFINITIONS ::= BEGIN EXPORTS T1, v1, v2; T1 ::= SEQUENCE{ T1-Type-1 INTEGER, T1-Type-2 BOOLEAN} v1 T1 ::= { v1-value-1 , v1-value-2 } v1-value-1 INTEGER ::= 3 v1-value-2 BOOLEAN ::= TRUE v2 v2-type ::= { v2-value-1 , v2-value-2 } v2-type ::= SEQUENCE{ v2-type-Type-1 INTEGER, v2-type-Type-2 T1} v2-type-Type-2 ::= T1 v2-value-1 INTEGER ::= 3 v2-value-2 v2-type-Type-2 ::= { v2-value-2-value-1 , v2-value-2-value-2 } v2-value-2-value-1 INTEGER ::= 4 v2-value-2-value-2 BOOLEAN ::= FALSE ENDThis usage of the macros correspond to a full implementation of the ASN.1 language, and does not require any modification to the ASN.1 input. However, the usage of macros in international standards is not limited to the definition of types and values, but may well include the provision of "knowledge". In is "Open Book", Dr. Marshall Rose goes as far as saying that a "disastrous characteristic of the macro facility is that it has "buried semantics" and that "the macro facility introduces tremendous difficulty for automatic tools." In order to solve this difficult problem, an extension to the MACRO syntax definition has been incorporated in MAVCOD.
ATTRIBUTE MACRO ::= BEGIN TYPE NOTATION ::= Syntax MatchTypes | empty VALUE NOTATION ::= value (VALUE OBJECT IDENTIFIER) Syntax ::= type(syntax) MatchTypes ::= "MATCHES FOR" Matches | empty Matches ::= Match Matches | Match Match ::= "EQUALITY" | "SUBSTRING" | "ORDERING" ENDExamples of instantiation of this macro are:
intatt ATTRIBUTE INTEGER MATCHES FOR EQUALITY ORDERING ::= {syntax 1} printatt ATTRIBUTE PrintableString MATCHES FOR EQUALITY SUBSTRING ::= {syntax 2} unknown ATTRIBUTE ::= {syntax 3} anyatt ATTRIBUTE OCTET STRING ::= {syntax 4}If we merely state these definitions in an ASN.1 module, the value notations will be expanded, and we will get the following output:
intatt intatt-type ::= { syntax 1 } intatt-type ::= OBJECT IDENTIFIER printatt printatt-type ::= { syntax 2 } printatt-type ::= OBJECT IDENTIFIER unknown unknown-type ::= { syntax 3 } unknown-type ::= OBJECT IDENTIFIER anyatt anyatt-type ::= { syntax 4 } anyatt-type ::= OBJECT IDENTIFIERThis is certainly correct from a "language" point of view, but falls somewhat short of the needs of the application programmer. In order to implement the "soft-typing" facility described above, he would certainly have liked to find something like:
intatt OBJECT IDENTIFIER ::= {syntax 1 } intatt-syntax ::= OPAQUE INTEGER printatt OBJECT IDENTIFIER ::= {syntax 2 } printatt-syntax ::= OPAQUE PrintableString unknown OBJECT IDENTIFIER ::= {syntax 3 } anyatt OBJECT IDENTIFIER ::= { syntax 4 } anyatt-syntax ::= OPAQUE OCTET STRINGMoreover, he would also have liked to use the knowledge of the macro in order to build "automatically" the table of attributes. This table is probably not only used for the coding and decoding routines, but perhaps also by some parts of the application program, e.g. to check whether this or that comparison operation is allowed for the attribute.
As implied by Dr. Marshall Rose, there is probably no general solution to the problem. In order to solve at least the case of the "most used" macros, MAVCOD provide the following extensions to the ASN.1 macro syntax:
ATTRIBUTE MACRO ::= BEGIN TYPE NOTATION ::= Syntax MatchTypes | emptyAn example of use of the "OPAQUE" keyword is found in the production rule name "Syntax"; it is followed by an example of the EXPORTS clause within an embedded definition.VALUE NOTATION ::= value (VALUE OBJECT IDENTIFIER) Syntax ::= type(OPAQUE syntax) < EXPORTS syntax; FILE ::= type(syntax);> MatchTypes ::= "MATCHES FOR" Matches | empty Matches ::= Match Matches | Match Match ::= "EQUALITY" | "SUBSTRING" | "ORDERING" END
The example contains the three allowed forms of the FILE command:
FILE ::= "1";
FILE ::= type(syntax);
FILE ::= value(VALUE)
intatt intatt-type ::= { syntax 1 } intatt-type ::= OBJECT IDENTIFIER intatt-type-syntax ::= OPAQUE INTEGER printatt printatt-type ::= { syntax 2 } printatt-type ::= OBJECT IDENTIFIER printatt-type-syntax ::= OPAQUE PrintableString unknown unknown-type ::= { syntax 3 } unknown-type ::= OBJECT IDENTIFIER anyatt anyatt-type ::= { syntax 4 } anyatt-type ::= OBJECT IDENTIFIER anyatt-type-syntax ::= OPAQUE OCTET STRINGThe macro definition file will contain the following lines:
ATTRIBUTE intatt-type-syntax { 1 3 } intatt ATTRIBUTE printatt-type-syntax { 1 2 } printatt ATTRIBUTE ? { 1 } unknown ATTRIBUTE anyatt-type-syntax { 1 } anyattIt can easily be parsed by a user provided program, in order to produce a definition of the attribute table.
Annotated ASN.1, C type description, | V ________ | MAVROS| |_______| | V Coding routines (C), Header file, Make file.The "C" output include indeed coding routines for the ASN.1 "Basic EncodingRules", as do most ASN.1 compiler. However, the MAVROS compiler generates also other routines:
The "light weight" syntax has been developped in a research project at INRIA. It can be negociated by knowledgeable presentation layers, and result in significant reduction of the coding and decoding time between compatible machines, in particular when the syntax is oriented towards "computing" applications, with lot of integers or floating point numbers.
The programmation and debugging support provided by MAVROS are designed to ease the development of complex applications.
Case-Independant-String ::= CHOICE { PrintableString, T61String }Together with other features of MAVROS, like the handling of "OPAQUE" types, these facilities can be used to program in "object oriented" style.
Modem ::= SEQUENCE { name GraphicString, data ModemData DEFAULT defaultModem } ModemData ::= SEQUENCE { standard GraphicString, bps INTEGER }Mavros will enable users to enter values of the type:
< name = "Modem 1"; data = < standard = "V.21"; bps = 2400 >>The structure of the text output follow closely the ASN.1 syntax of the data. The text routines are not necessarily adequate for sophisticated user interfaces, but they can be used to quickly develop debugging programs for new applications, following a simple structure:
As a result, the MAVROS compiler is very portable:
The "test" syntax:
The testing program:
The test suite is run:
_______________________________________________________________ |us / function | optim | v2 | v1 | v0 | |______________|__________|___________|___________|___________| | cpy | 264989 | 340819 | 359818 | 386817 | |______________|__________|___________|___________|___________| | cod | 51664 | | | 62330 | | dec | 187325 | 269989 | 307154 | 291821 | |______________|__________|___________|___________|___________| | ftcd | 35998 | | | 37998 | | ftdc | 144827 | 221157 | 247656 | 243490 | |______________|__________|___________|___________|___________| | out | 233823 | | | 221157 | | in | 1481440 | 2539231 | 2576230 | 2480234 | |______________|__________|___________|___________|___________|The test case showed on this table consisted of a batch of 20 "P1" messages, and was run on a SUN 3-60. The times figured in the table indicate the number of micro second necessary to encode or decode this 20 messages. The different columns show the coding or decoding time for successive versions of the compiler, and evidence a gain in performance of about 30% between the version "v0" of January 91 and the current version.
The important difference between coding and decoding is explained by: