Next: Rationale of the Up: Baseline and rationale Previous: Baseline and rationale

The state of the art

High speed operation of networking applications is an important goal for many research projects. As networks proceed to higher speeds the performance bottleneck is shifting from the bandwidth of the transmission media to the processing time necessary to execute higher layer protocols. In particular there is a concern that the existing transport and presentation protocols are the most processing demanding protocols. In fact, the existing standard protocols were defined in the seventies. At that time, communication lines had low bandwidth and poor quality which was compensated for by elaborate protocol functions. The emergence of high speed networks such as FDDI, DQDB and in the near future B-ISDN has changed the situation for these protocols and they will not be designed in the same way today.

At the same time, the application environment is changing. The introduction of distributed multimedia applications requires high bit rates as well as real time delivery of data which the old protocols were not designed for.

The performance of workstations has increased with the advent of modern RISC architectures but not at the same pace as the network bandwidth during past years. Furthermore, access to primary memory is relatively costly compared to cache and registers and the discrepancy between the processor and memory performance is expected to get worser. Many workstations today consist of a multiprocessor system, with a small number of processors. However, current protocol implementations do not exploit this parallel processing potential.

Protocol processing can be divided into two parts, control functions and data manipulation functions. In the data manipulation part, the actual data of a PDU is read from memory, manipulated and possibly loaded back to memory. Example of data manipulation functions are presentation encoding, checksumming, encryption and compression. In the control part there are functions for header and connection state processing. Jacobson et al. have demonstrated that the control part processing can match gigabit network performance for the most common size of PDUs with appropriate implementations [2]. However, data manipulation functions present a bottleneck [3], [4]. They consist of two or three phases. First a read phase of data, then a manipulation phase followed by a load phase for some functions. For very simple functions, such as checksumming, byte alignment, etcetera, the time to read and write to memory dominates the processing time. For others, like encryption and some presentation encodings, the manipulation time dominates.

The data manipulation functions are spread over different layers. In a naive protocol suite implementation, the layers are mapped into distinct software or hardware entities which can be seen as atomic entities. The functions of each layer are carried out completely before the protocol data unit is passed to the next layer. This means that the optimization of each layer has to be done separately. Such ordering constraints is in conflict with efficient implementation of data manipulation functions [46].

One could accuse the OSI hierarchical layered model of causing this conflict. However, it is important to distinguish between the architecture of a protocol suite and the implementation an specific end system or a relay node. There is nothing a priori that requires that the implementation should follow the architectural decomposition. An engineering principle called Integrated Layer Processing (ILP) [3], [4] has been suggested for addressing this problem. With integration we mean that several layers are implemented within the same module. This does not mean that the implementation is unstructured. The basic idea behind ILP is to perform all the manipulation steps in one or two processing loops, instead of performing them serially as is most often done today. Within a loop we mean one read to memory, followed by all manipulations and one write back. Hence, time consuming memory references are reduced. To facilitate ILP a protocol architecture should be organized such that the interactions between the control and data manipulation functions do not interfere with their integration.

The layered protocol architecture may unnecessarily reduce the engineering alternatives available to an implementor. Clark and Tennenhouse have proposed Application Level Framing (ALF) as a key architectural principle for the design of a new generation of protocols [3]. According to this principle applications should break the data into frames (or Application Data Units (ADUs)) meaningful to the application. It is also desirable that the presentation and transport layers preserve the frame boundaries as they process the data. In fact, this is in line with the widespread view that multiplexing of application data streams should only be done once in the protocol suite. The sending and receiving application should define what data goes in an ADU such that the ADUs can be processed out of order. The ADU will be considered as the unit of ``data manipulation'', which will simplify the processing. Thus, ALF is supporting the ILP principle.

ALF and ILP were proposed in 1990 and there are very few reported implementations done according to these concepts. Previous work, including our own (INRIA and SICS) [4] [7] [13] have demonstrated that there is a performance benefit with ILP. The results show substantial reduced processing time for one PDU, as much as a factor five when six simple data manipulation functions were integrated. These reported results came from experiments that were isolated from the rest of the protocol stacks and handcoded assembler routines were used in order to control register allocation and cache behavior. Abbot and Peterson [6] use a language approach to integrate functions, but with less speed up. In [13] only two functions are integrated, data copying and checksum calculation, but it is a real operational implementation of UDP. Experience with an implementation of the XTP protocol from the OSI95 project [7] showed a significant performance improvements when ILP was used. This implementation is in user address space which also demonstrate that such implementations can perform as well as kernel tuned implementations. In this project we will go one step further and integrate several of the data manipulation functions in a complete, operational stack in order to understand the architectural implications and the achievable speedup.

For the encoding/decoding data manipulation function, our research work (INRIA) has been centered around the optimization of the ASN.1 Basic Encoding Rules. The cost of the coding and decoding routines is attributed to the heavy Type-Length-Value oriented coding of ASN.1 BER. This motivated the work on ``light weight'' or XDR-like transfer syntaxes [48] based on three design principles:

In parallel, the optimization of the BER implementation functions was conducted and resulted in a drastic improvement of the speed of coding and decoding routines on high performance RISC workstations. However, even if BER encoding can be implemented with reasonable cost on high performance workstations, the need for a global architecture for the overall communication system is a more general problem that still needs to addressed. An optimized version of the encoding and decoding functions should lead to the implementation of the presentation as a filter in order to achieve ``streamlined'' encoding and transmission operation of network applications. INRIA contributed to OSI-95 by building an enhanced version of the MAVROS compiler with improved performance for the generated coding and decoding routines.

Many Formal Description Techniques (FDTs) have been developed to describe protocols. Most of these techniques are based on communicating extended state machines, Petri nets and extensions [37], as well as the CCS [43], CSP [41], ACP [24] calculi. Two techniques, Estelle [29][40] and LOTOS [28][42], have been developed within the International Standardization Organization, and are currently both International Standards. Also, other languages not specifically designed for formal description are now used in this context, for example Esterel [18], VHDL [17], and Ada [16].

Taking into account the maturity of these FDTs as well as the availability of several software platforms to support them, it is possible to apply some of these techniques to the formal specification and verification of the protocol architecture.

The increasing prevalence and importance of distributed systems has placed great stress on the reliability and performance of application software for distributed environments. Reliability is often taken to mean fault-tolerance, but in modern distributed systems this property can reflect a number of aspects of system behavior, such as correctness, ability to satisfy realtime deadlines, ability to automatically reconfigure as components come online and offline, load balancing, security and trust, and even ease of maintenance. Unfortunately, most existing distributed software is far from reliable in any of these respects. Lacking is an integrated, scientific approach to achieving the sorts of reliability guarantees in distributed systems that we routinely expect upon in other areas of engineering.

Distributed applications, with the corresponding decentralization of computational activities, need software structuring tools oriented towards groups of participating entities. Several research projects have demonstrated how building blocks for group activity can be provided through non-traditional operating system layers. Since group programming was poorly understood during the period when distributed systems first emerged, it is not surprising that designers of these sorts of layers failed to anticipate the structure that group-programming tools would take. Nonetheless, the design decisions that were made over the past few decades have proved to be obstacles to group programming over the standard layer. One of those assumptions taken for granted was that the communication support was slow, error prone, and subject to partitions. There is a chance in this project to review the scenario above, in the anticipation of the technology changes that will emerge with next generation high performance networking. In essence, to study mechanisms and paradigms/protocols that allow the networking infrastructure to provide the adequate support for groups.



Next: Rationale of the Up: Baseline and rationale Previous: Baseline and rationale


rodeo@sophia.inria.fr
Fri Feb 10 14:30:25 MET 1995