Project Description

1. Technical and scientific description of the activities

1.1 Rationale.

Informatics is presently engaged in a major evolution which could be called the globalization of parallelism. Parallelism is potentially everywhere. A full spectrum is covered: at one extreme, parallelism is present in the deep structure of processors (multicore architectures). At the other extreme, parallelism is present in the Web, for instance in games involving many remote players. In between these two extremes, parallelism is present in the machines themselves (mutiprocessors), as well as in networks or interconnected machines (distributed and grid computing). The second major evolution informatics is faced with is related to the Web. The Web is potentially everywhere. With its latest evolutions (increase of the network bandwidth, wide accessibility of the network, ubiquitous Web browsers) the Web constitutes a new infrastructure upon which new innovative applications are likely to be built. These will execute in parallel and on various heterogeneous computers and electronic devices, such as PC's, PDA's and mobile phones. Programming, which is situated at the heart of informatics, has to adapt to these two major evolutions. The challenge is clearly identified and is a real one: Intel announces that the standard processor, presently composed of two to four cores, will reach 80 cores within five years. Telephone sets are planned to contain one hundred cores in 2015. At the opposite extreme of the spectrum, programming the Web requires one to tame the heterogeneity of the various devices as well as the synchronization of the communications. The goal of the project PARTOUT is, from a programming language point of view, to study the impact on programming of the globalization of parallelism which now covers all the spectrum of informatics, ranging from multicore architectures, through multiprocessors and distributed systems, up to applications deployed on the Web. In summary, the issue is: How to program the underlying parallelism present as well in multicores as in the Web? Our opinion is that the software solutions we propose will be fruitful for concurrent programming techniques, for parallel and distributed system techniques, as well as for Web programming.

1.2 Background, state of the art, issues and hypotheses.

Multicore architectures are particular cases of multiprocessor machines which have existed for many years but which were previously devoted to the domain of intensive computing (scientific computing, simulations). Tools for the deployment of distributed applications have also existed for some time (RPC, CORBA, Java). Tools have also been proposed for the deployment of grids made of large numbers of machines. Concerning the programming of multicore machines, most of the existing proposals are based on sequential code parallelization techniques, with the main objective to process arrays in parallel. The standard approach is thus to parallelize sequential code in the most possible automatized way. It is however clear that many common applications do not fit in this frame and thus cannot fully benefit from multicore architectures. On the other hand, concurrent programming appears more and more as a mandatory tool for modularity, and in this framework, the issue is raised of linking the concurrency defined at user-level with the parallelism offered by the underlying architecture. Thread-based programming languages (Pthreads, Java) most often use preemptive scheduling techniques which require the use of low-level constructs for protecting data (locks). Concurrent programming based on the notion of process (CML, Erlang) proposes, on the other hand, asynchronous communication means based on message passing techniques. In all cases, asynchrony is a source of complexity at the semantic level. Standard parallel programming techniques raise deep issues at the semantic level. Synchronous languages (e.g. Esterel, Lustre) are a response to these problems in the static case, that is, when parallel components cannot be added during program execution. The Reactive Programming approach extends the paradigm of synchronous languages to the dynamic case (technically, by forbidding immediate reaction to the absence of signals). The proposed project is firmly based on this approach. Parallel programming (as well as concurrent programming and distributed programming) raises major safety and security issues. Here, safety refers to the absence of run-time errors, while security means mainly information flow security (confidentiality, integrity of data). Actually, parallelism raises safety/security issues at all levels. For multicore machines specifically, the programming language should provide high-level means to control accesses to the shared memory, in order to avoid violations of atomicity (data-races). In distributed programming, data have to be protected during their transmission, and it should be possible to control the way they are received and propagated on remote sites (access and flow control). Security concerns are particularly clear concerning the Web (leading for example to solutions consisting in limiting browser capacities). Parallelism is often presented as a miracle solution to increase the efficiency of applications. At the programming level however, many important issues have to be solved to be able to fully benefit from parallelism. In particular, the issue of determining what is the correct level of parallelism granularity needed for a specific application. The issue of the communication between parallel components is also raised: too much communication can cancel the benefit of parallel execution. Actually, the present technologies for parallel programming are far from being the panacea, in particular at the efficiency level, and their use remains complex. In this context, the main objective of the project is to design tools for safe and efficient parallel programming, adapted to multicore architectures, to multiprocessors, to distributed architectures, and to the Web.

1.3 Specific aims of the proposal, highlighting the originality and the novelty

The scientific focus of the project is on foundations of parallel programming, with an emphasis on three aspects: language expressivity, efficiency of implementations, and safety. The goal of this basic research project is to produce results on each of these three aspects, and to get a better understanding of the links which exist between them. The main expected scientific outcome of the project is one or several prototypes of concurrent programming languages, based on a formal definition and leading to a safe and efficient programming of multicore, multiprocessor, and distributed architectures. The choices characterising the project are: non-preemptive concurrent programming based on the synchronous-reactive paradigm; formal approach of the semantics; efficient compiling techniques; mechanisms for programming dynamic systems (as opposed to static systems); static analyses to insure strong safety properties and resource control. There exist at present very few research projects covering all the themes considered in our proposal: non-preemptive parallelism, multicore, static analyses for safety, memory separation, resource control, and Web programming. Work centered on the ML language (in particular, the one related to Caml), and aiming at extending it to parallelism, is generally based on standard preemptive concurrency and does not cover multicore aspects. Work centered on the parallelization of sequential code is generally not aiming at the design of parallel languages, but instead proposes to extend standard sequential languages with pragmas or primitives dedicated to parallel array processing. Moreover, the application domain is restricted to algorithms that can be « naturally » parallelized. Work on Java is quasi-exclusively based on the standard Java preemptive multitasking. The originality of our project is to target at the same time parallelism, safety, efficient implementation techniques, multicore, and the Web.

1.4 Progress beyond the state of the art and relevance to the call for proposals

The project falls in the Axis 1 « Algorithmes, Langages, Architectures », more specifically in the part « Langages et modèles », of the Request for Proposals « Domaines Emergents », by ANR. Compared to standard techniques based on preemptive parallelism (even in the distributed case), our project promotes an alternative approach, efficient and semantically founded, based on synchronous parallelism. The transaction-based approach (transactional memory) offers an alternative to the use of locks in preemptive programming. It is however not very appropriate when the concurrent entities are highly communicating. The transactional approach seems thus to offer only a partial solution to the issues raised by concurrency and parallelism. We propose (Section 1.5, theme 1) to compare, at a deeper level, our approach with the transactional one. Compared to the work on Synchronous languages, our projet puts the focus on dynamicity of programming and on the use of multicore architectures. As regards the safety and security issues, our project highlights static techniques, especially types systems, allowing run-time checks to be avoided. Previous work on reactive programming aimed at responding to the question: How to conceive modular code based on a parallel paradigm, and how to execute it efficiently on mono-CPU architectures? Today, the same question is raised in a multithreaded, multicore, multiprocessor, and distributed context.

1.5 Detailed description of the work.

The main themes of the project are the following: Efficient programming; Distributed programming; Safe programming; Dynamic aspects. Let us consider these themes in turn through a list of related questions.

Theme 1: Efficient Programming.

1. How to use multicore machines efficiently, in the context of a unique application? What are the syntactic primitives to propose? The issue is to define a syntax and a semantics for primitives especially designed to match the architectural orientations that are emerging today at the hardware level (hyper-threading, multicore, multiprocessor, etc.), in order to optimise the use of these new and heterogeneous resources (processors, DSP, GPU, etc.). The notion of scheduler in FunLoft is a first response to this question. In this context, an interesting question will be to compare the synchronous-reactive approach with an approach based on transactions.

2. How can the presence of instants (issued from the synchronous-reactive approach) gives a way to simplify the implementation of an automatic memory garbage collector (GC) in a multiprocessing (in a broad sense) framework?

3. How could just-in-time compiling techniques be used in order to improve efficiency in the context of reactive systems? Usually, a compiler tries to statically optimise the code through partial evaluation techniques and it does so at an early stage, when the execution context is not completely known. In many cases, delaying this process until execution allows better optimisations, especially for dynamic and reconfigurable systems. In the case of reactive formalisms, one can build efficient mechanisms to optimise code at run-time, based on the knowledge of the signals handled by the system at each instant. One can imagine annotating reactive code in order to help a just-in-time compiler to find deeper optimisations.

Theme 2: Distributed Programming.

1. How should distributed activities, expressed in a formalism based on synchronous parallelism, be synchronised? This point is especially important for the Web (synchronisation of distributed HOP servers and for networked multi-players games. We shall start from proposals existing in FunLoft (synchronised schedulers) and in SugarCubes (distributed reactive machines).

2. How should the synchronous/asynchronous interaction, which is unavoidable in a distributed context, be taken in account? We shall start from two existing proposals: FairThreads which is the model on which FunLoft is based, and the multi-clocks model recently introduced in SugarCubes.

Theme 3: Safe Programming.

1. How to insure a programming without data-races which preserves the natural atomicity of user programs within a multicore and distributed framework? The starting point is the FunLoft proposal and we shall study the way it can be extended to more general cases.

2. How can we insure the absence of information leaks related to parallelism or distribution (non-interference problems), for systems with multiple security levels? How could it be guaranteed that sensitive data are not corrupted or disclosed during the interaction between a program and the site which executes it?

3. How to conciliate run-time migration with the preservation of strong safety/security properties?

Theme 4: Dynamic Aspects.

1. How to introduce dynamic aspects (for example, scripts) in the programming frameworks we consider (at least in those which do not have any of them)? We shall start from the initial proposal of SugarCubes of reactive scripts, and from a more recent one in ReactiveML.

Tasks and sub-tasks will be the following:

Task 1: Language Design

Task 1.1 A first sub-task is concerned by programming primitives. Starting from the primitives that are proposed by FunLoft and SugarCubes, we shall study extensions in several directions (distribution, resource control, safety, scripting, migration). In order to take into account dynamic aspects, we will study particular forms of programming primitives, based on dynamic linking. For example, when an autonomous component (an agent) arrives on a remote site, it should have the capacity to link to some resources available on the site. The presence of instants and the existence of broadcast signals are certainly an advantage for that purpose. A proposal of a mechanism for dynamic linking based on the use of hashtables made previously in SugarCubes will be the starting point for this study. We will also consider the work by G. Boudol on ULM [B04]. One way to get efficient optimisations is to delay the linking to resources (memory cells, communication channels, etc.) until either the program deployment, or the loading in the system, or the activation by a thread. We also plan to study the application of these techniques.

 FunLoft is a new language developed at INRIA/Mimosa and a first experimental version is available. Our objective is the design of a more realistic language, obtained by improving several aspects of the present version, as for example error messages (this is a crucial question for a language using type inference, in presence of effects). We will also address the question of dynamic creation of schedulers (which is presently impossible), using techniques based on regions in a way similar to the one used for threads. The issue of resource control is also raised. The ideal would be to be able to compute explicit and realistic bounds, but this is a very difficult problem and we are far from having a solution for it. The extension of FunLoft to distribution is also a goal. In this context, the question is to define safe marschalling/unmarshalling techniques. The primitive migration proposed by FunLoft should be extended to the network, in a way inspired from SugarCubes. Dynamic linking mechanisms should be proposed for that purpose. Communication through valued broadcast signals, which is at the basis of the FairThreads model, should facilitate the way this problem could be solved. One starting point is the Acute language [SLWNAHV05]. The proposal of synchronised schedulers is to be analysed in more detail. It has to be tested on standard examples and compared with other techniques of synchronisation through the network.

 ReactiveML is developed at LRI. Our goal is to extend the language with new constructs to program applications distributed over several machines. In particular, we want to study how to mix the reactive synchronous model with an asynchronous model as the join-calculus. In this context, several questions arise. For example, we will have to understand which kind of values can be emitted on an asynchronous communication channel. In ReactiveML, channels are first class values. Hence, synchronous channels can a priori be emitted on an asynchronous one. If such communications are allowed, it would be possible to create « synchronous » communications between asynchronous sites. If we want to reject such programs or to apply particular treatments to synchronous channels in this case, we will have to design new static analyses.

 SugarCubes (in the Lite version) is presently developed by CNAM/CEDRIC. One goal of the project is to extend SugarCubes to a complete programming language, which means to give it the ability to represent data and their manipulations (and not to push away the issue to Java, as it is the case presently). This work should continue the one on the model of reactive objects, called Cubes. The benefit from designing not only a Domain Specific Language (DSL) but a full language is to open the way to the control of atomic instructions in order to insure some critical properties (termination, bounds on the execution time). The DSL approach used up to now does not offer these guarantees in Java. Similar ideas are considered in FunLoft and ReactiveML. One interesting question is also to compare the reactive-synchronous approach to the one based on transactions (as it appears for example in the Work of Martin Abadi et al. [ABHI08] on the AME model). As far as possible, we will formalise the various aspects considered (for example, dynamic linking). For that purpose, the main techniques that we envisage to use are operational semantics and static analyses based on type systems.

Task 1.2 We plan to study the question of information flow security in the context of the new languages proposed in the project, and to extend to these languages the results already obtained for more classical concurrent formalisms in [MBC07,C07]. The question, which arises in systems with multiple security levels, is that of guaranteeing the confidentiality and the integrity of sensitive data, while allowing them to be used by programs during their execution. It is well-known that access control is not sufficient for this purpose, since it does not prevent the propagation of sensitive data once they have been accessed by authorised parties. The issue of secure information flow becomes all the more crucial in a global network such as the Web, where computations are by essence distributed and mobile, and in general no « trust relation » may be assumed between a program and its executing environment, nor between concurrent programs accessing common data. We shall adopt here the so-called « language-based security » approach [SM03], which consists in restricting the security analysis to programs of a given language, and to adapt well established static analysis techniques, such as type (and effect) systems, to ensure the security properties of interest. Outcomes of the two sub-tasks Task 1.1 and Task 1.2 should be reused in the development of our initial languages.

Task 2: Implementations

Members of the project have a strong experience in software development. In particular, they have been committed to developing open-source softwares for many years. Each of these softwares differs from the others. FunLoft (F. Boussinot of the Inria Mimosa team, with F. Dabrowski) is a new language designed for programming concurrent, secure, and safe programs. SugarCubes (J-F. Susini of the CNAM) is a Java API designed for programming reactive animations and games. HOP (M. Serrano of the Inria Mimosa team) is a new SDK for programming interactive applications on the Web. ReactiveML (L. Mandel and M. Pouzet of LRI) is a ML-based language which has been used in several contexts such as video games and simulators. These four platforms are complementary. We associate a sub-task with each of them. These four sub-tasks are however strongly connected. We are basically studying how the various platforms can be blended for either improving their implementation or proposing new features and facilities. For instance, the FunLoft compiler could be retargeted to SugarCubes for re-using the reactive constructions proposed by the latter. SugarCubes could be integrated in the HOP platform allowing distributed and reactive programs to roam and execute over the Web.

Task 2.1 The sub-task on FunLoft will study the possibility to use SugarCubes as a target implementation language. An advantage of such an approach would be to give a direct access to the GC of Java. Another advantage would be to reuse the static controls of FunLoft in SugarCubes/Java. This approach would also make the distribution layer of SugarCubes, in particular the distributed reactive machines, usable for implementing the extension of FunLoft to distribution.

Task 2.2 The sub-task on ReactiveML will start with a new implementation of the language to make it fully compatible with Objective Caml. This new version will provide objects, functors, labels, etc. Then, two main topics will be considered:

● The first topic is the implementation of ReactiveML on parallel architectures. This point is both about (1) parallel execution of a ReactiveML program on a multi-core architecture and (2) the programming of distributed applications communicating through asynchronous channels. To achieve this last concern, links with the JoCaml language [LCL02] will be specially interesting to study.

● The second topic is the implementation of static analyses developed in the project, as that insuring absence of data-races. It will be challenging to integrate such analyses in the context of an ML language with polymorphism, higher order, etc. Finally, for the dissemination of synchronous reactive programming, it seems important to provide tools to understand and to design programs. Creating a debugger is an effort to supply in this way. An interactive mode which allows a program to be executed step by step has already been implemented. This mode provides a way to define semantic break points rather than syntactic ones, by means of synchronous observers. Some work needs still to be done to transform this interactive mode into a real debugger.

Task 2.3 The sub-task on SchemeBigloo will study the extension of the Bigloo threading model (a limited form of FairThreads) according to the outcomes of Task 1, and according to the needs issued from sub-task Task 3.1 concerning HOP.

Task 2.4 The sub-task on SugarCubes will first consider the issue of memory management. Several problems of a GC (triggering threshold, depth of exploration, etc.) can find an original solution in the reactive-synchronous approach, through the notion of instant. We plan to address these problems, in the perspective of a distributed GC. A second point is related to the reactive engine (reactive virtual machine, RVM), with the objective to give a very flexible means to execute reactive programs, and to dynamically add new programs, with a reduced cost in terms of CPU and memory footprint. Can SugarCubes, or more precisely RVM, be considered as a privileged execution platform for higher-level languages such as FunLoft or ReactiveML? This is another question that we would like to consider. RVM is a good means to study the delayed linking mechanisms that could be used in conjunction with the techniques of just-in-time compilation and of dynamic linking. The work on dynamic specialization should be considered in this context. The third point consists in searching for a deeper insight of the relations between the synchronous parallel model and the hardware parallelism (hyperthreading, multicore, etc.). The idea is, as in FunLoft, to propose solutions to minimize the use of the synchronization mechanisms available in asynchronous models (locks, monitors, etc.). The goal is to generate code which could fully exploit the various granularity levels of the underlying physical parallelism (pipelining, superscalar architectures, multicore, etc.). A possible way to do that is to exploit the structure of the high level programs. The objective is to optimize the cost of the concurrent accesses to the shared resources (CPU, memory). We also plan to consider the impact of these optimisations on the electric consumption.

Task 3: Applications

Two types of applications will be considered: distributed Web servers and networked games.

Task 3.1 The servers we are going to consider are HOP servers. We will study the way to program and implement the synchronisation and the communication of several HOP servers distributed on the Web. Indeed, the new forthcoming Web applications will need the simultaneous use of several peripherals, each of them embedding a Web server. It will then be a task of the global application to insure the correct synchronisation and communication of these embedded servers. The implementation of such systems should a priori use the outcomes of Task 2, concerning the full implementation of the FairThreads model in SchemeBigloo.

Task 3.2 The second type of applications that we plan to study are networked games. Several partial results have been obtained in the past, using SugarCubes, in the european project PING, led by France-Telecom. Our objective is to continue to explore this track, in particular with the security issue in mind. We intend to apply the project outcomes to game consoles, more specifically in the domain of game engines. Another target are mobile telephones in which a form of heterogeneous multiprocessing is presently emerging, which needs the cooperation of several very different processing units (example: video cards, FPGA, processors on telephone sets, etc.)

Task 4: Dissemination

Task 4 A special effort will be made to broadly disseminate the project outcomes. The existence of a specific task devoted to this topic is justified by the importance we give to this aspect. All partners should be involved in this task. In this task, we plan in particular to release new versions of our software products, under a GNU GPL license, as it is already the case for FunLoft, HOP, ReactiveML, SchemeBigloo, and SugarCubes.

1.6 Expected results and potential impact.

The main expected outcome of the project is a deeper comprehension of issues raised by concurrent, parallel, and distributed programming, in particular when applied to the Web. We hope to be able to show that the reactive-synchronous approach is a good framework for expressing parallelism, leading to a safe and flexible programming, as a consequence of its capacity to deal with dynamic aspects. Finally, we hope to show that our approach is extremely efficient, in particular because it allows programmers to fully benefit from parallel architectures, especially from multicore ones. The results we expect to obtain in the domain of software production are multiple. The goal is to design a general programming framework in which:

1. The user may program in a natural, concurrent, and dynamic way, based on a clear and precise semantics;

2. Static analyses entail strong properties, such as the absence of data-races;

3. Applications can be efficiently deployed, especially on multicore machines.

Partner References

[B05] Boussinot, F. -- FairThreads: Mixing Cooperative and Preemptive Threads in C -- Concurrency and Computation: Practice and Experience, 18, 2005, pp. 445-469.

[BD08] Boussinot, F. and Dabrowski, F. -- Safe Reactive Programming: the FunLoft Proposal -- Proc. of MULTIPROG -- First Workshop on Programmability Issues for Multi- Core Computers. Göteborg, January, 2008.

[BM05] Farid Benbadis, L. Mandel -- Simulation of Mobile Ad hoc Network Protocols in ReactiveML -- Synchronous Languages, Applications, and Programming, 2005.

[BS00] Boussinot, F. and Susini, J-F. -- Java Threads and SugarCubes -- Software Practice and Experience, 30(14), 2000, pp. 545—566.

[C07] Castellani, I. -- State-oriented noninterference for CCS -- Proc. SecCo'07, ENTCS 194(1): 39-60, 2007.

[MBC07] A. Almeida Matos, G. Boudol and I. Castellani -- Typing noninterference for reactive programs -- Journal of Logic and Algebraic Programming 72: 124-156, 2007.

[MP05] Mandel, L. and Pouzet, M. -- ReactiveML: a reactive extension to ML -- Proceedings of the 7th ACM SIGPLAN international Conference on Principles and Practice of Declarative Programming (Lisbon, Portugal), 2005.

[P06] Pouzet, M. -- Lucid Synchrone, version 3. Tutorial and reference manual -- Université Paris-Sud, LRI, April 2006. Distribution available at: www.lri.fr/~pouzet/lucid-synchrone.

[S06] Susini, J. -- The reactive programming approach on top of Java/J2ME -- Proceedings of the 4th international Workshop on Java Technologies For Real-Time and Embedded Systems (Paris, France), 2006.

[S07] Serrano, M. -- Programming Web Multimedia Applications with Hop -- Proceedings of the ACM Sigmm and ACM Siggraph conference on Multimedia, Best Open Source Software, Augsburg, Germany, Sep, 2007.

[SBS04] Serrano, M. and Boussinot, F. and Serpette, B. -- Scheme Fair Threads -- 6th sigplan International Conference on Principles and Practice of Declarative Programming (PPDP), Verona, Italy, Aug, 2004, pp. 203—214.

[UCP07] P. Urien, H. Chabanne, C. Pépin, M. Bouet, D. de Oliveira Cunha, V. Guyot, G. Pujolle, P. Paradinas, E. Gressier et J.F. Susini -- HIP-based RFID Networking Architecture -- International Conference on Wireless and Optical Communications Networks, 2007.

Other References

[ABHI08] M. Abadi, A. Birrell, T. Harris, and Mi. Isard -- Semantics of Transactional Memory and Automatic Mutual Exclusion -- POPL, pages 63–74, 2008.

[B04] Boudol, G. -- ULM, a core programming model for global computing --, Proceedings of the ESOP'04 Conference, Lecture Notes in Computer Science (LNCS) Vol 2986, 2004.

[D07] F. Dabrowski. -- Programmation Reactive Synchrone, Langage et Controle des Ressources -- PhD, University Paris 7, Department of Computer Science, 2007.

[LCL02] F. Le Fessant, C. Fournet, and L. Maranget -- JoCaml: a Language for Concurrent Distributed and Mobile Programming -- Proceedings of the 4th Summer School on Advanced Functional Programming, Oxford, 19-24, August 2002.

[SLWNAHV05] Sewell, P., Leifer, J. J., Wansbrough, K., Nardelli, F. Z., Allen-Williams, M., Habouzit, P., and Vafeiadis, V. 2005. -- Acute: high-level programming language design for distributed computation -- In Proceedings of the Tenth ACM SIGPLAN international Conference on Functional Programming (Tallinn, Estonia), 2005.

[SM03] A. Sabelfeld and A. Myers. -- Language-based information flow security -- IEEE Journal on Selected Areas in Communications, 21(1), 2003.