7. Implementations

7. Implementations

Browsing

Home:
Concurrent Programming with Fair Threads
The LOFT Language

Previous chapter: FairThreads in C
Next chapter: Examples - 2

Sections

7.1 Implementation in FairThreads
  7.1.1 Translation in FairThreads
  7.1.2 Modules
  7.1.3 Efficiency
7.2 Implementation of REWRITE
  7.2.1 Instructions
  7.2.2 Modules
  7.2.3 Threads
  7.2.4 Schedulers
7.3 Implementation of REPLACE
  7.3.1 Cooperate
  7.3.2 Sequence
  7.3.3 While
7.4 Implementation of OPTIM
  7.4.1 Await
  7.4.2 Limited Await
  7.4.3 Awake
7.5 Direct Implementation
  7.5.1 Event Queues
  7.5.2 Translation in C
  7.5.3 Main Function
  7.5.4 Access to the Next Thread
7.6 Conclusion

Chapters

1. Introduction
2. The Language LOFT
3. Examples - 1
4. Programming Style
5. Semantics
6. FairThreads in C
7. Implementations
8. Examples - 2
9. Related Work
10. Conclusion
11. Annex A: API of FairThreads
12. Annex B: LOFT Reference Manual
13. Annex C: the LOFT Compiler
  Glossary
  Index

One considers now the question of implementing Loft. Actually, five implementations are described:
  • The basic implementation basically translates Loft programs into FairThreads. Its is a complete implementation which can deal with non-native modules as well as with native ones.
  • Three implementations correspond to the previous Rewrite, Replace and Optim semantics.
  • Finally, the last implementation is suited for embedded systems. Actually, it can be seen as a mix of Optim with the automata-based part of the translation in FairThreads.
7.1 Implementation in FairThreads

The basic implementation of Loft uses FairThreads in C which is very close to it (let us recall that Loft means Language Over Fair Threads) and which was introduced in FairThreads Overview. FairThreads is supposed to be executed in the context of a preemptive OS. In this context, unlinked fair threads are run by dedicated native threads, at the kernel level. This is exactly what is needed to implement native modules of Loft. The implementation described in this section is complete, as opposite to implementations considered later, that deal only with the cooperative part of Loft.

7.1.1 Translation in FairThreads

The implementation of Loft basically consists in a translation of modules in FairThreads. The native modules are translated into standard fair threads, while the non-native ones are translated into automata. The translation is rather direct:
  • Types event_t, scheduler_t, thread_t of Loft are directly mapped to the corresponding types ft_event_t, ft_scheduler_t, ft_thread_t of FairThreads.
  • Orders of Loft are directly translated into corresponding orders of FairThreads. For example, stop (t) is translated into ft_scheduler_stop (t).
  • The generate instruction of Loft translates into ft_thread_generate or into ft_scheduler_broadcast, depending on the linking of the executing thread.
  • The instructions cooperate is translated into ft_thread_cooperate when a native module is considered, and into GOTO_NEXT when it is not native.
  • Instruction await of Loft translates into ft_thread_await or into STATE_AWAIT. Other non-atomic instructions are translated into their equivalent statements of FairThreads.
However, some points are specific to the implementation. They basically concern schedulers, deallocation, and the main module.
  • Schedulers are always run as standard autonomous pthreads, using the function ft_scheduler_start. Actually, the scheduler_react function of Loft becomes useless and is not implemented.
  • The main module must always be defined. Thus, the function loft_initialize becomes useless.
  • Deallocation functions are not implemented. It is left to the programmer to define means to deallocate objects created in the heap.

7.1.2 Modules

Each instance of a module is mapped onto a fair thread of type ft_thread_t. This thread is either a standard fair thread in case of a native module, or an automaton in case of a non-native one. Thus, only instances of non-native modules are run autonomously as standard pthreads. The translation will be described through the example of a module. The example considered is the following module that waits for the event go and then prints a message 100 times:

module native m (char *s)
local int i;
await (go);
{local(i) = 0;}
repeat (100) do
   printf ("instant %d: %s\n",local(i)++,local(s));
   cooperate;
end
exit (0);
end module
One first describes the translation of m, then one considers the change where m is a non-native module. In both cases, the module translates into a data structure for local variables, a function that describes the module behavior, and two functions to create new instances of the module.

7.1.2.1 Native Modules

The translation of the module m first defines the datatype structure m_data_ to store the parameters and local variables of m.

typedef struct m_data_
{
   char* s;
   int i;
} *m_data_;
Second a function m_finalize is defined to process the finalizing atom, which in this case is absent.

void m_finalize (void *_args)
{
   m_data_ _local_vars = _args;
}
Third, a function m_fun is built directly from the module body (one has manually indented the code in order to make it more readable).

void m_fun (void *_args)
{
   thread_t _self = ft_thread_self ();
   m_data_ _local_vars = (m_data_)_args;
   _set_thread_return_code (_self,ft_thread_await (go));
   {(_local_vars->i )  = 0;}
   {
       int _i0 = 100;
       for(;_i0>0;_i0--){
          {
             printf ("instant %d: %s\n",
                      (_local_vars-> i ) ++,
                      (_local_vars-> s ) );
          }
          ft_thread_cooperate ();
       }
   }
   {exit (0);}
}
Several points can be noticed in the translation:
  • The executing thread returned by ft_thread_self is assigned to the variable _self in order to be used later, in the function body.
  • The local data of the thread is obtained by the argument arg which is thus cast to the correct type. The macro local used in the module body is expanded into an access to the _local_vars variable.
  • The result of the call of ft_thread_await is passed to a special function that makes it available by calling the return_code function.
  • The repeat loop is translated into a standard for loop controlled by an automatic variable (produced by the compiler with an unique name).
Third, the m_create_in function is defined. It first allocates a structure _locals for parameters and local variables of the returned thread _th. A special allocation function is used, called mymalloc. Actually it just adds some tracing possibilities to the standard malloc function of C. The parameter are copied in _locals. Then, a fair thread is created using ft_thread_create. Finally, the new pthread is detached (using the pthread_detach function of Pthread), in order to avoid the limitation on the number of pthreads that can be attached in an application. Actually, the join primitive of FairThreads is not implemented using pthtread_join which supposes that threads are attached.

thread_t m_create_in (scheduler_t _sched ,char* s)
{
   thread_t _thread;
   m_data_ _locals = mymalloc (sizeof (struct m_data_));
   _locals->s = s;
   _thread = ft_thread_create (_sched,m_fun,m_finalize,_locals);
   pthread_detach (ft_pthread (_thread));
   return _thread;
}
Finally, the m_create function is defined as a call to m_create_in with the implicit scheduler as parameter.

thread_t m_create (char* s)
{
   return m_create_in (implicit_scheduler (),s);
}

7.1.2.2 Non-native modules

One now considers non-native modules and changes the previous module m into a non-native one. Syntaxically, this just means to suppress the native keyword in the definition of m. Non-native modules are naturally translated into automata as they do not need the full power of a dedicated pthread to execute. As auxiliary variables used in the translation, such as the previous variable _i0, cannot anymore be implemented as automatic variables, they are introduced in the local data structure produced. The m_data_ becomes (_i0 is renamed in _loc0):

typedef struct m_data_
{
    int _loc0;
    char* s;
    int i;
} *m_;
The automaton produced by the translation of m has 9 states and is defined by a function which is as previously called m_fun. The first state is used to await the event go. State 1 contains initialization of the local variable i. States 2,3, and 6 implements the repeat loop, using the local variable _loc0, which is initialized in state 2, tested in 3, and decremented in 6. State 4 contains the atomic printing action. State 5 corresponds to the cooperate instruction. State 7 is an auxiliary state (which should probably be removed). Finally, state 8 contains the call to the exit function.

DEFINE_AUTOMATON(m_fun)
{
m_data_ local_vars = (m_data_)_get_thread_args  (_self);
BEGIN_AUTOMATON
STATE_AWAIT (0,go) {}
STATE (1) {local(i) = 0;}
STATE (2) {local(_loc0) = 100;}
STATE (3) {if (local(_loc0)<=0) IMMEDIATE(7);}
STATE (4) {printf ("instant %d: %s\n",local(i)++,local(s));}
STATE (5) {GOTO_NEXT;}
STATE (6) {local(_loc0)--; IMMEDIATE (3); }
STATE (7) {}
STATE (8) {exit (0);}
END_AUTOMATON
}
The m_create_in function creates an automaton using ft_automaton_create:

thread_t m_create_in (scheduler_t sched ,char* s)
{
   thread_t _thread;
   m_ locals = malloc (sizeof (struct m_));
   locals->s = s;
   _thread = ft_automaton_create (sched,m_fun,m_finalize,locals);
   pthread_detach (ft_pthread (_thread));
   return _thread;
}

7.1.2.3 Main Module

The main function creates the implicit scheduler, then creates the main thread in it, and finally starts the implicit scheduler.

static scheduler_t _implicit;

int main (int n,char **args)
{
   _implicit = scheduler_create ();
   _main_create_in (_implicit,n,args);
   ft_scheduler_start (_implicit);
   ft_exit ();
   return 0;   
}
The pthread running the main function is exited, using the ft_exit function, leaving thus the implicit scheduler running autonomously.

7.1.3 Efficiency

Efficiency of the implementation is basically based on two points:
  • There is no busy-waiting on absent events because fair threads waiting for an event are placed into a list associated to it, and then become transparent for the scheduler. Transparent fair threads return to the visible state as soon as the event they are waiting for becomes generated. Thus, one gets an implementation using queues to store waiting threads which actually implements the method described in section OPTIM Semantics.
  • The use of automata for non-native threads is very efficient. Actually, an automaton is implemented as a switch instruction which dispatches the control to the appropriate state. Moreover, there is no limit on the number of automata that can be created, which is not true for native threads.
However, the implementation in FairThreads has a major drawback: it absolutely needs the presence of pthreads (even if they are not started, with non-native modules) and thus suffers of the problems related to pthreads. Amongst them, one can cite the need of a private stack for each pthread, which is memory consuming, and the presence of uncontrolled context-switches, which is inherent to the preemptive basis of pthreads. This last characteristic is clearly time consuming. Because pthreads are mandatory, the described implementation cannot be used for embedded systems with limited resources; this is the reason why the implementation described in section Direct Implementation is proposed.
7.2 Implementation of REWRITE

One now switches to implementations of the cooperative part of Loft. In this section, one directly implements the semantics rules of section Instructions in REWRITE. As the semantics, this implementation is called Rewrite to emphasize on the fact that it mimics the rewriting rules execution. The main purpose of Rewrite is to serve as a reference for others implementations.

7.2.1 Instructions

Instructions are of type instruction_t. Here is the definition of the type:

struct _instruction_t
{
   char type;
}*instruction_t; 
The type instruction_t is used to defined types of specific instructions. They all contain a field parent of type instruction_t which store the actual type of the instruction. For example, the type of sequences (to simplify, one only considers binary sequences) sequence_t, is defined by:

typedef struct _sequence_t
{
   instruction_t parent;
   instruction_t left;
   instruction_t right;
}
*sequence_t;
The make_sequence function returns new sequences:

instruction_t make_sequence (instruction_t left,instruction_t right)
{
   sequence_t res = mymalloc (sizeof (struct _sequence_t));
   res->parent.type = SEQ;
   res->left = left;
   res->right = right;
   return (instruction_t)res;
}
Types corresponding to other instructions are build in the same way, and for simplicity are not consider here. The scheduler and the environment which are transformed by the rewriting rules are considered as global variables in the implementation. The prototype of the rewriting function is:
rewrite_result_t rewrite (instruction_t); 
The result of a rewriting (of type rewrite_result_t) is a structure made of two components: the status (TERM, CONT, COOP, or RETURN) and the residual instruction. For example, the implementation of the rewriting of a sequence is:

rewrite_result_t sequence_rewrite (instruction_t inst)
{
   sequence_t s = (sequence_t)inst;
   rewrite_result_t res = rewrite (s->left);
   if (res->status != TERM) {
      return
          make_rewrite_result (res->status,make_sequence(res->inst,s->right));
   }
   return rewrite(s->right);
}
Note that a new sequence is systematically build by make_sequence when the left branch is not terminated.

7.2.2 Modules

The implementation of modules is close to the one presented in section Implementation in FairThreads except that pthreads are no more used and are replaced by elements of type instruction_t. The translation of a module m creates several items:
  • A structure m_data_ which defines the local data associated to the module (parameters and local variables). The structure m_data_ is constructed exactly as in Implementation in FairThreads.
  • The function m_selector which contains all the atomic instructions called in the module.
  • The function m which returns the instruction of type instruction_t build from the module body.
  • The two functions m_create_in and m_create which returns new instances of m. m_create is the same as in Implementation in FairThreads.
As example, consider the following (non-native) module m (also considered in Implementation in FairThreads):

module m (char *s)
local int i;
await (go);
{local(i) = 0;}
repeat (100) do
   printf ("instant %d: %s\n",local(i)++,local(s));
   cooperate;
end
exit (0);
end module
The function m_selector lists all the atoms and execute them according to an integer parameter:

static void *m_selector (int _n,m_data_ _local_vars)
{
    switch (_n) {        
      case 0: {void *res = (void*)(go); return res;}
      case 1: {local(i) = 0;} return NULL;
      case 2: {printf("instant %d: %s\n",local(i)++,local(s));} return NULL;
      case 3: {void *res = (void*)(100); return res;}
      case 4: {exit(0);} return NULL;
    }
    return NULL;
}
The m function returns an instruction which actually corresponds to the syntax tree of the module:

instruction_t m (m_data_ _local_vars)
{
    return 
      make_sequence (
        make_sequence (
          make_sequence (
             make_await (make_wrapper (m_selector,0)),
             make_atom (m_selector,1)),
          make_repeat (make_wrapper (m_selector,3),
             make_sequence (make_atom (m_selector,2),make_cooperate ()))),
        make_atom (m_selector,4));
}
The function make_wrapper returns a wrapper which is a structure which contains a selector, as the previous m_selector, and an index. The execution of an atom made from a wrapper returns the evaluation of the case corresponding to the index of the selector . Thus, the selector is evaluated at execution time and not when the thread is constructed. Finally, the function m_create_in which creates threads is defined by:

thread_t m_create_in (scheduler_t _sched, char* s)
{
   thread_t _thread;
   m_data_ _locals = mymalloc (sizeof (struct m_data_));
   _locals->s = s;
   _thread = make_thread (_sched,m (_locals),m_finalize,_locals);
   add_to_link (_sched,_thread);
   return _thread;
}

7.2.3 Threads

Threads are structures defined in Semantics Basics. The type thread_t is implemented by:

typedef struct _thread_t
{
   instruction_t          *run;
   finalizer_t             finalizer;
   void                   *locals;
   char                    state;
   char                    suspended;
   event_t                *signal;
   thread_t               *called;
   char                    code;
   scheduler_t            *scheduler;
}*thread_t;
The field run is the instruction executed by the thread. locals points to the thread local variables. The state of the thread is contained in state. The thread is suspended when suspended is different from 0. The event signal is generated when the thread terminates its execution. The field called is set while a thread is under execution through a run instruction. The field code holds the return code of some instructions (namely, await, join, and get_value). Finally, scheduler denotes the scheduler in which the thread is currently executed. The make_thread function returns a new thread in which a the termination signal is generated at the end of the executed instruction:

thread_t make_thread (scheduler_t sched,
                      instruction_t inst,
                      finalizer_t final,
                      void *locals)

{
   thread_t res      = mymalloc (sizeof (struct _thread_t));
   res->signal       = event_create_in (sched);
   res->run =
      make_sequence (
         inst,
         make_generate (make_const_expression (res->signal))
      );
   res->finalizer    = final;
   res->state        = _CONT;
   res->suspended    = 0;
   res->called       = NULL;
   res->code         = OK;
   res->locals       = locals;
   res->scheduler    = sched;
   return res;
}
The reaction of a thread consists in a rewriting of the instruction embedded in it:

void fire_thread (thread_t thread)
{
   rewrite_result_t res;
   thread->scheduler->current = thread;
   res = rewrite (thread->run);
   return_processing (thread,res->status);
   thread->run = res->inst;
}
The return_processing function basically replaces the RETURN status by TERM.

7.2.4 Schedulers

Schedulers are structures defined in Semantics Basics and implemented by:

typedef struct _scheduler_t
{
   item_list_t          linked;
   thread_t             current;
   item_list_t          events;
   item_list_t          orders;
   item_list_t          to_broadcast;
   char                 eoi;
   char                 move;
}*scheduler_t;
One just describes the way the scheduler reacts, corresponding to the instant function:

void instant (scheduler_t sched)
{
   start_instant (sched);
   while (1) {
      fire_all_threads (sched);
      if (stable (sched)) break;
      make_decision (sched);
   }
}
The start_instant function called at the beginning of each new instant is defined by:

static void start_instant (scheduler_t sched)
{
   sched->move = sched->eoi = 0;
   transfer_threads       (sched);
   transfer_all_events    (sched);
   thread_changes         (sched);
   reset_scheduler        (sched);
}
During a phase, the threads in linked are all fired in turn:

static void fire_all_threads (scheduler_t sched)
{
   item_cell_t cell = sched->linked->first;
   while (cell) {
      thread_t th = cell->item;      
      if (fireable (th)) fire_thread (th);
      cell = cell->next;
   }
}
Ths stability of the scheduler is tested by the function stable defined by:

static int stable (scheduler_t sched)
{
   item_cell_t cell = sched->linked->first;
   while (cell) {
      if (fireable ((thread_t)cell->item)) return 0;
      cell = cell->next;
   }
   return 1;
}
Finally, the end of the current instant is decided when the make_decision function is called while the move flag is false:

static void make_decision (scheduler_t sched)
{
   if (sched->move) sched->move = 0; else sched->eoi = 1;
}
7.3 Implementation of REPLACE

One now consider the variant Replace of Rewrite. The basic idea is to introduce states in instructions and to change instructions states during the rewriting process in order to limit to the maximum the creation of new instructions. One just gives here the main changes with the previous Rewrite implementation, concerning cooperate, sequences, and loops.

7.3.1 Cooperate

The cooperate instruction has a state which is set to indicate that the instruction has been executed:

char cooperate_rewrite (instruction_t inst)
{
   cooperate_t coop = (cooperate_t) inst;
   if (coop->state) return TERM;
   coop->state = 1;
   return COOP;
}
A function is defined which resets the state of the instruction:

void cooperate_reset (instruction_t inst)
{
   cooperate_t coop = (cooperate_t) inst;
   coop->state = 0;
}

7.3.2 Sequence

A state is also introduced in sequences (only binary ones are considered) which is set when the left branch of the sequence is terminated:

char sequence_rewrite (instruction_t inst)
{
   sequence_t s = (sequence_t)inst;
   if (0 == s->state) {   
      int res = rewrite (s->left);
      if (res != TERM) return res;
      s->state = 1;
   }
   return rewrite(s->right);
}
Note that the rewrite function directly returns the execution status because no new instruction is created as in Rewrite. The reset function for sequence resets the state and recursively resets the branches of the sequence:

void sequence_reset (instruction_t inst)
{
   sequence_t s = (sequence_t)inst;
   s->state = 0;
   reset_instruction (s->left);
   reset_instruction (s->right);   
}

7.3.3 While

The while loop has a state which is set to 1 when the expression is evaluated. When the loop body terminates, it is reset and the state is reset to 0 to force the expression to be re-evaluated:

char while_rewrite (instruction_t inst)
{
   while_t w = (while_t)inst;
   char r;
   while (1) {
      if (0 == w->state) {
         int b = (int)evaluate (w->exp);
         if (!b) return TERM;
         w->state = 1;
      }
      r = rewrite (w->body);
      if (r != TERM) return r;
      reset_instruction (w->body);
      w->state = 0;
   }
}
7.4 Implementation of OPTIM

The semantics Optim of OPTIM Semantics can be now implemented as a variant of the previous implementation Replace, to avoid busy-waiting on events. One has made a little change with the semantics of OPTIM Semantics by implementing the global table waiting of the scheduler with a list included in each event. Two functions are defined to register threads returning the WAIT status: register_in_event (e) registers the executing thread (actually self ()) in the waiting list associated to the event e, and register_in_timer (s) registers the executing thread in the timer list of the scheduler s. One now considers the implementation of the await instruction (in both the limited and the unlimited forms).

7.4.1 Await

If the awaited event is not present, then the await instruction is registered in the waiting queue of the event and WAIT is returned.

char await_rewrite (instruction_t inst)
{
   await_t a = (await_t)inst;
   if (0 == a->state) {
      a->event = evaluate (a->exp);
      a->state = 1;
   }

   if (is_present (a->event)) return TERM;
   register_in_event (a->event);
   return WAIT;
}
Now the status CONT and COOP are not returned. Either, the instruction terminates because the event is present (and the status returned is TERM), or it is registered in the event.

7.4.2 Limited Await

For a limited await registration is both in the waiting queue of the event and in the scheduler timer, before returning WAIT.

char limited_await_rewrite (instruction_t inst)
{
   limited_await_t a = (limited_await_t)inst;
   scheduler_t sched = current_scheduler ();
   if (0 == a->state) {
      a->event = evaluate (a->evt_exp);
      a->limit = (int)evaluate (a->limit_exp);
      self ()->deadline = a->limit + sched->instant;
      a->state = 1;
   }
   if (self ()->deadline <= sched->instant) {
      self ()->code = ETIMEOUT;
      return TERM;
   }
   if (is_present (a->event)) {
      self ()->code = OK;
      return TERM;
   }
   register_in_event (a->event);
   register_in_timer (sched);
   return WAIT;
}

7.4.3 Awake

The function generate now calls the awake function in order to awake the threads waiting for the generated event. To awake a thread makes it once again executable by the scheduler by setting its state to CONT:

static void awake (event_t event)
{
   item_cell_t cell = event->waiting->first;

   while (cell) {
      thread_t th = cell->item;
      if (!th->suspended && th->state == WAIT) th->state = CONT;
      cell = cell->next;
   }
   reset_item_list (event->waiting);
}
The processing of deadlines uses the auxiliary function which returns 1 if the deadline is reached and 0 otherwise:

static int deadline_reached (void *item)
{
   thread_t thread = item;
   if (thread->deadline > current_scheduler ()->instant) return 0;
   if (thread->state == WAIT) thread->state = CONT;
   return 1;
}
The awaking of threads with deadlines reached is done at the beginning of each instant, and the start_instant function becomes:

static void start_instant (scheduler_t sched)
{
   sched->instant++;
   sched->move = sched->eoi = 0;
   awake_deadline_reached (sched);
   transfer_threads       (sched);
   transfer_all_events    (sched);
   thread_changes         (sched);
}
where the awake_deadline function calls deadline_reached on all the elements in timer, removing them when 1 is returned.
7.5 Direct Implementation

One considers now an implementation of Loft which mixes the previous implementation of the Optim semantics described in Implementation of OPTIM and the automata-based approach of the implementation in FairThreads considered in Implementation in FairThreads, but without the need of pthreads anymore. One calls this implementation Direct as it consists in a rather direct translation in C, without the need of some particular library. Moreover, Direct use an efficient algorithm to access the next thread which needs execution. One gets an efficient implementation with low consumption of memory. This implementation is suited for using Loft on embedded systems with limited resources. An experiment described in section Prey-Predator Demo has been made with the GBA (Game Boy Advance) platform of Nintendo.

7.5.1 Event Queues

Event queues are used to avoid busy-waiting not only during one instant, but also accross several instants. A thread which is awaiting a not present event registers itself in a queue associated to the event. Up to this point, the thread falls into a new WAIT state, and will not be considered for execution by the scheduler to which it is linked, until the event is generated. When an event is generated, elements of the queue associated to it all return in the normal execution state. Note that the order in which the threads are executed is not perturbed by this mechanism which is actually just a way to suspend execution of threads awaiting non generated events. In this case, suspension and resumption are immediate, i.e. they become effective in the current instant and not in the next one as it is the case with the suspend/resume instructions.

7.5.2 Translation in C

Let us consider another time the module of Modules. The translation makes use of the following macros, which are very similar to the ones of FairThreads:

#define BEGIN_AUTOMATON\
      while (1) {\
         switch (_local_vars->_state)\
         {	    

#define STATE(n)\
            case n: SET_STATE(n);

#define END_AUTOMATON\
            default: THE_END\
         }\
      }

#define SET_STATE(n) _local_vars->_state=n
#define GOTO_NEXT    {SET_STATE(_local_vars->_state+1); return COOP;}
#define GOTO(n)      {SET_STATE(n); return COOP;}
#define IMMEDIATE(n) {SET_STATE(n); break;}
#define THE_END      {SET_STATE(-1); generate (self()->signal); return TERM;}
The code is actually very close to the one obtained with the translation in FairThreads. However, now the state of the automaton (_state) becomes explicit and is considered as a local variable. The local data structure becomes:

typedef struct m_data_
{
    event_t _loc0;
    int _loc1;
    char* s;
    int i;
    int _state;
} *m_data_;
The module body is translated into the automaton m_automaton which is the function:

char m_automaton (m_data_ _local_vars)
{
BEGIN_AUTOMATON
STATE(0) {local(_loc0)=go;}
STATE(1) {
   event_t event = local(_loc0);
   if (event->scheduler != current_scheduler ()) {
      self()->code = EBADLINK;
   } else if (!is_present (event)) {
      register_in_event (event);
      return WAIT;
   }
}
STATE(2) {local(i) = 0;}
STATE(3) {local(_loc1) = 100;}

STATE(4) {if (local(_loc1)<=0) IMMEDIATE(8);}
STATE(5) {printf ("instant %d: %s\n",local(i)++,local(s));}
STATE(6) {GOTO_NEXT;}
STATE(7) {local(_loc1)--; IMMEDIATE (4);}
STATE(8)
STATE(9) {exit (0);};
END_AUTOMATON
}
The code of state 1 implements the await (go) instruction. If go is not present, then the executing thread (returned by self) is registered in the event and the status returned is WAIT. Thus, the thread will never be considered for execution until go will be generated. When it will be the case (if it happens), then the thread status returns to CONT and thus the execution will resume. An important point to note is that the order in which threads are executed always remains the same and is not changed by awaiting events. The function which creates a thread is:

thread_t m_create_in (scheduler_t _sched,char* s)
{
   thread_t _thread;
   m_data_ _locals = mymalloc (sizeof (struct m_data_));
   _locals->s = s;
   _thread = make_thread (_sched,m_automaton,m_finalize,_locals);
   _locals->_state = 0;
   add_to_link (_sched,_thread);
   return _thread;
}
Note the initialization of the automaton state. The function make_thread returns a thread executing the function passed as first parameter with the finalizer as second parameter and the local data as third parameter. The new thread is added in the implicit scheduler by the call to add_to_link.

7.5.3 Main Function

The default main function is defined by:

int main (int argc,char **argv)
{
   loft_initialize ();
   _main_create_in (implicit_scheduler (),argc,argv);
   while (1) scheduler_react (implicit_scheduler ());
   return 0;
}
This function has to be adapted if one does not want all the CPU to be consumed by it.

7.5.4 Access to the Next Thread

The threads placed in linked are considered in turn fot execution. When a thread is in the state WAIT it should not be considered for execution and the scheduler should pass on it without any other action to perform. Of course, the state can later return to CONT when the thread is awaken, and in this case the scheduler will have to reconsider it again. To avoid to consider threads that are in state WAIT, one uses a masking technique4: threads with the state WAIT are masked and disapear from linked. However, they are not definitively extracted from the list and can be reintroduced in it at the same place. This of course happens when a masked thread is awaken. In this way, the efficiency of the scheduler do not depend anymore from threads that are registered in events or in the scheduler timer. To implement masking, linked is implemented as a list of cells with two series of pointers. The first serie describes the structure of the list which does not change except when a thread terminates or when new threads are linked to the scheduler. Note that this can only appear in-between instants. The second serie of pointers gives the next and previous unmasked cell. This pointers are changed when a thread returns WAIT and when it is awaken. Using the structure pointers allows to reinsert a thread at its previous position when awaken. The cost of finding the next unmasked thread to execute is constant. The cost of a masking action is constant. However the cost of unmaking depends on the number of threads (masked and unmasked) present in the list. This is basically because threads are reintroduced in linked at their initial place. One thus gets a very efficient mechanism that is specially useful when a large number of threads is in use while, at each instant, few of them are really executed, the others ones waiting for absent events.
7.6 Conclusion

One has considered five implementations of Loft. The first implementation consists in a direct translation into FairThreads, which is an API based on a computing model very close to Loft. From this point of view, Loft can be seen as a mean to replace an API by a language, in order to ease the programming of fair threads. The implementation is complete, as native modules are taken in account. Actually, instance of native modules are implemented as standard kernel threads, which are provided by the pthread library. Then implementations of the three semantics proposed have been considered. Rewrite is a direct implementation of the basic rewriting rules defining the semantics of Loft. This direct coding is possible because the semantics is deterministic and structurally defined. In Rewrite, new terms are build at each micro-step, which is clearly an unsatisfactory solution. Moreover, the memory is never released (there is no GC!) In the variant Replace of Rewrite, the construction of new terms is limited to the minimum. Basically, in Replace, new allocation occurs only when new threads are added in the system. Thus, the memory remains stable while no new thread is created. Both Rewrite and Replace are using a polling strategy for awaiting events. Actually, a thread executing await (e) tests for the presence of e at each micro-step and during all instants, until e is generated (if it is). This is clearly inefficient. The Optim implementation avoids busy-waiting for absent events, using queues in which waiting threads are placed. Finally, the fifth implementation Direct is suited for embedded systems. As Optim, it uses queues to store threads awaiting for an event are used, instead of busy-waiting. As the implementation in FairThreads, it translates non-native modules in automata. However, no library of kernel threads must be used in this case. Moreover, Direct uses a masking technique to get direct access to the next thread to be executed. One thus gets a very efficient with low memory consumption implementation of the cooperative part of Loft.


4: This mechanism is inspired from the one of Olivier Parra, defined in the context of reactive programming.

This page has been generated by Scribe.
Last update Wed Oct 22 18:41:04 2003