State of the problem and our solution

As software complexity increases, so does software lifespan. It is not uncommon in the HPC world to see applications that started their life cycle more than 20 years ago, and are still being actively maintained. On the other hand, the hardware and software environment of these applications is evolving rapidly, and with every new hardware generation or software paradigm shift, applications have to be partially rewritten to take advantage of these changes. An example of such migration path would be an application vectorized in the early 80's, then parallelized for distributed memory systems with PVM ten year later, now migrated to MPI... While some of the changes involved in such software transformations require a deep understanding of the application and considerable technical expertise, a large part of the changes consist in straightforward code updates, easy but cumbersome to do apply manually. A first level of user assistance in this task is provided by simple textual tools, like the search/replace tool available in all interactive text editor, or batch tools like sed or awk. While this may be enough for very simple transformations, even very simple transformation like swapping array indices for a given common variable requires some understanding of language syntax and typing rules. The solution described here is a language independent transformation system based on syntactic rewriting rules working on abstract syntax trees, with semantic constraints.

TrfL scope and applications

Our aims, with the TrfL language is twofold: 1) promote the expressiveness of the language, with a powerful pattern-matching mechanism with access to contextual information and 2) give an interactive environment to build or modify transformations by clicking for both users and experts (in programming transformations). TrfL is a generic rule-based Transformation Language. TrfL is generic, it is not dedicated to one language but we focus on the advantage of using this language in an FORTRAN environment. TrfL intends to be used by both FORTRAN programmers and experts (who know TrfL, AST, FORTRAN, ...). A user can specify simple transformations with few effort, transformations like adding/removing a parameter in function definitions/calls, tests on array indexes don't require a great level of knowledge. More complex transformations like moving from a programming convention to another one, from FORTRAN77 to FORTRAN90, ... have to be specified by an expert. A TrfL specification is a set of transformations rules. Roughly, a transformation consists (Fig. 1) of a source pattern, a target pattern and an application condition. A source or target pattern is an abstract syntax tree (AST) with meta-variables (place holders that represent arbitrary pieces of source code). An abstract syntax tree is a structured representation of a program that omit all unnecessary information like keywords. The application condition is a predicate that must be true in order for the rule to be fired. The target pattern is a replacement for the source code when the source pattern has matched the source code, the meta-variables have been instantiated and the application condition is verified. The patterns represent the syntactic part of the rule and the application condition can refer to non local and semantic informations (symbol table, control flow graph,...) in order to restrict to applicability of the rule. Compared to others transformation languages which generally allow very restricted form of pattern matching, TrfL enhances this mechanism. In these transformation languages, a pattern can only match near the root of a tree term or it can only select fixed number of items in list tree. TrfL patterns offer more than only abstract syntax tree with meta-variables, patterns can match at arbitrary distance from the root of the whole term and anywhere in list tree. The user only describe the shape of the subtree he wants to match, and not where and how such a subtree should be found. Thus it becomes easier to describe a pattern that find nested loops for example. Others features of the language are transformation composition (combine several transformations to make one), if, case structures and post actions (that can be used to update contextual information, or to emit a warning message to the user for example).

Figure 1: A sample transformation

The TrfL system (Fig. 2) consists of an engine and of an environment. The engine takes a source program, searches for pieces of code that match a transformation, checks the application condition of that rule to validate its applicability. If the condition is valid, the selected code is replaced with the instantiated target pattern. If the condition is not valid, the current transformation rule is not used and the engine continues its job (find a new transformation rule for the same piece of code, look for a new piece of code,..). The engine has several decisions to make during the transformation process:

How the source code should be searched?
Which rule should be fired if more than one is applicable?
Should rules be reapplied?
...

The environment supports two functions:

it embeds the engine. One may want to perform the transformation process in batch mode without interaction, in this case some parameters (to control the points listed above) must be set before running the process and the engine can work stand alone. In another way the user can interact on the process, 1) by guiding or controlling the engine, 2) by choosing a piece of code and then the engine gives the list of valid transformations, 3) by choosing a transformation and then the engine shows where this transformation can be applied.
it provides a framework to build or modify transformations. Patterns are abstract syntax trees, but a user is more familiar with the concrete syntax than with the abstract syntax and then building transformations could be a hard task. This can be solved with an interactive building of a transformation from a template code and pattern from a source code, by selecting on the piece of code that ``looks like'' the desired pattern and then by refining it with meta-variables. A template is a transformation rule with holes (in fact meta-variables) that the user fills in selecting trees or predefined constructs.

Figure 2: The transformation system

Transforming FORTRAN programs

The need of transformations in FORTRAN is a well-known subject, and tools that automate (fully or partially) the process are welcome and useful. A list of tasks that a transformation system can support includes :

parallelization,
instrumentation,
restructuration,
optimization.

A text editor with a string regular expression ``query-replace'' function or a script (Perl, awk, ...) is often not enough and is somehow error prone. The simplicity of regular expressions often does not allow a programmer to express the desired query and does not exploit the underlying tree structure of the program (on the other side, processing comments or non syntactically correct code is easier). The integration of TrfL in Foresys is a powerful environment for FORTRAN re-enginneering and analysis. Foresys comes with a full Fortran90/95 environment, including parser, several analysers, and a pretty-printer. The integration of TrfL in Foresys is to instantiate the TrfL system for FORTRAN, that is provide an API to access the structures that analyzer tools produce (symbol table, typing information, control flow graph, ...). The system comprised a predifined bunch of transformations, and users can defined their own transformations by modifying, refining the predefined ones or writing them from scratch. Interactions between the user and the expert are needed for complex transformations, the first explaining the problem and the second specifying a solution.

Examples

This section presents three transformations from the simpliest to more elaborated ones.

Equality test

This simple transformation replaces an equality test where the two operands are of float type with an approximation equality test. This transformation is mostly syntaxic, although typing information is needed to check the applicability of the rule

   -- the source pattern: (X.eq.Y)
   eq(X, Y)
    provided #:fsys:isFloatingType(X) & #:fsys:isFloatingType(Y)
    ->
   -- the target pattern: (call abs(X - Y) .lt. (10.0 * call spacing(X)))
   lt(func_call(name "abs", l_exp(sub(X, Y))) , mul(real_cst "10.0", func_call(name "spacing", l_exp(X))))

Nest of if statements

Here, the intent is to transform successive if elseif constructs in select case statement. This transformation is part of the migration from FORTRAN77 to FORTRAN90. There are several points to check:

the condition must be an equality test or successive equality tests between or operators,
the condition must be a constant expression,
the variable that occurs in the condition must be the same trough all the successive if elsif.

Only the main transformation rule is shown.

    -- the root pattern: if (COND) then THEN ELSE
    struct_if(OPT, COND, THEN, ELSE)
     provided
      -- verifies that the condition has all the required properties
      condi := CheckExp_(COND)
       in  --                                elseif _ then _ _           
      #:fsys:neq('failed, condi) & ELSE in { struct_elseif(_, _, _, _) }
     ->
    let
      var := #:fsys:car(condi);
      exp := #:fsys:cadr(condi);
      firstcase := makeCase_(exp::THEN); -- builds the first element
      LCSE := CollectCase_(<var, l_case()>::ELSE); -- builds the following elements
      LCASE := AddFirst_(firstcase::LCSE) -- adds the first element
      in
     -- the target pattern, adds a comment to point out the transformation
     -- select case (var)
     --   LCASE (the list of case statements)
     select_case(OPT, var, LCASE, none())^ prefix(comment_s(comment " struct_if converted in select_case"))
    end let;

    ...

Conversion from PVM to MPI

There are two well-known communication libraries for parallel message-passing applications. While PVM and MPI are not entirely interchangeable, both are sufficiently general for most applications and migration from one to another is possible.

Initialization:
    call pvmfmytid( mytid )
->
    call MPI_Init( info )
    call MPI_Comm_rank(MPI_COMM_WORLD,
        rank, info)
    call MPI_Comm_size(MPI_COMM_WORLD,
        size, info)

Easing migration of FORTRAN programs
Student paper - Extended abstract

State of the problem and our solution

TrfL scope and applications

Transforming FORTRAN programs

Examples

Equality test

Nest of if statements

Conversion from PVM to MPI

Evaluation

Easing migration of FORTRAN programs Student paper - Extended abstract

State of the problem and our solution

TrfL scope and applications

Transforming FORTRAN programs

Examples

Equality test

Nest of if statements

Conversion from PVM to MPI

Evaluation

Easing migration of FORTRAN programs
Student paper - Extended abstract