|
The TAPENADE Tutorial
|
|
TAPENADE is an Automatic Differentiation Engine developed at
INRIA Sophia-Antipolis
by the
TROPICS team. TAPENADE can be utilized as a server (JAVA servlet),
which runs at INRIA Sophia-Antipolis. The current address of this
TAPENADE server is
http://tapenade.inria.fr:8080/tapenade/index.jsp
.
TAPENADE can also be installed locally as a set of JAVA classes
(JAR archive). In that case it is run by a simple command line,
which can be included into a Makefile. It also provides you
with a user-interface to visualize the results in a XHTML browser.
Along this tutorial, we will show you one illustration example.
All you have to do to run this example is to follow the
instructions given in the paragraphs in italics, like this one.
Decide on the sub-program to differentiate
Specify the differentiation request
Actual Differentiation
Alternative invocation from the command line
Examine the differentiated program
Download the differentiated program
Modify the differentiation request
Examine the result of the Reverse Mode
A more complex example of the Reverse Mode
Diagnostics pages after failure
Want to install TAPENADE locally on your system?
Decide on the sub-program to differentiate:

TAPENADE takes as input a computer source program,
plus a request for differentiation.
TAPENADE builds and returns the differentiated source program,
that evaluates the required derivatives.
The first step in using TAPENADE is to identify and load
the input source program. In normal use, you know which
routine (i.e. a subroutine or function)
implements the mathematical function that
you want to differentiate. Let us call
it the top differentiation routine.
Notice that it is probably not the main program.
In order to get an efficient
differentiated program, the top differentiation routine,
plus all the routines that can be recursively
called through it, must all be given to TAPENADE.
If some source routine cannot be given to the tool,
("black box" routine, for example an external or library routine),
TAPENADE will make some conservative assumptions about it, which may
degrade the execution time and memory consumption of the
differentiated program. Of course this is a problem only if the
top differentiation routine (recursively) calls this
black box routine (TAPENADE provides a way to treat these
"black box" routines correctly and efficiently).
On the other hand, you may also give
TAPENADE some routines that are not called
through the top differentiation routine. TAPENADE
will analyze them and maybe will issue some diagnostics
about them, but they will not be differentiated.
Notice also that if you do not specify the name of the
top differentiation routine, TAPENADE will select one for you.
It will select any one of the topmost routines in the call graph
defined by the files you have given. Except in very simple cases, this
might not be the choice you want, so use this default
mechanism with caution.
The input source program must be written in Fortran77, Fortran95, or C.
The differentiated source program will be returned using the same language as the input.
Some advanced features of Fortran95, and some features of C, are not properly
differentiated yet. TAPENADE will issue a warning message if
one such feature is met. We are currently working to accept
the missing features. More info in the FAQ.
In TAPENADE, the internal representation of programs is largely
independent from the actual program's language. This should
make it easier for us to adapt to a new language later.
The next action in this tutorial is to build a
Fortran77 source program, that will be used
as an input for TAPENADE.
Of course in normal use, you already have your source program,
and TAPENADE normally does not require you to modify it.
But for this example, please cut and paste
the following
piece of source into some file that you
will call, say, mainpart.f, somewhere on your disk.
Similarly, put
this piece into, say,
utilities.f, and lastly
this piece of include text into an include file,
globals.inc. The file names do not matter much,
except of course for the include file.
Now give this program as an input for the TAPENADE server,
by repeatedly uploading all the source files as follows:
- Either type in the complete file path for mainpart.f,
or use the "Browse..." button to reach the file.
In that second case,
don't forget to finish by double-clicking on it or by
clicking on the "OK" button.
The complete file path should now appear in
the "file path" field.
- Actually upload it into the TAPENADE server, by clicking
on the "as a source" button. This declares that the file
contains a syntactically correct Fortran77 source.
Alternatively, the "as an include" button declares that
the file is an include file, i.e. is syntactically correct only in the context
of an enclosing Fortran77 source.
Repeat these two steps for utilities.f. Finally upload
globals.inc, but this time
using the "as an include"
button. All uploaded files should now be displayed by the
TAPENADE server window. In any case, you may remove all or some of the already loaded files,
using the two new buttons that appeared in the page:
"Retry with new files" and
"Remove selected files"
Specify the differentiation request:

The next step is now to fill in the request for differentiation.
The request for differentiation essentially composes of 4 parts:
- the name of the top differentiation routine
- the dependent output variables whose derivatives are
required,
- the independent input variables with respect to
which differentiation must be made, and
- the required mode of differentiation.
Notice that the present version of TAPENADE requires that the function you
want to differentiate corresponds exactly to a program routine.
Otherwise, if you want to differentiate only the function defined by
a routine fragment, you must first split this routine,
so that this fragment becomes a new separate routine, which will be
your top differentiation routine.
To go on with our example, please type "sub1" into
the "Name of the top routine" field.
Now we must select the names of the dependent output variables and the
independent input variables, i.e. specify which derivatives are wanted,
of which results with respect to which entries.
Notice that, for TAPENADE,
these two sets of variables must belong to the visible parameters of the
top differentiation routine. This means either formal
parameters, or declared globals such as variables in Fortran COMMON blocks.
In contrast, local variables or constants are not visible parameters.
Moreover, visible parameters must all belong to a type
for which differentiation is defined, i.e. REAL, DOUBLE PRECISION, or COMPLEX in FORTRAN.
Leaving blank the "dependent output" field selects all visible
outputs. Similarly, leaving blank the "independent input"
field selects all visible inputs.
Finally, notice that if there is no dependency between the
selected dependent and independent variables, the derivatives are certainly all
zero. TAPENADE detects this degenerate situation statically, and this results
in an empty differentiated program.
This may happen for example when dependent variables depend
on inputs that are all absent from the independent list.
Refer to the FAQ for a more complete discussion on this.
In particular, if the derivative of a variable is really required after the differentiated
procedure, then add it into the dependent list, even if it is not actually modified
by the procedure.
In our example, we can see that
"sub1" takes 5
formal parameters, plus two variables
"x" and
"y"
in a common. Variables "x"
and "y" are not used.
All 5 formal parameters are outputs, and only the first 3
parameters, namely "first",
"other",
and "third",
are inputs. Suppose we want the derivatives
with respect to "first"
and "other", of all
possibly dependent outputs. To this end, leave the
"Dependent output variables" field blank, and type
"first other"
into the "Independent input variables"
field.
The last step is to choose between the proposed differentiation modes.
To put it shortly, the "tangent mode" will build a program that,
given some small variations of the independent variables, computes the resulting
(1st order) variations of the dependent variables. In other words,
if we call "J" the Jacobian matrix of the partial derivatives
of each dependent "Yj" with respect to each independent
variable "Xi", the tangent mode
gives a program that computes "dY=J.dX" for each given "dX".
The "tangent multidirectional mode" also computes the variations of the output variables,
but simultaneously for several directions in the input space. For n directions, it is
therefore equivalent to running the tangent mode n times, but this is done in a single call,
and is therefore more efficient because the original function is evaluated only once.
Conversely, the "reverse mode" gives a program that computes the
transposed Jacobian product "J*.dY" for each
given "dY". In other words, given a weighting of the
dependent output variables "dY", the generated program computes
the gradient of the original program.
You may refer to section "what is A.D."
for further details.
Let us first take a look at the tangent mode: click on the
"Tangent Mode" button. A new page appears,
that displays the differentiated program...
Actual Differentiation:

Clicking on one of the buttons "Tangent Mode", "Tangent Multidirectional Mode"
and "Reverse Mode" triggers actual analysis and
differentiation of the uploaded files. For large files, this may take
some time before our server builds the differentiated files.
Times vary with the program, but for example,
it takes 1 minute and 40 seconds for a 67 thousand lines FORTRAN file,
and a dozen of seconds for a 10 thousand lines.
However, to protect our server, we have limited the size of the files
you may differentiate with the TAPENADE server, to 100 thousand characters.
There is no such limitation in the case of TAPENADE installed locally on your
system.
If all goes well, a new page appears in your browser, displaying
differentiation result. If something went wrong, you get a special
diagnostics page, which is discussed later.
Alternative invocation from the command line:

In the above sections of this tutorial, the illustrative example is based
on the TAPENADE web server. The same behavior can be achieved with
TAPENADE installed locally on your system. In that case, differentiation
is called from the command line instead of through a Web interface.
In the case of an invocation from the command line, the discussion
of the above sections is still valid. Only the illustration example
must be executed differently.
The command name is "tapenade", followed by options and arguments.
Of course your "PATH" variable must refer to the place where
tapenade is installed, which is the subdirectory
"bin"
of the TAPENADE installation root.
You may type "tapenade -help" to get the list of options.
The most important options are:
- -head, to specify the top differentiation routine,
- -outvars, for the dependent output variables,
as a white-space separated list enclosed by double quotes,
- -vars, for the independent input variables,
as a white-space separated list enclosed by double quotes,
- -tangent or -d, to differentiate in the tangent mode,
- -reverse or -b, to differentiate in the reverse mode,
- -output, to specify the name of a single file that will contain
all differentiated routines.
- -O, to specify the directory in which generated files will be placed.
- -inputlanguage, to specify the language of the input, if Tapenade
cannot tell from the extension.
- -outputlanguage, to specify the language of the output.
If it is not the same as the input language, some strange things may happen!
- -html, to create an additional HTML output, that can be
displayed in your browser.
The command line must also contain the names of the files that contain
the original program. Order of options and file names should not matter...
There is a refined way to use the -head option, to specify several
differentiation heads at the same time. If pi are procedures,
xi are independent inputs and yi are dependent outputs
of their respective procedures, then you may type e.g.:
-head "p1(x1 x3)\(y2 y5) p2(x7)\(y8 y9 y10) p4(x2 x7)\()"
or with equivalently, with slashes in the other direction
-head "p1(y2 y5)/(x1 x3) p2(y8 y9 y10)/(x7) p4()/(x2 x7)"
As can be expected, an empty list of variables means all possible dependent or independent
arguments of the current procedure.
The big advantage of this complex call is that deep procedures, called by
several of the pi head procedures, will be differentiated only once,
in a context which is the "envelope" of all possible individual calls.
Suppose you have downloaded the small example, as
described in the first section. Therefore you have
put the three files "mainpart.f", "utilities.f",
and "globals.inc" into one directory. Go to this directory.
Instead of the interface manipulations described above,
and to request the same differentiation, just type:
$> tapenade -tangent -head sub1 -vars "first other" mainpart.f utilities.f -html
New files are created, that hold the result of differentiation.
Since the "-html"
option was given,
a HTML display of the results is prepared, just as with the
web server. With some browsers, a new web page pops automatically.
Otherwise, load into your browser the page that
you can find under the curent directory in
"tapenadedir/tapenade.html"
You are now ready to examine the results of differentiation,
with the same interface as the web server, as explained below.
Notice that if you just type "tapenade" with no options,
you get a graphical input interface very much like the web server input interface,
that lets you specify progressively your differentiation request.
Examine the differentiated program:

The page that displays the result of differentiation is composed of
several frames. The meaning of these frames is the following:
- the top left frame displays the
call graph of the original
program. Starting from top-level, each routine is displayed,
and under it are displayed the routines it calls, recursively
indented. System (intrinsic) routines are not shown.
When the call graph is large, TAPENADE avoids
duplicate display of identical sub-graphs. External routines
are displayed in smaller type. Each routine name is in fact
a HTML link, that triggers display of the corresponding
routine in the frame below.
- the top right frame displays the
call graph of the differentiated routines. It may very well
happen that some differentiated routine calls a non-differentiated
routine. However, for clarity, these non-differentiated
routines are not displayed here.
Here also, each routine name is in fact
a HTML link, that triggers display of the corresponding
differentiated routine in the frame below.
- the middle left frame displays the text of some
routine of the original program. Colors are used to highlight
syntactic structure, e.g. keywords, data types, function names,
labels or comments. Notice that this may differ from the actual
contents of the original file, because the routine has been
preprocessed, and this results in normalized declarations or
indentations. Each line of the subroutine is in fact
a HTML link, that triggers display of the corresponding
line of the differentiated routine, if any, in the frame on the right.
- the middle right frame displays the text of some
generated differentiated routine. The syntactic structure
is highlighted as usual, and each line of the subroutine is in fact
a HTML link, that triggers display of the corresponding
line of its original routine, in the frame on the left.
- the bottom frame displays some
error or warning messages
that may have been emitted during the analysis of the original program.
Typical messages deal with interprocedural type-checking,
dimension checking, uninitialized variables, aliasing...
These messages are described in detail in the
"Messages issued by TAPENADE"
section of the Reference Manual.
In fact, each message in this bottom frame has a HTML link to
the section of the reference manual that describes it.
Do not overlook these messages: even if your program
compiles well, they may indicate serious problems with the differentiated
program. In many cases, you will neglect these messages in the end,
but only after making sure none of these serious problems occur in
this particular case.
Each error message is a HTML link
to the corresponding place in the original or differentiated programs, in the middle
left or right frame. Conversely, the
icon in the displayed program indicates the presence of an error or warning message,
which can be viewed by clicking on the icon.
Let us use this interface to examine the differentiated program (tangent mode).
You may refer to section
"Direct differentiation model"
for a more complete description of the tangent mode of differentiation.
Look at the original call graph. Notice that "sub1" only calls
"f" and "g". Therefore it is natural that the
Differentiated call graph shows only the derivative routines
"sub1_d", "f_d",
and "g_d". The "d"
suffix designates files and variables differentiated
in tangent mode. It is a reminder of the dot above, which is
the conventional sign used to denote derivatives in the tangent mode.
Some people also read it as "direct".
Now click on "sub1" in the original call graph. The
text of subroutine sub1 appears below. If you now click on
the last instruction, the middle right frame displays the differentiated
routine "sub1_d", and if this frame is small, it automatically
scrolls to display the differentiated instruction which is:
otherd = G_D(out, outd, other)
If you look more closely at "sub1_d", you notice that
one instruction has been added before each original instruction,
to deal with the derivatives. The derivative of a variable "v",
by convention, is put into a variable called "vd".
Declarations of differentiated variables are inserted automatically,
and differentiated visible variables appear as additional
formal parameters or commons. Arithmetical operations and
system numerical subroutines are differentiated as could be expected.
For example the first instruction:
out = first * other + 3.0 * third
is differentiated as
outd = firstd * other + first * otherd
because of the well-known differentiation of a product, and
noticing that at that time in the program, variable "thirdd"
is certainly zero because variable "third" was not selected as
an independent input variable.
Finally, calls to user routines are simply replaced by calls
to the differentiated subroutine, with suitable values for the
additional parameters that hold the derivatives. For example
instruction:
other = G(out)
becomes in the differentiated routine
otherd = G_D(out, outd, other)
where the argument outd is inserted after the original
argument out, the original
result other has been moved
to the arguments list (but is still an output!), and the derivative
of the result is returned in the new result otherd
Take a look at routines G
and G_D,
by clicking on their names in the call graphs.
The interface also shows some warning and error messages, that were detected during the
preliminary analyses required by TAPENADE. These messages should be examined, because some of
the problems they show might lead to an incorrect differentiation.
The end-user can often find out that a warning message can be neglected, but bear in mind that
this is not always the case. For example,
incompatible array sizes between actual and formal arguments may result in insufficient
storage of intermediate values in the reverse mode. Similarly, aliasing
problems (see FAQ) can lead to bad differentiation.
Look at routine G. There is a
icon in the equality test. This is a warning message.
If you click on it, the bottom frame displays the corresponding warning message,
that tells you to avoid equality tests on REAL values.
There is another message in the bottom frame. Click on it.
You see that subroutine SUB4 is called
with an array element, while it expects an array.
Again, it is very hard for a source transformation tool to handle this
correctly, even if it is a widespread practice.
At least, you should probably check if
TAPENADE has handled this well, and rebuilt correct differentiated code.
Download the differentiated program:

This applies only to the web server. When called from the command line,
TAPENADE automatically creates the differentiated files.
Using the Web server,
if you are satisfied with the differentiated program, and you want to
use it, you may cut-and-paste it. However, since this is tedious on large
files, you'd better download it using the "Download differentiated file"
button in the top right frame. This will download a single
gzip'ed file
that contains all differentiated routines. You may then include these
differentiated routines in your application, feeding it with values
for the derivatives of the independent input variables, and using
the returned derivatives of the dependent output variables.
Notice that, since differentiated
routines may call original routines, this differentiated program may
require linking to the original routines before running.
Click on the "Download differentiated file" button.
Your browser
should pop a file selector, in which you specify where the
differentiated file should be downloaded on your system.
The name of this file is not important. Some browsers
may loose the ".gz" suffix. In that case, add
this suffix manually. Then unzip the file using
the "gunzip"
command. You are ready to use the differentiated file in your
application
Modify the differentiation request:

This applies only to the web server. When you use the command line,
you just need to type a new "tapenade" command, with different arguments.
Using the Web server,
you may go back to the differentiation request page by clicking on
the button "Retry with the same files"
in the top left frame.
In the differentiation request page, you may remove and add uploaded
files, and even remove them all with the "Retry with the new files"
button. This should be used with
caution, because uploading again is tedious. But don't worry:
of course, in any case, nothing is erased on your local system!!
You may very well want to modify some of your original files, for example
to fix an error or to improve differentiation result. After doing so, don't
forget to use the "Remove selected files" button to remove the
uploaded copy of the file, then upload the new version as indicated above.
You may also specify a different top routine, or different dependent or
independent variables, and also use another mode of differentiation.
Click on the "Retry with the same files" button.
You should be
back to the differentiation request page, with the three files
still uploaded. Then type "sub2"
as the name of the
top differentiation routine. Leave blank the fields for
the dependent and independent variables. The click on the
"Reverse Mode" button.
After a few seconds, the differentiated program is displayed...
Examine the result of the Reverse Mode:

The page that displays the differentiated program is identical
for the reverse mode and the tangent mode. But of course the
differentiated programs differ completely. You may refer to section
"Reverse differentiation model"
for a more complete description of the reverse mode.
In the call graphs, click on routines "sub2"
on the left,
and on "sub2_b" on the right.
The "b"
suffix designates files and variables differentiated
in reverse mode. It is a reminder of the bar above, which is
the conventional sign used to denote derivatives in the reverse mode.
Some people also read it as "backward".
Routine "sub2"
makes no call to other user routines.
Therefore the differentiated program consists of only
"sub2_b".
The structure of "sub2" is simple:
one DO loop containing a
multiple conditional with a GOTO,
followed by a one WHILE loop.
As expected, the reverse
subroutine is built of two successive parts:
- the first part ("forward sweep")
replicates the original program, with the
same instructions embedded in the same control. Only some
instructions are inserted to memorize intermediate values
and some control flow information.
- the second part ("reverse sweep")
reverses the original control flow,
from the last instruction to the first instruction,
and each instruction is replaced by its reverse derivative
instructions.
Notice the memorizations of the control, through the
PUSHINTEGER4 and
POPINTEGER4 routines. These
ensure that the reverse sweep exactly reproduces the
execution order of the forward sweep, but reversed.
Click on this instruction in the original routine
y(i) = LOG(z(i))
The middle right frame scrolls to display the corresponding
differentiated instructions, which are:
zb(i) = zb(i) + yb(i) / z(i)
yb(i) = 0.0
It may take a pencil and a paper to make sure that this is correct,
but it is indeed!
Sometimes, TAPENADE performs some optimizations, that yield a better
code, but somewhat harder to understand. For example, instruction
a = 0.5 * x(20)
corresponds to no differentiated instructions. This is because a
is a local variable which plays no role in the differentiation of the outputs with
respect to the inputs. We say that a is not "active",
and thus need not be differentiated.
Again, you may download the differentiated subroutines using the
"Download differentiated file" button. In the reverse mode,
you need to link the differentiated program to a separate
library (actually two files, one Fortran: adBuffer.f
and one C: adStack.c),
that defines all the PUSH... and
POP... routines.
You need to compile both adBuffer.f and adStack.c
and link them both to your final reverse-differentiated executable.
You will find these files adBuffer.f and adStack.c
in the ADFirstAidKit, which you may download by clicking on
the "Download PUSH/POP" button in the differentiation result page, or else
Be careful when linking Fortran and C. This is highly dependent on your local
system/architecture. On some systems, like ours (Linux+Gnu compilers or Solaris+Sun compilers),
this is rather simple.
We compile the Fortran files with:
$> f77 -c fortranfilename.f
$> f77 -c adBuffer.f
We compile the C file with:
$> cc -c adStack.c
And we do the final link by:
$> f77 fortranfilename.o adBuffer.o adStack.o -o execname
On other systems, there may be additional work. For example (thanks! Frode Martinsen, NTNU, Norway) under Windows2000, with VisualStudio
and Compaq VisualFortran, you may have to add the following directives into the Fortran files that use C
routines. For example if Fortran calls PUSHINTEGER4:
EXTERNAL pushinteger4_
!DEC ATTRIBUTES C,REFERENCE :: pushinteger4_
In any case, read the documentation of your compilers, or otherwise
ask your system engineer.
A more complex example of the Reverse Mode:

... To be continued ...
Diagnostics pages after failure:

... To be continued ...
Want to install TAPENADE locally on your system?

If you are satisifed with the programs returned by the TAPENADE server,
you may stick to this manner of differentiating programs.
There is no algorithmic difference between the TAPENADE server
and a locally installed TAPENADE, and the differentiated programs are the same.
There are some advantages in sticking to the server. First, you don't
need to install, and second, you don't need to re-install to get
updates of the tool.
However, there are several reasons why you might want to
install TAPENADE locally on your system:
- You may not like that your source files travel on the Internet. We will
not use nor redistribute them, but we cannot guarantee that no one can spy on the web!
- You may want to use the tapenade command in Makefiles
- You may have run into our limitation to 100 thousand characters. Notice however that
this limitation can probably be lifted when our server becomes robust enough.
- You may find it tedious to upload a large number of separate files into the server.
This last point should be improved soon, when the server
accepts tar files.
In that case, download TAPENADE from
our FTP server
.
Last modified: Wed Jan 27 15:31:57 CET 2010
by
the developers of Tapenade