Bibliography

Next: About this document ... Up: latex2html_globus Previous: Acknowledgements

No References!

APPENDIX A
USEFUL NOTES

The partitions shown in Table 22 were made with the version of the CEMEF mesh partitioner created by Lanrivain [#!rodolphe!#] (CMP_Lanrivain) during his 2003 Internship in the SMASH project directed by Herve Guillard. This version, CMP_Lanrivain, writes the partitions and the flu.glob files in the format readable by AERO-F and AEDIF, and includes both homogeneous and heterogeneous (optimized) options. The mesh partitioner of CEMEF is a very sophisticated code written in C++. Each system upgrade on the cluster has had a dramatic effect on the ability to compile the code even with the aid of the developers. Recent attempts by Patrick Nivet and the author to compile the mesh partitioner were unsuccessful, therefore as this report is written, the mesh partitioner is not available to partition meshes.
The maximun number of partitions show in Table 22 is 64. Partitioning for 96 partitions were tried but the iteration scheme in the code did not converged after 24 hours using 32 nina processors to create 96 partitions. The developer, Hugues Digonnet, suggested to limit the number of iterations to 30 maximum rather than the default 3000. This change was made but could not be tested due to 1., this explains why there are no partitions > 64.
The CMP_Lanrivain version Makefile to create heterogeneous partitions needs updating.
The statement to write the partitions and the flu.glob files in the format readable by AERO-F and AEDIF has been transfer to a more recent version of the mesh partitioner by Hugues Digonnet and Youssef Mesri(2004 SMASH intern), however, the code has not been validated.

APPENDIX B
HOW TO Make and EXECUTE AEDIF

AEDIF can be obtained by contacting Herve Guillard at INRIA Sophia Antipolis.

mkdir AEDIF
cd AEDIF directory
cp source, makefiles, ... to AEDIF

cat README
Paral3D.h
Param3D.h

In case the *.h gets deleted
cp -p Paral3D.h.sav Paral3D.h
cp -p Param3D.h.sav Param3D.h

ls Mak* shows
Makefile_aedif
Makefile_aedif_GLOBUS
Makefile_aedif_parameters
Makefile_aero_Interface

1) Create a working directory (WD)
mkdir AEDIF

cd WD
2) cp or link the flu-* and flu.glob files to the WD

3) cp -p AEDIF/Mak* .

4) Make new .h files based on the flu.glob for actual case
make -f Makefile_aedif_parameters
This need to be done only once.

5) Create the executables:
make -f Makefile_aedif_GLOBUS clean      Globus
make -f Makefile_aedif_GLOBUS aero       Globus
globusrun -f runscript.rsl
or
make -f Makefile_aedif clean         non-GLOBUS
make -f Makefile_aedif aero          non-GLOBUS

./nina_mv.lsf ncpus

nina_mv.lsf: non-globus run script
bsub -J aerodia16 -o test06.out -e test06.err -m "linux-nina linux-pf"  \
     -f "test06.hostnames      < hostnames.out" \
     -f "test06.flu.glbcpu     < flu.glbcpu" \
     -f "test06.flu.lclcpu     < flu.lclcpu" \
     -f "test06.cpu_times.dat  < times.dat" \
     -f "test06.rsitg.data     < rsitg.data"\
     -n $1 mpijob aerodia.x

The non-GLOBUS run script saves the files indicated. The globusrun script
has a FILE_STAGE_OUT option that could be used to save the files. Unless the
FILE_STAGE_OUT is used, the user must save the files manually before submitting
another job, otherwise they will be overwritten.

APPENDIX C
GRAPHICAL INTERFACE

The AEDIF graphical interface can be found at

dauphine.inria.fr: /net/home/swornom/AEDIF_Graphics.tar

AEDIF_Graphics contains these files:

Aerodia2d_to_Graphics.f
write_variables_VTK2d.c
Makefile_aedif2dg
aedif2dg.inp

Aerodia3d_to_Graphics_v2.f
write_variables_VTK.c
Makefile_aedif3dg
aedif3dg.inp

To create the executables:

make -f Makefile_aedif2dg (interface for Murrone-Guillard two dimensional code)
make -f Makefile_aedif3dg (interface for AEDIF)

cp either aedif2dg.inp and Makefile_aedif2dg (or aedif3dg.inp and Makefile_aedif3dg) to the directory containing the data.

The interface is excuted by:

./aedif2dg.x or
./aedif3d.x

and interactively answering the questions posed. In advance the aedif3d.inp or aedif3dg.inp files should be modified if necessary. The user has an option to write the graphic data for either the Medithttp://www-rocq1.inria.fr/gamma/fra.htmhttp://www-rocq1.inria.fr/gamma/fra.htm code of INRIA or the ParaViewhttp://www.kitware.com graphics code. The advantage of ParaView over Medit is that multiple time step data can be processed in the batch mode. This is very useful for data sets of 50-100 to create video animations¹⁷ The following link linkhttp://www-sop.inria.fr/smash/personnel/Stephen.Wornom/Stephen.Wornom-english.html contains several animations created during this study.

APPENDIX D
BATCH MESH PARTITIONER

AEDIF contains two directories
1) MeshPartitioner_Batch_script/
2) libs/

cd MeshPartitioner_Batch_script/
follow instructions in the README file.

APPENDIX E
PROCESSORS HOSTNAMES

The flu.data input file for the AEDIF code contains a hostname option (yes=1). Table 25 shows the neighbors of processor "0" and their hostnames printed in the file hostname.out file when this option is selected.

Table 25: Processor hostname information

Processor	Hostname
0	nina08.inria.fr
1	nina08.inria.fr
2	nina06.inria.fr
3	nina06.inria.fr
4	node9.clustal.com
5	node8.clustal.com
6	node7.clustal.com
7	node6.clustal.com
8	node24.cemef
9	node23.cemef
10	node22.cemef
11	node20.cemef
12	pf8.inria.fr
13	pf8.inria.fr
14	pf3.inria.fr
15	pf3.inria.fr

Table 26 shows the neighbors of processor "0" and their hostnames.

Table 26: Processor hostname information

My Processor		My Neighbors
CPU	Hostname	CPU	Hostname
0	nina08.inria.fr	1	nina08.inria.fr
0	nina08.inria.fr	8	node24.cemef
0	nina08.inria.fr	9	node23.cemef
0	nina08.inria.fr	10	node22.cemef
0	nina08.inria.fr	12	pf8.inria.fr

Table 27 shows the neighbors of processor "0" and their hostnames. nprocs= 16 message passing between clusters at time step Total messages passed = 108

Table 27: Processor hostname information

Cluster names	Number of messages
nina-nina	2
pf-pf	6
cemef-cemef	6
iusti-iusti	10
nina-pf	5
nina-cemef	7
nina-iusti	5
pf-cemef	7
pf-iusti	11
cemef-iusti	7

APPENDIX F
TIME ANALYSIS-I

The time analysis is found in the flu.glbcpu and the flu.lclcpu files written at ktsav intervals. The non-GLOBUS run script saves these files. The globusrun script has a FILE_STAGE_OUT option that could be used to save the files but has not been used. Unless the FILE_STAGE_OUT is used, the user must save the files manually before submitting another job, otherwise they will be overwritten.

The minimum, maximum, and average times for all the processors are computed. The average time is the sum of the individual processors divide by the number of processors. MPI passes the fluxes, time step, and gradients (GRD) between partitions, the gradients contain the most data.

 -------------------------------------------------------------
globus_24_pf_12_cemef_12_run3.flu.glbcpu
 -------------------------------------------------------------
 Number of time steps            :    10
 Number of solution saves        :     1
  
             
 Values of local CPU times       :    1:MIN - 2:MAX - 3:AVRG
 -------------------------------------------------------------
 Wait time to get all needed CPUs:   7.058    22.328    15.205
 Total simulation time           : 795.915   811.187   804.063
 Problem setup time              : 163.527   178.799   171.675
 Total computational time        : 632.386   632.389   632.388
 Write local solution files      :   0.000     0.000     0.000
 Write global solution files     :   0.036   123.378     5.210
 -------------------------------------------------------------
 Total Computational time (Tcomp): 509.011   632.353   627.178
 Total Communication time (Tcomm): 244.473   476.911   442.616
 Twork = Tcomp - Tcomm           : 155.477   387.916   189.772
 Tcomm/Twork                     :   0.630     3.067     0.026
 -------------------------------------------------------------
 Global intra-communication time :  68.562   234.007   164.888
 Local  intra-communication time :  43.296   321.118   277.728
 -------------------------------------------------------------
 Explicit convective fluxes      :  51.132    82.393    56.942
 Explicit nodal gradients        :  25.299    43.051    34.956
 -------------------------------------------------------------
 Local intra-comm. transfer rates
    Dt  transfer rate (Mbps)     :   0.535     6.303     1.363
    Grd transfer rate (Mbps)     :   0.907     8.054     2.098
    Flx transfer rate (Mbps)     :   0.857     8.167     1.972
 -------------------------------------------------------------
 Local intra-communication times (sec)
    Dt  time                     :   0.651     7.650     5.174
    Grd time                     :  32.095   228.644   200.591
    Flx time                     :  10.550    86.602    71.963
 -------------------------------------------------------------
 Other local comput. and I/O time:   3.560     5.234     4.453
 Mesh motion and metrics update  :   0.000     0.000     0.000
 KtGrd10                         :  10.000    10.000    10.000
 -------------------------------------------------------------

APPENDIX G
NEED for a COHERENT GRID COMPUTING POLICY

The need for a Grid coherent policy becomes evident when attempting to analyze Grid communication speeds using the LINUX ping tool. Table 28 shows the results of ping tests between the different MecaGRID member sites. Knowing the routes allows for additional ping tests to determine the transfer rates of the different routers involved.

Table 28: Ping tests between MecaGRID sites.

From	To	Status	Routes
CEMEF	INRIA	successful	node10.cemef (192.168.8.110)
			192.168.101.12
			cluster.inria.fr (193.51.209.126)
			cluster.inria.fr (193.51.209.126)
			sarek.cemef (192.168.8.152)
			node10.cemef (192.168.8.110)
CEMEF	IUSTI	Failed	unknown
IUSTI	CEMEF	Failed	unknown
IUSTI	INRIA	Failed	unknown

Additional information is given below:

Example 1: ping nina01 from the INRIA frontend cluster

IP address of nina01 is 193.51.209.36

ping -c 100 -R 193.51.209.36

Comment: The -R option shows the routes involved.

PING 193.51.209.36 (193.51.209.36) from 193.51.209.126 :
RR:
cluster.inria.fr (193.51.209.126)
nina01.inria.fr (193.51.209.36)
nina01.inria.fr (193.51.209.36)
cluster.inria.fr (193.51.209.126)

64 bytes from 193.51.209.36: icmp_seq=0 ttl=64 time=0.1 ms
64 bytes from 193.51.209.36: icmp_seq=1 ttl=64 time=0.1 ms
64 bytes from 193.51.209.36: icmp_seq=2 ttl=64 time=0.1 ms
64 bytes from 193.51.209.36: icmp_seq=3 ttl=64 time=0.1 ms

193.51.209.36 ping statistics

100 packets transmitted, 100 packets received, 0% packet loss

round-trip min/avg/max = 0.0/0.1/0.5 ms

Positive: The routes are printed
Negative: The time format is 1 decimal place.

Example 2: ping nina01 from the cemef node10

PING nina01.inria.fr (193.51.209.36) from 192.168.8.110

RR:
node10.cemef (192.168.8.110)
192.168.101.12
cluster.inria.fr (193.51.209.126)
nina01.inria.fr (193.51.209.36)
nina01.inria.fr (193.51.209.36)
192.168.101.21
sarek.cemef (192.168.8.152)
node10.cemef (192.168.8.110)

64 bytes from nina01.inria.fr (193.51.209.36): time=19.821 msec
64 bytes from nina01.inria.fr (193.51.209.36): time=10.425 msec
64 bytes from nina01.inria.fr (193.51.209.36): time=20.509 msec
64 bytes from nina01.inria.fr (193.51.209.36): time=20.590 msec
64 bytes from nina01.inria.fr (193.51.209.36): time=20.683 msec
64 bytes from nina01.inria.fr (193.51.209.36): time=20.747 msec
...

-- nina01.inria.fr ping statistics --

100 packets transmitted, 100 packets received, 0% packet loss

round-trip min/avg/max/mdev = 5.087/14.752/22.962/4.909 ms

Next: About this document ... Up: latex2html_globus Previous: Acknowledgements

Stephen Wornom 2004-09-10