Next: Two-phase flow experiments
Up: Test case: AGARD swept
Previous: Individual cluster performance
Shown in Table 6 are the performances on the different clusters
of the MecaGRID for the AGARD test case for different combinations of
the clusters for both the implicit and the explicit solvers relative to the nina times.
For both the implicit and explicit solvers the performance degrades significantly
for inter-cluster computations.
Inter/intra-cluster computations are indicated by 4-4 thus, for
example, nina-cemef means 4 processors on nina and 4 processors on the CEMEF cluster.
Table 7:
AGARD: Inter/intra-cluster Explicit solver performance
Run type |
Globus |
Globus |
Globus |
Globus |
Name of cluster |
nina-pf |
nina-cemef |
nina-iusti |
iusti-cemef |
Processor speed |
2/1 GHz |
2/1 GHz |
2/2 GHz |
2/1 Ghz |
cache (K) |
512/256 |
512/256 |
512/512 |
512/256 |
Executable size |
236 MB |
236 MB |
236 MB |
236 MB |
Number of processors |
4-4 |
4-4 |
4-4 |
4-4 |
Total computational time |
208.6 |
1570.6 |
1438.4 |
1990.3 |
Local inter-comm. time |
50.4 |
1137.9 |
1178.0 |
1188.0 |
Global inter-comm. time |
6.3 |
158.9 |
159.3 |
146.0 |
Computational ratio |
2.4 |
17.9 |
16.4 |
22.7 |
communication/work |
0.37 |
4.73 |
13.23 |
2.03 |
|
Table 7 shows some of the
inter-cluster and the intra-cluster (nina-pf) performances using the explicit solver
where it is seen that the
inter-cluster performances are degraded due to large local inter-communication and
global communication times.
This may in part due to the large physical distance between
Sophia Antipolis and Marseille where different routers and networks may be involved.
The problem is certainly agravitated by the small mesh
where most of the time is used in message passing rather than processor work.
Lastly, another factor impossible to evaluate in the present MecaGRID configuration is
the efficiency of the VPN.
Regarding the AGARD test case with the 22K vertices mesh, the following observations are made:
- In general, inter-cluster Grid performance was poor for both the implicit and the explicit solver
indicating that small meshes are not suitable for inter-cluster GRID applications.
For small meshes the communication time between
processors is larger than total processor work time. Therefore one must be careful in extrapolating
these results to larger meshes.
- The explicit time scheme shows better inter-cluster
Grid performance than the implicit time scheme.
This is perhaps a result of the vector matrix product iteration in the implicit solver
that may add more communication time than processor work on small meshes; application
on larger meshes may result in the contrary.
However, for computing steady-state flows like the AGARD swept wing test case,
the total CPU time for the implicit scheme is much
less than the explicit solver2.
- Inter-cluster Grid computations involving combination pf-CEMEF clusters
and the combination nina-CEMEF give approximately the same performance.
- Using the explicit solver, the CEMEF cluster showed a slightly better performance
than the INRIA-pf cluster. Equivalent performance was expected.
- In general, inter-cluster Grid computations involving the CEMEF cluster were the least efficient
of the inter-cluster computations.
The reasons for the poor performance will be discussed in section 9.1.
Next: Two-phase flow experiments
Up: Test case: AGARD swept
Previous: Individual cluster performance
Stephen Wornom
2004-09-10