Grid performance summary

Next: Two-phase flow experiments Up: Test case: AGARD swept Previous: Individual cluster performance

Grid performance summary

Shown in Table 6 are the performances on the different clusters of the MecaGRID for the AGARD test case for different combinations of the clusters for both the implicit and the explicit solvers relative to the nina times. For both the implicit and explicit solvers the performance degrades significantly for inter-cluster computations. Inter/intra-cluster computations are indicated by 4-4 thus, for example, nina-cemef means 4 processors on nina and 4 processors on the CEMEF cluster.

Table 7: AGARD: Inter/intra-cluster Explicit solver performance

Run type	Globus	Globus	Globus	Globus
Name of cluster	nina-pf	nina-cemef	nina-iusti	iusti-cemef
Processor speed	2/1 GHz	2/1 GHz	2/2 GHz	2/1 Ghz
cache (K)	512/256	512/256	512/512	512/256
Executable size	236 MB	236 MB	236 MB	236 MB
Number of processors	4-4	4-4	4-4	4-4
Total computational time	208.6	1570.6	1438.4	1990.3
Local inter-comm. time	50.4	1137.9	1178.0	1188.0
Global inter-comm. time	6.3	158.9	159.3	146.0
Computational ratio	2.4	17.9	16.4	22.7
communication/work	0.37	4.73	13.23	2.03

Table 7 shows some of the inter-cluster and the intra-cluster (nina-pf) performances using the explicit solver where it is seen that the inter-cluster performances are degraded due to large local inter-communication and global communication times. This may in part due to the large physical distance between Sophia Antipolis and Marseille where different routers and networks may be involved. The problem is certainly agravitated by the small mesh where most of the time is used in message passing rather than processor work. Lastly, another factor impossible to evaluate in the present MecaGRID configuration is the efficiency of the VPN.

Regarding the AGARD test case with the 22K vertices mesh, the following observations are made:

In general, inter-cluster Grid performance was poor for both the implicit and the explicit solver indicating that small meshes are not suitable for inter-cluster GRID applications. For small meshes the communication time between processors is larger than total processor work time. Therefore one must be careful in extrapolating these results to larger meshes.
The explicit time scheme shows better inter-cluster Grid performance than the implicit time scheme. This is perhaps a result of the vector matrix product iteration in the implicit solver that may add more communication time than processor work on small meshes; application on larger meshes may result in the contrary. However, for computing steady-state flows like the AGARD swept wing test case, the total CPU time for the implicit scheme is much less than the explicit solver².
Inter-cluster Grid computations involving combination pf-CEMEF clusters and the combination nina-CEMEF give approximately the same performance.
Using the explicit solver, the CEMEF cluster showed a slightly better performance than the INRIA-pf cluster. Equivalent performance was expected.
In general, inter-cluster Grid computations involving the CEMEF cluster were the least efficient of the inter-cluster computations. The reasons for the poor performance will be discussed in section 9.1.

Next: Two-phase flow experiments Up: Test case: AGARD swept Previous: Individual cluster performance

Stephen Wornom 2004-09-10