Shown in Table 4 is a comparison between non-Globus and Globus performance for the AGARD swept wing test case using the INRIA-nina cluster. Table 4 shows the Globus performance to be slightly better than the non-Globus performance. The non-Globus MPI uses the MPICH p4_ch device whereas the Globus MPI uses the globus2 device. The small differences in performance are due to slightly different configure options. Global inter-communication occurs, for example, when the maximum, minimum, or sum of the values of a variable are computed over all the processors. Local inter-communication occurs when messages are passed between two processors. The total computation time includes the inter/intra-communication times but not setup times (reading data, meshes, initialization, ...etc). The times shown in all tables are in seconds and the maximum values for all the processors. Times to save intermediate solutions are not taken into account. The Communication/Work ratio is the sum of the local and global communication time divided by the total computational time - communication time (Work)1. The minimum and average Communication/Work ratios are much smaller.
The non-Globus and Globus computational times are approximately the same on the nina cluster. Without testing each individual cluster, we hypothesize that the non-Globus and Globus times will be approximately the same on the other clusters as well.
|