PicsouGrid/Grid5000 Benchmark Results 2006-03-14

Other Results

Description

On 13 March 2007 I submitted 12 multi-node tasks with reservation to all the available Grid5000 clusters (Grenoble was not available), each one trying to use at least 20 nodes. The reservation was for 6:05am the next day, Wednesday 14 March. 9 clusters at 7 sites ran the jobs successfully. After resubmission attempts later the same day, 2 more clusters also completed the test.

Each task was a set of simple Monte Carlo simulations, where each one takes about 90 seconds on a "standard" desktop machine. Once the task started on the cluster, it spread to all nodes in the OAR_NODELIST and then forked one process per CPU.

The graphs below show a blue box for the cluster occupancy, which is the time from the task starting on a worker node until the last node in the nodelist has returned its results. The grey boxes signify either the task queue time or the task data stage-out time. The red boxes show the worker node occupancy, for each worker node. The black lines show the life line of the core Monte Carlo algorithm.

Images

(note: these are high resolution, in order to see the life-lines for individual cores)


Discussion

This is a significant improvement to the results from 7 March. Only Lille showed 3 nodes with a clock which was one hour off. It is interesting to see that Bordeaux has such a big performance difference between different nodes. I need to look at the log files to see what the actual CPUs are behind these different nodes, and to check the node load level during the benchmark. It is also interesting to see the delay in start time, given the reservation. The reservation window was set for 20 minutes (although I think a bug meant this was interpretted as 20 hours). For Bordeaux, the first several nodes are dual AMD Opterons 248 processors (2.2 GHz). The remainder are dual Intel Xeon 3GHz processors with Hyper Threading enabled (4 virtual processors). This means the benchmarks here aren't really "fair". Furthermore, the nodefile supplied by OAR only listed the number of physical processors for the Xeons. A rough "correction" would be to divide the Bordeaux results by two, although that would also divide by two the 11 (of 51) nodes which used the AMD cores. Obviously my benchmark strategy needs to be improved here.

Follow up

If you have any questions or would like the raw data or images in a better format, email me at Ian.Stokes-Rees _AT_ inria.fr.