We would definitely be interested in listening to "hands-on" experience with very large/complex systems. Actually, our goal is to provide a venue where people having experienced of thereality of large scale can meet exchange together, so that they may avoid re-inventing the wheel again and again.
In a previous discussion, someone pointed out that the real problem was not processors; it is rather managing users and institutions. We could also add: making long-term projects out of short-term budget commitments. In this respect, grids are not that different from anything else, but, since the seminal Foster and Kesselman's comparaison with the power grid, people have somehow collectively dreamed that they could be different.
Again, the goal is not to provide an extensive, technical review of the projects, but rather to explain what made possible the miracle that such a complex technical, institutional, human and financial organization works in the long-term.
In this talk, we will first present IBM's approach towards commercial grid computing, illustrating it with real customer cases. We will then introduce IBM's grid strategic directions and describe research projects, aimed at developing new capabilities to fulfill enterprise grid requirements.
The presentation attempts to give a brief survey over the major achievements of one of the earliest production grids in Europe, the Hungarian ClusterGrid Infrastructure. It is the largest national grid system in Hungary, coordinated by NIIF/HUNGARNET, that utilizes the free CPU cycles of desktop PCs being used for education (on Microsoft Windows) during the official work hours of different institutional labs, and being used for supercomputing purposes (regularly on Linux) during the less busy periods, typically the nights and week-ends. The ClusterGrid as an infrastructure contains 1100 compute nodes of 26 interconnected institutions (universities, research institutes) and a cost effective storage media serving over 200 academic users. The whole system, on a smaller quantity scale, has been put into production in 2002. The presentation will also cover questions like what is the software infrastructure stack like?, how are the large number of PC-s managed by minor number of participants?, how is the whole system maintained over 1000 nodes?, how is the platform independence guaranteed? and in which ways the users are supported? In the conclusion, a brief summary will be given about the production grid experiences during the past 3 years, and some additional remarks will also be highlighted about the future development plans.References: http://www.clustergrid.hu