Master 1 Informatique	Programmation Répartie et Architecture N Tiers

TD-TP - n°5 : ProActive - 2
Durée: 3H

Denis Caromel, Brian Amedro
Université Nice-Sophia Antipolis, Département Informatique

1. ProActive deployment

1.1. Deployment Related Concepts

Download this archive.

A first principle is to fully eliminate from the source code the following elements:

machine names
creation protocols
registry lookup protocols

The goal is to deploy any application anywhere without changing the source code. For instance, we must be able to use various protocols, rsh, ssh, Globus, LSF, etc., for the creation of the JVMs needed by the application. In the same manner, the discovery of existing resources or the registration of the ones created by the application can be done with various protocols such as RMIregistry, Globus etc. Therefore, we see that the creation, registration and discovery of resources have to be done externally to the application.

A second key principle is the capability to abstractly describe an application, or part of it, in terms of its conceptual activities. The description should indicate the various parallel or distributed entities in the program. For instance, an application that is designed to use three interactive visualization nodes, a node to capture input from a physics experiment, and a simulation engine designed to run on a cluster of machines should somewhere clearly advertise this information.

Now, one should note that the abstract description of an application and the way to deploy it are not independent piece of information. If for example, we have a simulation engine, it might register in a specific registry protocol, and if so, the other entities of the computation might have to use that lookup protocol to bind to the engine. Moreover, one part of the program can just lookup for the engine (assuming it is started independently), or explicitly create the engine itself. To summarize, in order to abstract away the underlying execution platform, and to allow a source-independent deployment, a framework has to provide the following elements:

an abstract description of the distributed entities of a parallel program or component,
an external mapping of those entities to real machines, using actual creation, registry, and lookup protocols.

To reach that goal, the programming model relies on the specific notion of Virtual Nodes (VNs):

a VN is identified as a name (a simple string)
a VN is used in a program source
a VN is defined and configured in a deployment descriptor (XML)
a VN, after activation, is mapped to one or to a set of actual ProActive Nodes

Of course, distributed entities (Active Objects), are created on Nodes, not on Virtual Nodes. There is a strong need for both Nodes and Virtual Nodes. Virtual Nodes are a much richer abstraction, as they provide mechanisms such as set or cyclic mapping. Another key aspect is the capability to describe and trigger the mapping of a single VN that generates the allocation of several JVMs. This is critical if we want to get at once machines from a cluster of PCs managed through Globus or LSF. It is even more critical in a Grid application, when trying to achieve the co-allocation of machines from several clusters across several continents.

Moreover, a Virtual Node is a concept of a distributed program or component, while a Node is actually a deployment concept: it is an object that lives in a JVM, hosting Active Objects. There is of course a correspondence between Virtual Nodes and Nodes: the function created by the deployment, the mapping. This mapping is specified in the Application Descriptor. The grid facilities are described in two deployment descriptor separated by the different concerns of the application developer and grid infrastructure administrator. In the grid deployment descriptor we describe:

the resources provided by the infrastructure
how to acquire the resources provided by the infrastructure

In the application deployment descriptor we describe:

how to launch the application
the resources needed by the application
the resource providers

Look at and study the descriptor files GCMD_Local.xml, GCMD_SSH.xml and GCMA.xml.

Q	Write a Deployment Descriptor and an Application Descriptor file to deploy your ProActive application onto the classroom's machines.

Now, we will see how to start a monitoring agent on a remote machine using the deployment methods explained previously. To be able to deploy on remote machines we just have to use the deployment file, add a method that tells ProActive to activate the nodes used and tell the active object to start on the remote node.

We will use different classes:

org.objectweb.proactive.extensions.gcmdeployment.PAGCMDeployment - used to create a GCMApplication from an Application Descriptor
org.objectweb.proactive.gcmdeployment.GCMApplication - represents the application which is being deployed
org.objectweb.proactive.core.ProActiveException - used to catch exception
org.objectweb.proactive.gcmdeployment.GCMVirtualNode - used to control and instantiate virtual node objects
org.objectweb.proactive.core.node.Node - used to control and instantiate node objects

We will change the Main class to declare and load the deployment descriptors to be used. For this we will use a deploy() method that returns the a Virtual Node which has several nodes (as specified in the deployment file) that we can deploy on. First, the method creates an object representation of the deployment file, then activates all the nodes, and then returns the first available node. We also have to change the deployment descriptor files to fit our local settings.

Q	Create GCMApplication from the application descriptor file using PAGCMDeployment.loadDescriptor(String descriptor) Start the deployment of all the virtual nodes using GCMApplication.startDeployment() Get a virtual node using GCMApplication.getVirtualNode(String virtualNodeName) Create a virtual object node using the deployment method Create the active object using an node from the virtual node Get and print the state from the active object Stop the application using GCMApplication.kill() Change the State class so the initialization of the variables takes place in the toString() method. Run the deployed application again and explain the different results.

Create GCMApplication from the application descriptor file using PAGCMDeployment.loadDescriptor(String descriptor)
Start the deployment of all the virtual nodes using GCMApplication.startDeployment()
Get a virtual node using GCMApplication.getVirtualNode(String virtualNodeName)
Create a virtual object node using the deployment method
Create the active object using an node from the virtual node
Get and print the state from the active object
Stop the application using GCMApplication.kill()
Change the State class so the initialization of the variables takes place in the toString() method. Run the deployed application again and explain the different results.

2. Groups of Monitoring Agents

Now, we will show how to use groups of active objects. We will create several active objects that we add and remove from a group. The group will be used to retrieve the a State object from all the active objects in the group. In order to ease the use of the group communication, ProActive provides a set of static methods in the PAGroup class and a set of methods in the Group interface. ProActive also provides typed group communication, meaning that only methods defined on classes or interfaces implemented by members of the group can be called. There are several ways to create groups of active objects. Similar to active objects, we have instantiation based creation and object based creation. Instantiation based creation is done through newGroup(..) and newGroupInParallel while object based creation is done through turnActiveAsGroup(...).

To deal with ProActive Groups, we will have to work with two main classes:

org.objectweb.proactive.api.PAGroup - used to create a group of active objects
org.objectweb.proactive.core.group.Group - used to control the group of objects

We only need to modify the Main class to create the group of objects. To instantiate the active object we will use the CMAgentInitialized class that we have defined previously.

Q	Create a new empty group using `PAGroup.newGroup(..)` Create a collection of active objects with on object on each node Get a management representation of the monitors group using the `Group` interface Print the Node URL using `PAActiveObject.getActiveObjectNodeUrl(...)` Use `PAGroup.waitAndGetOneThenRemoveIt()` to control the list of `State` futures

Create a new empty group using PAGroup.newGroup(..)
Create a collection of active objects with on object on each node
Get a management representation of the monitors group using the Group interface
Print the Node URL using PAActiveObject.getActiveObjectNodeUrl(...)
Use PAGroup.waitAndGetOneThenRemoveIt() to control the list of State futures

2.1. Object-Oriented SPMD Groups

SPMD stands for Single Program Multiple Data, which is a technique used in parallelizing applications by separating task and running them simultaneously on different machines or processors. ProActive allows the use of object oriented programming combined with the SPMD techniques. ProActive uses group communication with SPMD in order to free the programmer from having to implement the complex communication code required for setting identical groups in each SPMD activity. Group communication allows the focus to be on the application itself and not on the synchronizations. An SPMD group is a group of active objects where each one has a group referencing all the active objects. This chapter presents the mechanism of typed group communication as an new alternative to the old SPMD programming model. While being placed in an object-oriented context, this mechanism helps the definition and the coordination of distributed activities. The approach offers a better structuring flexibility and implementation through a modest size API. The automation of key communication mechanisms and synchronization simplifies code writing. The typed group communication system can be used to simulate MPI-style collective communication. Contrary to MPI that requires all members of a group to collectively call the same communication primitive, our group communication scheme makes possible for one activity to call methods on the group.

The main class for the SPMD groups is org.objectweb.proactive.api.PASPMD. This class contains methods for creating and controlling ProActive SPMD groups.

For this exercise, we want to distribute a Primality test using the SPMD API. Our test will be a naive one:

private boolean isPrime(long candidate, long begin, long end) {
  for (long divider = begin; divider < end; divider++) {
    if ((candidate % divider) == 0) {
      return false;
    }
  }
  return true;
}

This test has to be performed on the range [2;ceil(sqrt(n))]. Distribution will consist on making the test with subranges on different machines.

Q	"From scratch", develop a small ProActive program to distribute our Primality test. You have write two classes: a `PrimeTest` class (which is an Active Object) that will test the Primality on its subrange. Note that this subrange has to be determined by the object itself, thanks to the group size and rank values. a `Main` class that creates a SPMD group of `PrimeTest`, and asks continuously the user for a number to test.

"From scratch", develop a small ProActive program to distribute our Primality test. You have write two classes:

a PrimeTest class (which is an Active Object) that will test the Primality on its subrange. Note that this subrange has to be determined by the object itself, thanks to the group size and rank values.
a Main class that creates a SPMD group of PrimeTest, and asks continuously the user for a number to test.

Q	Use the ProActive immediate services to know which is the current divider for a given active object

TD-TP - n°5 : ProActive - 2

Durée: 3H

1. ProActive deployment

1.1. Deployment Related Concepts

2. Groups of Monitoring Agents

2.1. Object-Oriented SPMD Groups