MASCOTTE no longer exists => visit the new COATI project-team
 


Internship MASCOTTE
How Bandwidth Availability Affects Data Lifetime in Distributed Storage Systems

by Sandeep Kumar Gupta

 
Advisor Frédéric Giroire and Stéphane Pérennes
School IIT Delhi
Degree Bachelor
Period 2009

Distributed storage systems have been proposed as a possible scalable way to<br />backup data at low marginal cost. In this kind of systems, redundancy is introduced to<br />preserve the data in case of peer failures or departures. To ensure long term<br />fault tolerance, this redundancy has to be constantly maintained by a reconstruction process.<br />The speed of the reconstruction of a data block is crucial for its <br />survival. This speed is mainly determined by how much bandwidth is<br />available. Reconstruction durations are usually modeled as independent (e.g., <br />poissonian or deterministic, or more generally following any distribution). But<br />in practical systems, concurrent reconstructions do compete for the same <br />bandwidth. Moreover, they also tend to happen at the same time (when a peer<br />fails). They are thus correlated to each other.<br />In this study, we propose a new analytical framework that takes into account this<br />correlation. Mainly, we introduce a queuing model in which reconstructions are<br />served by peers at a cadency which mainly depends on peer bandwidth. We then<br />show that the load is unbalanced among peers in such systems (young peers<br />inherently store less data than the old ones). This leads us to introduce a<br />correcting factor computed from a more complex multi-queue model.We also propose to<br />introduce biases in the protocol to correct the imbalance between peers, and show<br />that it improves the efficiency of the system.<br />Last, we discuss scheduling and control issues. Indeed each peer is involved in<br />many reconstructions, so we need to schedule them. Also due to redundancy, for<br />each reconstruction process, we may select which peers, among a given set, are going<br />to be involved. We compare several policies, and propose a<br />simple greedy-like policy that allows a small reconstruction time. <br />The models and schemes proposed are validated by mathematical analysis and/or an<br />extensive set of simulations.<br /><br />

List of interships