dClock: Decentralized QoS Scheduling for Distributed Storage Systems

Problem:

Distributed storage systems provide a scalable, flexible and cheaper solution to growing storage needs of organizations. Servers designed using commodity components can be organized together to provide an abstraction of a bigger storage system. Data can be put on multiple servers using erasure coding for redundancy and fast parallel access. An example system can be FAB (Federated Array of Bricks) from HP Labs or IceCube from IBM. The clients may access the system through various gateways (can be servers themselves). Thus a centralized scheduling approach isn't possible. Moreover the scheduler needs to be aware of resource constraints that each request may have. This is because each request will belong to a certain server and can only be services by it. In this case hot-spotting becomes a challenging issue in providing QoS. This work is under progress and I will put a preliminary draft with a formal problem description and results, sometime next month.