Student: Not assigned yet
Owner: David van Moolenbroek dcvmoole@cs.vu.nl / Raja Appuswamy raja@cs.vu.nl
Git branch name: N/A
The storage industry has witnessed a tremendous change in both hardware and software fronts over the past decade. Storage hardware landscape is witnessing the birth and adoption of new classes of storage like flash memory-based solid state drives that posses radically different characteristics compared to the traditional disk drives. Storage software, on the other hand, has evolved from simple single disk file systems to feature-rich, sophisticated storage systems like ZFS and Btrfs that support a suite of features like snapshotting, cloning, checksumming, background defragmentation etc.
Our research involves building a next-generation storage stack called Loris, on top of the MINIX 3 operating system. Before we started designing our stack, we analyzed all existing solutions along three different dimensions - reliability, flexibility and heterogeneity. We found that all existing solutions fail to satisfy the requirements of an ideal storage stack. Using the modular, layered network stack as a guiding example, we then designed and implemented the Loris stack which solves all problems faced by existing approaches.
Loris' modularity makes it implement a highly-reliable storage solution that can protect itself from both hardware and software failures. Loris' flexibility makes it possible to deploy a storage stack that can snapshot and clone data in a range of granularities ranging from individual files all the way to file volumes.
As enterprises continue to produce massive amounts of data and as data management is still prohibitively expensive, several storage vendors have started offering data storage as a service. In this project, we would like to investigate the best way to integrate such a cloud storage solution with the Loris stack.
There are several possible design alternatives that one could pursue. For instance, one could use the cloud store as a remote mirrored primary backup data store which stores a copy of all local data. In such a case, when any or all local nodes fail, the cloud store can take over as the primary with no down time. Another alternative would be to use the cloud store in a RAID setting where it stores just the parity blocks while storing the data blocks on local storage.
This project involves investigating all such possibilities, implementing a cloud storage physical layer in Loris and evaluating it using a range of benchmarks. If you are interested, please come and talk to us!