Table of Contents

Research

MINIX 3 can be used as the base for (Ph.D.) research in operating systems, especially reliable, secure, or fault-tolerant operating systems. At the VU we have done so extensively. Here are some of the projects that have been carried out.

Automatic Recovery from Fatal System Errors

In most operating systems, a fatal error within the operating system, for example, referencing an invalid pointer, leads almost immediately to a system crash. In MINIX 3, such an error leads to the crash of one of the operating system components, but not the entire system. The crash is reported to a system component called the reincarnation server, which takes an appropriate action, typically including logging the event, notifying the system administrator, and restarting the failed component. For stateless components, the recover is very quick and completely transparent to application processes, that is, they do not even notice the failure and recovery. Research on automatic recovery of stateful components is underway. Here are some selected papers on recovery in MINIX 3.

A New File System called Loris

Loris is a new POSIX-conformant file system that has been designed in a highly modular way with four main layers. It has many new properties, including treating entire files as units for storage and replication and excellent fault tolerance. The top layer, the naming layer, handles file naming, protection, and the POSIX attributes. The next layer is the cache layer, which keeps the most recently uses files in RAM. Below this is the logical layer, which hides all aspects of physical storage, including replication over heterogenous storage media (hard disks, SSDs, etc.) from the upper layers. RAID algorithms are implemented here on a per-file basis so different files can have different RAID configurations. The lowest layer is the physical layer, which manages putting files on disks. Here are some selected papers on Loris.

Adapting MINIX 3 for Multicore Chips

MINIX 3 is a multiserver system, with nearly all the operating system running as a collection of user-mode processes. On a multicore system, an obvious issue is where to run the processes. Should they all be on the same core? Should each one be on a different core? If the cores run at different speeds, which process should be on which core to get good performance while reducing energy usage and so on. We have made a first prototype of part of the system by splitting up the networking stack into TCP, UDP, and IP components, each running as separate processes, all in user mode. We are examining splitting up other parts of the system as well. Here are some selected papers on multicore MINIX 3.

Live Update

Many applications need to keep running all the time and cannot go down even to upgrade the operating system to new releases. This work is about being able to replace nearly all of the operating system (except the microkernel) on the fly, without a reboot and without disturbing running processes. In fact, applications on MINIX 3 are unaware that the operating has been upgraded underneath them. We can also handle changes to new versions in which the data structures used by the new version are quite different from those used in the old version. The upgrade process is completely automatic in most cases. In a few cases, where the changes are fairly radical, some input from the programmer may be needed to transfer state from the old operating system component to the new one. We are also looking at many applications of live update tecnology for other purposes. Here are some selected papers on live update.

More Publications

We have published extensively about our research. Here are our publications.