One of the main goals of MINIX 3 is reliability. Below we discuss some of the more important principles that enhance MINIX 3's reliability. These principles also enhance security, since most security flaws are due to attackers exploiting bugs in the code, so greater reliability will also improve security. Some of the ideas discussed are in the current release, but a few are scheduled for the next release. As this is a research project, we often make changes as we think of new ways to improve reliability.
Monolithic operating systems (e.g., Windows, Linux, BSD) have millions of lines of kernel code. There is no way so much code can ever be made correct. In contrast, MINIX 3 has about 4000 lines of executable kernel code. We believe this code can eventually be made fairly close to bug free.
In monolithic operating systems, device drivers reside in the kernel. This means that when a new peripheral is installed, unknown, untrusted code is inserted in the kernel. A single bad line of code in a driver can bring down the system. This design is fundamentally flawed. In MINIX 3, each device driver is a separate user-mode process. Drivers cannot execute privileged instructions, change the page tables, perform I/O, or write to absolute memory. They have to make kernel calls for these services and the kernel checks each call for authority.
In monolithic operating systems, a driver can write to any word of memory and thus accidentally trash user programs. In MINIX 3, when a user expects data from, for example, the file system, it builds a descriptor telling who has access and at what addresses. It then passes an index to this descriptor to the file system, which may pass it to a driver. The file system or driver then asks the kernel to write via the descriptor, making it impossible for them to write to addresses outside the buffer.
Dereferencing a bad pointer within a driver will crash the driver process, but will have no effect on the system as a whole. The reincarnation server will restart the crashed driver automatically. For some drivers (e.g., disk and network) recovery is transparent to user processes. For others (e.g., audio and printer), the user may notice. In monolithic systems, dereferencing a bad pointer in a (kernel) driver normally leads to a system crash.
If a driver gets into an infinite loop, the scheduler will gradually lower its priority until it becomes the idle process. Eventually the reincarnation server will see that it is not responding to status requests, so it will kill and restart the looping driver. In a monolithic system, a looping driver hangs the system.
MINIX 3 uses fixed-length messages for internal communication, which eliminates certain buffer overruns and buffer management problems. Also, many exploits work by overrunning a buffer to trick the program into returning from a function call using an overwritten stacked return address pointing into the overrun buffer. In MINIX 3, this attack does not work because instruction and data space are split and only code in (read-only) instruction space can be executed.
Device drivers obtain kernel services (such as copying data to users' address spaces) by making kernel calls. The MINIX 3 kernel has a bit map for each driver specifying which calls it is authorized to make. In monolithic systems every driver can call every kernel function, authorized or not.
The kernel also maintains a table telling which I/O ports each driver may access. As a result, a driver can only touch its own I/O ports. In monolithic systems, a buggy driver can access I/O ports belonging to another device.
Not every driver and server needs to communicate with every other driver and server. Accordingly, a per-process bit map determines which destinations each process may send to.
A special process, called the reincarnation server, periodically pings each device driver. If the driver dies or fails to respond correctly to pings, the reincarnation server automatically replaces it by a fresh copy. The detection and replacement of nonfunctioning drivers is automatic, without any user action required. This feature does not work for disk drivers at present, but in the next release the system will be able to recover even disk drivers, which will be shadowed in RAM. Driver recovery does not affect running processes.
When a interrupt occurs, it is converted at a low level to a notification sent to the appropriate driver. If the driver is waiting for a message, it gets the interrupt immediately; otherwise it gets the notification the next time it does a RECEIVE to get a message. This scheme eliminates nested interrupts and makes driver programming easier.