This page details several crucial aspects of the Minix kernel booting. ====== Outside Minix itself: multiboot setup procedure ====== Minix assumes a Multiboot-compliant boot loader has loaded its components into memory. This means the kernel and executables of boot-time processes are loaded somewhere in memory. The kernel is loaded at a location specified in its ELF header, the other executables are loaded at arbitrary locations. The bootloader passes control to the kernel at its entry point specified in its [[http://en.wikipedia.org/wiki/Executable_and_Linkable_Format|ELF header]]. You can discover the entry point of the kernel by reading the output of ''objdump -x kernel''. Typical output is ''start address 0x00400000''. The entry point is specified by the kernel [[http://wiki.osdev.org/Linker_Scripts|linker script]] in ''kernel/arch/i386/kernel.lds'' by the ''ENTRY(symbol)'' directive. So the end result of the multiboot loading procedure is: the kernel is loaded at its desired address in memory; 32-bit, protected mode, 0-based, 4GB-sized segments are loaded, protected mode is on and paging is off. The kernel can execute because its symbols (i.e. data/code references) correspond to correct addresses in memory. These addresses, i.e. the symbol addresses and where to load the sections so that the references are correct, are specified in the kernel linker script as well. Control is passed to its entry point and it can execute normally. See [[http://en.wikipedia.org/wiki/Multi_boot|wikipedia multiboot]] for more information and references on the multiboot procedure. Theoretically the Minix kernel can be booted with any multiboot-compliant loader, such as GRUB. The stock Minix boot loader is the NetBSD boot loader. ====== Kernel Entry point ====== On X86, the Minix kernel begins executing at its entry point with symbol MINIX in ''arch/i386/head.S''. The initial entry has to be assembly first as some state has to be initialized that C compilers assume is initialized, specifically the stack pointer has to point to a usable area of memory. Part of the multiboot protocol is to leave a magic number in ''%eax'' and a pointer to the multiboot information structure in ''%ebx''. So the assembly sets the stack pointer and calls the first C function, passing it ''%eax'' and ''%ebx'' for verification and information. ====== Physical and virtual addressing in the kernel ====== A source of some complexity in kernel addressing is the dichotomy between physical and virtual addressing. As there can be no page table loaded at start time, initially the kernel must address all of its code and data using physical addressing. This means that the addresses of all symbols correspond to their physical address in memory. To make this possible, we tell the boot loader where the kernel is to be loaded and set the symbol addresses accordingly. This load address is ''_kern_phys_base'' in the linker script. The statement ''. = _kern_phys_base'' before the first section will make the load and virtual addresses of symbols in the first sections start with that constant. This ensures symbols are loaded at the expected place and references will work normally as soon as the kernel starts executing. The kernel, however, must logically execute at the //top// of virtual address space, i.e. at a high address. This is because the kernel shares the virtual address space with the currently running process, and constraining user processes to be linked (i.e. execute) at a certain minimal address, also bounding kernel size and kernel virtual address space usage, is unacceptable. Of course loading the kernel to such a high address physically is also impossible, as it's unlikely memory exists there. ===== Setting up the mapping ===== So an important part of the kernel's early execution is to set up a page table that will map high-memory references to the physical location of the kernel, in other words, map it in at the desired high location and continue executing there. This happens in the first C function that is executed and is called by ''head.S'': ''pre_init.c:pre_init()''. It creates the page table, mapping the virtual address of its current execution 1:1 to its current physical addresses (logically nothing changes), and also mapping the high address range to the current physical addresses, allowing execution to occur there. Once the mapping is done, the function returns and the assembly code in ''head.S'' will call the ''kmain()'' function, making the CPU jump to the highly mapped region. The start of the high region, in virtual addressing of course, is ''_kern_vir_base''. ===== Separating lowly and highly mapped symbols ===== The kernel initially executes in the low, i.e. physical, region. In this phase it may not reference any symbols for use in the high region at all, because those are not accessible using those addresses yet; the CPU will try to access high physical addresses and most likely memory doesn't exist there. At best silent nonsense references happen (even if it's real memory it's still nonsense as there is no kernel data there), at worst it'll mess up some hardware that is mapped there. Correspondingly, once the kernel starts executing in the high region, it should not reference any lowly mapped symbols as that part of the virtual address space does not belong to the kernel any more; everything below ''_kern_vir_base'' is the domain of the currently executing process and the kernel can't use anything below that for itself. As before, references to lowly mapped symbols would be disastrous. The solution chosen in Minix is to separate the two regions by putting the lowly mapped symbols in a separate namespace. All object files that are run in the initial phase get prepended a prefix ''%%__%%k_unpaged_'' to all its symbol definitions and symbol references. This way neither region can accidentally access symbols in the other region. Of course they can deliberatly do that by prepending this prefix; e.g.t he lowly mapped code in head.S jumps into the highly mapped code starting with ''kmain'' because ''kmain'' is actually declared ''%%__%%k_unpaged_kmain'', accomplished by a ''#define'' in ''proto.h''. This is done so that the prototype for ''kmain'' makes sense both in the unpaged and the paged code, where it has different names. ===== Specifying high/low symbol linkage ===== The lowly mapped symbols are collected by their object file names with prefix ''unpaged_*.o''. These files are made in ''arch/i386/Makefile.inc'', where an ''objcopy'' is performed to give symbols the prefix also. in ''kernel.lds'' all ''unpaged_*.o'' objects are collected and linked with the low, physically referenced address. Then the pointer is increased to raise it to ''_kern_vir_base'' where the rest of the symbols are linked. To make sure the remaining sections are still loaded at a low address, despite having high link addresses, the ''AT()'' directive is used. An example of the result from ''objdump -x kernel'' output: LOAD off 0x00001000 vaddr 0x00400000 paddr 0x00400000 align 2**12 LOAD off 0x00004000 vaddr 0x00403000 paddr 0x00403000 align 2**12 LOAD off 0x00006020 vaddr 0xf040f020 paddr 0x0040f020 align 2**12 LOAD off 0x0001c000 vaddr 0xf0425000 paddr 0x00425000 align 2**12 This demonstrates the first two sections being loaded and linked at low address, and the last two sections being loaded at a low address yet linked at high addresses. ==== NetBSD boot loader caveat ==== The NetBSD boot loader does not use the physical address field to load the sections; rather it takes the virtual address and masks some of the high bits off. This means that the virtual and physical addresses of sections must correspond exactly with this masking. This is buried in ''arch/x86/include/loadfile_machdep.h'' as: #define LOADADDR(a) ((((u_long)(a)) & 0x07ffffff) + offset) I.e. the physical address must equal the virtual address masked with ''0x07ffffff'' to be loaded at the expected place by the NetBSD loader. ====== Kernel starts VM ====== The first responsibility of the kernel is to set up architecture-dependent basics, and then make VM runnable. As VM manages the memory and the virtual address space of all processes, it is preferable it does that for all processes, including boot processes. It can't for itself of course. So part of the kernel initialization is to make a page table for VM, so that it is mapped where it expects to be mapped, and start executing it. VM will then create virtual address spaces (i.e. page tables and its own high-level data structures) for the other boot-time processes.