====== Live update and rerandomization ====== MINIX3 now has support for live update and rerandomization of its system services. These features are based on LLVM bitcode compilation and instrumentation in combination with various run-time extensions. Live update and rerandomization support is currently fully functional, although still in an experimental state, not enabled by default, and available for x86 only. This document describes the basic idea, provides instructions on how to enable and use the functionality, provides more in-depth information for developers, and lists open issues and further reading material. ===== Introduction ===== This section contains a high-level overview of the live update and rerandomization functionality. ==== Live update ==== A live update is an update to a software component while it is active, allowing the component's code and data to be changed without affecting the environment around it. The MINIX3 live update functionality allows such updates to be applied to its **system services**: the usermode server and driver processes that, in addition to the microkernel, make up the operating system. As a result, these services can be be updated at run time without requiring a system reboot. There is no support for live updating the microkernel or user applications at this time. The live update procedure can be summarized as follows. The component responsible for orchestrating live updates is the RS (Reincarnation Server) service. When RS applies an update to a particular system service, it first brings that service to a stop in a known **quiescence** state, ensuring that the live update will not interfere with the service's normal operation, by exploiting the message-based nature of MINIX3. A new instance of the service is created. This new instance performs its own **state transfer**, copying and adjusting all the relevant data from the old instance to itself. If the state transfer succeeds, the new instance continues to run, and the old instance is killed. If the state transfer fails, RS performs a **rollback**: the new instance is killed, and the system resumes execution of the old instance. In order to maintain the illusion to the rest of the system that there only ever was one service process, the process slots of the old and the new instance are swapped before the new instance gets to run, and swapped back upon rollback. The MINIX3 live update system allows updates to all system services. Those include the RS service itself, and the VM (Virtual Memory) service. The VM service can be updated with severe restrictions only, however. The system also supports **multicomponent** live updates: atomic live updates of several system services at once, possibly including RS and/or VM. In principle, this allows for an atomic live update of the entire MINIX3 service layer. The state transfer aspect of live update relies heavily on compile-time and in particular link-time instrumentation of system services. This instrumentation is implemented in the form of LLVM "optimization" passes, which operate on LLVM bitcode modules. In most cases, these passes are run after (initial) program linking, by means of the LLVM Link-Time Optimization (LTO) system. Thus, in order to support live update and rerandomization, the system must be compiled using LLVM bitcode and with LTO support. The LLVM pass that performs the static analysis and link-time instrumentation for live update is called the **magic pass**. In addition, live updates require runtime support for state transfer in each service. For this reason, system services are relinked with a library that provides all the run-time functionality which ultimately allow a new service instance to perform state transfer from its old instance. This library is called the **magic runtime library** or //libmagicrt//. Together, the magic pass and runtime library make up the **magic framework**. ==== Live rerandomization ==== Live rerandomization consists of randomizing the internal address space layout of a component at run time. While the concept of ASR or ASLR - Address Space (Layout) Randomization - is well known, most implementations are rather limited: they perform such randomization only once, when starting a process; they merely randomize the base location of entire process regions, for example the process stack; and, they apply the concept to user processes only. In contrast, the MINIX3 live rerandomization can randomize the address space layout of operating system services, as often as desired, and with fine granularity. In order to achieve this, the live rerandomization makes use of live updates. The fundamental approach consists of a two-step process. First, new versions of the service program are generated, using link-time randomization of various parts of its program binary. Ideally, this would be done at run time; due to various limitations, MINIX3 currently only supports pregenerated randomized binaries of system services. Then, at runtime, the live update system is used to update from one randomized version of each service to another. The randomization of binaries is done with another link-time pass, called the **asr pass**. The magic runtime library implements various runtime aspects of ASR rerandomization during live update. ===== Users guide ===== In this section, we explain how to set up a MINIX3 system that supports live update and rerandomization, and we describe how to use the new functionality when running MINIX3. ==== Setting up the system ==== We cover all the steps to set up a MINIX3 system that is ready for live update and rerandomization. For now, it requires crosscompilation as well as an additional build of the LLVM source code. The procedure is for x86 targets only. The current procedure has been tested only from **Linux** as host platform, and may require minor adjustments on other host platforms. We provide a few additional instructions for those other platforms, but these may currently not be complete. Please feel free to add more instructions to this page, and/or open GitHub issues for other platforms and link to them from here. After setting up an initial environment, the first step is to obtain the MINIX3 source code. After that, the next step is to build an LLVM toolchain with LTO support, which is needed because the regular MINIX3 crosscompilation LLVM toolchain does not include LTO support (yet - we are working on this). Once the LTO-supporting toolchain has been built, the final step is to build the MINIX3 sources, with extra flags to enable magic instrumentation and possibly ASR rerandomziation. Once these steps have been completed successfully for the first time, one can later update the MINIX3 source and then rebuild the system. The LTO-supporting toolchain need not be rebuilt unless we upgrade the LLVM source code itself. We will now go through all steps in detail. At the end of this section, there is also a summary of the commands to issue. All of the commands in this section are to be performed on the crosscompilation host system rather than on MINIX3. None of the commands, except the Linux-specific ''sudo apt-get'' example in the first subsection, require more than ordinary user privileges. === Setting up the environment === The initial step is to set up a crosscompilation environment. General information about setting up a crosscompilation environment can be found on the [[.:crosscompiling|crosscompilation page]]. As one example, the reference platform used to test the instructions in this document was the developer desktop edition of Ubuntu 14.04, a.k.a. ''ubuntu-14.04.2-desktop-i386.iso'', with the following extra packages installed: $ sudo apt-get install curl clang binutils zlibc zlib1g zlib1g-dev libncurses-dev qemu-system-x86 The MINIX3 build system uses one single directory in which to place all its files. This directory is one level up from the root of the MINIX3 source directory. Thus, it is advisable to create this containing directory at a location known to have enough free hard disk space. Here we use ''/home/user/minix-liveupdate'' as an example, but the location is entirely up to you. The containing directory will end up having one subdirectory for the MINIX3 source code (called ''minix-src'' in this document), one subdirectory for the LLVM LTO toolchain (called ''obj_llvm.i386''), and one subdirectory for the crosscompilation tool chain and compiled objects (called ''obj.i386''). Thus, the ultimate directory structure will look like this: /home/user/minix-liveupdate/minix-src /home/user/minix-liveupdate/obj_llvm.i386 /home/user/minix-liveupdate/obj.i386 You have to choose a location for the containing directory, and create it yourself. The three subdirectories should be created automatically as part of the following steps. However, it has been reported that on some platforms (e.g., FreeBSD), some or all of these directories have to be created manually; this can be done with nothing more than a few basic ''mkdir'' commands. In terms of disk space usage, expect to be needing a bare minimum of **30GB** for the combination of these three subdirectories, with a recommended **40GB** of available space. === Obtaining or updating the MINIX3 source code === The first real step is to fetch the MINIX3 source code. Other wiki pages cover this in more detail, but the simplest approach is to check out the sources from the main MINIX3 repository using [[.:usinggit|git]]: $ cd /home/user/minix-liveupdate $ git clone git://git.minix3.org/minix minix-src This will create a ''minix-src'' subdirectory with the latest version of the MINIX3 source code. Later on, a newer version of the source code can be pulled from the MINIX3 repository: $ cd /home/user/minix-liveupdate/minix-src $ git pull In both cases, the next step is now to build the source code. === Building the LTO toolchain === The second step is to build the LLVM LTO infrastructure, if it has not yet been built before. Eventually, this will be done automatically as part of the regular build. For now, we have a script that can perform the build, called ''generate_gold_plugin.sh''. It is located in the ''minix/llvm'' subdirectory of the MINIX3 source tree. The basic procedure therefore consists of the following steps (but read this entire section first): $ cd /home/user/minix-liveupdate/minix-src/minix/llvm $ ./generate_gold_plugin.sh On some platforms, it may be needed to specify the C/C++ compiler and/or the name of the GNU make utility, which can be done as follows: $ CC=clang CXX=clang++ MAKE=make ./generate_gold_plugin.sh On FreeBSD and similar platforms, one may have to ensure that GNU make is installed (typically as ''gmake'') first, and pass in ''MAKE=gmake'' to point to it. This step may take several hours. It can be sped up by supplying a number of parallel jobs, through a ''JOBS=n'' variable: $ JOBS=8 ./generate_gold_plugin.sh As stated before, after this command has finished successfully, it need not be reissued until LLVM is upgraded in the MINIX3 source tree. This is a rare event which is typically part of a larger resynchronization with NetBSD code, and we will clearly announce such events. When this happens, it may be advisable to remove the entire ''obj_llvm.i386'' directory as well as any files in ''minix-src/minix/llvm/bin'', before rerunning the generate_gold_plugin.sh script. === Building the system === The third step consists of building the system and generating a bootable image out of it. When run for the first time, this step will also build the regular (non-LTO) crosscompilation toolchain. The first run may therefore (also) take several hours. The build procedure is just like regular MINIX3 crosscompilation, differing in only two aspects. First, the appropriate build variables must be passed in to enable the desired functionality. In order to build the system with live update support through magic instrumentation, the build system must be invoked with the ''MKMAGIC'' build variable set to //yes//. This will perform a bitcode build of the entire system, and perform magic instrumentation on all system services. In order to build the system with ASR instrumentation, the build system must be invoked with the ''MKASR'' build variable set to //yes//. This will automatically enable magic instrumentation, perform ASR randomization on all system services, and pregenerate a number of ASR-rerandomized service binaries for each service. This number can be controlled with an additional ''ASRCOUNT=n'' build variable, where the //n// value must be between 1 and 65536 (inclusive). The default //ASRCOUNT// is 3. Second, in order to build a hard disk image suitable for use by the resulting bitcode builds, the ''x86_hdimage.sh'' script must be invoked with the **-b** flag. This will enlarge the generated image to account for the larger binaries, and enable inclusion of ASR-rerandomized binaries if necessary. These two aspects can be covered in a single build command. The following short procedure will build a hard disk image with magic instrumentation: $ cd /home/user/minix-liveupdate/minix-src $ BUILDVARS="-V MKMAGIC=yes" ./releasetools/x86_hdimage.sh -b In order to speed up the build, a number of parallel jobs may be supplied. It is typically advisable to use as many jobs as there are hardware threads of execution (i.e., CPU cores or hyperthreads) in the system: $ JOBS=8 BUILDVARS="-V MKMAGIC=yes" ./releasetools/x86_hdimage.sh -b It may be necessary to ensure that clang is used as the compiler: $ CC=clang CXX=clang++ JOBS=8 BUILDVARS="-V MKMAGIC=yes" ./releasetools/x86_hdimage.sh -b Also, some platforms may not be able to compile the compiler toolchain for the target platform due to running out of memory. In that case, it is possible to build an image that does not come with its own compiler toolchain, by passing in the ''MKLLVMCMDS=no'' build variable. This build variable can also be used simply to speed up the compilation procedure. $ BUILDVARS="-V MKMAGIC=yes -V MKLLVMCMDS=no" ./releasetools/x86_hdimage.sh -b In order to build an image with ASR randomization, including four additional ASR-rerandomized versions of each system service, use the following build variables: $ BUILDVARS="-V MKASR=yes -V ASRCOUNT=4" ./releasetools/x86_hdimage.sh -b Obviously, all variables shown above can be combined as appropriate. The author of this document has used the following command line on several occasions: $ CC=clang CXX=clang++ JOBS=4 BUILDVARS="-V MKASR=yes -V ASRCOUNT=2 -V MKLLVMCMDS=no" ./releasetools/x86_hdimage.sh -b After the first run, the build system will perform recompilation of only the parts of the source code that have changed, and should not take nearly as long to run as the first time. In case of unexpected problems when rebuilding, it may be necessary to throw away the previously generated objects and rebuild the MINIX3 source code in its entirety. This can be done by going to the top-level ''obj.i386'' directory and deleting all files and directories in there, except the ''tooldir.{yourplatform}'' subdirectory. Fully rebuilding the MINIX3 source code will take longer than an incremental rebuild, but since the crosscompilation toolchain is left as is, it will still be nowhere close as long as the first run. As explained in more detail on the [[.:crosscompiling|crosscompilation page]], it is also possible to rebuild particular parts of the system without going through the entire "make build" process. This involves the use of the ''nbmake-i386'' tool and generally requires a good understanding of the compilation process. === Running the image === The x86_hdimage command produces a bootable MINIX3 hard disk image file. The generated image file is called ''minix_x86.img'' and located in the root of the MINIX3 source tree - ''minix-src'' in our examples. Once an image has been generated, it can be run. The most convenient way to run the image is to use **qemu/KVM**. This can be done using the command as given at the end of the x86_hdimage output. While explaining the use of qemu is beyond the scope of this document, it may be useful to look into the ''-append'', ''-curses'', and ''-serial file:..'' qemu command line arguments. The following command line will launch qemu with KVM support (remove ''--enable-kvm'' to disable KVM support), a curses-based user interface, and system output redirected to a file named ''serial.out'': $ cd /home/user/minix-liveupdate/minix-src $ (cd ../obj.i386/destdir.i386/boot/minix/.temp && qemu-system-i386 --enable-kvm -m 256 -kernel kernel -initrd "mod01_ds,mod02_rs,mod03_pm,mod04_sched,mod05_vfs,mod06_memory,mod07_tty,mod08_mib,mod09_vm,mod10_pfs,mod11_mfs,mod12_init" -hda ../../../../../minix-src/minix_x86.img -curses -serial file:../../../../../minix-src/serial.out -append "rootdevname=c0d0p0 cttyline=0") Extra [[usersguide:bootmonitor|boot options]] can be supplied in the (space-separated) list that follows the ''-append'' switch. For example, adding '' rs_verbose=1'' will enable verbose output in the RS service, which is highly useful for debugging issues with live update. === Summary === The following commands can be used to obtain and build a MINIX3 system that supports live update and live rerandomization, including three alternative rerandomized versions of all system services, in addition to the randomized standard ones: $ export CC=clang CXX=clang++ JOBS=8 $ cd /home/user/minix-liveupdate $ git clone git://git.minix3.org/minix minix-src $ cd minix-src/minix/llvm $ ./generate_gold_plugin.sh $ cd ../.. $ BUILDVARS="-V MKASR=yes -V MKLLVMCMDS=no" ./releasetools/x86_hdimage.sh -b The entire procedure will typically take about 30GB of disk space and several hours of time. Sometime later, the following steps can be used to update the installation to a newer MINIX3 version: $ cd /home/user/minix-liveupdate/minix-src $ git pull $ CC=clang CXX=clang++ JOBS=8 BUILDVARS="-V MKASR=yes -V MKLLVMCMDS=no" ./releasetools/x86_hdimage.sh -b In contrast to the initial run, the entire update procedure should take no more than an hour. ==== Using live update ==== Once an instrumented MINIX3 system has been built and started, it should be ready for live updates. MINIX3 offers two scripts that make use of the live update functionality: one for testing the infrastructure, and one for performing runtime ASR rerandomization. In addition, the user may perform live updates manually. In this section, we cover both parts. The commands in this section are to be run within MINIX3, rather than on the host system. They must be run as root, because performing a live update of a system service requires superuser privileges. These two things are reflected by the ''minix#'' prompt used in the examples below. === Pre-provided scripts === The MINIX3 distribution comes with two scripts that can be used to test and use the live update and rerandomization functionality. The first one is //testrelpol//. This script may be used for basic regression testing of the MINIX3 live update infrastructure. The second one is //update_asr//. This command performs live rerandomization of system services at runtime. == Infrastructure testing: testrelpol == The MINIX3 test suite has a test script that tests the basic MINIX3 crash recovery and live update functionality. The script is called **testrelpol** and can be found in ''/usr/tests/minix-posix'': minix# cd /usr/tests/minix-posix minix# ./testrelpol For its live update tests, this script does //not// use the magic framework for state transfer at all. Instead it uses **identity transfer** which performs a basic memory copy between the old and the new instance. As a result, the testrelpol script should succeed whether or not services are instrumented. However, it may not work reliably on MINIX3 systems that are not built for magic instrumentation (i.e., built with neither ''MKMAGIC=yes'' nor ''MKASR=yes''). == Live rerandomization: update_asr == As we have shown before, the ''MKASR=yes'' host-side build variable performs the //build-time// preparation of a MINIX3 system for live rerandomization. Complementing this, the //run-time// side of the live rerandomization is provided by means of the **update_asr** command. The update_asr command will update system services to their next pregenerated ASR-rerandomized version, using a cyclic system. Live rerandomization is not automatic, and thus, the MINIX3 system administrator is responsible for running the update_asr command at appropriate times. By default, the update_asr command performs one round of ASR rerandomization, updating each service to its next version: minix# update_asr By default, this command will report errors only. More verbose information can be shown using the ''-v'' switch: minix# update_asr -v For further details about this command, see the update_asr(8) manual page. Aside from providing actual security benefits, the update_asr script is the **most complete test** of the live update and rerandomization functionality at this time. It uses the magic framework for state transfer, with full relocation of all state, and it applies the runtime ASR features. As of writing, it runs in the default qemu environment without any errors or subsequent issues. The only aspect that is not tested with this command, is whether ASR rerandomization is //effective//, that is, whether all parts of service address space were properly randomized by the asr pass. After all, ASR rerandomization between identical service copies works just as well, but provides substantially fewer security guarantees. Developers working on the asr pass are encouraged to verify its effectiveness manually, for example using nm(1) on generated service binaries on the host side. === Live update commands === RS can be instructed to perform live updates through the minix-service(8) command, specifically through its **minix-service update** subcommand. This command is also used by the automated scripts. For a full overview of the command's functionality, please see the minix-service(8) manual page as well as the command's output when it is run with no parameters. In its most fundamental form, the //minix-service update// command will update a running service, identified by its label, to a new version provided as an on-disk binary file. It is however also possible to tell RS to update the service into a copy of itself. In addition, various flags and options can be used for fine-grained control of the live update action. The basic syntax to perform a live update on a single system service is as follows: minix# minix-service [flags] update [self|] -label