User Tools

Site Tools


developersguide:liveupdate

This is an old revision of the document!


This is a draft.

Live update and rerandomization

MINIX3 now has support for live update and rerandomization of its system services. These features are based on LLVM bitcode compilation and instrumentation in combination with various run-time extensions. Live update and rerandomization support is currently fully functional but in an experimental state, not enabled by default, and available for x86 only. This document describes the basic idea, provides instructions on how to enable and use the functionality, provides more in-depth information for developers, and lists open issues.

Introduction

This section contains a high-level overview of the live update and rerandomization functionality.

Live update

A live update is an update to a software component while it is active, allowing the component's code and data to be changed without affecting the environment around it. The MINIX3 live update functionality allows such updates to be applied to its system services: the usermode server and driver processes that, in addition to the microkernel, make up the operating system. As a result, these services can be be updated at run time without requiring a system reboot. There is no support for live updating the microkernel or user applications at this time.

The live update procedure can be summarized as follows. The component responsible for orchestrating live updates is the RS (Reincarnation Server) service. When RS applies an update to a particular system service, it first brings that service to a stop in a known state, by exploiting the message-based nature of MINIX3. A new instance of the service is created. This new instance performs its own state transfer, copying and adjusting all the relevant data from the old instance to itself. If the state transfer succeeds, the new instance continues to run, and the old instance is killed. If the state transfer fails, RS performs a rollback: the new instance is killed, and the system resumes execution of the old instance. In order to maintain the illusion to the rest of the system that there only ever was one service process, the process slots of the old and the new instance are swapped before the new instance gets to run, and swapped back upon rollback.

The MINIX3 live update system allows updates to all system services. Those includes the RS service itself, and the VM (Virtual Memory) service. The VM service can be updated with severe restrictions only, however. The system also supports multicomponent live updates: atomic live updates of several system services at once, possibly including RS and/or VM. In principle, this allows for an atomic live update of the entire MINIX3 service layer.

The state transfer aspect of live update relies heavily on compile-time and in particular link-time instrumentation of system services. This instrumentation is implemented in the form of LLVM “optimization” passes, which operate on LLVM bitcode modules. In most cases, these passes are run after (initial) program linking, by means of the LLVM Link-Time Optimization (LTO) system. Thus, in order to support live update and rerandomization, the system must be compiled using LLVM bitcode and with LTO support. The LLVM pass that performs the static analysis and link-time instrumentation for live update is called the magic pass.

In addition, live updates require runtime support for state transfer in each service. For this reason, system services are relinked with a library that provides all the run-time functionality which ultimately allow a new service instance to perform state transfer from its old instance. This library is called the magic library or libmagic. Together, the magic pass and library make up the magic framework.

Live rerandomization

Live rerandomization consists of randomizing the internal address space layout of a component at run time. While the concept of ASR or ASLR - Address Space (Layout) Randomization - is well known, most implementations are rather limited: they do such randomization only once, when starting a process; the randomize the base location of entire process regions, for example the process stack; and, they apply the concept to user processes only. In contrast, the MINIX3 live rerandomization can randomize the address space layout of system services, as often as desired, and with fine granularity. In order to achieve this, the live rerandomization makes use of live updates.

The fundamental idea is to first generate a new version of the service binary, using link-time randomization of various parts of the binary. Ideally, this would be done at run time; due to various limitations, MINIX3 currently only supports pregenerated randomized binaries of system services. Then, at runtime, the live update system is used to update from one randomized version of each service to another.

The randomization of binaries is done with another link-time pass, called the asr pass. The magic library implements the runtime aspects of ASR rerandomization during live update.

Users guide

In this section, we explain how to set up a MINIX3 system that supports live update and rerandomization, and we describe how to use these functionalities when running MINIX3.

Setting up the system

We cover all the steps to set up a MINIX3 system that is ready for live update and rerandomization. For now, it requires crosscompilation as well as an additional build of the LLVM source code. The procedure is for x86 targets only. The current procedure is not quite ideal, but it is what we have right now, and it should work.

After setting up an initial environment, the MINIX3 update cycle basically consists of four steps: obtaining or updating the MINIX3 source code, building the system, instrumenting the system system, and generating a bootable image. We will go through all steps in detail. There is also a summary of commands to issue at the end.

All of the commands in this section are to be performed on the crosscompilation host system rather than on MINIX3. None of the commands, except the Linux-specific sudo apt-get example in the first subsection, require more than ordinary user privileges.

Setting up the environment

The initial step is to set up a crosscompilation environment. General information about setting up a crosscompilation environment can be found on the crosscompilation page. As one example, the reference platform used to test the instructions in this document was the developer desktop edition of Ubuntu 14.04, a.k.a. ubuntu-14.04.2-desktop-i386.iso, with the following extra packages installed:

$ sudo apt-get install curl clang binutils zlibc zlib1g zlib1g-dev libncurses-dev qemu-system-x86

In terms of directory organization, the idea is that everything will end up in one containing directory. Here we use /home/user/minix-liveupdate as an example, but the location is entirely up to you. This containing directory will end up having one subdirectory for the MINIX3 source code (called minix-src in this document), one subdirectory for the LLVM LTO toolchain (called obj_llvm.i386), and one subdirectory for the crosscompilation tool chain and compiled objects (called obj.i386). Thus, the ultimate directory structure will look like this:

/home/user/minix-liveupdate/minix-src
/home/user/minix-liveupdate/obj_llvm.i386
/home/user/minix-liveupdate/obj.i386

You have to choose a location for the containing directory, and create it yourself. The three subdirectories will be created automatically as part of the following steps. In terms of placement, expect to be needing a bare minimum of 30GB for the combination of these three subdirectories, with a recommended 40GB of available space.

Obtaining or updating the MINIX3 source code

The first real step is then to check out the MINIX3 source code. Other wiki pages cover this in more detail, but the gist is to check out the sources from the main MINIX3 repository using git:

$ cd /home/user/minix-liveupdate
$ git clone git://git.minix3.org/minix minix-src

This will create a minix-src subdirectory containing the latest version of the MINIX3 source code.

Later on, a newer version of the source code can be pulled from the MINIX3 repository:

$ cd /home/user/minix-liveupdate/minix-src
$ git pull

In both cases, the next step is now to build the source code.

Building the system

The next step consists of building the system. When run for the first time, this step will also build the LLVM LTO infrastructure, the crosscompilation tools, and the instrumentation. The first run may take several hours.

The center of all the instrumentation activities is the minix/llvm subdirectory of the MINIX3 source tree. This directory contains the instrumentation passes, runtime library, and supporting scripts. This step and the next steps therefore assume this subdirectory as the current directory:

$ cd minix-src/minix/llvm

It may be necessary to ensure that clang is used as the compiler, by exporting the following shell variables. GCC should work as well, but has not been tested as thoroughly.

$ export CC=clang CXX=clang++

Then, the system can built with support for instrumentation by running the configure.llvm script in the current directory, with the MKMAGIC build variable set to yes. To build the infrastructure and system without parallel compilation, simply run the script:

$ BUILDVARS="-V MKMAGIC=yes" ./configure.llvm

Alternatively, a number of parallel jobs may be supplied. It is typically advisable to use as many jobs as there are hardware threads of execution (i.e., CPUs or hyperthreads) in the system:

$ JOBS=8 BUILDVARS="-V MKMAGIC=yes" ./configure.llvm

After the first run, the configure.llvm will perform recompilation of only parts of the source code that have changed, and should not take nearly as long to run as the first time. In case of unexpected problems when rebuilding, it may be necessary to throw away the previously generated objects and rebuild the MINIX3 source code in its entirety. This can be done by going to the top-level obj.i386 directory and deleting all files and directories except the tooldir.{yourplatform} subdirectory in there. Fully rebuilding the MINIX3 source code will take longer than an incremental rebuild, but since the crosscompilation toolchain is left as is, it will still be nowhere close as long as the first run.

As explained in more detail on the crosscompilation page, it is also possible to rebuild particular parts of the system without going through the entire “make build” process. This involves the use of the nbmake-i386 tool and generally requires a good understanding of the compilation process. It may be worth mentioning that the first configure.llvm run saves the MKMAGIC value, so this variable need not be passed to nbmake-i386 each time.

Rebuilding the instrumentation

When building the system for the first time, this step may be skipped, as it is performed automatically. However, when the source code is changed for any of the LLVM passes or the magic library, that is, the source code in minix/llvm, the changed component must be recompiled. Warning: updating the MINIX3 source code with git pull may also upgrade any of these components, in which case it is the responsibility of the user (you) to recompile and reinstall them!

Once we properly integrate the LLVM LTO infrastructure into the MINIX3 build system, this step should disappear altogether.

Rebuilding libmagic

This substep must be performed whenever the source code of libmagic changes. This is due to the fact that dependency tracking is not working correctly for libmagic, which means the automated step in configure.llvm may not recompile the library properly.

The source code of libmagic is located in the minix/llvm/static/magic subdirectory of the MINIX3 source code. To (re)compile and install libmagic, go to its source directory, issue a make clean and a make install:

$ cd static/magic
$ make clean install

The library is installed to minix/llvm/bin. In a later step, the relink.llvm script will pick it up from there.

Rebuilding a pass

This substep is also performed automatically for the first time, by the generate_gold_plugin.sh script invoked from configure.llvm. However, whenever the source code of any of the LLVM instrumentation passes changes, that pass must be recompiled and installed.

The source code of the passes is located in the minix/llvm/passes subdirectory of the MINIX3 source code. A pass can be compiled and installed by going to its minix/llvm/passes/{pass} subdirectory, and issuing make install.

For example, to recompile and install the magic pass:

$ cd passes/magic
$ make install

The passes are installed to minix/llvm/bin. In a later step, the build.llvm script will pick them up from there.

Instrumentation and image building

After building the system, two more steps need to be performed: instrumentation of system services, and generation of a bootable hard disk image. These steps must be performed every time the system is built, including the first time. In particular: every time a system service is (re)compiled, it must be (re)instrumented afterwards. Every time any part of the compiled MINIX3 installation is changed, a new image must be built.

In order to generate a fully instrumented system image with a number of pregenerated ASR binaries for all services, one can run a command that automates both steps. This is covered in the first subsection. Alternatively, the details of manual instrumentation and image building are covered in the two subsections after.

The easy way: bulk ASR generation

The clientctl script in minix/llvm provides a convenient way to instrument all services for live update and rerandomization, generate a number of rerandomized versions of each service, and build a hard disk image. The command has the following syntax:

$ ./clientctl buildasr [N]

Here, N is an optional parameter specifying the number of rerandomized binaries that should be generated in addition to the standard set of randomized binaries. N defaults to 1. For example, the following command will produce a system with four randomized sets of service binaries: one set of ASR-randomized services that are used by default, and three extra rerandomized binaries to which the system can switch at run time:

$ ./clientctl buildasr 3

The result is a MINIX3 hard disk image file which can be booted in (for example) qemu; see further below.

The manual way (1/2): instrumentation

Instrumentation takes place at the granularity of individual system services. The minix/llvm directory contains scripts that allow for relinking services against runtime libraries, and instrumenting services with LLVM passes. The general procedure is like this:

  1. First, the service is compiled and linked to its basic form.
  2. Then, the resulting linked bitcode object is relinked with libmagic.
  3. Finally, link-time instrumentation is applied by running the magic pass, possibly as well as the asr pass, on the linked bitcode object.

Each step also (re)generates a ready-to-execute machine code version of the service.

Step 1 happens in the “building the system” step, using configure.llvm, as explained before.

Step 2 is done with the relink.llvm script in minix/llvm. This script will relink services against a space-separated list of libraries. For live update, only the magic library is relevant:

$ ./relink.llvm magic

This command will relink all services against libmagic, thus providing them with runtime support for live update.

Step 3 is done with the build.llvm script in minix/llvm. This script will instrument services with a space-separated list of LLVM passes. For live update, the magic pass should be used:

$ ./build.llvm magic

This command will instrument all services with the magic pass, performing static analysis and changing the service to include the information necessary for libmagic to perform live updates at runtime.

For live rerandomization support, one must apply not only the magic pass, but also the asr pass:

$ ./build.llvm magic asr

The resulting service will not only be ready for live update, but also be subjected to fine-grained randomization, as well as be supplied with parameters to perform the runtime component of rerandomization during live updates.

For reference, the clientctl buildasr command shown above performs this step multiple times to generate different rerandomized versions of each service, storing each in a different location.

Some details that might be useful to know about relinking and applying passes:

  • By default, relink.llvm and build.llvm perform their respective actions on all system services. It is possible instrument only a subset of services, leaving the other services as is. This can be done by passing a C shell variable with a comma-separate list of individual services. For example, the following command relinks the PM (Process Manager) service against the magic library:
$ C=pm ./relink.llvm magic

The pseudo-targets servers, fs, net, and drivers will perform actions on the services in the corresponding subdirectories in the MINIX3 source tree. The rd pseudo-target regenerates the ramdisk, which must be redone after changing any service on the ramdisk. For example, the following command instruments core servers and file system services with the magic and asr passes, and rebuilds the ramdisk:

$ C=servers,fs,rd ./build.llvm magic asr

The clientctl buildasr command accepts this optional C shell variable as well.

  • Each of the three steps undoes the effects of prior invocations of this step and subsequent steps, but not earlier steps: compiling and linking a service (step 1) will undo any previous relinking and instrumentation. Relinking a service (step 2) will similarly undo any previous relinking and instrumentation of the same service. Instrumenting a service (step 3) will undo any previous instrumentation, reapplying the instrumentation to the same relinked binary. Therefore, a single build.llvm invocation must be used to apply all passes at once.
  • Instrumentation with the magic pass will fail if the service has not been relinked with libmagic first. The same applies to the asr pass. However, the asr pass will not fail if the service has not been instrumented with the magic pass. Instrumenting a service with the asr pass but not the magic pass is of limited use: the service will be randomized, but cannot be subjected to live rerandomization.
The manual way (2/2): building the image

Finally, a MINIX3 image can be built from the compiled MINIX3 code using the clientctl buildimage command:

$ ./clientctl buildimage

This command produces a bootable MINIX3 hard disk image file. The generated image file is called minix_x86.img and located in the root of the MINIX3 source tree - minix-src in our examples.

This command is called automatically as part of clientctl buildasr.

Running the image

Once a hard disk image has been generated, it can be run. The most convenient way to run the image is to use qemu. For convenience, the clientctl script in minix/llvm has a run command to run the image in qemu without further effort:

$ OUT=F ./clientctl run

The OUT shell variable can be set to other values to control what to do with serial output. The F value specifies that the serial output will be redirected to a File, namely serial.out. The other supported settings are Stdout, Console, and Pty.

Extra boot options can be supplied through the APPEND variable:

$ OUT=F APPEND="rs_verbose=1" ./clientctl run

This example will enable verbose output in the RS service, which is highly useful for debugging issues with live update.

Summary

The following commands can be used to obtain, build, instrument, and start a MINIX3 system that supports live update and live rerandomization, including three alternative rerandomized versions, in addition to the standard ones, of all system services:

$ git clone git://git.minix3.org/minix minix-src
$ cd minix-src/llvm
$ export CC=clang CXX=clang++
$ JOBS=8 BUILDVARS="-V MKMAGIC=yes" ./configure.llvm
$ ./clientctl buildasr 3
$ OUT=F ./clientctl run

The entire procedure will typically take about 30GB of disk space and several hours of time.

Sometime later, the following steps can be used to update the installation to a newer MINIX3 version:

$ cd minix-src/llvm
$ git pull
$ export CC=clang CXX=clang++
$ JOBS=8 BUILDVARS="-V MKMAGIC=yes" ./configure.llvm
$ for pass in WeakAliasModuleOverride sectionify magic asr; do (cd passes/$pass && make clean install); done
$ (cd static/magic && make clean install)
$ ./clientctl buildasr 3
$ OUT=F ./clientctl run

In contrast to the initial run, the entire update procedure should take no more than an hour.

Instead of the ./clientctl buildasr 3 step in the above two examples, one can for example also instrument the system for live update but not live rerandomization, using the following three replacement steps:

$ ./relink.llvm magic
$ ./build.llvm magic
$ ./clientctl buildimage

Using live update

Once an instrumented MINIX3 system has been built and started, it should be ready for live updates. MINIX3 offers two scripts that make use of the live update functionality: one for testing the infrastructure, and one for performing runtime ASR rerandomization. In addition, the user may let the system perform live updates explicitly. In this section, we cover both parts.

The commands in this section are to be run within MINIX3, rather than on the host system. They must be run as root, because performing a live update of a system service requires superuser privileges. These two things are reflected by the minix# prompt used in the examples below.

Pre-provided scripts

The MINIX3 distribution comes with two scripts that can be used to test and use the live update and rerandomization functionality. The first one is testrelpol. This script may be used for basic regression testing on the MINIX3 live update infrastructure. The second one is update_asr. This command performs live rerandomization of system services at runtime.

Infrastructure testing: testrelpol

The MINIX3 test suite has a test set script that tests the basic MINIX3 crash recovery and live update functionality. The script is called testrelpol and can be found in /usr/tests/minix-posix:

minix# cd /usr/tests/minix-posix
minix# ./testrelpol

For its live update tests, this script does not use the magic framework for state transfer at all. Instead it uses identity transfer which basically just performs a memory copy between the old and the new instance. As a result, the testrelpol script should work whether or not services are instrumented. However, it may not work reliably on MINIX3 systems that are not built for magic instrumentation at all (i.e., built without MKMAGIC=yes).

Live rerandomization: update_asr

As we have shown before, the clientctl buildasr host-side command can perform the build-time preparation of a MINIX3 system for live rerandomization. Complementing this, the run-time side of the live rerandomization is provided by means of the update_asr command. The update_asr command will update system services to their next pregenerated rerandomized version, using a cyclic system. Live rerandomization is not automatic, and thus, the MINIX3 system administrator is responsible for running the update_asr command at appropriate times.

By default, the update_asr command performs one round of ASR rerandomization, updating each service to its next version:

minix# update_asr

By default, this command will report errors only. More verbose information can be shown using the -v switch:

minix# update_asr -v

For further details about this command, see the update_asr(8) manual page.

Aside from providing actual security benefits, the update_asr script is the most complete test of the live update and rerandomization functionality at this time. It uses the magic framework for state transfer, with full-scale relocation of all state, and it applies the runtime ASR features. As of writing, it runs in the default qemu environment without any errors or subsequent issues.

The only aspect that is not tested with this command, is whether ASR rerandomization is effective, that is, whether all parts of its address space were properly randomized by the asr pass. After all, ASR rerandomization between identical service copies works just as well, but provides substantially fewer security guarantees. Developers working on the asr pass are encouraged to check its effectiveness manually, for example using nm(1) on generated service binaries on the host side.

Live update commands

RS can be instructed to perform live updates through the service(8) command, specifically through its service update subcommand. This command is also used by the automated scripts. For a full overview of the command's functionality, please see the service(8) manual page as well as the command's output when it is run with no parameters.

In its most fundamental form, the service update command will update a running service, identified by its label, to a new on-disk binary file. It is however possible to tell RS to update the service into a copy of itself, and to influence the process using various flags and options. The basic syntax to perform a live update on a single system service is as follows:

minix# service [flags] update [self|<binary>] -label <label> [options]

Through various combinations of this command's parameters, MINIX3 basically supports four types of updates, representing increasingly challenging conditions for the overall live update infrastructure in general, and state transfer in particular. We will now go through all of them, and explain how they can be performed. Later on, the developers guide will provide a more in-depth explanation of the four types of updates.

Identity transfer

The first update type is identity transfer. In this case, the service is updated to an identical copy of itself, with all functions and static data in the new instance located at the exact same addresses as the old instance. Identity transfer bluntly copies over entire sections at once, thus requiring no instrumentation at all. This makes it suitable for testing of the MINIX3-specific side of the live update infrastructure, hence its use in the testrelpol script. Identity transfer is the default of the service(8) command when “self” is given instead of a path to a new binary:

minix# service update self -label pm

This will perform an identity transfer of the PM service. Identity transfer should work for literally all MINIX3 system services. As mentioned, it is guaranteed to work only when the system was built with MKMAGIC=yes, although it will mostly work on systems built without magic support as well. It works regardless of whether the target service was instrumented with the magic framework (or ASR).

If the live update is successful, the service(8) command will be silent, but RS will print a system message that the update succeeded:

RS: update succeeded

If the system was started on qemu with the OUT=F, this message will end up in serial.out. Otherwise, the message should show up the system log (/var/log/messages) and possibly on the first console.

If the live update fails, RS should print an error to the system log, and service(8) will complain. In order to debug such failures, it may be useful to enable verbose mode in RS, buy starting the system with rs_verbose=1 as shown earlier.

Self state transfer

The second update type is self state transfer. Self state transfer also performs an update of a service into an identical copy of itself, but instead uses the state transfer functionality of the magic framework. Thus, self state transfer requires that the service be instrumented properly, and the update type can be used to test whether a service's state can be transferred without problems. Many of the things explained here also apply to the remaining two update types, as all three are using the state transfer of the magic framework.

Self state transfer is performed by supplying the -t flag along with “self” to the service update command:

minix# service -t update self -label pm

This command will perform self state transfer of the PM service. The libmagic state transfer routine in the new service instance will print additional system messages while it is running. Upon success, the system output will look somewhat like this:

total remote functions: 57. relocated: 54
total remote sentries: 186. relocated normal: 84 relocated string: 101
total remote dsentries: 5
st_data_transfer: processing sentries
st_data_transfer: processing dsentries
st_data_transfer: processing sentries
st_data_transfer: processing dsentries
st_state_transfer: state transfer is done, num type transformations: 0
RS: update succeeded

If the state transfer routine is not able to perform state transfer successfully, it will print messages that start with [ERROR]. RS will then roll back the service to the old instance, and both RS and service(8) will report failure. Self state transfer should succeed for all MINIX3 system services that have been built with bitcode and instrumented with libmagic and the magic pass. As of writing, there are no system services for which self state transfer is known to result in [ERROR] lines and subsequent live update failure. However:

  • It is possible that changes to system services, and even usage scenarios of services which we have not yet tested, results in new state transfer errors. Such errors should be resolved. The developers guide further below contains instructions on how to resolve some of these errors.
  • Currently, one service is not built with bitcode, namely the memory driver. It is therefore also not instrumented. An attempt to perform self state transfer on any service that is not instrumented will result in a “Function not implemented (error 78)” error. This is usually a good indication that a step was missed during the build phase.
  • Some services have no state to transfer, in which case their new instances will perform a fresh start instead of state transfer. In that case, live update with self state transfer will succeed, but not print the state transfer system messages shown above. This is the case for the IS (Information Server) and readclock.drv services, for example.
  • Some services may only be updated once brought into a specific state of quiescence, because the default quiescence state is not sufficiently restrictive. In that case, the user must specify an alternative quiescence state explicitly, through the service(8) -state option. This currently applies to all services that make use of usermode threads, namely the VFS, ahci, virtio_blk services. They must be updated using quiescence state 2 (request free) rather than state 1 (work free):
minix# service -t update self -label vfs -state 2

Omitting the appropriate state parameter may result in a crash of the service after live update. At the moment, update_asr script has hardcoded knowledge about these necessary states. None of this is great, and we will be working towards a situation where the default state will not result in a crash.

  • State transfer may be slow, and RS applies a default timeout for live updates. Therefore, it may be necessary to set a longer timeout in order to avoid needless failures. This can be done through the -maxtime option to service(8):
minix# service -t update self -label vfs -state 2 -maxtime 120HZ

The maximum time is specified in clock ticks by default, but may be given in seconds by appending “HZ” to the timeout. The latter may sound confusing and it is, but the original idea was supposedly that the number of seconds is multiplied by the system's clock frequency aka its HZ setting. The above command allows the live update of VFS to take up to two minutes.

ASR rerandomization

The third update type is ASR rerandomization. Like self state transfer, ASR rerandomization uses the magic framework to perform state transfer. In this case, the service performs state transfer into a rerandomized version of the same service. This involves specifying the path to a rerandomized ASR binary to the service(8) command, as well as the -a flag. The -a flag tells the new instance to enable the run-time parts of rerandomization during the live update.

minix# service -a update /service/asr/1/pm -label pm

When a system has been built with ASR rerandomization, the (randomized) base service binaries are located in /service and the (randomized) alternative service binaries are located in numbered subdirectories in /service/asr. As mentioned before, the update_asr(8) command can be used to perform these updates semi-automatically.

ASR rerandomization comes with one extra restriction: the VM service cannot be subjected to more complicated forms of state transfer than self state transfer. It is also skipped by the update_asr(8) command for this reason. We will explain the restrictions regarding the VM service in the developer section.

Functional update

The final update type is a functional update. Compared to self state transfer, ASR rerandomization relocates code and more data. However, there are fundamentally no differences between the old and the new version of the service. In the case of a functional update, the service performs state transfer into a new program. While typically highly similar, the new program may be different from the running service in various ways.

In terms of the service(8) command, such functional updates can be performed by simply using service update with a new binary. For example, one could test a new version of the UDS (UNIX Domain Sockets) service, without installing it into /service yet, and without affecting its open sockets:

minix# service update /usr/src/minix/net/uds/uds -label uds

The fact that this time there may be actual differences between the old and new versions of the services adds an extra dimension to the state transfer issue. Additional state transfer failures can be expected in this case, and must be dealt with accordingly. The developers guide will eventually elaborate on this point.

Similarly, depending on the nature of the update, the update action may require a specific state of quiescence. Taking UDS as an example, an update may change file descriptor transfers over sockets, in which case the update may impose that no file descriptors are being transferred at the time of the update. The old instance of the service must support this as a quiescence state. This state can then be specified through the -state option of the service update command.

Since the live update functionality is relatively new for MINIX3, we do not yet have much experience with the practical side of performing functional updates to services. This document will be expanded as we gain more insight into the common usage patterns of live update. Stay tuned!

Multicomponent updates

From the user's perspective, updating multiple services at once is not much more complex than updating a single service. First, a number of service update commands should be issued, each with the -q flag:

minix# service -q -t update /service/pm -label pm
minix# service -q -t update /service/vfs -label vfs -state 2

Then, the entire update can be launched with the service sysctl upd_run command:

minix# service sysctl upd_run

The RS output will be much more verbose in this case. Timeouts are still to be specified on a per-service basis, rather than for the entire update at once. If necessary, any queued service update commands may be canceled with the upd_stop sysctl subcommand:

minix# service sysctl upd_stop

This will cancel the entire multicomponent live update action.

Useful host commands

The host-side clientctl script in minix/llvm offers a number of additional convenient commands, mainly for developers. We list some of them here.

The buildboot command installs just the services that are part of the boot image. It can be used instead of clientctl buildimage when only boot-image services have been changed, thus speeding up the development cycle:

$ ./clientctl buildboot

Using this command, it is possible to make and test changes to boot system services fairly quickly. As an example, the following set of steps suffices to make and test changes to the PM service:

$ export PATH=$PATH:/home/user/minix-liveupdate/obj.i386/tooldir.{platform}/bin
$ cd minix-src/minix/servers/pm
[make changes to the PM source code]
$ nbmake-i386 all install
$ cd ../../llvm
$ C=pm ./relink.llvm magic
$ C=pm ./build.llvm magic
$ ./clientctl buildboot
$ OUT=F ./clientctl run

The unstack command shows a stacktrace of pretty much any MINIX3 binary in human-readable form:

$ ./clientctl unstack <name> [address [address ..]]

For example, to show a stack trace of the PM service in a human-readable form:

$ ./clientctl unstack pm 0x805a7fd 0x80492a5 0x8048050

Note that on ASR-enabled installations, the unstack command works only on the base versions of system services: there is currently no way to unstack a stacktrace for any of the ASR-rerandomized service binaries. On one occasion, the author of this document has done that process by hand, by finding the matching assembly code of an ASR-rerandomized service's crash site in the service's base version.

Developers guide

developersguide/liveupdate.1442229376.txt.gz · Last modified: 2015/09/14 13:16 by dcvmoole