User Tools

Site Tools


implementingjobcontrol

Things to do in order to integrate job control within MINIX.

Warning: This document has gone stale. In particular, the commits linked, while not actually tested, were in phase with April 2013 view of MINIX and are not usable with MINIX-current.

The goal is to have POSIX-class job control in MINIX.

Context

Job control is a feature added in 1983 to 4.2BSD “C shell”, csh. Shortly after, the idea has been ported to AT&T System V, and many descendants of the Bourne shell (like ksh, bash) are offering this feature. The basis of how-it-works has been carefully specified within the Posix standard.

Minix conforms to POSIX.1-1990, where job control was optional; in the context of a small operating system to run on 16-bit Intel PCs, such an option was dropped.

Then, that option becomes mandatory as part of the FIPS standardization, so almost any *nix-like operating systems these days provide it. It has been required for conformance (mandatory) since the 2001 revision of the POSIX standard. And several important applications (gdb, bash,…) are assuming this feature is always available and do not provide an acceptable work-around.

Job Control in a Nutshell

This section is directly extracted from the BSD termios(4) manual page.

Every process is associated with a particular process group and session. The grouping is hierarchical: every member of a particular process group is a member of the same session. This structuring is used in managing groups of related processes for purposes of job control; that is, the ability from the keyboard (or from program control) to simultaneously stop or restart a complex command (a command composed of one or more related processes). The grouping into process groups allows delivering of signals that stop or start the group as a whole, along with arbitrating which process group has access to the single controlling terminal. The grouping at a higher layer into sessions is to restrict the job control related signals and system calls to within processes resulting from a particular instance of a login. Typically, a session is created when a user logs in, and the login terminal is setup to be the controlling terminal; all processes spawned from that login shell are in the same session, and inherit the controlling terminal.

A job control shell operating interactively (that is, reading commands from a terminal) normally groups related processes together by placing them into the same process group. A set of processes in the same process group is collectively referred to as a job. When the foreground process group of the terminal is the same as the process group of a particular job, that job is said to be in the foreground. When the process group of the terminal is different from the process group of a job (but is still the controlling terminal), that job is said to be in the background. Normally the shell reads a command and starts the job that implements that command. If the command is to be started in the foreground (typical), it sets the process group of the terminal to the process group of the started job, waits for the job to complete, and then sets the process group of the terminal back to its own process group (it puts itself into the foreground). If the job is to be started in the background (as denoted by the shell operator “&”), it never changes the process group of the terminal and does not wait for the job to complete (that is, it immediately attempts to read the next command). If the job is started in the foreground, the user may type a key (usually ^Z) which generates the terminal stop signal (SIGTSTP) and has the effect of stopping the entire job. The shell will notice that the job stopped, and will resume running after placing itself in the foreground. The shell also has commands for placing stopped jobs in the background, and for placing stopped or background jobs into the foreground.

Things we need for this

  • support from applications (our ash(1) has it!)
  • support for stopping processes (here with ptrace(2)!)
  • support for having more than one process group within a session
  • support for changing the process group at controlling terminal

Current status

Implementation

DoneFunctionalityComments
20% more than one process group within a session
15% changing the process group at controlling terminal
10% stopping processes, and report of it
10% suspending process groups from terminal
stopping background processes interacting with terminal Blocker, see below
enable job control in ash(1)
enable multiple jobs in make(1) and bmake(1pkgsrc)
support for SIGCHLD signal (enhanced to stopping)
support for TOSTOP control
support for orphaned process groups

Testing

As part of normal test suite of MINIX, a new program has to be added to test all the functionalities as specified from POSIX; this program is itself tested against other implementations (like NetBSD).

TestedTarget environmentByComments
Clang-compiled MINIX
GCC-compiled MINIX
GDB
Pkgsrc (bash, etc.)

Reported problems

SolvedProblemComments

Detailed implementation

More than one process group within a session

API support

The basic interface is setpgid(2), also known as setpgrp(2) on BSD systems; (Posix changed the name because setpgrp() on System V is different, more limited.)

#!cplusplus
#define _POSIX_SOURCE
#include <unistd.h>

int setpgid(pid_t _pid, pid_t _pgid);

This function serves two purposes: it can create a new process group (passing 0 as second argument); or it can move a process to an already existing group.

X/Open standards adds a supplementary function, getpgid(2), which is not strictly necessary but often assumed by some packages; (again Posix the name changed away from BSD's getpgrp(2) because that name on System V is different, and returns the session identifier.)

#!cplusplus
#define _XOPEN_SOURCE
#include <unistd.h>

pid_t getpgid(pid_t _pid);

The prototypes are declared within the libc headers, although they were conditionally defined out.

Setpgrp() is left over. While there is a prototype for it in <unistd.h>, it corresponds to the legacy 4BSD function which is now superseded by setpgid(2); furthermore, X/Open standardized a different interface (with no arguments, originally from System V, corresponding to setpgid(0,0)), which is even indicated in <unistd.h> with the comment XXX prototype error. The best thing to do here is to carefully avoid such a trap, so we shall not provide any implementation of it, and shall left the prototype declaration guarded with the #ifndef __minix barrier.

System implementation

Once the new call numbers are declared, the needed library support has to go in libc, which is without difficulties.

Before the introduction of job control, there is some confusion going on within MINIX between process groups and sessions, since there was always only one process group within a session. This allowed to use the process-id of the leader as both session-id and process-group-id, kept in the mp_procgrp field. This is of course no longer possible. This is solved by introducing a new member //mp_session// in the PM process table to record the session identifier which is less used.

Furthermore, on VFS side there was no information about the session or the process group of a given process: the only relevant datas transmitted from PM are the pid, the endpoint of the session leader (more about this later), and a flag when the process is a session leader. So here we add both the process group id and the session id for each process.

The getpgid(2) system call is straightforward, much like the similar calls like getuid(2), handled in pm/getset.c, function do_get(). Since it uses exactly the same logic as getsid(2), we share the system call 113, as done with getpid(2)/getppid(2).

In MINIX, process group information is needed both in PM (e.g. to determine the processes which are signalled as part of a group) and in VFS (where it comes to the interface with the controlling terminal.) So we follow the long-established practice to have the setpgid system call handled in PM, which then calls VFS asynchronously. Note there are two subtleties here: * Some checks are done only by VFS, namely that a child which process group is about to be changed, should not have done any exec*(2) system call. So the actual change within PM is deferred until VFS gives back the OK: this is implemented within handle_vfs_reply(). * The process which should be changed can be a child of the current process. To accurately update its information, while replying to the caller, we need to keep the distinction during all the transitions PM→VFS→PM; so we keep a pointer to the caller process slot within PM as an additional message parameter (PM_CALLER_IS_PARENT), which indicates when non-NULL that the target process (indicated to VFS as PM_PROC, just like the other similar inter-service communications) is actually the child of the caller.

As an additional optimization, in VFS the session which any process pertains is recorded through a direct pointer to its session leader (rather than as session id); this works in VFS because the session id is needed only with respect to the controlling terminal, which is vanishing as soon as the session leader exits.

Changing the process group at controlling terminal

API support

The basic interface is tcsetpgrp(3) and tcgetpgrp(3) in Posix, which are almost always thin wrappers around TIOCSPGRP and TIOCGPGRP ioctl to the controlling terminal.

#!cplusplus
#define _POSIX_SOURCE
#include <unistd.h>

pid_t tcgetpgrp(int fd);
int tcsetpgrp(int fd, pid_t pgid);

X/Open standards adds a supplementary function, tcgetsid(3), which is not strictly necessary but sometimes assumed blindly by some packages.

#!cplusplus
#define _XOPEN_SOURCE
#include <sys/types.h>
#include <termios.h>

pid_t tcgetsid(int fd);

It returns the session identifier corresponding to the terminal, but only if that terminal is the controlling terminal of the session (being the current session or not seems debatable, to avoid information leak). The corresponding ioctl is TIOCGSID.

The prototypes and the ioctl labels were declared in MINIX for a long time (since Minix-vmd provided an implementation.) The library support is already here too thanks to NetBSD compatibility.

For tcgetsid(3), we are missing both the declarations and the library support.

System support

The implementation of TIOCSPGRP, TIOCGPGRP, and TIOCGSID ioctl takes place in VFS. To achieve that, we specialize the handling of terminal I/O within the general case: thus we can intercept the IOCTL messages and deal with their specialities. We take that occasion to add the information of the process group of the caller within each message (with 0 when this is not a controlling terminal), to allow the tty driver to determine if the call is for a background or a foreground process.

A new member is also added with the foreground process group in the tty structure. When a session takes the control of a terminal (indicated by the O_NOCTTY flag being clear in the DEV_OPEN message), VFS sends the session as a supplementary field, which is the easiest way to have both members initialized (a session leader cannot quit its process group).

The validity of the new foreground process group id, as passed in tcsetpgrp(), should be checked; since VFS intercepts its case, it can easily perform the validation check. Note that the information is replicated in the TTY structures, mainly to allow the keyboard-generated signals to be delivered to the whole foreground process group, as prescribed by Posix.

Another special case is when a foreground process group becomes empty, either because the last element exits, or because it is moved to another group; this is described as the terminal has no foreground process group in Posix. The requirements are then for tcgetpgrp(3) to return a value greater than 1 that does not match the process group ID of any existing process group; the former value is fine, at least until that value is reused for a new process, or the process group is recreated; both cases are unlikely but possible (and we do not handle them correctly.) Other consequences might apply. This need revisiting.

Stopping processes (and report of it)

The good news here is that there is (almost) nothing to do! Stopped processes are already envisionned as part of the ptrace(2) support (as it historically was implemented in 4BSD; vestigial of that history is the WUNTRACED flag to waitpid.)

To make the matter more clear, we introduce a new flag in the PM process state, JOBCTL_STOPPED; and we optionally rename the former STOPPED into TRACE_STOPPED.

Reporting stopped processes with waitpid(2)

Handling the WUNTRACED detection in waitpid is a simple copy-and-paste of the code dealing with the ZOMBIE case, including the tell_parent subroutine, into a new tell_parent_untraced.

Dealing with SIGSTOP and SIGCONT

Stopping and continuing are two new default actions associated with signals; so its obvious place is the sig_proc() function. System services are not considered, since stopping them have never been thought about and it might easily have fatal consequences.

Continuing is handled before checking for catching, since POSIX specifies that the continuing action occurs anyway; note here that in the Linux operating system, that action occurs even before the trapping of that signal by a tracer (debugger). Also, in the X/Open standard, some more actions could be taken here: see below for WCONTINUED and SIGCHLD.

Stopping is handled after checking for ignoring, masking or catching. The process is stopped, and various flags are set to remember its state. If the parent is actually waiting with the WUNTRACED flag set, the condition is now completed and the parent has to be told.

Dealing with SIGTSTP, SIGTTIN, and SIGTTOU

See below for the VFS/TTY part. The PM part is exactly the same as for SIGSTOP, hence the four signals are handled as a whole in a ad hoc set.

Suspending and stopping processes and process groups interacting with terminal

Stopping foreground process group, SIGTSTP

This one is easy: just add a handler for ^Z, really the c_cc[VSUSP] setting of termios (emitting SIGTSTP) along with c_cc[VINTR] emitting SIGINTR etc.

Warning: Extra care with the following: this was true back in 2011, but things changed since that. The use of a POSIX system call like killpg() directly from a service to PM is controversial; an alternative could be to implement a sys_killpg() which does the work using the same codepath as sys_kill… and more ugly signal code inside the kernel.

We take the occasion to revise the code dealing with those signals (sig_char() in tty/tty.c), to use directly killpg. We also corrects a POSIX non-conformance when dealing with the SIGHUP signal: the comments were correct about the job to be done, but the actual code in PM does not perform that job; it is then safer to actually ask for the precise tasks to do, rather than expecting remote code to handle the borderlines!

Reading from background processes, SIGTTIN

When a background process wants to read from its controlling terminal, the natural action with job control is to stop the process, in the expectation that the shell switches the process in the foreground, to give it the focus. However a job-control-aware process has some control over this. The stop is given through the SIGTTIN signal, which default action is to stop the process as we saw above. As most signals however, it can be ignored, caught, or masked (blocked); and POSIX specifies various behaviours in each case:

  • if the SIGTTIN signal is ignored or blocked, the operation is failed with EIO error code, no characters are read
  • if the SIGTTIN signal is caught, the signal handler is run so with respect to the read operation, it is aborted with EINTR (as is occurring with all signal caught)
  • else the SIGTTIN signal has its normal function, stopping the process (and all the process in the group, since the signal is raised against the process group); later, when the process will be continued, the whole system call will be restarted; in MINIX, this process is called reviving, and is handled in servers/vfs/pipe.c. So we create a new reason for a process to be suspended, FP_BLOCKED_ON_BGIO, to characterize this case. The implementation does not anything special besides keeping the state; worth the note here is the new character letter, 'B', issued by the procfs server in the FS status field in the /proc/$$/psinfo pseudo file to mark this new status (notice also that the FP_BLOCKED_ON_xxx state, as any process-related VFS information, is not exposed by is.)

Since this test on the disposition of the signal occurs while reading, it requires VFS to know or to learn how a given signal (which are managed by PM) will be handled. We choose the learn way, and we create a new system call (reserved to system services), sighandled(2) to ask PM about the way a given signal would be handled. The implementation parallels the actions done in sig_proc(), but no action is actually done. A possible alternative to the current scheme could be to introduce another flag to sig_proc(), called for example dry_run, and actually call the sig_proc() routine; however this routine is already large and heavily loaded with several sub-cases and flags, so the chosen solution is probably clearer.

Once VFS has learnt how the signal will be handled, the actions follow a simple decision tree: either allowing or aborting the system call, or sending the SIGTTIN signal (to the whole background process group), which will require either aborting the current system call with EINTR just as will be done any time a signal will be caught, or putting the system call on hold with the new FP_BLOCKED_ON_BGIO status.

Writing from background processes, SIGTTOU, and the TOSTOP flag

There are only marginal differences with the above case for reading.

TOSTOP prevents background processes to write on their controlling terminal. The actual action is the same as with READ or IOCTL. But since it happens within VFS, we must remember the current value of that TOSTOP flag (which is part of the local control flags of the termios structure); so we also intercept the IOCTL TCSETS which is the only way to affect this flag.

Another difference with SIGTTIN is that if the signal is ignored or masked, the operation is allowed (in contrast to the EIO error code returned to a reading process), without any signal being delivered.

Suspended foreground processes sent to background

The last piece of the puzzle is when a foreground application is suspended on its controlling tty, waiting for some input, and meanwhile is sent to the background: it should then free the tty task to allow another new foreground process to read. This is done sending a CANCEL message to the task, and then be putting the now backgrounded process in the FP_BLOCKED_ON_BGIO state.

This is still WIP.

Enabling job control in applications

FreeBSD

NetBSD

BSD make (and bmake from pkgsrc)

gdb

shells/pdksh shells/mksh

The configure script should figure it out automatically. Recompile, check config.h for a definition of JOBS, do some testing to ensure it works.

bin/ksh from Minix-current

bin/ksh (imported NetBSD) is just a modified version of pdksh with the configure script and a few other things removed/changed. To enable job control, you need to update config.h. To update the config.h file to enable job control, the config.h generated by shells/pdksh needs to be copied over to /usr/src/bin/ksh/config.h.

shells/tcsh shells/static-tcsh shells/standalone-tcsh

There is a Minix config file shipped with tcsh in the config directory. You need to set POSIXJOBS and/or BSDJOBS as appropriate. After testing, upstream the changes.

shells/bash shells/bash2

bash provides a simple configure script switch. To enable job control, just edit the package Makefile so that CONFIGURE_ARGS is set with –enable-job-control instead of –disable-job-control.

gnulib devel/m4 devel/bison

spawni.c from gnulib contains a call to setpgid(). The call to setpgid() in spawni.c has been commented out in devel/m4 and devel/bison. Those two patches haven't been upstreamed, and can be dropped once Minix has setpgid() support. It would also be prudent to check with upstream gnulib. At least one developer has been working on formal Minix support (i.e. porting all of the gnulib replacement functions).

Packages that people want once job control is working

  • devel/distcc
  • security/sudo
  • misc/tmux

Advanced features

Not yet implemented.

Support for SIGCHLD

Support for orphaned process groups

Support for WCONTINUED

Testing

Testing is probably the hardest part of the whole project. Here are some tips or ideas on how to test job control. The list is incomplete, but it might give some ideas for where to start.

use software that uses job control

Don't pass over the obvious thing, try out some of the job control features of the shells listed above.

tests that come with zsh

zsh ships with a test file called “job-control-tests” in the Misc directory that is supposed to be run interactively. Try it out. It might be useful to check the sources of other shells for additional job control test code.

cpulimit

It may require a little porting work, but there is a tool called cpulimit which uses job control to start and stop a process in order to limit the percentage of CPU time it gets. The code's on github (here). You could use cpulimit to limit a cpu heavy application (sysutils/cpuburn for example or even just a tight infinite loop that does floating point math). That might help expose some bugs.

writing your own tests

The glibc manual has a section titled Implementing a Shell. You could probably learn enough about shells and job control to implement a simple shell that supports job control (no fancy stuff like if/while/variables/…). You could write some sort of script to execute within the shell which tests job control. Alternatively, you could take an existing shell and add debugging commands to it which would help with job control testing. Either way, if the test is to be included in the Minix source tree, be sure it's compatible with the Minix license terms.

implementingjobcontrol.txt · Last modified: 2015/01/03 13:32 by antoineleca