Warning: This document has gone stale. In particular, the commits linked, while not actually tested, were in phase with April 2013 view of MINIX and are not usable with MINIX-current.
The goal is to have POSIX-class job control in MINIX.
Job control is a feature added in 1983 to 4.2BSD “C shell”, csh. Shortly after, the idea has been ported to AT&T System V, and many descendants of the Bourne shell (like ksh, bash) are offering this feature. The basis of how-it-works has been carefully specified within the Posix standard.
Minix conforms to POSIX.1-1990, where job control was optional; in the context of a small operating system to run on 16-bit Intel PCs, such an option was dropped.
Then, that option becomes mandatory as part of the FIPS standardization, so almost any *nix-like operating systems these days provide it. It has been required for conformance (mandatory) since the 2001 revision of the POSIX standard. And several important applications (gdb, bash,…) are assuming this feature is always available and do not provide an acceptable work-around.
This section is directly extracted from the BSD termios(4) manual page.
Every process is associated with a particular process group and session. The grouping is hierarchical: every member of a particular process group is a member of the same session. This structuring is used in managing groups of related processes for purposes of job control; that is, the ability from the keyboard (or from program control) to simultaneously stop or restart a complex command (a command composed of one or more related processes). The grouping into process groups allows delivering of signals that stop or start the group as a whole, along with arbitrating which process group has access to the single controlling terminal. The grouping at a higher layer into sessions is to restrict the job control related signals and system calls to within processes resulting from a particular instance of a login. Typically, a session is created when a user logs in, and the login terminal is setup to be the controlling terminal; all processes spawned from that login shell are in the same session, and inherit the controlling terminal.
A job control shell operating interactively (that is, reading commands
from a terminal) normally groups related processes together by placing
them into the same process group. A set of processes in the same process
group is collectively referred to as a job. When the foreground
process group of the terminal is the same as the process group of a particular
job, that job is said to be in the foreground. When the
process group of the terminal is different from the process group of a
job (but is still the controlling terminal), that job is said to be in
the background. Normally the shell reads a command and starts the
job that implements that command. If the command is to be started in the
foreground (typical), it sets the process group of the terminal to the
process group of the started job, waits for the job to complete, and then
sets the process group of the terminal back to its own process group (it
puts itself into the foreground). If the job is to be started in the
background (as denoted by the shell operator “&”), it never changes the
process group of the terminal and does not wait for the job to complete
(that is, it immediately attempts to read the next command). If the job
is started in the foreground, the user may type a key (usually ^Z)
which generates the terminal stop signal (SIGTSTP
) and has the effect of
stopping the entire job. The shell will notice that the job stopped, and
will resume running after placing itself in the foreground. The shell
also has commands for placing stopped jobs in the background, and for
placing stopped or background jobs into the foreground.
Done | Functionality | Comments |
20% | more than one process group within a session | |
15% | changing the process group at controlling terminal | |
10% | stopping processes, and report of it | |
10% | suspending process groups from terminal | |
☐ | stopping background processes interacting with terminal | Blocker, see below |
☐ | enable job control in ash(1) | |
☐ | enable multiple jobs in make(1) and bmake(1pkgsrc) | |
☐ | support for SIGCHLD signal (enhanced to stopping) | |
☐ | support for TOSTOP control | |
☐ | support for orphaned process groups |
As part of normal test suite of MINIX, a new program has to be added to test all the functionalities as specified from POSIX; this program is itself tested against other implementations (like NetBSD).
Tested | Target environment | By | Comments |
☐ | Clang-compiled MINIX | ||
☐ | GCC-compiled MINIX | ||
☐ | GDB | ||
☐ | Pkgsrc (bash, etc.) |
Solved | Problem | Comments |
☐ |
The basic interface is setpgid(2)
, also known as setpgrp(2)
on BSD systems; (Posix changed the name because setpgrp()
on System V is different, more limited.)
#!cplusplus #define _POSIX_SOURCE #include <unistd.h> int setpgid(pid_t _pid, pid_t _pgid);
This function serves two purposes: it can create a new process group (passing 0 as second argument); or it can move a process to an already existing group.
X/Open standards adds a supplementary function, getpgid(2)
, which is not strictly necessary but often assumed by some packages; (again Posix the name changed away from BSD's getpgrp(2)
because that name on System V is different, and returns the session identifier.)
#!cplusplus #define _XOPEN_SOURCE #include <unistd.h> pid_t getpgid(pid_t _pid);
The prototypes are declared within the libc headers, although they were conditionally defined out.
Setpgrp()
is left over. While there is a prototype for it in <unistd.h>
, it corresponds to the legacy 4BSD function which is now superseded by setpgid(2)
; furthermore, X/Open standardized a different interface (with no arguments, originally from System V, corresponding to setpgid(0,0)
), which is even indicated in <unistd.h>
with the comment XXX prototype error. The best thing to do here is to carefully avoid such a trap, so we shall not provide any implementation of it, and shall left the prototype declaration guarded with the #ifndef __minix
barrier.
Once the new call numbers are declared, the needed library support has to go in libc, which is without difficulties.
Before the introduction of job control, there is some confusion going on within MINIX between process groups and sessions, since there was always only one process group within a session. This allowed to use the process-id of the leader as both session-id and process-group-id, kept in the mp_procgrp field. This is of course no longer possible. This is solved by introducing a new member //mp_session// in the PM process table to record the session identifier which is less used.
Furthermore, on VFS side there was no information about the session or the process group of a given process: the only relevant datas transmitted from PM are the pid, the endpoint of the session leader (more about this later), and a flag when the process is a session leader. So here we add both the process group id and the session id for each process.
The getpgid(2) system call is straightforward, much like the similar calls like getuid(2), handled in pm/getset.c, function do_get(). Since it uses exactly the same logic as getsid(2), we share the system call 113, as done with getpid(2)/getppid(2).
In MINIX, process group information is needed both in PM (e.g. to determine the processes which are signalled as part of a group) and in VFS (where it comes to the interface with the controlling terminal.) So we follow the long-established practice to have the setpgid system call handled in PM, which then calls VFS asynchronously. Note there are two subtleties here: * Some checks are done only by VFS, namely that a child which process group is about to be changed, should not have done any exec*(2) system call. So the actual change within PM is deferred until VFS gives back the OK: this is implemented within handle_vfs_reply(). * The process which should be changed can be a child of the current process. To accurately update its information, while replying to the caller, we need to keep the distinction during all the transitions PM→VFS→PM; so we keep a pointer to the caller process slot within PM as an additional message parameter (PM_CALLER_IS_PARENT), which indicates when non-NULL that the target process (indicated to VFS as PM_PROC, just like the other similar inter-service communications) is actually the child of the caller.
As an additional optimization, in VFS the session which any process pertains is recorded through a direct pointer to its session leader (rather than as session id); this works in VFS because the session id is needed only with respect to the controlling terminal, which is vanishing as soon as the session leader exits.
The basic interface is tcsetpgrp
(3) and tcgetpgrp
(3) in Posix, which are almost always thin wrappers around TIOCSPGRP and TIOCGPGRP ioctl to the controlling terminal.
#!cplusplus #define _POSIX_SOURCE #include <unistd.h> pid_t tcgetpgrp(int fd); int tcsetpgrp(int fd, pid_t pgid);
X/Open standards adds a supplementary function, tcgetsid
(3), which is not strictly necessary but sometimes assumed blindly by some packages.
#!cplusplus #define _XOPEN_SOURCE #include <sys/types.h> #include <termios.h> pid_t tcgetsid(int fd);
It returns the session identifier corresponding to the terminal, but only if that terminal is the controlling terminal of the session (being the current session or not seems debatable, to avoid information leak). The corresponding ioctl is TIOCGSID.
The prototypes and the ioctl labels were declared in MINIX for a long time (since Minix-vmd provided an implementation.) The library support is already here too thanks to NetBSD compatibility.
For tcgetsid(3)
, we are missing both the declarations and the library support.
The implementation of TIOCSPGRP, TIOCGPGRP, and TIOCGSID ioctl takes place in VFS. To achieve that, we specialize the handling of terminal I/O within the general case: thus we can intercept the IOCTL messages and deal with their specialities. We take that occasion to add the information of the process group of the caller within each message (with 0 when this is not a controlling terminal), to allow the tty driver to determine if the call is for a background or a foreground process.
A new member is also added with the foreground process group in the tty structure. When a session takes the control of a terminal (indicated by the O_NOCTTY flag being clear in the DEV_OPEN message), VFS sends the session as a supplementary field, which is the easiest way to have both members initialized (a session leader cannot quit its process group).
The validity of the new foreground process group id, as passed in tcsetpgrp(), should be checked; since VFS intercepts its case, it can easily perform the validation check. Note that the information is replicated in the TTY structures, mainly to allow the keyboard-generated signals to be delivered to the whole foreground process group, as prescribed by Posix.
Another special case is when a foreground process group becomes empty, either because the last element exits, or because it is moved to another group; this is described as the terminal has no foreground process group in Posix. The requirements are then for tcgetpgrp(3)
to return a value greater than 1 that does not match the process group ID of any existing process group; the former value is fine, at least until that value is reused for a new process, or the process group is recreated; both cases are unlikely but possible (and we do not handle them correctly.) Other consequences might apply. This need revisiting.
The good news here is that there is (almost) nothing to do! Stopped processes are already envisionned as part of the ptrace(2) support (as it historically was implemented in 4BSD; vestigial of that history is the WUNTRACED flag to waitpid.)
To make the matter more clear, we introduce a new flag in the PM process state, JOBCTL_STOPPED
; and we optionally rename the former STOPPED
into TRACE_STOPPED
.
Handling the WUNTRACED detection in waitpid is a simple copy-and-paste of the code dealing with the ZOMBIE case, including the tell_parent subroutine, into a new tell_parent_untraced.
Stopping and continuing are two new default actions associated with signals; so its obvious place is the sig_proc() function. System services are not considered, since stopping them have never been thought about and it might easily have fatal consequences.
Continuing is handled before checking for catching, since POSIX specifies that the continuing action occurs anyway; note here that in the Linux operating system, that action occurs even before the trapping of that signal by a tracer (debugger). Also, in the X/Open standard, some more actions could be taken here: see below for WCONTINUED and SIGCHLD.
Stopping is handled after checking for ignoring, masking or catching. The process is stopped, and various flags are set to remember its state. If the parent is actually waiting with the WUNTRACED flag set, the condition is now completed and the parent has to be told.
See below for the VFS/TTY part. The PM part is exactly the same as for SIGSTOP, hence the four signals are handled as a whole in a ad hoc set.
This one is easy: just add a handler for ^Z, really the c_cc[VSUSP]
setting of termios (emitting SIGTSTP) along with c_cc[VINTR]
emitting SIGINTR etc.
Warning: Extra care with the following: this was true back in 2011, but things changed since that. The use of a POSIX system call like killpg()
directly from a service to PM is controversial; an alternative could be to implement a sys_killpg() which does the work using the same codepath as sys_kill… and more ugly signal code inside the kernel.
We take the occasion to revise the code dealing with those signals (sig_char() in tty/tty.c), to use directly killpg. We also corrects a POSIX non-conformance when dealing with the SIGHUP signal: the comments were correct about the job to be done, but the actual code in PM does not perform that job; it is then safer to actually ask for the precise tasks to do, rather than expecting remote code to handle the borderlines!
When a background process wants to read from its controlling terminal, the natural action with job control is to stop the process, in the expectation that the shell switches the process in the foreground, to give it the focus. However a job-control-aware process has some control over this. The stop is given through the SIGTTIN signal, which default action is to stop the process as we saw above. As most signals however, it can be ignored, caught, or masked (blocked); and POSIX specifies various behaviours in each case:
servers/vfs/pipe.c
. So we create a new reason for a process to be suspended, FP_BLOCKED_ON_BGIO, to characterize this case. The implementation does not anything special besides keeping the state; worth the note here is the new character letter, 'B', issued by the procfs server in the FS status field in the /proc/$$/psinfo
pseudo file to mark this new status (notice also that the FP_BLOCKED_ON_
xxx state, as any process-related VFS information, is not exposed by is.)
Since this test on the disposition of the signal occurs while reading, it requires VFS to know or to learn how a given signal (which are managed by PM) will be handled. We choose the learn way, and we create a new system call (reserved to system services), sighandled(2)
to ask PM about the way a given signal would be handled. The implementation parallels the actions done in sig_proc(), but no action is actually done. A possible alternative to the current scheme could be to introduce another flag to sig_proc(), called for example dry_run, and actually call the sig_proc() routine; however this routine is already large and heavily loaded with several sub-cases and flags, so the chosen solution is probably clearer.
Once VFS has learnt how the signal will be handled, the actions follow a simple decision tree: either allowing or aborting the system call, or sending the SIGTTIN signal (to the whole background process group), which will require either aborting the current system call with EINTR just as will be done any time a signal will be caught, or putting the system call on hold with the new FP_BLOCKED_ON_BGIO status.
There are only marginal differences with the above case for reading.
TOSTOP prevents background processes to write on their controlling terminal. The actual action is the same as with READ or IOCTL. But since it happens within VFS, we must remember the current value of that TOSTOP flag (which is part of the local control flags of the termios structure); so we also intercept the IOCTL TCSETS which is the only way to affect this flag.
Another difference with SIGTTIN is that if the signal is ignored or masked, the operation is allowed (in contrast to the EIO error code returned to a reading process), without any signal being delivered.
The last piece of the puzzle is when a foreground application is suspended on its controlling tty, waiting for some input, and meanwhile is sent to the background: it should then free the tty task to allow another new foreground process to read. This is done sending a CANCEL message to the task, and then be putting the now backgrounded process in the FP_BLOCKED_ON_BGIO state.
This is still WIP.
The configure script should figure it out automatically. Recompile, check config.h for a definition of JOBS, do some testing to ensure it works.
bin/ksh (imported NetBSD) is just a modified version of pdksh with the configure script and a few other things removed/changed. To enable job control, you need to update config.h
. To update the config.h
file to enable job control, the config.h
generated by shells/pdksh needs to be copied over to /usr/src/bin/ksh/config.h
.
There is a Minix config file shipped with tcsh in the config directory. You need to set POSIXJOBS and/or BSDJOBS as appropriate. After testing, upstream the changes.
bash provides a simple configure script switch. To enable job control, just edit the package Makefile so that CONFIGURE_ARGS is set with –enable-job-control instead of –disable-job-control.
spawni.c from gnulib contains a call to setpgid(). The call to setpgid() in spawni.c has been commented out in devel/m4 and devel/bison. Those two patches haven't been upstreamed, and can be dropped once Minix has setpgid() support. It would also be prudent to check with upstream gnulib. At least one developer has been working on formal Minix support (i.e. porting all of the gnulib replacement functions).
Not yet implemented.
Testing is probably the hardest part of the whole project. Here are some tips or ideas on how to test job control. The list is incomplete, but it might give some ideas for where to start.
Don't pass over the obvious thing, try out some of the job control features of the shells listed above.
zsh ships with a test file called “job-control-tests” in the Misc directory that is supposed to be run interactively. Try it out. It might be useful to check the sources of other shells for additional job control test code.
It may require a little porting work, but there is a tool called cpulimit which uses job control to start and stop a process in order to limit the percentage of CPU time it gets. The code's on github (here). You could use cpulimit to limit a cpu heavy application (sysutils/cpuburn for example or even just a tight infinite loop that does floating point math). That might help expose some bugs.
The glibc manual has a section titled Implementing a Shell. You could probably learn enough about shells and job control to implement a simple shell that supports job control (no fancy stuff like if/while/variables/…). You could write some sort of script to execute within the shell which tests job control. Alternatively, you could take an existing shell and add debugging commands to it which would help with job control testing. Either way, if the test is to be included in the Minix source tree, be sure it's compatible with the Minix license terms.