Stale page
All the changes listed on this page, except for the dynamic updates and failure resilience support, have been merged into the VFS-FS protocol. The VFS-FS protocol documentation has been updated accordingly. Further changes to the protocol are being documented on that page – not here. This page is kept only to preserve the design of dynamic updates and failure resilience support.
This page presents a new version of The VFS-FS protocol. This is a work in progress, so do not consider it final yet. In the future the changes to the protocol will be incorporated into the VFS-FS protocol page, and this page will be removed.
The known issues of the previous VFS-FS protocol have been solved. Moreover, support for Dynamic Updates and Failure Resilience for an FS has been added (although these features are not actually implemented, yet).
In order to provide failure recovery after an FS has crashed, all requests are part of a transaction. A transaction consists of:
An FS handles only one transaction at a time (VFS puts subsequent transactions on a queue).
A commit by the FS is an atomic operation. While a transaction is not yet committed, the FS stores the result of the request in step 1 in a temporary data structure. That is, it is not really part of the state of the FS, yet. If, after a crash of the FS, it turns out there was a partially executed transaction, the temporary data structure can be ignored in the recovery process as if the request hadn't happened at all.
However, if an FS crashes right after it did commit the changes, but was unable to successfully deliver the COMMITTED message, restarting the transaction could end up in getting wrong results (e.g., consider unlinking a file successfully and then retry unlinking the file; the first time the FS returns OK and the second time it returns ENOENT). To solve this, each transaction has an ID that is encoded in the message using the 'type' field. An FS must record the transaction ID when it commits a transaction, so it can verify whether it has committed the transaction or not when VFS asks for it. If it turns out the request was already committed, it simply replies COMMITED.
Steps 3 and 4 of the transaction protocol can be omitted if a request is idempotent (for example, stat is a read request and can be issued multiple times and get the same result each time). To do this, the FS sets an 'auto-commit' flag in the reply. This flag is encoded in the 'type' field just like the transaction IDs. It is up to the FS to decide whether or not a request is idempotent. VFS will automatically send a COMMIT request when the reply from an FS indicates that the request is non-idempotent.
When an FS crashes, it should have a way to recover a freshly started FS to the state previous to the crash. This can be done by using shared memory regions that remain resident even after the program that created the shared memory regions is no longer executing. After the crash, a new FS maps in the old memory region and possibly fixes errors if necessary. A newly started FS knows it has to recover state from a previous FS (as opposed to mount a new file system), because VFS will send a REQ_RECOVER message.
When a transaction keeps failing a number of times, the communication layer returns EAGAIN, enabling VFS to undo any changes to its internal state and report an error message to the user (program).
The following macros encode the request result (r, signed short), transaction ID (i, unsigned short), and auto-commit flag (f, unsigned short). Note that the transaction ID is actually 15 bits wide (not 16) and can therefore carry values of 0 up to 32767.
#define TGET_RESULT(t) ((t >> 16) & 0xFFFF) #define TGET_TRNS_ID(t) ((t >> 1) & 0x7FFF) #define TGET_AC(t) ((t ) & 0x0001) #define TSET_RESULT(t, r) (t |= (r & 0xFFFF) << 16 ) #define TSET_TRNS_ID(t, i) (t |= (i & 0x7FFF) << 1 ) #define TSET_AC(t, f) (t |= (f & 0x0001) )
A dynamic update of an FS allows the administrator to install and run a new version of an FS without needing to reboot the computer or unmount and mount file systems; a running copy of the FS is replaced by a new version. This is achieved by telling the FS to write its buffers to disk and do an exit by sending a REQ_RESTART message. Subsequently, the new FS is started and it is told reload state from disk by sending it REQ_RELOAD. That is, it reads the inodes from disk which were in cache before the update. This way it restores state.
It is advised to read “Dynamic Updates and Failure Resilience for the Minix File Server” by Thomas Veerman (see link at the bottom) to gain better understanding of the mechanisms behind Dynamic Updates and Failure Resilience.
This specification reflects the protocol as it should be implemented, not how it is implemented by MFS. In particular, old and deprecated requests are not and should not be included.
The entire VFS-FS protocol is entirely POSIX-oriented. Any deviation from the requirements imposed by POSIX in this specification is unintentional except when mentioned explicitly. For convenience, links to the relevant Open Group function specifications and file access (ATIME), modification (MTIME) and change (CTIME) time update requirements are provided.
The reply codes in this document are advisory and mostly aimed at indicating additional restrictions needed for POSIX compliance. Not all of them may be applicable to every file server, and a file server may send other error codes where appropriate. Errors resulting from protocol validation checks (e.g. EROFS, sys_safecopy.. errors) are not included.
The requests are ordered according to the following rough categorization:
In the tables below, we use the following color coding:
<6% > | The field has its name changed. |
Value has changed (e.g., new variable type, new spot in a message, different description). When the whole row has this color, it means this row was added to the request. | |
Nothing has changed. | |
This field has been dropped (or replaced by a new field). |
Mount the file system.
Request fields
<16% >REQ_GRANT | <6% >m9_l2 | <12% >cp_grant_id_t | memory grant (READ) for the label of the block device driver to use |
REQ_PATH_LEN | m9_s2 | unsigned short | length of the label |
REQ_DEV | m9_l5 | dev_t | device number of block device to mount |
REQ_READONLY | m6_c1 | int | flag indicating whether the file system is mounted read-only (1 = read-only, 0 = read-write) |
REQ_ISROOT | m6_c2 | int | flag indicating whether the file system is the system root file system (1 = yes, 0 = no) |
REQ_FLAGS | m9_s3 | int | REQ_RDONLY flag indicates whether the file system is mounted read-only or not (i.e., read and write). REQ_ISROOT flag indicates the file system is the root file system. |
Reply fields
<16% >RES_INODE_NR | <6% >m9_l1 | <12% >ino_t | upon success: inode number of the root inode |
RES_MODE | m9_s2 | mode_t | upon success: mode of the root inode |
RES_FILE_SIZE | m9_l2 | off_t | upon success: file size of the root inode |
RES_FILE_SIZE_HI | m9_l2 | off_t | upon success: file size of the root inode (upper 32 bits) |
RES_FILE_SIZE_LO | m9_l3 | off_t | upon success: file size of the root inode (lower 32 bits) |
RES_DEV | m9_l4 | uid_t | upon success: resulting file device number |
RES_UID | m9_s4 | uid_t | upon success: user ID of the root inode |
RES_GID | m9_s1 | gid_t | upon success: group ID of the root inode |
Reply codes
<16% >EINVAL | label too long |
EINVAL | unable to retrieve endpoint from DS using label |
EINVAL | opening device driver failed |
EINVAL | reading superblock failed |
OK | file system initialized and mounted |
Notes
VFS assumes the root inode on the mounted FS is in use and will have a reference count of 1 |
Unmount the file system.
Request fields
Reply fields
Reply codes
<16% >OK | file system unmounted |
Notes
Analog to how REQ_READSUPER opens the root inode will REQ_UNMOUNT put the root inode. Previously, all inodes had to have a reference count of 0 before issueing this request. |
Resolve a path string to an inode.
Request fields
<16% >REQ_GRANT2 | <6% >m9_l1 | <12% >cp_grant_id_t | memory grant (READ) of the buffer containing supplemental group data | |
<16% >REQ_GRANT | <6% >m9_l2 | <12% >cp_grant_id_t | memory grant (READ | WRITE) of the buffer containing the pathname |
REQ_PATH_LEN | <6% >m9_s2 | int | length of the remaining part of the string to resolve | |
REQ_PATH_SIZE | <6% >m9_l5 | size_t | total size of the buffer | |
REQ_L_PATH_OFF | m9_l2 | size_t | starting offset of the string to resolve within the buffer | |
REQ_DIR_INO | <6%>m9_l3 | ino_t | inode number of the starting directory | |
REQ_ROOT_INO | <6%>m9_l4 | ino_t | inode number of the root directory of the caller, or 0 if not on this file system | |
REQ_FLAGS | <6% >m9_s3 | int | PATH_RET_SYMLINK (do not resolve a symlink as the last path component), PATH_GET_UCRED (retrieve UID and GIDs from VFS instead of using REQ_UID and REQ_GID, because UID is member of multiple, supplemental, groups), or 0 | |
REQ_UID | <6% >m9_s4 | uid_t | user ID of the caller | |
REQ_GID | <6% >m9_s1 | gid_t | group ID of the caller | |
REQ_UCRED_SIZE | <6% >m9_s4 | size_t | total size of ucred structure |
Reply fields
<16% >RES_INODE_NR | <6% >m9_l1 | <12% >ino_t | upon success: resulting file inode number |
RES_MODE | <6% >m9_s2 | mode_t | upon success: resulting file mode |
RES_FILE_SIZE | m6_l2 | off_t | upon success: resulting file size |
RES_FILE_SIZE_HI | m9_l2 | off_t | upon success: file size of the root inode (upper 32 bits) |
RES_FILE_SIZE_LO | m9_l3 | off_t | upon success: file size of the root inode (lower 32 bits) |
RES_DEV | m9_l4 | dev_t | upon success: resulting file device number |
RES_UID | m9_s4 | uid_t | upon success: resulting file user ID |
RES_GID | m9_s1 | gid_t | upon success: resulting file group ID |
RES_INODE_NR | m9_l1 | ino_t | upon EENTERMOUNT: inode number of the mountpoint inode |
RES_OFFSET | m9_s2 | int | upon EENTERMOUNT and ELEAVEMOUNT and ESYMLINK: new starting offset of string within buffer |
RES_SYMLOOP | m9_s3 | unsigned short | upon EENTERMOUNT and ELEAVEMOUNT and ESYMLINK: number of symbolic links followed |
Reply codes
<16% >ENAMETOOLONG | provided path length exceeds what file server can handle |
ENAMETOOLONG | any of the path components is longer than the file system supports |
ENOTDIR | any of the intermediate path components is not a directory |
EACCES | the caller has no search access permission on any of the intermediate directories |
ENFILE | no inodes are available in memory |
ELOOP | more than SYMLOOP_MAX symlinks were encountered during the lookup |
ENAMETOOLONG | resulting path to copy back (including terminating '\0') does not fit in provided buffer |
ENOENT | one of the components does not exist |
EENTERMOUNT | a mountpoint was encountered |
ELEAVEMOUNT | “..” is followed from the file system root and the file system root is not the caller root inode |
ESYMLINK | an absolute symlink was encountered |
EINVAL | starting inode was a mountpoint and first path component is not “..” |
OK | inode successfully looked up and opened |
REQ_GRANT2 provides a grant to an ucred structure holding user ID and (supplemental) group data that are to be used to check permissions during the lookup. |
Notes
VFS assumes the opened inode on the FS is in use and will have a reference count +1 (i.e., 1 if just opened for the first time, x+1 if it was already opened). |
Create a regular file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the containing directory for the new file |
REQ_MODE | <6% >m9_s3 | mode_t | mode for the file |
REQ_UID | <6% >m9_s4 | uid_t | user ID for the file |
REQ_GID | <6% >m9_s1 | gid_t | group ID for the file |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for the last path component |
REQ_PATH_LEN | <6% >m9_s2 | unsigned short | length of the last path component |
Reply fields
<16% >RES_INODE_NR | <6% >m9_l1 | <12% >ino_t | upon success: inode number of created file |
RES_MODE | <6% >m9_s2 | mode_t | upon success: mode of created file |
RES_FILE_SIZE | m6_l2 | off_t | upon success: file size of created file |
RES_FILE_SIZE_HI | m9_l2 | off_t | upon success: file size of created file (upper 32 bits) |
RES_FILE_SIZE_LO | m9_l3 | off_t | upon success: file size of created file (lower 32 bits) |
RES_UID | <6% >m9_s4 | uid_t | upon success: user ID of created file |
RES_GID | <6% >m9_s1 | gid_t | upon success: group ID of created file |
RES_DEV | <6% >m9_l4 | dev_t | upon success: device node index |
RES_INODE_INDEX | m6_s2 | unsigned short | upon success: inode index to associate with this inode |
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
EEXIST | a directory entry with that name already exists |
ENFILE | no inodes are available |
ENOSPC | no space is left on the device |
EFBIG | the containing directory can not handle any more entries |
OK | regular file created and opened |
Notes
VFS assumes the created inode on the FS is in use and will have a reference count of 1. |
Create an open, unlinked file.
Request fields
<16% >REQ_MODE | <6% >m9_s3 | <12% >mode_t | mode for the inode |
REQ_DEV | <6% >m9_l5 | dev_t | device number for the inode |
REQ_UID | <6% >m9_s4 | uid_t | user ID for the inode |
REQ_GID | <6% >m9_s1 | gid_t | group ID for the inode |
Reply fields
<16% >RES_INODE_NR | <6% >m9_l1 | <12% >ino_t | upon success: inode number of the resulting inode |
RES_MODE | <6% >m9_s2 | mode_t | upon success: mode of the resulting inode |
RES_FILE_SIZE | m6_l2 | off_t | upon success: size of the resulting inode |
RES_FILE_SIZE_HI | m9_l2 | off_t | upon success: size of the resulting inode (upper 32 bits) |
RES_FILE_SIZE_LO | m9_l3 | off_t | upon success: size of the resulting inode (lower 32 bits) |
RES_DEV | <6% >m9_l4 | dev_t | upon success: device number of the resulting inode |
RES_UID | <6% >m9_s4 | uid_t | upon success: user ID of the resulting inode |
RES_GID | <6% >m9_s1 | gid_t | upon success: group ID of the resulting inode |
Reply codes
<16% >ENFILE | no inodes are available |
OK | temporary inode created and opened |
Decrease an open file's reference count.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
REQ_COUNT | <6% >m9_l2 | ino_t | number of references to drop |
Reply fields
Reply codes
<16% >OK | reference count decreased |
Notes
VFS assumes the inode on the FS: - is not in use when REQ_COUNT equals exactly the amount of times the inode was opened according to the FS, - is still in use when REQ_COUNT is less than the amount of times the inode was opened according to the FS (e.g., sometimes VFS will (effectively) set the reference counter to 1 in order to prevent the counter from wrapping). |
Read from a file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
<16% >REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (WRITE) to store the resulting data in |
<16% >REQ_POS | m2_i3 | off_t | seek position into the open file |
<16% >REQ_SEEK_POS_HI | m9_l3 | off_t | seek position into the open file (upper 32 bits) |
<16% >REQ_SEEK_POS_LO | m9_l4 | off_t | seek position into the open file (lower 32 bits) |
<16% >REQ_NBYTES | m9_l5 | size_t | number of bytes to write |
REQ_FD_INODE_INDEX | m2_s1 | unsigned short | inode index associated with this inode |
Reply fields
<16% >RES_FD_POS | <6% >m2_i1 | <12% >off_t | upon success: resulting file position |
<16% >RES_SEEK_POS_HI | <6% >m9_l3 | <12% >off_t | upon success: resulting file position (upper 32 bits) |
<16% >RES_SEEK_POS_LO | <6% >m9_l4 | <12% >off_t | upon success: resulting file position (lower 32 bits) |
RES_NBYTES | m9_l5 | size_t | upon success: number of bytes read |
Reply codes
<16% >OK | results successfully (partially) read, or EOF reached |
Write to a file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
<16% >REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) containing the data to write |
<16% >REQ_FD_POS | <6% >m2_i3 | off_t | seek position into the open file |
<16% >REQ_SEEK_POS_HI | <6% >m9_l3 | off_t | seek position into the open file (upper 32 bits) |
<16% >REQ_SEEK_POS_LO | <6% >m9_l4 | off_t | seek position into the open file (lower 32 bits) |
<16% >REQ_NBYTES | <6% >m9_l5 | size_t | number of bytes to write |
REQ_FD_INODE_INDEX | m2_s1 | unsigned short | inode index associated with this inode |
Reply fields
<16% >RES_FD_POS | <6% >m2_i1 | <12% >off_t | upon success: resulting file position |
<16% >RES_SEEK_POS_HI | <6% >m9_l3 | <12% >off_t | upon success: resulting file position (upper 32 bits) |
<16% >RES_SEEK_POS_LO | <6% >m9_l4 | <12% >off_t | upon success: resulting file position (lower 32 bits) |
RES_NBYTES | m9_l5 | size_t | upon success: number of bytes written |
Reply codes
<16% >ENOSPC | no space is left on the device |
EFBIG | the write would make the resulting file size too big |
OK | results successfully written |
Retrieve directory entries.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the directory |
REQ_GRANT | m9_l2 | cp_grant_id_t | memory grant (WRITE) to store resulting struct dirent entries and names in |
REQ_MEM_SIZE | m9_l5 | size_t | size of given memory grant |
REQ_GDE_POS | m2_l1 | off_t | seek position into the open file |
REQ_SEEK_POS_HI | m9_l3 | off_t | file position (upper 32 bits) |
REQ_SEEK_POS_LO | m9_l4 | off_t | file position (lower 32 bits) |
Reply fields
<16% >RES_GDE_POS_CHANGE | <6% >m2_l1 | <12% >off_t | upon success: the amount by which to adjust the seek position into the file |
<16% >RES_SEEK_POS_HI | <6% >m9_l3 | <12% >off_t | upon success: new seek position into the file (upper 32 bits) |
<16% >RES_SEEK_POS_LO | <6% >m9_l4 | <12% >off_t | upon success: new seek position into the file (lower 32 bits) |
RES_NBYTES | m9_l5 | size_t | upon success: the amount of resulting bytes stored, with 0 for EOF |
Reply codes
<16% >ENOENT | the given file position is not aligned to the internal data structures (file system specific) |
EINVAL | the given buffer is too small to store even one entry (including padding) |
OK | stored zero or more entries in the user's buffer |
Set size, or free space, of an open file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
<16% >REQ_TRC_START_HI | <6% >m9_l2 | <12% >off_t | new file size or starting position (inclusi ve) or region to free (upper 32 bits) |
<16% >REQ_TRC_START_LO | <6% >m9_l3 | <12% >off_t | new file size or starting position (inclusi ve) or region to free (lower 32 bits) |
<16% >REQ_TRC_END_HI | <6% >m9_l4 | <12% >off_t | zero or ending position (exclusive) of region to free (upper 32 bits) |
<16% >REQ_TRC_END_LO | <6% >m9_l5 | <12% >off_t | zero or ending position (exclusive) of region to free (lower 32 bits) |
REQ_FD_START | m2_i2 | off_t | new file size or starting position (inclusive) of region to free |
REQ_FD_END | m2_i3 | off_t | zero or ending position (exclusive) of region to free |
Reply fields
Reply codes
<16% >EINVAL | an attempt is made to change the file size of a pipe to anything but zero |
EFBIG | the resulting file would be too big |
OK | file size changed and/or holes created |
Mark file as target of seek operation.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
Reply fields
Reply codes
<16% >OK | request processed successfully |
Retrieve file status.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (WRITE) to store resulting “struct stat” in |
Reply fields
Reply codes
<16% >OK | result stored in buffer |
Change file ownership.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
REQ_UID | m6_s1 | uid_t | user ID of the caller |
REQ_GID | m6_c1 | gid_t | group GID of the caller |
REQ_UID | m9_s4 | uid_t | new user ID for the file |
REQ_GID | m9_s1 | gid_t | new group ID for the file |
Reply fields
<16% >RES_MODE | <6% >m9_s2 | <12% >mode_t | upon success: resulting inode mode |
Reply codes
<16% >OK | ownership changed |
Change file mode.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
REQ_MODE | <6% >m9_s3 | mode_t | new mode for the file |
REQ_UID | m6_s1 | uid_t | user ID of the caller |
REQ_GID | m6_c1 | gid_t | group ID of the caller |
Reply fields
<16% >RES_MODE | <6% >m9_s2 | <12% >mode_t | upon success: resulting inode mode |
Reply codes
<16% >OK | mode changed |
- The caller UID and GID are typically unused. \\- While MFS changes the 06777 (octal) part of the mode, other file system may choose to change S_ISVTX as well (07777) |
Set file times.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
REQ_ACTIME | <6% >m9_l2 | time_t | new access time |
REQ_MODTIME | <6% >m9_l3 | time_t | new modification time |
Reply fields
Reply codes
<16% >OK | custom file times set |
Create a directory.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the containing directory for the new file |
REQ_MODE | <6% >m9_s3 | mode_t | mode for the directory |
REQ_UID | <6% >m9_s4 | uid_t | user ID for the directory |
REQ_GID | <6% >m9_s1 | gid_t | group ID for the directory |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for the last path component |
REQ_PATH_LEN | <6% >m9_s2 | unsigned short | length of the last path component |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
EEXIST | a directory entry with that name already exists |
ENFILE | no inodes are available |
ENOSPC | no space is left on the device |
EFBIG | the containing directory can not handle any more entries |
EMLINK | the containing directory has the maximum number of links already |
OK | directory created |
Create a special file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the containing directory for the new file |
REQ_MODE | <6% >m9_s3 | mode_t | mode for the file |
REQ_DEV | <6% >m9_l5 | dev_t | device number |
REQ_UID | <6% >m9_s4 | uid_t | user ID for the file |
REQ_GID | <6% >m9_s1 | gid_t | group ID for the file |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for the last path component |
REQ_PATH_LEN | <6% >m9_s2 | short | length of the last path component |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
EEXIST | a directory entry with that name already exists |
EINVAL | the given file type is invalid or not supported |
ENFILE | no inodes are available |
ENOSPC | no space is left on the device |
EFBIG | the containing directory can not handle any more entries |
OK | special file created |
Create a hard link to a file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | link file inode number |
REQ_DIR_INO | <6% >m9_l3 | ino_t | inode number of the containing directory for the new link |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for the last path component |
REQ_PATH_LEN | <6% >m9_s2 | unsigned short | length of the last path component |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
EEXIST | a directory entry with that name already exists |
EPERM | the linked file is a directory |
EMLINK | the linked inode has the maximum number of links already |
ENOSPC | no space is left on the device |
EFBIG | the containing directory can not handle any more entries |
OK | new link created |
Unlink a file.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the containing directory for the file |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for last path component |
REQ_PATH_LEN | <6% >m9_s2 | unsigned short | length of the last path component |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
ENOENT | no directory entry with that name exists |
EPERM | the given name refers to a directory |
OK | unlinked file |
Remove an empty directory.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the containing directory for the file |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for last path component |
REQ_PATH_LEN | <6% >m9_s2 | unsigned short | length of the last path component |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
ENOENT | no directory entry with that name exists |
ENOTDIR | the given name does not refer to a directory. |
ENOTEMPTY | the given directory is not empty |
EINVAL | the given directory is “.” or “..” |
EBUSY | the given directory is the root directory of the file system |
OK | removed directory |
Rename a file or directory.
Request fields
<16% >REQ_REN_OLD_DIR | <6% >m9_l3 | <12% >ino_t | inode number of containing directory for the old file |
REQ_REN_NEW_DIR | <6% >m9_l4 | ino_t | inode number of containing directory for the new file |
REQ_REN_GRANT_OLD | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for the old last path component |
REQ_REN_LEN_OLD | <6% >m9_s1 | unsigned short | length of the old last path component |
REQ_REN_GRANT_NEW | <6% >m9_l1 | cp_grant_id_t | memory grant (READ) for the new last path component |
REQ_REN_LEN_NEW | <6% >m9_s2 | unsigned short | length of the new last path component |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component of the old or new file is longer than the file system supports |
ENOENT | the old file does not exist |
OK | the old and new last path component and containing directory are the same |
EBUSY | the old file is a mountpoint directory |
EINVAL | an attempt is made to move a directory to within its own subtree |
EINVAL | the old or new last path component is “.” or “..” |
EMLINK | the old file is a directory and the new file doesn't exist but the new containing directory has the maximum number of links |
ENOTDIR | the old file is a directory and the new file exists but is not a directory |
EISDIR | the old file is not a directory and the new file exists but is a directory |
ENOTEMPTY | the new file is a directory but is not empty |
EBUSY | the new file is the root directory of the file system |
ENOSPC | no space is left on the device |
EFBIG | the new containing directory can not handle any more entries |
OK | file renamed |
Create a symbolic link.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of the containing directory for the new file |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) for the link name's last path component |
REQ_PATH_LEN | <6% >m9_s2 | unsigned short | length of the link name's last path component |
REQ_GRANT3 | <6% >m9_l3 | cp_grant_id_t | memory grant (READ) for the link target (not including a trailing '\0') |
REQ_MEM_SIZE | <6% >m9_l5 | size_t | length of the link target (not including a trailing '\0') |
REQ_UID | <6% >m9_s4 | uid_t | user ID for the new symlink |
REQ_GID | <6% >m9_s1 | gid_t | group ID for the new symlink |
Reply fields
Reply codes
<16% >ENAMETOOLONG | the last path component is longer than the file system supports |
EEXIST | a directory entry with that name already exists |
ENFILE | no inodes are available |
ENOSPC | no space is left on the device |
EFBIG | the containing directory can not handle any more entries |
ENAMETOOLONG | the link target contains '\0' bytes |
OK | symbolic link created |
Retrieve symbolic link target.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (WRITE) for buffer to write result to |
REQ_MEM_SIZE | <6% >m9_l5 | size_t | size of buffer to write to |
Reply fields
<16% >RES_NBYTES | <6% >m9_l5 | <6% >size_t | upon success: number of bytes written |
Reply codes
<16% >OK | result stored in buffer |
Mark an inode as mountpoint.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | inode number of file to use as mountpoint |
Reply fields
Reply codes
<16% >EBUSY | inode already in use as mountpoint |
ENOTDIR | given inode is not a directory |
OK | inode marked as mountpoint |
Retrieve file system status.
Request fields
<16% >REQ_GRANT | <6% >m9_l2 | <12% >cp_grant_id_t | memory grant (WRITE) to store resulting “struct statfs” in |
Reply fields
Reply codes
<16% >OK | result stored in buffer |
Write any unwritten data to disk.
Request fields
Reply fields
Reply codes
<16% >OK | request processed successfully |
Flush cached data for an unmounted device.
Request fields
<16% >REQ_DEV | <6% >m9_l5 | <12% >dev_t | device number |
Reply fields
Reply codes
<16% >EBUSY | the device is mounted |
OK | cache flushed and invalidated for this device |
Set a new driver endpoint for a major device.
Request fields
<16% >REQ_DEV | <6% >m9_l5 | <12% >dev_t | device number |
REQ_DRIVER_E | <6% >m9_l2 | endpoint_t | driver endpoint |
Reply fields
Reply codes
<16% >OK | request processed successfully |
Read from a block device directly.
Request fields
<16% >REQ_DEV2 | <6% >m9_l1 | <12% >dev_t | device number |
REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (WRITE) to store the resulting data in |
REQ_SEEK_POS_LO | <6% >m9_l4 | off_t | low 32 bits of position |
REQ_SEEK_POS_HI | <6% >m9_l3 | off_t | high 32 bits of position |
REQ_NBYTES | <6% >m9_l5 | size_t | number of bytes to read |
Reply fields
<16% >RES_SEEK_POS_LO | <6% >m9_l4 | <12% >off_t | upon success and failure: low 32 bits of resulting position |
<16% >RES_SEEK_POS_HI | <6% >m9_l3 | <12% >off_t | upon success and failure: high 32 bits of resulting position |
<16% >RES_NBYTES | <6% >m9_l5 | <12% >size_t | upon success and failure: total number of bytes read |
Reply codes
<16% >EIO | I/O error reported by the device driver |
OK | results successfully (partially) read, or EOF reached |
Write to a block device directly.
Request fields
<16% >REQ_DEV2 | <6% >m9_l1 | <12% >dev_t | device number |
<16% >REQ_GRANT | <6% >m9_l2 | cp_grant_id_t | memory grant (READ) containing the data to write |
<16% >REQ_SEEK_POS_LO | <6% >m9_l4 | off_t | low 32 bits of position |
<16% >REQ_SEEK_POS_HI | <6% >m9_l3 | off_t | high 32 bits of position |
<16% >REQ_NBYTES | <6% >m9_l5 | size_t | number of bytes to write |
Reply fields
<16% >RES_SEEK_POS_LO | <6% >m9_l4 | <12% >off_t | upon success and failure: low 32 bits of resulting position |
<16% >RES_SEEK_POS_HI | <6% >m9_l3 | <12% >off_t | upon success and failure: high 32 bits of resulting position |
<16% >RES_NBYTES | <6% >m9_l5 | <12% >size_t | upon success and failure: total number of bytes written |
Reply codes
<16% >EIO | I/O error reported by the device driver |
OK | results successfully (partially) written, or EOF reached |
Commit a transaction (part of the transaction protocol).
Request fields
<16% >REQ_ID | <6% >m9_s1 | <12% >unsigned short | Request ID |
Reply fields
Reply codes
<16% >EINVAL | Request ID could not be committed (e.g., REQ_ID != (current id - 1)) |
<16% >COMMITTED | Request is committed |
Description
This request tells the file server to commit a transaction. When VFS sends this request to an FS as a reply to a reply from an FS that is flagged 'auto-commit' or if it sends this reques more than once while a transaction is already committed, the FS replies COMMITTED. |
Recover state after a crash.
Request fields
<16% >REQ_OLD_E | <6% >m9_l3 | <12% >endpoint_t | Endpoint of crashed (initial) FS |
Reply fields
Reply codes
<16% >EIO | Recovery process failed (e.g., due to data corruption) |
<16% >OK | Recovery process completed successfully |
Description
The FS allocates and registers shared memory regions for the inode cache and buffer cache based on the endpoint (e.g., using keys such as <endpoint>_i and <endpoint>_b) with DS, after receiving a mount request. Upon receiving a recovery request, it maps in the inode cache and buffer cache of the crashed FS and runs a recovery procedure. Note that the endpoint is the endpoint of the initial FS, because the key in DS will never change. There are no naming schemes defined; it is up to the FS to pick suitable names. |
Tell the FS it is about to perform a dynamic update, so it can flush dirty data to disk.
Request fields
Reply fields
Reply codes
<16% >OK | Dirty data is written to disk |
Description
The FS does a sync to write the inode table to the block buffer and the block buffer to disk, followed by above reply and an 'exit.' |
Restore buffers by reading a number of inodes from disk, such that the state of the FS is the same as before the update.
Request fields
<16% >REQ_INODE_NR | <6% >m9_l1 | <12% >ino_t | Inode number of file to use as mountpoint |
<16% >REQ_GRANT | <6% >m9_l2 | <12% >cp_grant_id_t | Memory grant (READ) containing a list of inodes that the FS has to reopen |
<16% >REQ_MEM_SIZE | <6% >m9_l5 | <12% >size_t | Size of the inode list |
<16% >REQ_OLD_E | <6% >m9_l3 | <12% >endpoint_t | Endpoint of FS before dynamic update |
Reply fields
Reply codes
<16% >OK | Reload completed successfully |
Description
The VFS holds a list of inodes of which it thinks the FS has opened. For VFS, that is what the state of the FS looks like. By reading those inodes from disk, state is restored. \\Because the block buffer is stored in a shared memory region, the reloading process is sped up by mapping in that shared memory (using the endpoint to retrieve the shared memory key in DS) and reading the blocks from cache instead of disk. The inode cache must be overwritten. |
This document is not based on the original VFS-FS protocol documentation by Balazs Gerofi. However, that document may still provide additional insights.
Design and implementation of the MINIX Virtual File system by Balazs Gerofi, August, 2006
For more information on Dynamic Updates and Failure Resilience, see the Master's Thesis by Thomas Veerman.
Dynamic Updates and Failure Resilience for the Minix File Server by Thomas Veerman, May, 2009