Table of Contents

The Data Link protocol

This page provides the official documentation of the Data Link protocol of MINIX 3. It describes the protocol used between the inet server, and ethernet drivers that control network interface card (NIC) hardware. The current version documents the protocol used in SVN revisions 9568 and later. If you update this document because of changes to MINIX 3, please mention the revision number of the change in the wiki comment.

General information

The following information is written mainly for people implementing an ethernet driver.

Instances

Each ethernet driver instance is responsible for exactly one ethernet card. A machine may have multiple ethernet cards of the same type; in this case, multiple copies of the same ethernet driver will be started. Each copy will be given a unique instance number (0 for the first instance of that particular driver, 1 for the second, and so on).

It is up to the driver to decide how to map instances to ethernet cards. This should be reasonably stable across machine boots, to make sure that the user's per-instance configuration will consistently apply to the same instance. The typical approach is to iterate linearly over PCI devices at startup time, skipping N matching PCI devices before reserving the next one.

Initialization

Upon initialization, the driver must retrieve the instance number. It is passed as one of the arguments to the driver, in the form instance=N where N is a decimal representation of the instance number. Obtaining the instance number is typically done by means of env_setargs() and env_parse(), as found in <minix/sysutil.h>.

The driver should try to initialize the device immediately when starting up, and fail initialization (for example, by means of panic()) if successful initialization of the device fails. Please note that for legacy reasons, this is not common practice in existing ethernet drivers.

Upon (successful) startup, the ethernet driver must announce its presence in DS. This should be done by calling the netdriver_announce() function found in libnetdriver.

Requests and replies

The basic protocol consists of inet sending requests to the ethernet driver, and the driver sending replies. The m_type message field contains the request or reply type. All message names and field aliases are defined in <minix/com.h>. The first request issued by inet will always be a DL_CONF request, although more DL_CONF requests may be issued later.

Inet will not send a new request until it has received a reply for the previous request. As such, the driver must always respond to each request with a reply immediately. In the case of the DL_READV_S and DL_WRITEV_S data transfer requests, this may lead to one immediate reply message to acknowledge a request, and one reply message later to complete the request. See the section on data transfer below. The two other requests, DL_CONF and DL_GETSTAT_S, follow a more strict request-reply form.

Ethernet drivers should use the netdriver_receive() function from libnetdriver to receive messages. The send() primitive should be used to send replies back to inet.

Data transfer

Like all other requests, data transfer is always initiated by inet. Inet will send a DL_READV_S request when it is ready to receive a packet. It will send a DL_WRITEV_S request to indicate that a packet should be sent.

The driver may not always be able to satisfy the send or receive request immediately, for example because there are no packets in the receive queue, or because the send queue is full. If the driver is able to satisfy the request immediately, it should send a DL_TASK_REPLY with the appropriate DL_PACK_ flag set in the DL_FLAGS field (and, in the case of receives, with DL_COUNT set appropriately).

If the driver is not able to satisfy the request immediately, it should immediately send a DL_TASK_REPLY with a value of DL_NOFLAGS (0) in DL_FLAGS, to indicate that the request has been received but is still pending. Once the driver has been able to satisfy such a pending request (because a new packet arrived, or enough previous data were sent to be able to queue the new packet), it should send another DL_TASK_REPLY message, with the appropriate DL_PACK_ flag set in DL_FLAGS (and possibly with the DL_COUNT field set, see above). This usually happens in response to an interrupt; if the same interrupt happens to satisfy both a pending receive and a pending send request, one single DL_TASK_REPLY message may be used to acknowledge both; in that case, DL_STAT would contain the bitwise OR'ed combination of DL_PACK_SEND and DL_PACK_RECV, and DL_COUNT would contain the size of the received packet.

In other words, while a send or receive request must be acknowledged immediately with a reply, the request itself will stay pending until a matching DL_PACK_ flag is set in a reply. The driver will never receive another request of the same (DL_READV_S, DL_WRITEV_S) type while the previous request has not yet fully completed.

System interaction

The driver should use the System Event Framework (SEF). This framework automatically takes care of interaction with the Reincarnation Server (RS).

The driver should register a signal handler callback function through SEF, and immediately perform an exit from the callback function when it gets a SIGTERM signal. Before exiting, the driver should stop the device.

The driver will typically interact with the PCI server to find and reserve hardware devices. Description of this interaction is beyond the scope of this document.

Protocol messages

This section documents the messages used in the ethernet driver protocol.

Configuration

The DL_CONF message has two purposes. First, it specifies what promiscuity/multicast/broadcast mode the ethernet card should be changed to. Second, it requests the ethernet hardware address of the card. The driver may receive multiple DL_CONF messages over the course of its lifetime. The DL_CONF request looks like this:

< 16% >RequestDL_CONF request configuration and set mode
Fields<12%>DL_MODE<6%>m2_l1<16%>unsigned int flag field stating what mode(s) the NIC should have

The DL_MODE field is a bitwise combination of the following possible flags:

AliasValueMeaning
DL_PROMISC_REQ 0x1 promiscuous mode
DL_MULTI_REQ 0x2 multicast mode
DL_BROAD_REQ 0x4 broadcast mode

These flags indicate what types of packets are to be received, in addition to packets addressed specifically to the ethernet device's hardware address. The alias DL_NOMODE equals 0 and is used when none of the above flags are set.

The ethernet driver should change the mode of the hardware device to the mode indicated by these flags, and respond with the following reply:

< 16% >ReplyDL_CONF_REPLY provide ethernet configuration
Fields<12%>DL_STAT<6%>m3_i1<16%>int result code
DL_HWADDRm3_ca1ether_addr_t upon success: ethernet hardware address

The result code must be either OK to indicate success, or a negative error code. If the driver cannot successfully reserve and interact with the device, this error code is expected to be ENXIO. Upon success, The ethernet hardware address (aka MAC address) is to be stored in the result message as well. The ether_addr_t structure is defined in <net/gen/ether.h>.

The driver may use the first DL_CONF message to initialize the hardware. However, it is recommended that the driver do this immediately at startup. The driver can then be hardcoded to return a OK response to DL_CONF requests. In the long term, the DL_STAT field may be removed.

Statistics

At any time after a device has been first configured with a DL_CONF message, inet may request packet transmission and error statistics from the driver:

< 16% >RequestDL_GETSTAT_S request ethernet statistics
Fields<12%>DL_GRANT<6%>m2_l2<16%>cp_grant_id_t grant (WRITE) for eth_stat_t structure

Upon receiving this request, the ethernet driver must fill the fields of an “eth_stat_t” structure as best it can, use sys_safecopyto to copy it out to the caller's provided grant, and send the following reply message.

< 16% >Reply<34% >DL_STAT_REPLY provide ethernet statistics
Fields none

The driver should not reset the statistics after processing this message. The eth_stat_t structure is defined in <net/gen/eth_io.h>, along with rough descriptions of what each field means.

There is currently no userland tool that prints these statistics.

Data transfer

The request from inet to receive a packet looks like this:

< 16% >RequestDL_READV_S receive ethernet packet
Fields<12%>DL_GRANT<6%>m2_l2<16%>cp_grant_id_t grant (READ) for iovec_s_t vector
DL_COUNTm2_i3int number of vector elements

The request comes with a grant for a vector that specifies grants and sizes for the destination buffers. For the driver, the process of copying out a packet consists of copying in the vector (using sys_safecopyfrom), and repeatedly copying out (using sys_safecopyto) the next iov_size bytes of the received packet to the next iov_grant grant as specified by that element of the vector, until the entire packet is copied out (sys_vsafecopy may be used as well). The vector itself is DL_COUNT * sizeof(iovec_s_t) bytes in size, and DL_COUNT will not exceed NR_IOREQS. The total size of the buffers specified by the vector is guaranteed to be at least ETH_MAX_PACK_SIZE_TAGGED bytes, which should be large enough to receive the packet; if not, the driver may truncate the packet or simply panic. The driver itself should remove any trailing CRC bytes from the ethernet packet before copying it out. The driver must ensure that the resulting packet is at least ETH_MIN_PACK_SIZE bytes. Smaller packets (runt frames) must be thrown away.

The request from inet to send a packet looks like this:

< 16% >RequestDL_WRITEV_S send ethernet packet
Fields<12%>DL_GRANT<6%>m2_l2<16%>cp_grant_id_t grant (READ) for iovec_s_t vector
DL_COUNTm2_i3int number of vector elements

Similar to the receive request, the send request includes a vector that specify grants and sizes for the buffers that contain the packet data. The process of copying in the packet consists of copying in the vector, and repeatedly copying in the next iov_size bytes of the packet to send to the next iov_grant grant as specified by that element of the vector, until the entire packet is copied in. The total size of the packet is guaranteed to be at least ETH_MIN_PACK_SIZE bytes and at most ETH_MAX_PACK_SIZE_TAGGED bytes. Depending on the underlying hardware, the driver may have to copy in the vector before being able to determine whether there is room in the send queue to send the entire packet.

The reply message for both DL_READV_S and DL_WRITEV_S requests is the same. It may acknowledge a request and/or signify completion of a pending request. When signifying completion a receive request, it must specify the size of the received packet. The message looks like this:

< 16% >ReplyDL_TASK_REPLY acknowledge pending or successful data transfer
Fields<12%>DL_FLAGS<6%>m2_l1<16%>unsigned longcompletion flags
DL_COUNTm2_i3int if DL_PACK_RECV is set: received packet size, in bytes

The completion flags in DL_FLAGS may be a bitwise combination of the following flags:

AliasValueMeaning
DL_PACK_SEND 0x1 the send request has been completed
DL_PACK_RECV 0x2 the receive request has been completed

The alias DL_NOFLAGS equals 0 and is typically used to indicate that the just-received send or receive request could not be satisfied immediately, and is now pending.

Inet guarantees that all grants, including the grant for the vector, stay valid until the driver has acknowledged that the request has been completed. Thus, the driver need not save a copy of the vector contents before sending a DL_NOFLAGS task reply; it can simply save the vector grant (and size).