User Tools

Site Tools


releases:3.2.0:developersguide:datalinkprotocol

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

releases:3.2.0:developersguide:datalinkprotocol [2014/11/11 14:52]
releases:3.2.0:developersguide:datalinkprotocol [2014/11/11 14:52] (current)
Line 1: Line 1:
 +====== The Data Link protocol ======
 +
 +This page provides the official documentation of the Data Link protocol of MINIX 3. It describes the protocol used between the inet server, and ethernet drivers that control network interface card (NIC) hardware. The current version documents the protocol used in SVN revisions **9568** and later. If you update this document because of changes to MINIX 3, please mention the revision number of the change in the wiki comment.
 +
 +
 +===== General information =====
 +
 +The following information is written mainly for people implementing an ethernet driver.
 +
 +==== Instances ====
 +
 +Each ethernet driver instance is responsible for exactly one ethernet card. A machine may have multiple ethernet cards of the same type; in this case, multiple copies of the same ethernet driver will be started. Each copy will be given a unique instance number (0 for the first instance of that particular driver, 1 for the second, and so on).
 +
 +It is up to the driver to decide how to map instances to ethernet cards. This should be reasonably stable across machine boots, to make sure that the user's per-instance configuration will consistently apply to the same instance. The typical approach is to iterate linearly over PCI devices at startup time, skipping N matching PCI devices before reserving the next one.
 +
 +==== Initialization ====
 +
 +Upon initialization,​ the driver must retrieve the instance number. It is passed as one of the arguments to the driver, in the form ''​instance=N''​ where N is a decimal representation of the instance number. Obtaining the instance number is typically done by means of env_setargs() and env_parse(),​ as found in ''<​minix/​sysutil.h>''​.
 +
 +The driver should try to initialize the device immediately when starting up, and fail initialization (for example, by means of panic()) if successful initialization of the device fails. Please note that for legacy reasons, this is not common practice in existing ethernet drivers.
 +
 +Upon (successful) startup, the ethernet driver must announce its presence in DS. This should be done by calling the netdriver_announce() function found in libnetdriver.
 +
 +==== Requests and replies ====
 +
 +The basic protocol consists of inet sending requests to the ethernet driver, and the driver sending replies. The <fs small>​m_type</​fs>​ message field contains the request or reply type. All message names and field aliases are defined in ''<​minix/​com.h>''​. The first request issued by inet will always be a <fs small>​DL_CONF</​fs>​ request, although more <fs small>​DL_CONF</​fs>​ requests may be issued later.
 +
 +Inet will not send a new request until it has received a reply for the previous request. As such, the driver must always respond to each request with a reply immediately. In the case of the <fs small>​DL_READV_S</​fs>​ and <fs small>​DL_WRITEV_S</​fs>​ data transfer requests, this may lead to one immediate reply message to acknowledge a request, and one reply message later to complete the request. See the section on data transfer below. The two other requests, <fs small>​DL_CONF</​fs>​ and <fs small>​DL_GETSTAT_S</​fs>,​ follow a more strict request-reply form.
 +
 +Ethernet drivers should use the netdriver_receive() function from libnetdriver to receive messages. The send() primitive should be used to send replies back to inet.
 +
 +==== Data transfer ====
 +
 +Like all other requests, data transfer is always initiated by inet. Inet will send a <fs small>​DL_READV_S</​fs>​ request when it is ready to receive a packet. It will send a <fs small>​DL_WRITEV_S</​fs>​ request to indicate that a packet should be sent.
 +
 +The driver may not always be able to satisfy the send or receive request immediately,​ for example because there are no packets in the receive queue, or because the send queue is full. If the driver is able to satisfy the request immediately,​ it should send a <fs small>​DL_TASK_REPLY</​fs>​ with the appropriate <fs small>​DL_PACK_</​fs>​ flag set in the <fs small>​DL_FLAGS</​fs>​ field. (and, in the case of receives, with <fs small>​DL_COUNT</​fs>​ set appropriately).
 +
 +If the driver is //not// able to satisfy the request immediately,​ it should immediately send a <fs small>​DL_TASK_REPLY</​fs>​ with a value of <fs small>​DL_NOFLAGS</​fs>​ (0) in <fs small>​DL_FLAGS</​fs>,​ to indicate that the request has been received but is still pending. Once the driver has been able to satisfy such a pending request (because a new packet arrived, or enough previous data were sent to be able to queue the new packet), it should send another <fs small>​DL_TASK_REPLY</​fs>​ message, with the appropriate <fs small>​DL_PACK_</​fs>​ flag set in <fs small>​DL_FLAGS</​fs>​ (and possibly with the <fs small>​DL_COUNT</​fs>​ field set, see above). This usually happens in response to an interrupt; if the same interrupt happens to satisfy both a pending receive and a pending send request, one single <fs small>​DL_TASK_REPLY</​fs>​ message may be used to acknowledge both; in that case, <fs small>​DL_STAT</​fs>​ would contain the bitwise OR'ed combination of <fs small>​DL_PACK_SEND</​fs>​ and <fs small>​DL_PACK_RECV</​fs>,​ and <fs small>​DL_COUNT</​fs>​ would contain the size of the received packet.
 +
 +In other words, while a send or receive request must be acknowledged immediately with a reply, the request itself will stay pending until a matching <fs small>​DL_PACK_</​fs>​ flag is set in a reply. The driver will never receive another request of the same (<fs small>​DL_READV_S</​fs>,​ <fs small>​DL_WRITEV_S</​fs>​) type while the previous request has not yet fully completed.
 +
 +==== System interaction ====
 +
 +The driver should use the [[.:​sef|System Event Framework]] (SEF). This framework automatically takes care of interaction with the Reincarnation Server (RS).
 +
 +The driver should register a signal handler callback function through SEF, and immediately perform an exit from the callback function when it gets a SIGTERM signal. Before exiting, the driver should stop the device.
 +
 +The driver will typically interact with the PCI server to find and reserve hardware devices. Description of this interaction is beyond the scope of this document.
 +
 +===== Protocol messages =====
 +
 +This section documents the messages used in the ethernet driver protocol.
 +
 +==== Configuration ====
 +
 +The <fs small>​DL_CONF</​fs>​ message has two purposes. First, it specifies what promiscuity/​multicast/​broadcast mode the ethernet card should be changed to. Second, it requests the ethernet hardware address of the card. The driver may receive multiple <fs small>​DL_CONF</​fs>​ messages over the course of its lifetime. The <fs small>​DL_CONF</​fs>​ request looks like this:
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Request**| @#​E0E0FF:​**DL_CONF**||| @#E0E0FF: request configuration and set mode |
 +|**Fields**|<​12%>​DL_MODE|<​6%>​m2_l1|<​16%>​unsigned int | flag field stating what mode(s) the NIC should have |
 +
 +The <fs small>​DL_MODE</​fs>​ field is a bitwise combination of the following possible flags:
 +
 +|<​35%>​|
 +|**Alias**|**Value**|**Meaning**|
 +| DL_PROMISC_REQ | 0x1 | promiscuous mode |
 +| DL_MULTI_REQ | 0x2 | multicast mode |
 +| DL_BROAD_REQ | 0x4 | broadcast mode |
 +
 +These flags indicate what types of packets are to be received, in addition to packets addressed specifically to the ethernet device'​s hardware address. The alias <fs small>​DL_NOMODE</​fs>​ equals 0 and is used when none of the above flags are set.
 +
 +The ethernet driver should change the mode of the hardware device to the mode indicated by these flags, and respond with the following reply:
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Reply**| @#​E0E0FF:​**DL_CONF_REPLY**||| @#E0E0FF: provide ethernet configuration |
 +|**Fields**|<​12%>​DL_STAT|<​6%>​m3_i1|<​16%>​int | result code |
 +| ::: |DL_HWADDR|m3_ca1|ether_addr_t| //upon success:// ethernet hardware address|
 +
 +The result code must be either <fs small>​OK</​fs>​ to indicate success, or a negative error code. If the driver cannot successfully reserve and interact with the device, this error code is expected to be <fs small>​ENXIO</​fs>​. Upon success, The ethernet hardware address (aka MAC address) is to be stored in the result message as well. The ether_addr_t structure is defined in ''<​net/​gen/​ether.h>''​.
 +
 +The driver may use the first <fs small>​DL_CONF</​fs>​ message to initialize the hardware. However, it is recommended that the driver do this immediately at startup. The driver can then be hardcoded to return a <fs small>​OK</​fs>​ response to <fs small>​DL_CONF</​fs>​ requests. In the long term, the <fs small>​DL_STAT</​fs>​ field may be removed.
 +
 +==== Statistics ====
 +
 +At any time after a device has been first configured with a <fs small>​DL_CONF</​fs>​ message, inet may request packet transmission and error statistics from the driver:
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Request**| @#​E0E0FF:​**DL_GETSTAT_S**||| @#E0E0FF: request ethernet statistics |
 +|**Fields**|<​12%>​DL_GRANT|<​6%>​m2_l2|<​16%>​cp_grant_id_t| grant (WRITE) for //​eth_stat_t//​ structure |
 +
 +Upon receiving this request, the ethernet driver must fill the fields of an "​eth_stat_t"​ structure as best it can, use [[.:​kernelapi#​sys_safecopyto|sys_safecopyto]] to copy it out to the caller'​s provided grant, and send the following reply message.
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Reply**| @#​E0E0FF:<​34% ​ >​**DL_STAT_REPLY**| @#E0E0FF: provide ethernet statistics |
 +|**Fields**| //none//||
 +
 +The driver should not reset the statistics after processing this message.
 +The eth_stat_t structure is defined in ''<​net/​gen/​eth_io.h>'',​ along with rough descriptions of what each field means.
 +
 +There is currently no userland tool that prints these statistics.
 +
 +==== Data transfer ====
 +
 +The request from inet to receive a packet looks like this:
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Request**| @#​E0E0FF:​**DL_READV_S**||| @#E0E0FF: receive ethernet packet |
 +|**Fields**|<​12%>​DL_GRANT|<​6%>​m2_l2|<​16%>​cp_grant_id_t| grant (READ) for //​iovec_s_t//​ vector |
 +| ::: |DL_COUNT|m2_i3|int| number of vector elements |
 +
 +The request comes with a grant for a vector that specifies grants and sizes for the destination buffers. For the driver, the process of copying out a packet consists of copying in the vector (using [[.:​kernelapi#​sys_safecopyfrom|sys_safecopyfrom]]),​ and repeatedly copying out (using [[.:​kernelapi#​sys_safecopyto|sys_safecopyto]]) the next ''​iov_size''​ bytes of the received packet to the next ''​iov_grant''​ grant as specified by that element of the vector, until the entire packet is copied out ([[.:​kernelapi#​sys_vsafecopy|sys_vsafecopy]] may be used as well). The vector itself is DL_COUNT * sizeof(iovec_s_t) bytes in size, and DL_COUNT will not exceed NR_IOREQS. The total size of the buffers specified by the vector is guaranteed to be at least ETH_MAX_PACK_SIZE_TAGGED bytes, which should be large enough to receive the packet; if not, the driver may truncate the packet or simply panic. The driver itself should remove any trailing CRC bytes from the ethernet packet before copying it out. The driver must ensure that the resulting packet is at least ETH_MIN_PACK_SIZE bytes. Smaller packets (//runt frames//) must be thrown away.
 +
 +The request from inet to send a packet looks like this:
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Request**| @#​E0E0FF:​**DL_WRITEV_S**||| @#E0E0FF: send ethernet packet |
 +|**Fields**|<​12%>​DL_GRANT|<​6%>​m2_l2|<​16%>​cp_grant_id_t| grant (READ) for //​iovec_s_t//​ vector |
 +| ::: |DL_COUNT|m2_i3|int| number of vector elements |
 +
 +Similar to the receive request, the send request includes a vector that specify grants and sizes for the buffers that contain the packet data. The process of copying in the packet consists of copying in the vector, and repeatedly copying in the next ''​iov_size''​ bytes of the packet to send to the next ''​iov_grant''​ grant as specified by that element of the vector, until the entire packet is copied in. The total size of the packet is guaranteed to be at least <fs small>​ETH_MIN_PACK_SIZE</​fs>​ bytes and at most <fs small>​ETH_MAX_PACK_SIZE_TAGGED</​fs>​ bytes. Depending on the underlying hardware, the driver may have to copy in the vector before being able to determine whether there is room in the send queue to send the entire packet.
 +
 +The reply message for both <fs small>​DL_READV_S</​fs>​ and <fs small>​DL_WRITEV_S</​fs>​ requests is the same. It may acknowledge a request and/or signify completion of a pending request. When signifying completion a receive request, it must specify the size of the received packet. The message looks like this:
 +
 +|<​90%>​|
 +| @#​E0E0FF:<​ 16%  >​**Reply**| @#​E0E0FF:​**DL_TASK_REPLY**||| @#E0E0FF: acknowledge pending or successful data transfer |
 +|**Fields**|<​12%>​DL_FLAGS|<​6%>​m2_l1|<​16%>​unsigned long|completion flags |
 +| ::: |DL_COUNT|m2_i3|int| //if DL_PACK_RECV is set:// received packet size, in bytes |
 +
 +The completion flags in <fs small>​DL_FLAGS</​fs>​ may be a bitwise combination of the following flags:
 +
 +|<​55%>​|
 +|**Alias**|**Value**|**Meaning**|
 +| DL_PACK_SEND | 0x1 | the send request has been completed |
 +| DL_PACK_RECV | 0x2 | the receive request has been completed |
 +
 +The alias <fs small>​DL_NOFLAGS</​fs>​ equals 0 and is typically used to indicate that the just-received send or receive request could not be satisfied immediately,​ and is now pending.
  
releases/3.2.0/developersguide/datalinkprotocol.txt · Last modified: 2014/11/11 14:52 (external edit)