Achieving the lowest latency for delay-sensitive traffic
Almost every packet on a digital network is part of a "flow", a sequence of packets from the same source to the same destination. These flows are of two types:
- they either carry a continuous stream of data such as an audio or video signal
- or transfer information between processes running in computers, as in a TCP session
We can think of the former as "AV" flows and of the latter as "IT" flows. For many applications, AV flows are sensitive to "latency", which is the time between a packet being transmitted by the sender and received at its destination; in a phone call, for example, longer delays make it difficult to have a natural conversation. New applications proposed for 5G, such as those involving augmented or virtual reality, or tactile feedback, will have even more severe requirements. For IT flows, if latency is important at all it will be the average over time that matters, whereas for AV flows it is the delay for the slowest packet.
Current-generation networks were originally designed as IT networks, carrying IT flows, and have had various features added to assist AV flows, which increase complexity but still do not provide the best service for these flows.
For NGP we propose to have separate services for AV and IT flows; on communication links, the two services are multiplexed together with the AV service taking the space it needs when it needs it and the IT service using all of the remaining capacity. This would have been expensive to implement on 1980s hardware, but is easy to provide in current-generation logic. The IT service uses label routing as described in the previous blog post; the AV service uses synchronization to achieve the lowest possible latency, as described below.
We define a "slot" as an opportunity for an AV packet to be transmitted. Each slot is allocated to a specific flow (with free slots being allocated to a "null" flow), so each flow has its own set of slots and the service it gets can't be affected by traffic on other flows; thus, no shaping or policing is needed. Packets don't need labels, because they can be identified by the slot in which they arrive. If the slots on different links are phase-locked (and in the prototype implementation that was found to be really easy to do) the delay through each switch, and hence the flow's latency through the network, is fixed.
Forwarding can be done by simply copying all incoming AV packets (or rather, the contents of all incoming slots, from all ports on the switch) to a buffer where they stay for a few microseconds before being overwritten; each output has a routing table which tells it from where in the buffer to take the packet to fill each slot. Thus a flow can be multicast simply by setting more than one output to transmit it.
The original proposal was to allow slots to be of any size from a few bytes (for packets carrying a single audio sample) up to about 4KB (for video), but in practice the minimum size is the width of the buffer (which needs to be quite wide to achieve the necessary throughput) and the maximum needs to be quite small because fragmentation of the space on a link (when a few flows have been routed) could make it impossible to route a flow that needs a large slot. A fixed size of 64 bytes was therefore chosen; using a fixed size means that there is no need for control plane protocols to signal where the slots begin. That is, of course, not very different from the size of an ATM cell, but unlike in ATM any unused space is not wasted but made available to the IT service.
The AV service is very simple and efficient to implement, and can be used for any traffic that needs guaranteed throughput. It provides very low latency as standard; the buffering delay is less than 15 microseconds per hop. For "variable bit rate" traffic, such as constant-quality compressed media, the flow can simply be set up to carry the peak bandwidth, and any unused capacity will be available for the IT service.
Because the AV service can be used for any traffic that has requirements on throughput or latency, the IT service can be a purely best-effort service, without any provision for prioritising one flow over another, although if such facilities are required (to support slicing, perhaps) they can be implemented in the same way as in current networks.
The AV service may also be the best way to transfer large files. The main purpose of protocols such as TCP is to control the rate of transmission and to request retransmission of packets that have been discarded because of buffer overflow. With the AV service, the transmission rate is fixed and packets will not be discarded, so a much simpler protocol which merely checks for transmission errors can be used. The flow only needs to be set up in one direction in the data plane, because any request for retransmission can be sent via the control plane; usually all that will be needed is to indicate success when clearing the flow down. This process can also be used to multicast a file to a large number of recipients, for instance when distributing a software update.
For more details see clause 5.3.4 of GR 003, also available from the Specifications tab.