![]() |
Toolbox snapshot
The Reactive C++ Toolbox
|
A sensible upper-bound for message payloads is 1400 bytes:
The User Datagram Protocol (UDP) is a message-oriented transport layer protocol. UDP is the most common transport layer protocol used with multicast addressing.
UDP is a simple, stateless protocol that is best suited to applications that can tolerate:
For these reasons, UDP is often referred to as an unreliable protocol.
Although UDP supports datagrams of up to 64K in length, packing them into a single Maximum Transmission Unit (MTU) is highly desirable to avoid fragmentation and subsequent reassembly. By default, Linux will return EMSGSIZE
if a user attempts to send a datagram that exceeds the target link's MTU.
UDP adds 8 bytes to the 20 byte header used by the IP protocol:
The maximum size of a datagram that avoids fragmentation is, therefore, the MTU size minus 28 bytes. This can be verified on a link with an MTU of 1500 bytes as follows:
$ ping -M do -c 1 -s 1472 www.reactivemarkets.com
UDP can be made more reliable by adding sequence numbers to the application protocol; receivers can implement a receive window to detect duplicates and restore datagram order.
Re-sending the last datagram at regular intervals may be used as a form of keep-alive and for detecting tail-loss on low volume channels.
When a sequence gap is detected, the receiver must use a suitable method for recovering lost transmissions. Recovery can be achieved by sending a Negative Acknowledgements (NAKs) back upstream to the sender. Care must be taken, however, to avoid re-request storms that further burden the network and risk compounding any capacity-related issues.
A solution often deployed by market-data feeds is to provide a separate recovery channel that periodically broadcasts snapshots for late-join or recovery purposes. Snapshot and delta channels use the same sequence numbers, so that they can be synchronised by downstream consumers. The main advantage of this approach is that events continue to flow in a single direction at a predictable rate.
Several characteristics of the Transmission Control Protocol (TCP) make it a more suitable for reliable messaging than UDP:
In addition, the connected nature of TCP streams and their APIs allow applications to perform specific actions when streams are connected and disconnected.
Disclaimer: these points are intended to highlight key reliability features only. (Consult your favourite TCP reference for more advanced features, including TCP's network congestion avoidance algorithms.)
In summary, UDP is ideal for messages that:
Good candidates for UDP are application heartbeats and snapshot-based market-data feeds.
TCP should generally be preferred otherwise.
When a client subscribes to a stream, depending on the type of data, it may need to synchronise with the current "state of the world". This is normally achieved by replaying historical messages or by sending a snapshot.
A typical example would be a delta-based market-data feed, where clients must acquire the current state of the order-book, on which subsequent deltas can be applied.
Such problems are easily solved using a reliable, connection-oriented protocol like TCP:
Packed data-structures are important for message encodings where smaller sizes require less bandwidth, decrease transmit times, and help to avoid fragmentation when message-oriented protocols such as UDP are used.
This is often in contrast with design considerations for internal data-structures. Concurrent data-structures such as Events in the Disruptor pattern, for example, are often padded to avoid false sharing and to adhere with the single-writer principle.