Super High-Performance Networking Code

January 4th, 2010 | Tags: , , ,

I’ve encountered a faulty assumption about how TCP works, actually in multiple code bases I’ve maintained. Interestingly, my colleague Jeff and perhaps soon-to-be colleague Dane have also encountered this more than once and have dubbed it “Super High-Performance Networking Code.”

In Super High-Performance Networking Code, the sending side might look something like this:

    void OnFooEvent(Event e)
    {
        char *ptr = e.GetVariableLengthPayloadDataAsBytes();
        int len = e.GetLengthOfPayloadData();
        int rc = send(m_socket, ptr, len, 0);
        // handle error.
    }

… and the receiving side might look like this …

    void ReceiveLoop()
    {
        while (!m_bDone)
        {
            char buf[REALLY_BIG];
            int len = recv(m_socket, buf, sizeof(buf), 0);
            if (len > 0)
            {
                HandleExactlyOneFooEventWithVariableLengthData(buf, len);
            }
        }
    }

The fatal assumption is that read sizes on the receiving end are coincident with the write sizes on the sending end.

TCP networking is a stream-based protocol and it makes NO guarantees that the read and write sizes are coincident. In fact, it has a few mechanisms which will actively thwart this, including Nagle’s algorithm. Also, all TCP networks have an MTU, and packets larger than this will be sliced up for delivery (although without affecting the data contained within or the guaranteed order of delivery). There is some handshaking to discover the minimum MTU along a route, though this is intended to allow client systems to adapt for maximum throughput and is not a guarantee that slicing won’t happen. It’s more like a guarantee that slicing won’t happen frequently. Additionally, loaded routers may well choose to split transmit units or collect and concatenate them in order to meet quality of service and throughput requirements.

Unfortunately, though, making this assumption will work often enough with small packet sizes to lull some developers into believing they’ve confirmed that assuming read and write sizes are coincident is sound.

If you are making this assumption, you need to redesign your wire protocol to include within it a mechanism for delineation of the data, such as sending the data size up front, or using some sort of end-of-data marker. You then need to collect the reads on the receiving side until you’ve reached the demarcation or expected data size.

No comments yet.
TOP