I never thought about that but I think you're absolutely right! In hindsight it's a glaring oversight to offer a stream API without the ability to flush the buffer.
The API should have been message oriented from the start. This would avoid having the network stack try to compensate for the behavior of the application layer. Then Nagel’s or something like it would just be a library available for applications that might need it.
The stream API is as annoying on the receiving end especially when wrapping (like TLS) is involved. Basically you have to code your layers as if the underlying network is handing you a byte at a time - and the application has to try to figure out where the message boundaries are - adding a great deal of complexity.
The problem is that this is not in practice quite what most applications need, but the Internet evolved towards UDP and TCP only.
So you can have message-based if you want, but then you have to do sequencing, gap filling or flow control yourself, or you can have the overkill reliable byte stream with limited control or visibility at the application level.
I’m not suggesting exposing retransmission, fragmentation, etc to the API user.
The sender provides n bytes of data (a message) to the network stack. The receiver API provides the user with the block of n bytes (the message) as part of an atomic operation. Optionally the sender can be provided with notification when the n-bytes have been delivered to the receiver.
Very well said. I think there is enormous complexity in many layers because we don't have that building block easily available.
But yeah, where that's unnecessary, it's probably just as easy to have a 4-byte length prefix, since TCP handles the checksum and retransmit and everything for you.
TCP_CORK is a rather kludgey alternative.
The same issue exists with file IO. Writing via an in-process buffer (default behavior or stdio and quite a few programming languages) is not interchangeable with unbuffered writes — with a buffer, it’s okay to do many small writes, but you cannot assume that the data will ever actually be written until you flush.
I’m a bit disappointed that Zig’s fancy new IO system pretends that buffered and unbuffered IO are two implementations of the same thing.
Seems like there's been a disconnect between users and kernel developers here?
For stuff where no answer is required, Nagel's algorithm works very well for me, but many TCP channels are mixed use these days. They send messages that expect a fast answer and other that are more asynchronous (from a users point of view, not a programmers).
Wouldn't it be nice if all operating systems, (home-)routers, firewalls and programming languages would have high quality implementations of something like SCTP...