Why do I not see MSG_EOR for SOCK_SEQPACKET on linux?

With SOCK_SEQPACKET unix domain sockets the only way for the message to be cut short is if the buffer you give to recvmsg() isn't big enough (and in that case you'll get MSG_TRUNC).

POSIX says that SOCK_SEQPACKET sockets must set MSG_EOR at the end of a record, but Linux unix domain sockets don't.

(Refs: POSIX 2008 2.10.10 says SOCK_SEQPACKET must support records, and 2.10.6 says record boundaries are visible to the receiver via the MSG_EOR flag.)

What a 'record' means for a given protocol is up to the implementation to define.

If Linux did implement MSG_EOR for unix domain sockets, I think the only sensible way would be to say that each packet was a record in itself, and so always set MSG_EOR (or maybe always set it when not setting MSG_TRUNC), so it wouldn't be informative anyway.


When you read the docs, SOCK_SEQPACKET differs from SOCK_STREAM in two distinct ways. Firstly -

Sequenced, reliable, two-way connection-based data transmission path for datagrams of fixed maximum length; a consumer is required to read an entire packet with each input system call.

-- socket(2) from Linux manpages project

aka

For message-based sockets, such as SOCK_DGRAM and SOCK_SEQPACKET, the entire message shall be read in a single operation. If a message is too long to fit in the supplied buffers, and MSG_PEEK is not set in the flags argument, the excess bytes shall be discarded, and MSG_TRUNC shall be set in the msg_flags member of the msghdr structure.

-- recvmsg() in POSIX standard.

In this sense it is similar to SOCK_DGRAM.

Secondly each "datagram" (Linux) / "message" (POSIX) carries a flag called MSG_EOR.

However Linux SOCK_SEQPACKET for AF_UNIX does not implement MSG_EOR. The current docs do not match reality :-)


Allegedly some SOCK_SEQPACKET implementations do the other one. And some implement both. So that covers all the possible different combinations :-)

[1] Packet oriented protocols generally use packet level reads with truncation / discard semantics and no MSG_EOR. X.25, Bluetooth, IRDA, and Unix domain sockets use SOCK_SEQPACKET this way.

[2] Record oriented protocols generally use byte stream reads and MSG_EOR

  • no packet level visibility, no truncation / discard. DECNet and ISO TP use SOCK_SEQPACKET that way.

[3] Packet / record hybrids generally use SOCK_SEQPACKET with truncation / discard semantics on the packet level, and record terminating packets marked with MSG_EOR. SPX and XNS SPP use SOCK_SEQPACKET this way.

https://mailarchive.ietf.org/arch/msg/tsvwg/9pDzBOG1KQDzQ2wAul5vnAjrRkA

You've shown an example of paragraph 1.

Paragraph 2 also applies to SOCK_SEQPACKET as defined for SCTP. Although by default it sets MSG_EOR on every sendmsg(). The option to disable this is called SCTP_EXPLICIT_EOR.

Paragraph 3, the one most consistent with the docs, seems to be the most obscure case.

And even the docs are not properly consistent with themselves.

The SOCK_SEQPACKET socket type is similar to the SOCK_STREAM type, and is also connection-oriented. The only difference between these types is that record boundaries are maintained using the SOCK_SEQPACKET type. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers parts of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag in the received message flags returned by the recvmsg() function. -- POSIX standard


That's not what MSG_EOR is for.

Remember that the sockets API is an abstraction over a number of different protocols, including UNIX filesystem sockets, socketpairs, TCP, UDP, and many many different network protocols, including X.25 and some entirely forgotten ones.

MSG_EOR is to signal end of record where that makes sense for the underlying protocol. I.e. it is to pass a message to the next layer down that "this completes a record". This may affect for example, buffering, causing the flushing of a buffer. But if the protocol itself doesn't have a concept of a "record" there is no reason to expect the flag to be propagated.

Secondly, if using SEQPACKET you must read the entire message at once. If you do not the remainder will be discarded. That's documented. In particular, MSG_EOR is not a flag to tell you that this is the last part of the packet.

Advice: You are obviously writing a non-SEQPACKET version for use on MacOS. I suggest you dump the SEQPACKET version as it is only going to double the maintenance and coding burden. SOCK_STREAM is fine for all platforms.

Tags:

Linux

C++

C