Sortix volatile manual
This manual documents Sortix volatile, a development build that has not been officially released. You can instead view this document in the latest official manual.
UDP(4) | Device Drivers Manual | UDP(4) |
NAME
udp
— user
datagram protocol
SYNOPSIS
#include
<sys/socket.h>
#include <netinet/in.h>
#include <netinet/udp.h>
int
socket
(AF_INET,
SOCK_DGRAM,
IPPROTO_UDP);
DESCRIPTION
The User Datagram Protocol (UDP) is a connectionless transport
layer for the Internet Protocol
ip(4) that provides best-effort
delivery of datagrams. It is designed for packet-switched networks and
provides multiplexing with a 16-bit port number, basic data integrity checks
(16-bit ones' complement sum), and broadcasting. It does not provide a
guarantee of delivery, avoidance of delivering multiple times, ordering, out
of band data, nor flow control. UDP provides the
SOCK_DGRAM
abstraction for the
inet(4) protocol family.
UDP sockets are made with
socket(2) by passing an
appropriate domain (AF_INET
),
SOCK_DGRAM
as the type, and 0
or IPPROTO_UDP
as the
protocol. Initially a socket is not bound, it won't
receive datagrams, and it does not have a remote address and port set.
A UDP socket has the following state:
- The address family it belongs to.
- The network interface it is bound to (if any)
(
SO_BINDTODEVICE
andSO_BINDTOINDEX
) (initially none). - The local address and port (when bound) (initially none).
- The remote address and port (when connected) (initially none).
- A receive queue (initially empty).
- Whether the socket has been shutdown(2) for read and/or write (initially neither).
- A single pending asynchronous error (if any)
(
SO_ERROR
) (initially none). - Whether broadcast datagrams can be sent
(
SO_BROADCAST
) (initially no). - Whether binding to the any address and a port doesn't conflict with
binding to another address on the same port
(
SO_REUSEADDR
) (initially no). - Limits on the size of the receive and send queues
(
SO_RCVBUF
andSO_SNDBUF
).
Datagrams are sent as a packet with a header and the datagram itself. The header contains the source port, the destination port, the checksum, and the packet's length. The length is a 16-bit value, allowing the packet to be up to 65535 bytes. The header is 8 bytes, allowing the maximum datagram size of 65527 bytes. However, the actual maximum datagram size may be smaller, as the network layer and link layer, as well as the path to the destination host, will add their own headers and maximum transmission unit (MTU) restrictions.
Port numbers are 16-bit and range from 1 to 65535. Port 0 is not
valid. Binding to port 0 will assign an available port on the requested
address. Sending or connecting to port 0 will fail with
EADDRNOTAVAIL
. Received packets whose source or
destination address is port 0 will be silently dropped. UDP ports are
distinct from ports in other transport layer protocols.
Packets contain a 16-bit ones' complement checksum by default. Unless the packet has no checksum, a received packet will be silently discarded if its checksum does not match its contents.
Sockets can be bound to a local address and port with bind(2) (if not already bound), or an local address and port will be automatically assigned on the first send or connect operation. The local address and port can be read with getsockname(2). If the socket hasn't been bound, the local address and port is reported as the any address on port 0. Binding to a well-known port (port 1 through port 1023) requires superuser privileges.
Sockets can be bound to the any address, the broadcast address,
the address of a network interface, or the broadcast address of a network
interface. Binding to port 0 will automatically assign an available port on
the requested local address or fail with EAGAIN
if
no port is available. No two sockets can bind to the same local address and
port. No two sockets can be bound such that one is bound to the any address
and a port, and the other socket is bound to another address and the same
port; unless both sockets had the SO_REUSEADDR
socket option set when the second socket was bound, and the current user is
the same that bound the first socket or the current user has superuser
privileges.
A socket bound to a local address and port will receive an incoming datagram if the following conditions hold:
- The datagram belongs to the socket's address family and the protocol is UDP.
- The datagram's checksum matches the datagram or it has no checksum.
- The datagram is not sent from port 0 and is not sent to port 0.
- The datagram is sent to the address or broadcast address of the network interface it is received on, or the datagram was sent to the broadcast address;
- The socket is either bound to the receiving network interface, or the socket is not bound to a network interface;
- The datagram is sent to the socket's local port;
- The datagram is sent to the socket's local address, or the socket's local address is the any address (and no other socket is bound to the datagram's address and that port);
- The socket is connected and the datagram was sent from the remote address and the remote port, or the socket is not connected; and
- The socket is not shut down for reading.
If so, the datagram is added to the socket's receive queue, otherwise it is discarded. The receive queue contains incoming packets waiting to be received. Incoming packets are dropped if the receive queue is full. Shrinking the receive queue limit drops packets as needed to stay below the limit.
The remote address and port can be set multiple times with
connect(2), after which the
socket is said to be connected, but UDP is connectionless and no handshake
is sent. The remote port must not be port 0 or the connection will fail with
EADDRNOTAVAIL
. If the socket is not bound,
connect(2) will determine
which network interface will be used to send to the remote address, and then
bind to the address of that network interface together with an available
port. connect(2) will fail
if there is no route from the local address to the requested remote address.
A connected socket only receive datagrams from the remote address and port.
connect(2) will drop
datagrams in the receive queue that don't originate from the requested
remote address. The send(2),
write(2), and
writev(2) functions can be
used on a connected socket and they send to the remote address and port by
default. If the socket is connected, the destination given to
sendto(2) and
sendmsg(2) must be
NULL
. The remote address and port can be read with
getpeername(2).
The socket can be disconnected by connecting to a socket address
with the family value set to AF_UNSPEC
, which resets
the remote address and port (if set), and otherwise has no effect. The
socket can be disconnected even if not connected, but it has no effect.
Datagrams can be sent with
sendmsg(2) and
sendto(2). Sending on a
unbound socket will bind to the any address and an available port, or fail
with EAGAIN
if no port is available. Datagrams can
be received with recvmsg(2),
recvfrom(2),
recv(2),
read(2), and
readv(2). If an asynchronous
error is pending, the next send and receive operation will fail with that
error and clear the asynchronous error, so the next operation can succeed.
Asynchronous errors can arise from network problems. There is no send queue
at the UDP level and datagrams are directly forwarded to the network layer.
It is an error to use any of the flags
MSG_CMSG_CLOEXEC
,
MSG_CMSG_CLOFORK
, MSG_EOR
,
MSG_OOB
, and
MSG_WAITALL
.
The condition of the socket can be tested with
poll(2) where
POLLIN
signifies a packet has been received (or the
socket is shut down for reading), POLLOUT
signifies
a packet can be sent now (and the socket is not shut down for writing),
POLLHUP
signifies the socket is shut down for
writing, and POLLERR
signifies an asynchronous error
is pending.
The socket can be shut down for receiving and/or sending with
shutdown(2). The receive
queue is emptied when shut down for receive (asynchronous errors are
preserved) and receive operations will succeed with an end of file
condition, but any pending asynchronous errors will take precedence and be
delivered instead. Sending when shut down for writing will raise
SIGPIPE
and fail with EPIPE
(regardless of a pending asynchronous error).
Socket options can be set with
setsockopt(2) and read
with getsockopt(2) and
exist on the IPPROTO_UDP
level as well as applicable
underlying protocol levels.
Broadcast datagrams can be sent by setting the
SO_BROADCAST
socket option with
setsockopt(2) and sending
to a broadcast address of the network layer.
SOCKET OPTIONS
UDP sockets support these
setsockopt(2) /
getsockopt(2) options at
level SOL_SOCKET
:
SO_BINDTODEVICE
char[]- Bind to a network interface by its name. (Described in if(4))
SO_BINDTOINDEX
unsigned int- Bind to a network interface by its index number. (Described in if(4))
SO_BROADCAST
int- Whether sending to a broadcast address is allowed. (Described in if(4))
SO_DEBUG
int- Whether the socket is in debug mode. This option is not implemented and is
initially 0. Attempting to set it to non-zero will fail with
EPERM
. (Described in if(4)) SO_DOMAIN
sa_family_t- The socket domain (the address family). This option can only be read. (Described in if(4))
SO_DONTROUTE
int- Whether to bypass the routing table and only send on the local network.
This option is not implemented and is initially 0. Attempting to set it to
non-zero will fail with
EPERM
. (Described in if(4)) SO_ERROR
int- The asynchronous pending error (an errno(3) value). Cleared to 0 when read. This option can only be read. (Described in if(4))
SO_PROTOCOL
int- The socket protocol (
IPPROTO_UDP
). This option can only be read. (Described in if(4)) SO_RCVBUF
int- How many bytes the receive queue can use (default is 64 pages, max 4096 pages). (Described in if(4))
SO_REUSEADDR
int- Whether binding to the any address on a port doesn't conflict with binding to another address and the same port, if both sockets have this option set and the user binding the second socket is the same that bound the first socket or the user binding the second socket has superuser privileges. (Described in if(4))
SO_SNDBUF
int- How many bytes the send queue can use (default is 64 pages, max 4096 pages). (Described in if(4))
SO_TYPE
int- The socket type (
SOCK_DGRAM
). This option can only be read. (Described in if(4))
UDP sockets currently implement no
setsockopt(2) /
getsockopt(2) options at
level IPPROTO_UDP
.
IMPLEMENTATION NOTES
There is no way to disable the checksum on sent packets, however received packets without a checksum will not be checksummed.
Each packet currently use a page of memory, which counts towards the receive queue limit.
If no specific port is requested, one is randomly selected in the dynamic port range 32768 (inclusive) through 61000 (exclusive).
EXAMPLES
This example creates and binds a UDP socket to a local address and port and sends a broadcast datagram to a remote address and port and receives a response and remembers who sent the response. local is the local socket address that is bound to and local_len is the size of the local socket address and likewise with remote and remote_len. responder is an uninitialized socket address of the appropriate size responder_len for the protocol family af where the source address of the response is stored. The response is stored in the incoming array of size amount. The af, local, local_len, remote, remote_len, responder, and responder_len values should all be chosen according to the address family and network layer.
sa_family_t af = /* ... */; const struct sockaddr *local = /* ... */; socklen_t local_len = /* ... */; const struct sockaddr *remote = /* ... */; socklen_t remote_len = /* ... */; const struct sockaddr *responder = /* ... */; socklen_t responder_len = /* ... */; int fd = socket(af, SOCK_DGRAM, IPPROTO_UDP); if (fd < 0) err(1, "socket"); if (bind(fd, local, local_len) < 0) err(1, "bind"); int value = 1; if (setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &value, sizeof(value)) < 0) err(1, "setsockopt"); char outgoing[] = "Hello"; if (sendto(fd, outgoing, strlen(outgoing), 0, remote, remote_len) < 0) err(1, "sendto"); char incoming[1024]; ssize_t amount = recvfrom(fd, incoming, sizeof(incoming), 0, responder, &responder_len); if (amount < 0 ) err(1, "recvfrom");
COMPATIBILITY
Sortix is the only known system where connect(2) will remove datagrams from the wrong source from the receive queue. All other systems will deliver datagrams already present in the receive queue, even if from the wrong source, despite the POSIX requirement that connect(2) “limits the remote sender for subsequent recv() functions”. Software for affected systems must either first empty the receive queue after connect(2), or use recvmsg(2) and validate the source address rather than rely on the kernel validation.
sendto(2) or
sendmsg(2) on a connected
socket must have the destination be NULL
(the
default destination) on Sortix, FreeBSD, Haiku, macOS, NetBSD, OpenBSD, and
SunOS; but the destination can be NULL
or any
address on DragonFly, GNU/Hurd, Linux, and Minix.
Socket disconnect is implemented on Sortix, DragonFly, Haiku,
GNU/Hurd, Linux, Minix, and SunOS; but socket disconnect is not implemented
on on FreeBSD, macOS, NetBSD and OpenBSD. Storing the
AF_FAMILY
value in the address family's socket
address structure or struct sockaddr is portable to the systems implementing
socket disconnect. A socket can be disconnected even if not connected on
Sortix, DragonFly, Haiku, GNU/Hurd, Linux, and Minix; but SunOS requires the
socket to be connected before it can be disconnected.
The broadcast address can be bound on Sortix, GNU/Hurd, Linux, OpenBSD, and SunOS; but can't be bound on DragonFly, FreeBSD, macOS, Minix and NetBSD.
SO_BROADCAST
doesn't need to be enabled to
connect(2) to the broadcast
address on Sortix, DragonFly, FreeBSD, Haiku, macOS, Minix, NetBSD, OpenBSD,
and SunOS; but is required on GNU/Hurd and Linux.
Reconnecting a socket to an address that is not reachable from the local address will fail on Sortix, GNU/Hurd, and Linux; but the socket will be bound to another address that can reach the remote address (even though it is not possible to bind a socket twice) (on the same port if possible) on DragonFly, FreeBSD, Haiku, macOS, NetBSD, OpenBSD, and SunOS.
connect(2) will not deliver asynchronous errors on Sortix, DragonFly, FreeBSD, Haiku, GNU/Hurd, Linux, and Minix; however it will deliver asynchronous errors on macOS, NetBSD, OpenBSD, and SunOS.
Shutting a socket down for reading will cause receives to return 0
on Sortix, DragonFly, FreeBSD, macOS, Minix, NetBSD, OpenBSD, and SunOS; but
receives will fail with fail with EWOULDBLOCK
on
Linux.
Shutting a socket down for writing will cause sends to raise SIGPIPE and fail with EPIPE on Sortix, DragonFly, FreeBSD, GNU/Hurd, macOS, NetBSD, OpenBSD, and SunOS; but will not raise SIGPIPE and only fail with EPIPE on Linux and Minix.
Sortix, GNU/Hurd, Linux, and Minix will signal POLLIN if a datagram has been received or if shut down for read. DragonFly, FreeBSD, macOS, NetBSD, OpenBSD, and SunOS will signal POLLIN if a datagram has been received, if shut down for read, or if an error is pending.
Sortix and DragonFly will signal POLLOUT if a datagram can be sent, unless the socket has been shut down for write. FreeBSD will signal POLLOUT if a datagram can be sent, unless the socket has been shut down for both read and write. GNU/Hurd will signal POLLOUT if a datagram can be sent, unless the socket has been shut down for write or if an error is pending. Linux, Minix, OpenBSD, and SunOS will signal POLLOUT if a datagram can be sent, regardless of whether the socket has been shut down. macOS will signal POLLOUT if a datagram can be sent, unless the socket has been shut down for either read or write.
Sortix and DragonFly will signal POLLHUP if shut down for write. FreeBSD and Linux will signal POLLHUP if shut down for both read and write. GNU/Hurd, macOS, Minix, NetBSD, OpenBSD, and SunOS will not signal POLLHUP. macOS will signal POLLHUP if shut down for either read or write.
Sortix, Haiku, GNU/Hurd, and Linux will signal POLLERR if an error is pending. DragonFly, FreeBSD, macOS, Minix, NetBSD, OpenBSD, and SunOS will not signal POLLERR.
Shutting a socket down for read doesn't work on GNU/Hurd and Linux, where the socket continues to receive datagrams.
Linux delivers asynchronous errors on send, even if shut down for write.
Sockets can be shut down even if not connected on Sortix, DragonFly, Minix, NetBSD, and OpenBSD; but sockets must be connected before they can be shut down on FreeBSD, GNU/Hurd, Linux, macOS, and SunOS.
Connecting to the any address will fail with
ENETUNREACH
on Sortix. On DragonFly, FreeBSD, Haiku,
GNU/Hurd, Linux, macOS, OpenBSD, and SunOS it will succeed and
getpeername(2) will
report the loopback address (OpenBSD will report the any address).
Connecting to port 0 will fail on Sortix, FreeBSD, macOS, Minix, NetBSD, OpenBSD, and SunOS; but will succeed on DragonFly, Haiku, GNU/Hurd and Linux.
Sortix's handling of SO_REUSEADDR
requires
the two sockets to bound by the same user or the second socket to be bound
by a user with superuser privileges. It's unclear what other systems also
perform this check and when the user identity is captured.
Setting SO_REUSEADDR
on both sockets is
required on Sortix, Haiku, GNU/Hurd, and Linux; but DragonFly, FreeBSD,
Minix, macOS, NetBSD, OpenBSD, and SunOS only require it to be set on the
second socket.
Two sockets can't be bound to the same address and port on Sortix,
DragonFly, FreeBSD, Haiku, macOS, NetBSD, and OpenBSD; but GNU/Hurd, Linux,
Minix, and SunOS allows it when SO_REUSEADDR
is
set.
ERRORS
Socket operations can fail due to these error conditions, in addition to the error conditions of the network and link layer, and the error conditions of the invoked function.
- [
EACCES
] - A datagram was sent to a broadcast address, but
SO_BROADCAST
is turned off. - [
EADDRINUSE
] - The socket cannot be bound to the requested address and port because
another socket was already bound to 1) the same address and port 2) the
any address and the same port (and
SO_REUSEADDR
was not set on both sockets), or 3) some address and the same port but the requested address was the any address (andSO_REUSEADDR
was not set on both sockets). - [
EADDRNOTAVAIL
] - The socket cannot be bound to the requested address because no network interface had that address or broadcast address.
- [
EADDRNOTAVAIL
] - The socket was connected to port 0, or a datagram was sent to port 0.
- [
EAGAIN
] - A port could not be assigned because each port in the dynamic port range had already been bound to a socket in a conflicting manner.
- [
ECONNREFUSED
] - The destination host of a datagram was not listening on the port. This error can happen asynchronously.
- [
EHOSTDOWN
] - The destination host of a datagram is not up. This error can happen asynchronously.
- [
EHOSTUNREACH
] - The destination host of a datagram was unreachable. This error can happen asynchronously.
- [
EISCONN
] - A destination address and port was specified when sending a datagram, but the socket has already been connected to a remote address and port.
- [
EMSGSIZE
] - The datagram was too large to be sent because it exceeded the maximum transmission unit (MTU) on the path between the local and remote address, or it exceeded the UDP datagram size limit of 65527 bytes. This error can happen asynchronously.
- [
ENETDOWN
] - The network interface used to deliver a datagram isn't up. This error can happen asynchronously.
- [
ENETUNREACH
] - The destination network of a datagram was unreachable. This error can happen asynchronously.
- [
ENETUNREACH
] - The remote address could not be connected because there was no route from the local address to the remote address.
- [
ENOBUFS
] - There was not enough memory available for network packets.
- [
EPERM
] - One of the unimplemented
SO_DEBUG
andSO_DONTROUTE
socket options was attempted to be set to a non-zero value.
SEE ALSO
bind(2), connect(2), getpeername(2), getsockname(2), getsockopt(2), poll(2), recvfrom(2), recvmsg(2), sendmsg(2), sendto(2), setsockopt(2), shutdown(2), socket(2), if(4), inet(4), ip(4), kernel(7)
STANDARDS
J. Postel, User Datagram Protocol, STD 6, RFC 768, USC/Information Sciences Institute, August 1980.
Internet Engineering Task Force and R. Braden (ed.), Requirements for Internet Hosts -- Communication Layers, STD 3, RFC 1122, USC/Information Sciences Institute, October 1989.
IEEE Std 1003.1-2008
(“POSIX.1”) specifies the UDP socket programming
interface and defines the socket options
SO_BROADCAST
, SO_DEBUG
,
SO_DONTROUTE, SO_ERROR, SO_RCVBUF, SO_REUSEADDR
,
SO_SNDBUF
, and SO_TYPE
.
BUGS
bind(2) does not yet enforce that binding to a well-known port (port 1 through port 1023) requires superuser privileges.
The handling of SO_REUSEADDR in
bind(2) does not yet enforce
the two sockets to be bound by the same user or the second socket to be
bound by a user with superuser privileges. The requirement that both sockets
have SO_REUSEADDR
set might be relaxed to only the
second socket having it set when this permission check is implemented.
The integration with the network layer is inadequate and the
asynchronous errors ECONNREFUSED
,
EHOSTDOWN
, EHOSTUNREACH
, and
ENETUNREACH
are never delivered asynchronously from
the network.
The send(2) flag
MSG_DONTROUTE
and the
SO_DONTROUTE
socket option are not implemented
yet.
The SO_SNDBUF
socket option is currently
not used and the send queue is not limited at the socket level.
The automatic assignment of ports is random, but is statistically biased. A random port is picked, and if it is taken, the search sequentially iterates ports in ascending order until an available port is found or the search terminates.
FreeBSD's and OpenBSD's UDP documentation states in the BUGS section that receiving a datagram on a socket shutdown for read should reply with a ICMP Port Unreachable message, however they don't implement this behavior. No other system appears to implement this behavior, and it is unclear whether it should be implemented.
June 4, 2017 | Sortix 1.1.0-dev |