sortix-mirror/share/man/man4/udp.4

700 lines
23 KiB
Groff

.Dd June 4, 2017
.Dt UDP 4
.Os
.Sh NAME
.Nm udp
.Nd user datagram protocol
.Sh SYNOPSIS
.In sys/socket.h
.In netinet/in.h
.In netinet/udp.h
.Ft int
.Fn socket AF_INET SOCK_DGRAM IPPROTO_UDP
.Sh DESCRIPTION
The User Datagram Protocol (UDP) is a connectionless transport layer for the
Internet Protocol
.Xr ip 4
that provides best-effort delivery of datagrams.
It is designed for packet-switched networks and provides multiplexing with a
16-bit port number, basic data integrity checks (16-bit ones' complement sum),
and broadcasting.
It does not provide a guarantee of delivery, avoidance of delivering multiple
times, ordering, out of band data, nor flow control.
UDP provides the
.Dv SOCK_DGRAM
abstraction for the
.Xr inet 4
protocol family.
.Pp
UDP sockets are made with
.Xr socket 2
by passing an appropriate
.Fa domain
.Dv ( AF_INET ) ,
.Dv SOCK_DGRAM
as the
.Fa type ,
and 0 or
.Dv IPPROTO_UDP
as the
.Fa protocol .
Initially a socket is not bound, it won't receive datagrams, and it does not
have a remote address and port set.
.Pp
A UDP socket has the following state:
.Pp
.Bl -bullet -compact
.It
The address family it belongs to.
.It
The network interface it is bound to (if any)
.Dv ( SO_BINDTODEVICE
and
.Dv SO_BINDTOINDEX )
(initially none).
.It
The local address and port (when bound) (initially none).
.It
The remote address and port (when connected) (initially none).
.It
A receive queue (initially empty).
.It
Whether the socket has been
.Xr shutdown 2
for read and/or write (initially neither).
.It
A single pending asynchronous error (if any)
.Dv ( SO_ERROR )
(initially none).
.It
Whether broadcast datagrams can be sent
.Dv ( SO_BROADCAST )
(initially no).
.It
Whether binding to the any address and a port doesn't conflict with binding to
another address on the same port
.Dv ( SO_REUSEADDR )
(initially no).
.It
Limits on the size of the receive and send queues
.Dv ( SO_RCVBUF
and
.Dv SO_SNDBUF ) .
.El
.Pp
Datagrams are sent as a packet with a header and the datagram itself.
The header contains the source port, the destination port, the checksum, and the
packet's length.
The length is a 16-bit value, allowing the packet to be up to 65535 bytes.
The header is 8 bytes, allowing the maximum datagram size of 65527 bytes.
However, the actual maximum datagram size may be smaller, as the network layer
and link layer, as well as the path to the destination host, will add their own
headers and maximum transmission unit (MTU) restrictions.
.Pp
Port numbers are 16-bit and range from 1 to 65535.
Port 0 is not valid.
Binding to port 0 will assign an available port on the requested address.
Sending or connecting to port 0 will fail with
.Er EADDRNOTAVAIL .
Received packets whose source or destination address is port 0 will be silently
dropped.
UDP ports are distinct from ports in other transport layer protocols.
.Pp
Packets contain a 16-bit ones' complement checksum by default.
Unless the packet has no checksum, a received packet will be silently discarded
if its checksum does not match its contents.
.Pp
Sockets can be bound to a local address and port with
.Xr bind 2
(if not already bound),
or an local address and port will be automatically assigned on the first send
or connect operation.
The local address and port can be read with
.Xr getsockname 2 .
If the socket hasn't been bound, the local address and port is reported as the
any address on port 0.
Binding to a well-known port (port 1 through port 1023) requires superuser
privileges.
.Pp
Sockets can be bound to the any address, the broadcast address, the address of
a network interface, or the broadcast address of a network interface.
Binding to port 0 will automatically assign an available port on the requested
local address or fail with
.Er EAGAIN
if no port is available.
No two sockets can bind to the same local address and port.
No two sockets can be bound such that one is bound to the any address and a
port, and the other socket is bound to another address and the same port; unless
both sockets had the
.Dv SO_REUSEADDR
socket option set when the second socket was bound, and the current user is the
same that bound the first socket or the current user has superuser privileges.
.Pp
A socket bound to a local address and port will receive an incoming datagram if
the following conditions hold:
.Pp
.Bl -bullet -compact
.It
The datagram belongs to the socket's address family and the protocol is UDP.
.It
The datagram's checksum matches the datagram or it has no checksum.
.It
The datagram is not sent from port 0 and is not sent to port 0.
.It
The datagram is sent to the address or broadcast address of the network
interface it is received on, or the datagram was sent to the broadcast address;
.It
The socket is either bound to the receiving network interface, or the socket is
not bound to a network interface;
.It
The datagram is sent to the socket's local port;
.It
The datagram is sent to the socket's local address, or the socket's local
address is the any address (and no other socket is bound to the datagram's
address and that port);
.It
The socket is connected and the datagram was sent from the remote address and
the remote port, or the socket is not connected; and
.It
The socket is not shut down for reading.
.El
.Pp
If so, the datagram is added to the socket's receive queue, otherwise it is
discarded.
The receive queue contains incoming packets waiting to be received.
Incoming packets are dropped if the receive queue is full.
Shrinking the receive queue limit drops packets as needed to stay below the
limit.
.Pp
The remote address and port can be set multiple times with
.Xr connect 2 ,
after which the socket is said to be connected, but UDP is connectionless and
no handshake is sent.
The remote port must not be port 0 or the connection will fail with
.Er EADDRNOTAVAIL .
If the socket is not bound,
.Xr connect 2
will determine which network interface will be used to send to the remote
address, and then bind to the address of that network interface together with an
available port.
.Xr connect 2
will fail if there is no route from the local address to the requested remote
address.
A connected socket only receive datagrams from the remote address and port.
.Xr connect 2
will drop datagrams in the receive queue that don't originate from the
requested remote address.
The
.Xr send 2 ,
.Xr write 2 ,
and
.Xr writev 2
functions can be used on a connected socket and they send to the remote address
and port by default.
If the socket is connected, the destination given to
.Xr sendto 2
and
.Xr sendmsg 2
must be
.Dv NULL .
The remote address and port can be read with
.Xr getpeername 2 .
.Pp
The socket can be disconnected by connecting to a socket address with the family
value set to
.Dv AF_UNSPEC ,
which resets the remote address and port (if set), and otherwise has no effect.
The socket can be disconnected even if not connected, but it has no effect.
.Pp
Datagrams can be sent with
.Xr sendmsg 2
and
.Xr sendto 2 .
Sending on a unbound socket will bind to the any address and an available port,
or fail with
.Er EAGAIN
if no port is available.
Datagrams can be received with
.Xr recvmsg 2 ,
.Xr recvfrom 2 ,
.Xr recv 2 ,
.Xr read 2 ,
and
.Xr readv 2 .
If an asynchronous error is pending, the next send and receive operation will
fail with that error and clear the asynchronous eror, so the next operation can
succeed.
Asynchronous errors can arise from network problems.
There is no send queue at the UDP level and datagrams are directly forwarded to
the network layer.
It is an error to use any of the flags
.Dv MSG_CMSG_CLOEXEC ,
.Dv MSG_CMSG_CLOFORK ,
.Dv MSG_EOR ,
.Dv MSG_OOB ,
and
.Dv MSG_WAITALL .
.Pp
The condition of the socket can be tested with
.Xr poll 2
where
.Dv POLLIN
signifies a packet has been received (or the socket is shut down for reading),
.Dv POLLOUT
signifies a packet can be sent now (and the socket is not shut down for
writing),
.Dv POLLHUP
signifies the socket is shut down for writing, and
.Dv POLLERR
signifies an asynchronous error is pending.
.Pp
The socket can be shut down for receiving and/or sending with
.Xr shutdown 2 .
The receive queue is emptied when shut down for receive (asynchronous errors are
preserved) and receive operations will succeed with an end of file
condition, but any pending asynchronous errors will take precedence and be
delivered instead.
Sending when shut down for writing will raise
.Dv SIGPIPE
and fail with
.Er EPIPE
(regardless of a pending asynchronous error).
.Pp
Socket options can be set with
.Xr setsockopt 2
and read with
.Xr getsockopt 2
and exist on the
.Dv IPPROTO_UDP
level as well as applicable underlying protocol levels.
.Pp
Broadcast datagrams can be sent by setting the
.Dv SO_BROADCAST
socket option with
.Xr setsockopt 2
and sending to a broadcast address of the network layer.
.Sh SOCKET OPTIONS
UDP sockets support these
.Xr setsockopt 2 /
.Xr getsockopt 2
options at level
.Dv SOL_SOCKET :
.Bl -tag -width "12345678"
.It Dv SO_BINDTODEVICE Fa "char[]"
Bind to a network interface by its name.
(Described in
.Xr if 4 )
.It Dv SO_BINDTOINDEX Fa "unsigned int"
Bind to a network interface by its index number.
(Described in
.Xr if 4 )
.It Dv SO_BROADCAST Fa "int"
Whether sending to a broadcast address is allowed.
(Described in
.Xr if 4 )
.It Dv SO_DEBUG Fa "int"
Whether the socket is in debug mode.
This option is not implemented and is initially 0.
Attempting to set it to non-zero will fail with
.Er EPERM .
(Described in
.Xr if 4 )
.It Dv SO_DOMAIN Fa "sa_family_t"
The socket
.Fa domain
(the address family).
This option can only be read.
(Described in
.Xr if 4 )
.It Dv SO_DONTROUTE Fa "int"
Whether to bypass the routing table and only send on the local network.
This option is not implemented and is initially 0.
Attempting to set it to non-zero will fail with
.Er EPERM .
(Described in
.Xr if 4 )
.It Dv SO_ERROR Fa "int"
The asynchronous pending error
(an
.Xr errno 3
value).
Cleared to 0 when read.
This option can only be read.
(Described in
.Xr if 4 )
.It Dv SO_PROTOCOL Fa "int"
The socket protocol
.Dv ( IPPROTO_UDP ) .
This option can only be read.
(Described in
.Xr if 4 )
.It Dv SO_RCVBUF Fa "int"
How many bytes the receive queue can use (default is 64 pages, max 4096 pages).
(Described in
.Xr if 4 )
.It Dv SO_REUSEADDR Fa "int"
Whether binding to the any address on a port doesn't conflict with binding to
another address and the same port, if both sockets have this option set and the
user binding the second socket is the same that bound the first socket or the
user binding the second socket has superuser privileges.
(Described in
.Xr if 4 )
.It Dv SO_SNDBUF Fa "int"
How many bytes the send queue can use (default is 64 pages, max 4096 pages).
(Described in
.Xr if 4 )
.It Dv SO_TYPE Fa "int"
The socket type
.Dv ( SOCK_DGRAM ) .
This option can only be read.
(Described in
.Xr if 4 )
.El
.Pp
UDP sockets currently implement no
.Xr setsockopt 2 /
.Xr getsockopt 2
options at level
.Dv IPPROTO_UDP .
.Sh IMPLEMENTATION NOTES
There is no way to disable the checksum on sent packets, however received
packets without a checksum will not be checksummed.
.Pp
Each packet currently use a page of memory, which counts towards the receive
queue limit.
.Pp
If no specific port is requested, one is randomly selected in the dynamic port
range 32768 (inclusive) through 61000 (exclusive).
.Sh EXAMPLES
This example creates and binds a UDP socket to a local address and port and
sends a broadcast datagram to a remote address and port and receives a response
and remembers who sent the response.
.Va local
is the local socket address that is bound to and
.Va local_len
is the size of the local socket address and likewise with
.Va remote
and
.Va remote_len .
.Va responder
is an uninitialized socket address of the appropriate size
.Va responder_len
for the protocol family
.Va af
where the source address of the response is stored.
The response is stored in the
.Va incoming
array of size
.Va amount .
The
.Va af , local , local_len , remote , remote_len , responder ,
and
.Va responder_len
values should all be chosen according to the address family and network layer.
.Bd -literal
sa_family_t af = /* ... */;
const struct sockaddr *local = /* ... */;
socklen_t local_len = /* ... */;
const struct sockaddr *remote = /* ... */;
socklen_t remote_len = /* ... */;
const struct sockaddr *responder = /* ... */;
socklen_t responder_len = /* ... */;
int fd = socket(af, SOCK_DGRAM, IPPROTO_UDP);
if (fd < 0)
err(1, "socket");
if (bind(fd, local, local_len) < 0)
err(1, "bind");
int value = 1;
if (setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &value, sizeof(value)) < 0)
err(1, "setsockopt");
char outgoing[] = "Hello";
if (sendto(fd, outgoing, strlen(outgoing), 0, remote, remote_len) < 0)
err(1, "sendto");
char incoming[1024];
ssize_t amount = recvfrom(fd, incoming, sizeof(incoming), 0,
responder, &responder_len);
if (amount < 0 )
err(1, "recvfrom");
.Ed
.Sh COMPATIBILITY
Sortix is the only known system where
.Xr connect 2
will remove datagrams from the wrong source from the receive queue.
All other systems will deliver datagrams already present in the receive queue,
even if from the wrong source, despite the POSIX requirement that
.Xr connect 2
"limits the remote sender for subsequent recv() functions".
Software for affected systems must either first empty the receive queue after
.Xr connect 2 ,
or use
.Xr recvmsg 2
and validate the source address rather than rely on the kernel validation.
.Pp
.Xr sendto 2
or
.Xr sendmsg 2
on a connected socket must have the destination be
.Dv NULL
(the default destination)
on Sortix, FreeBSD, Haiku, macOS, NetBSD, OpenBSD, and SunOS; but the
destination can be
.Dv NULL
or any address on DragonFly, GNU/Hurd, Linux, and Minix.
.Pp
Socket disconnect is implemented on Sortix, DragonFly, Haiku, GNU/Hurd, Linux,
Minix, and SunOS; but socket disconnect is not implemented on on FreeBSD, macOS,
NetBSD and OpenBSD.
Storing the
.Dv AF_FAMILY
value in the address family's socket address structure or struct sockaddr is
portable to the systems implementing socket disconnect.
A socket can be disconnected even if not connected on Sortix, DragonFly, Haiku,
GNU/Hurd, Linux, and Minix; but SunOS requires the socket to be connected
before it can be disconnected.
.Pp
The broadcast address can be bound on Sortix, GNU/Hurd, Linux, OpenBSD, and
SunOS; but can't be bound on DragonFly, FreeBSD, macOS, Minix and NetBSD.
.Pp
.Dv SO_BROADCAST
doesn't need to be enabled to
.Xr connect 2
to the broadcast address on Sortix, DragonFly, FreeBSD, Haiku, macOS, Minix,
NetBSD, OpenBSD, and SunOS; but is required on GNU/Hurd and Linux.
.Pp
Reconnecting a socket to an address that is not reachable from the local address
will fail on Sortix, GNU/Hurd, and Linux; but the socket will be bound to
another address that can reach the remote address (even though it is not
possible to bind a socket twice) (on the same port if possible) on DragonFly,
FreeBSD, Haiku, macOS, NetBSD, OpenBSD, and SunOS.
.Pp
.Xr connect 2
will not deliver asynchronous errors on Sortix, DragonFly, FreeBSD, Haiku,
GNU/Hurd, Linux, and Minix; however it will deliver asynchronous errors on
macOS, NetBSD, OpenBSD, and SunOS.
.Pp
Shutting a socket down for reading will cause receives to return 0 on Sortix,
DragonFly, FreeBSD, macOS, Minix, NetBSD, OpenBSD, and SunOS; but receives will
fail with fail with
.Er EWOULDBLOCK
on Linux.
.Pp
Shutting a socket down for writing will cause sends to raise SIGPIPE and fail
with EPIPE on Sortix, DragonFly, FreeBSD, GNU/Hurd, macOS, NetBSD, OpenBSD, and
SunOS; but will not raise SIGPIPE and only fail with EPIPE on Linux and Minix.
.Pp
Sortix, GNU/Hurd, Linux, and Minix will signal POLLIN if a datagram has been
received or if shut down for read.
DragonFly, FreeBSD, macOS, NetBSD, OpenBSD, and SunOS will signal POLLIN if a
datagram has been received, if shut down for read, or if an error is pending.
.Pp
Sortix and DragonFly will signal POLLOUT if a datagram can be sent, unless the
socket has been shut down for write.
FreeBSD will signal POLLOUT if a datagram can be sent, unless the socket has
been shut down for both read and write.
GNU/Hurd will signal POLLOUT if a datagram can be sent, unless the socket has
been shut down for write or if an error is pending.
Linux, Minix, OpenBSD, and SunOS will signal POLLOUT if a datagram can be sent,
regardless of whether the socket has been shut down.
macOS will signal POLLOUT if a datagram can be sent, unless the socket has been
shut down for either read or write.
.Pp
Sortix and DragonFly will signal POLLHUP if shut down for write.
FreeBSD and Linux will signal POLLHUP if shut down for both read and write.
GNU/Hurd, macOS, Minix, NetBSD, OpenBSD, and SunOS will not signal POLLHUP.
macOS will signal POLLHUP if shut down for either read or write.
.Pp
Sortix, Haiku, GNU/Hurd, and Linux will signal POLLERR if an error is pending.
DragonFly, FreeBSD, macOS, Minix, NetBSD, OpenBSD, and SunOS will not signal
POLLERR.
.Pp
Shutting a socket down for read doesn't work on GNU/Hurd and Linux, where the
socket continues to receive datagrams.
.Pp
Linux delivers asynchronous errors on send, even if shut down for write.
.Pp
Sockets can be shut down even if not connected on Sortix, DragonFly, Minix,
NetBSD, and OpenBSD; but sockets must be connected before they can be shut down
on FreeBSD, GNU/Hurd, Linux, macOS, and SunOS.
.Pp
Connecting to the any address will fail with
.Er ENETUNREACH
on Sortix.
On DragonFly, FreeBSD, Haiku, GNU/Hurd, Linux, macOS, OpenBSD, and SunOS it will
succeed and
.Xr getpeername 2
will report the loopback address (OpenBSD will report the any address).
.Pp
Connecting to port 0 will fail on Sortix, FreeBSD, macOS, Minix, NetBSD, OpenBSD,
and SunOS; but will succeed on DragonFly, Haiku, GNU/Hurd and Linux.
.Pp
Sortix's handling of
.Dv SO_REUSEADDR
requires the two sockets to bound by the same user or the second socket to be
bound by a user with superuser privileges.
It's unclear what other systems also perform this check and when the user
identity is captured.
.Pp
Setting
.Dv SO_REUSEADDR
on both sockets is required on Sortix, Haiku, GNU/Hurd, and Linux; but
DragonFly, FreeBSD, Minix, macOS, NetBSD, OpenBSD, and SunOS only require it to
be set on the second socket.
.Pp
Two sockets can't be bound to the same address and port on Sortix, DragonFly,
FreeBSD, Haiku, macOS, NetBSD, and OpenBSD; but GNU/Hurd, Linux, Minix, and
SunOS allows it when
.Dv SO_REUSEADDR
is set.
.Sh ERRORS
Socket operations can fail due to these error conditions, in addition to the
error conditions of the network and link layer, and the error conditions of the
invoked function.
.Bl -tag -width [EADDRNOTAVAIL]
.It Bq Er EACCES
A datagram was sent to a broadcast address, but
.Dv SO_BROADCAST
is turned off.
.It Bq Er EADDRINUSE
The socket cannot be bound to the requested address and port because another
socket was already bound to 1) the same address and port 2) the any address
and the same port (and
.Dv SO_REUSEADDR
was not set on both sockets), or 3) some address and the same port but the
requested address was the any address (and
.Dv SO_REUSEADDR
was not set on both sockets).
.It Bq Er EADDRNOTAVAIL
The socket cannot be bound to the requested address because no network interface
had that address or broadcast address.
.It Bq Er EADDRNOTAVAIL
The socket was connected to port 0, or a datagram was sent to port 0.
.It Bq Er EAGAIN
A port could not be assigned because each port in the dynamic port range had
already been bound to a socket in a conflicting manner.
.It Bq Er ECONNREFUSED
The destination host of a datagram was not listening on the port.
This error can happen asynchronously.
.It Bq Er EHOSTDOWN
The destination host of a datagram is not up.
This error can happen asynchronously.
.It Bq Er EHOSTUNREACH
The destination host of a datagram was unreachable.
This error can happen asynchronously.
.It Bq Er EISCONN
A destination address and port was specified when sending a datagram, but the
socket has already been connected to a remote address and port.
.It Bq Er EMSGSIZE
The datagram was too large to be sent because it exceeded the maximum
transmission unit (MTU) on the path between the local and remote address, or it
exceeded the UDP datagram size limit of 65527 bytes.
This error can happen asynchronously.
.It Bq Er ENETDOWN
The network interface used to deliver a datagram isn't up.
This error can happen asynchronously.
.It Bq Er ENETUNREACH
The destination network of a datagram was unreachable.
This error can happen asynchronously.
.It Bq Er ENETUNREACH
The remote address could not be connected because there was no route from the
local address to the remote address.
.It Bq Er ENOBUFS
There was not enough memory available for network packets.
.It Bq Er EPERM
One of the unimplemented
.Dv SO_DEBUG
and
.Dv SO_DONTROUTE
socket options was attempted to be set to a non-zero value.
.El
.Sh SEE ALSO
.Xr bind 2 ,
.Xr connect 2 ,
.Xr getpeername 2 ,
.Xr getsockname 2 ,
.Xr getsockopt 2 ,
.Xr poll 2 ,
.Xr recvfrom 2 ,
.Xr recvmsg 2 ,
.Xr sendmsg 2 ,
.Xr sendto 2 ,
.Xr setsockopt 2 ,
.Xr shutdown 2 ,
.Xr socket 2 ,
.Xr if 4 ,
.Xr inet 4 ,
.Xr ip 4 ,
.Xr kernel 7
.Sh STANDARDS
.Rs
.%A J. Postel
.%D August 1980
.%R STD 6
.%R RFC 768
.%T User Datagram Protocol
.%Q USC/Information Sciences Institute
.Re
.Pp
.Rs
.%A Internet Engineering Task Force
.%A R. Braden (ed.)
.%D October 1989
.%R STD 3
.%R RFC 1122
.%T Requirements for Internet Hosts -- Communication Layers
.%Q USC/Information Sciences Institute
.Re
.Pp
.St -p1003.1-2008 specifies the UDP socket programming interface and defines the
socket options
.Dv SO_BROADCAST , SO_DEBUG , SO_DONTROUTE, SO_ERROR, SO_RCVBUF, SO_REUSEADDR ,
.Dv SO_SNDBUF ,
and
.Dv SO_TYPE .
.Sh BUGS
.Xr bind 2
does not yet enforce that binding to a well-known port (port 1 through port
1023) requires superuser privileges.
.Pp
The handling of
.Dv SO_REUSEADDR in
.Xr bind 2
does not yet enforce the two sockets to be bound by the same user or the second
socket to be bound by a user with superuser privileges.
The requirement that both sockets have
.Dv SO_REUSEADDR
set might be relaxed to only the second socket having it set when this
permission check is implemented.
.Pp
The integration with the network layer is inadequate and the asynchronous errors
.Er ECONNREFUSED ,
.Er EHOSTDOWN ,
.Er EHOSTUNREACH ,
and
.Er ENETUNREACH
are never delivered asynchronously from the network.
.Pp
The
.Xr send 2
flag
.Dv MSG_DONTROUTE
and the
.Dv SO_DONTROUTE
socket option are not implemented yet.
.Pp
The
.Dv SO_SNDBUF
socket option is currently not used and the send queue is not limited at the
socket level.
.Pp
The automatic assignment of ports is random, but is statistically biased.
A random port is picked, and if it is taken, the search sequentially iterates
ports in ascending order until an available port is found or the search
terminates.
.Pp
FreeBSD's and OpenBSD's UDP documentation states in the BUGS section that
receiving a datagram on a socket shutdown for read should reply with a ICMP
Port Unreachable message, however they don't implement this behavior.
No other system appears to implement this behavior, and it is unclear whether
it should be implemented.