Sortix
Sortix Download Manual Development Source Code News Blog More
current nightly

Sortix cross-nightly manual

This manual documents Sortix cross-nightly. You can instead view this document in the latest official manual.

NAME

tcp — transmission control protocol

SYNOPSIS

#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
int
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);

DESCRIPTION

The Transmission Control Protocol (TCP) is a connection-oriented transport layer for the Internet Protocol ip(4) that provides a reliable byte stream connection between two hosts. It is designed for packet-switched networks and provides sequenced data, retransmissions on packet loss, handling of duplicated packets, flow control, basic data integrity checks, multiplexing with a 16-bit port number, support for out-of-band urgent data, and detection of lost connection. TCP provides the SOCK_STREAM abstraction for the inet(4) protocol family.
TCP sockets are made with socket(2) by passing an appropriate domain (AF_INET), SOCK_STREAM as the type, and 0 or IPPROTO_TCP as the protocol. Newly created TCP sockets are not bound to a local address nor connected to a remote socket.
Port numbers are 16-bit and range from 1 to 65535. Port 0 is not valid. Binding to port 0 will assign an available port on the requested address. Connecting to port 0 will fail with EADDRNOTAVAIL. Received packets whose source or destination address is port 0 will be silently dropped. TCP ports are distinct from ports in other transport layer protocols.
Packets contain a 16-bit ones' complement checksum. Received packets will be silently discarded if their checksum does not match the contents.
Sockets can be bound to a local address and port with bind(2) (if not already bound), or an local address and port will be automatically assigned when connected. The local address and port can be read with getsockname(2). If the socket hasn't been bound, the local address and port is reported as the any address on port 0. Binding to a well-known port (port 1 through port 1023) requires superuser privileges.
Sockets can be bound to the any address, the broadcast address, the address of a network interface, or the broadcast address of a network interface. Binding to port 0 will automatically assign an available port on the requested local address or fail with EAGAIN if no port is available. No two sockets can bind to the same local address and port. No two sockets can be bound such that one is bound to the any address and a port, and the other socket is bound to another address and the same port; unless both sockets had the SO_REUSEADDR socket option set when the second socket was bound, and the current user is the same that bound the first socket or the current user has superuser privileges.
A connection to a remote TCP socket can be established with connect(2). Connections can be established when both sides calls connect(2) on each other. If the socket is not bound, connect(2) will determine which network interface will be used to send to the remote address, and then bind to the address of that network interface together with an available port. connect(2) will fail if there is no route from the local address to the requested remote address.
Incoming connections can be received by binding to a local address with bind(2) and listening for connections with listen(2), after which incoming connections can be retrieved with accept(2).
Bytes can be received from the remote TCP socket with recv(2), recvmsg(2), recvfrom(2), read(2), or readv(2). Bytes can be transmitted to the remote TCP socket with send(2), sendmsg(2), sendto(2), write(2), or writev(2). Transmitting when the connection has broken will result in the process being sent the SIGPIPE signal and fail with EPIPE.
The receiving socket will acknowledge any received data. If no acknowledgement is received in a timely manner, the transmitting socket will transmit the data again. If a acknowledgement still isn't received after a while, the connection is considered broken and no further receipt or transmission is possible.
The condition of the socket can be tested with poll(2) where POLLIN signifies new data been received or the remote socket has shut down for writing or an incoming connection can be retrieved with accept(2), POLLOUT signifies new data can be sent now (and the socket is not shut down for writing), POLLHUP signifies the socket is shut down for writing, and POLLERR signifies an asynchronous error is pending.
The connection can be shut down with shutdown(2) in either the reading direction (discarding further received data) or the writing direction (which sends the finish control flag). The connection is closed when both sockets have sent and acknowledged the finish control flag. Upon the close(2) of the last file descriptor for a connected socket, the socket is shut down in both directions.
Socket options can be set with setsockopt(2) and read with getsockopt(2) and exist on the IPPROTO_TCP level as well as applicable underlying protocol levels.

SOCKET OPTIONS

TCP sockets support these setsockopt(2) / getsockopt(2) options at level SOL_SOCKET:
SO_BINDTODEVICE char[]
Bind to a network interface by its name. (Described in if(4))
SO_BINDTOINDEX unsigned int
Bind to a network interface by its index number. (Described in if(4))
SO_DEBUG int
Whether the socket is in debug mode. This option is not implemented and is initially 0. Attempting to set it to non-zero will fail with EPERM. (Described in if(4))
SO_DOMAIN sa_family_t
The socket domain (the address family). This option can only be read. (Described in if(4))
SO_ERROR int
The asynchronous pending error (an errno(3) value). Errors are permanent. This option can only be read. (Described in if(4))
SO_PROTOCOL int
The socket protocol (IPPROTO_TCP). This option can only be read. (Described in if(4))
SO_RCVBUF int
How many bytes the receive queue can use (default is 64 KiB). (Described in if(4))
SO_REUSEADDR int
Whether binding to the any address on a port doesn't conflict with binding to another address and the same port, if both sockets have this option set and the user binding the second socket is the same that bound the first socket or the user binding the second socket has superuser privileges. (Described in if(4))
SO_SNDBUF int
How many bytes the send queue can use (default is 64 KiB). (Described in if(4))
SO_TYPE int
The socket type (SOCK_STREAM). This option can only be read. (Described in if(4))
TCP sockets currently implement no setsockopt(2) / getsockopt(2) options at level IPPROTO_TCP.

IMPLEMENTATION NOTES

Connections time out when a segment has not been acknowledged by the remote socket after 6 attempts to deliver the segment. Each retransmission happens after 1 second plus 1 second per failed transmissions so far. Successful delivery of any segment resets the retransmission count to 0.
The receive and transmission buffers are both 64 KiB by default.
If no specific port is requested, one is randomly selected in the dynamic port range 32768 (inclusive) through 61000 (exclusive).
The Maximum Segment Lifetime (MSL) is set to 30 seconds and the quiet time of two MSLs before reusing sockets is 60 seconds.

ERRORS

Socket operations can fail due to these error conditions, in addition to the error conditions of the network and link layer, and the error conditions of the invoked function.
[EADDRINUSE]
The socket cannot be bound to the requested address and port because another socket was already bound to 1) the same address and port 2) the any address and the same port (and SO_REUSEADDR was not set on both sockets), or 3) some address and the same port but the requested address was the any address (and SO_REUSEADDR was not set on both sockets).
[EADDRNOTAVAIL]
The socket cannot be bound to the requested address because no network interface had that address or broadcast address.
[EADDRNOTAVAIL]
The socket was connected to port 0.
[EAGAIN]
A port could not be assigned because each port in the dynamic port range had already been bound to a socket in a conflicting manner.
[ECONNREFUSED]
The destination host refused the connection.
[ECONNRESET]
The connection was reset by the remote socket.
[EHOSTDOWN]
The destination host is not up. This error can happen asynchronously.
[EHOSTUNREACH]
The destination host was unreachable. This error can happen asynchronously.
[ENETDOWN]
The network interface isn't up. This error can happen asynchronously.
[ENETUNREACH]
The destination network was unreachable. This error can happen asynchronously.
[ENETUNREACH]
The remote address could not be connected because there was no route from the local address to the remote address.
[ENOBUFS]
There was not enough memory available for network packets.
[EPERM]
The unimplemented SO_DEBUG socket options was attempted to be set to a non-zero value.
[EPIPE]
The transmission failed because the connetion is broken. The SIGPIPE signal is sent as well unless disabled.
[ETIMEDOUT]
The connection timed out delivering a segment. This error can happen asynchronously.

SEE ALSO

accept(2), bind(2), connect(2), getpeername(2), getsockname(2), getsockopt(2), poll(2), recv(2), recvfrom(2), recvmsg(2), send(2), sendmsg(2), sendto(2), setsockopt(2), shutdown(2), socket(2), if(4), inet(4), ip(4), kernel(7)

STANDARDS

J. Postel (ed.), Transmission Control Protocol, STD 7, RFC 793, USC/Information Sciences Institute, September 1981.
Internet Engineering Task Force and R. Braden (ed.), Requirements for Internet Hosts -- Communication Layers, STD 3, RFC 1122, USC/Information Sciences Institute, October 1989.
IEEE Std 1003.1-2008 (“POSIX.1”) specifies the TCP socket programming interface.

BUGS

The implementation is incomplete and has known bugs.
Out-of-band data is not yet supported and is ignored on receipt.
The round trip time is not estimated which prevents efficient retransmission when data is lost. Retransmissions happen after a second, which means unnecessary retransmissions happen if the round trip time is more than a second.
Options are not supported and are ignored on receipt.
No extensions are implemented yet that improve efficiency for long fast networks with large bandwidth * delay products.
There is not yet any support for sending keep-alive packets.
There is not yet any support for respecting icmp(4) condition such as destination unreachable or source quench.
Half-open connections use memory, but until the handshake is complete, it is not confirmed whether the remote is actually able to transmit from the source qaddress. An attacker may be able to transmit many packets from forged addresses, reaching the limit on pending TCP sockets in the listen queue and thus deny service to further legitimate connections. A SYN queue or SYN cookies would mitigate this problem, but neither is yet implemented.
bind(2) does not yet enforce that binding to a well-known port (port 1 through port 1023) requires superuser privileges.
The automatic assignment of ports is random, but is statistically biased. A random port is picked, and if it is taken, the search sequentially iterates ports in ascending order until an available port is found or the search terminates.
Copyright 2011-2025 Jonas 'Sortie' Termansen and contributors.
Sortix's source code is free software under the ISC license.
#sortix on irc.sortix.org
@sortix_org