sortix-mirror/utils/sort.1

171 lines
3.9 KiB
Groff

.Dd April 8, 2018
.Dt SORT 1
.Os
.Sh NAME
.Nm sort
.Nd sort lines of text
.Sh SYNOPSIS
.Nm
.Op Fl CcmRruVz
.Op Fl o Ar path
.Ar
.Sh DESCRIPTION
.Nm
reads lines of text from the standard input and writes the lines in sorted order
to the standard output.
If files are specified, the input is the concatenated content of the files read
in sequential order.
The
.Ar file
path can be set to
.Sq -
to specify the standard input.
The lines are compared according to the current locale's collating rules.
.Pp
The options are as follows:
.Bl -tag -width "12345678"
.It Fl c, \-check, \-check=diagnose-first
Check whether the input is already sorted.
If a line is out of order (or an equal line is found if
.Fl u ) ,
write an error describing which line was out of
order and exit 1.
.It Fl C, \-check=quiet, \-check=silent
Same as
.Fl c ,
but write no error to the standard output about the input being out of order.
.It Fl m, \-merge
Merge the presorted input files into a sorted output.
.It Fl o Ar path , Fl \-output Ns = Ns Ar path
After reading the full input; write the output to the file at
.Pa path
(creating it if it does not already exist, discarding its previous contents if
it already existed).
The output file can be one of the input files.
This option is incompatible with
.Fl C
and
.Fl c .
.It Fl R , \-random-sort
Sort the lines randomly with a uniform distribution, where all permutations are
equally likely.
This option is incompatible with
.Fl C
and
.Fl c .
If
.Fl u ,
don't write duplicate lines to the output.
.It Fl r , \-reverse
Compare the lines in reverse order.
.It Fl u , \-unique
Don't write a line if it is equal to the previous line.
.It Fl V , \-version-sort
Sort according to the version string, per
.Xr strverscmp 3 .
.It Fl z , \-zero-terminated
Lines are delimited with the NUL byte (0) instead of the newline byte (10).
.El
.Sh IMPLEMENTATION NOTES
In the event of an input error,
.Nm
will write an error to the standard error and exit unsuccessfully.
.Pp
.Nm
reads the whole input into memory, rather than storing intermediate sorting
steps in the filesystem, and requires enough memory to store a copy of the whole
input.
.Sh ENVIRONMENT
.Bl -tag -width "LC_COLLATE"
.It Dv LANG
The default locale for locale variables that are unset or null.
.It Dv LC_ALL
Overrides all the other locale variables if set.
.It Dv LC_COLLATE
Compare the input according to this locale's collating rules using
.Xr strcoll 3 .
.El
.Sh EXIT STATUS
.Nm
will exit 0 on success, exit 1 if the input was out of order when
.Fl C
or
.Fl c ,
or exit 2 (or higher) otherwise.
.Sh EXAMPLES
Read lines from the standard input and write them in sorted order to the
standard output:
.Bd -literal
sort < input > output
.Ed
.Pp
Read lines from the three specified files (where the second happens to be the
standard input) and write them in sorted to the standard output:
.Bd -literal
grep pattern lines.txt | sort foo - bar -o output.txt
.Ed
.Pp
Sort the input file if it isn't already sorted:
.Bd -literal
if sort -C file; [ $? = 1 ]; then
sort file -o file
fi
.Ed
.Pp
Remove duplicate lines from the input by sorting it and removing lines equal to
the previous line:
.Bd -literal
sort -u
.Ed
.Sh SEE ALSO
.Xr cat 1 ,
.Xr comm 1 ,
.Xr join 1 ,
.Xr uniq 1 ,
.Xr qsort 3 ,
.Xr strcoll 3 ,
.Xr strverscmp 3
.Sh STANDARDS
.Nm
is standardized in
.St -p1003.1-2008 ,
which is currently partially implemented in this implementation of
.Nm .
.Pp
The
.Fl R , V ,
and
.Fl z
options, as well as the long options, are extensions also found in GNU
coreutils.
.Pp
Unlike GNU coreutils,
.Fl R
will not remove duplicates unless
.Fl u
is passed.
.Pp
As an extension, the
.Fl C
and
.Fl c
options support multiple input files.
.Sh BUGS
The
.St -p1003.1-2008
options
.Fl b ,
.Fl d ,
.Fl f ,
.Fl i ,
.Fl k ,
.Fl n ,
and
.Fl t
are not currently implemented.
.Pp
The
.Fl m
option is not currently taken advantage of to speed up the sorting, rather the
presorted input files are sorted all over again.