summaryrefslogtreecommitdiffstats
path: root/man-pages-posix-2013/man1p/sort.1p
diff options
context:
space:
mode:
Diffstat (limited to 'man-pages-posix-2013/man1p/sort.1p')
-rw-r--r--man-pages-posix-2013/man1p/sort.1p776
1 files changed, 776 insertions, 0 deletions
diff --git a/man-pages-posix-2013/man1p/sort.1p b/man-pages-posix-2013/man1p/sort.1p
new file mode 100644
index 0000000..a8cde07
--- /dev/null
+++ b/man-pages-posix-2013/man1p/sort.1p
@@ -0,0 +1,776 @@
+'\" et
+.TH SORT "1P" 2013 "IEEE/The Open Group" "POSIX Programmer's Manual"
+.SH PROLOG
+This manual page is part of the POSIX Programmer's Manual.
+The Linux implementation of this interface may differ (consult
+the corresponding Linux manual page for details of Linux behavior),
+or the interface may not be implemented on Linux.
+
+.SH NAME
+sort
+\(em sort, merge, or sequence check text files
+.SH SYNOPSIS
+.LP
+.nf
+sort \fB[\fR\(mim\fB] [\fR\(mio \fIoutput\fB] [\fR\(mibdfinru\fB] [\fR\(mit \fIchar\fB] [\fR\(mik \fIkeydef\fB]\fR... \fB[\fIfile\fR...\fB]\fR
+.P
+sort \fB[\fR\(mic|\(miC\fB] [\fR\(mibdfinru\fB] [\fR\(mit \fIchar\fB] [\fR\(mik \fIkeydef\fB] [\fIfile\fB]\fR
+.fi
+.SH DESCRIPTION
+The
+.IR sort
+utility shall perform one of the following functions:
+.IP " 1." 4
+Sort lines of all the named files together and write the result to
+the specified output.
+.IP " 2." 4
+Merge lines of all the named (presorted) files together and write the
+result to the specified output.
+.IP " 3." 4
+Check that a single input file is correctly presorted.
+.P
+Comparisons shall be based on one or more sort keys extracted from each
+line of input (or, if no sort keys are specified, the entire line up
+to, but not including, the terminating
+<newline>),
+and shall be performed using the collating sequence of the current
+locale.
+.SH OPTIONS
+The
+.IR sort
+utility shall conform to the Base Definitions volume of POSIX.1\(hy2008,
+.IR "Section 12.2" ", " "Utility Syntax Guidelines",
+except for Guideline 9, and the
+.BR \(mik
+.IR keydef
+option should follow the
+.BR \(mib ,
+.BR \(mid ,
+.BR \(mif ,
+.BR \(mii ,
+.BR \(min ,
+and
+.BR \(mir
+options. In addition,
+.BR '\(pl'
+may be recognized as an option delimiter as well as
+.BR '\(mi' .
+.P
+The following options shall be supported:
+.IP "\fB\(mic\fP" 10
+Check that the single input file is ordered as specified by the
+arguments and the collating sequence of the current locale. Output
+shall not be sent to standard output. The exit code shall indicate
+whether or not disorder was detected or an error occurred. If
+disorder (or, with
+.BR \(miu ,
+a duplicate key) is detected, a warning message shall be sent to
+standard error indicating where the disorder or duplicate key
+was found.
+.IP "\fB\(miC\fP" 10
+Same as
+.BR \(mic ,
+except that a warning message shall not be sent to standard error
+if disorder or, with
+.BR \(miu ,
+a duplicate key is detected.
+.IP "\fB\(mim\fP" 10
+Merge only; the input file shall be assumed to be already sorted.
+.IP "\fB\(mio\ \fIoutput\fR" 10
+Specify the name of an output file to be used instead of the standard
+output. This file can be the same as one of the input
+.IR file s.
+.IP "\fB\(miu\fP" 10
+Unique: suppress all but one in each set of lines having equal keys.
+If used with the
+.BR \(mic
+option, check that there are no lines with duplicate keys, in addition
+to checking that the input file is sorted.
+.P
+The following options shall override the default ordering rules. When
+ordering options appear independent of any key field specifications,
+the requested field ordering rules shall be applied globally to all
+sort keys. When attached to a specific key (see
+.BR \(mik ),
+the specified ordering options shall override all global ordering
+options for that key.
+.IP "\fB\(mid\fP" 10
+Specify that only
+<blank>
+characters and alphanumeric characters, according to the current
+setting of
+.IR LC_CTYPE ,
+shall be significant in comparisons. The behavior is undefined for a
+sort key to which
+.BR \(mii
+or
+.BR \(min
+also applies.
+.IP "\fB\(mif\fP" 10
+Consider all lowercase characters that have uppercase equivalents,
+according to the current setting of
+.IR LC_CTYPE ,
+to be the uppercase equivalent for the purposes of comparison.
+.IP "\fB\(mii\fP" 10
+Ignore all characters that are non-printable, according to the current
+setting of
+.IR LC_CTYPE .
+The behavior is undefined for a sort key for which
+.BR \(min
+also applies.
+.IP "\fB\(min\fP" 10
+Restrict the sort key to an initial numeric string, consisting of
+optional
+<blank>
+characters, optional minus-sign, and zero or more digits with an
+optional radix character and thousands separators (as defined in the
+current locale), which shall be sorted by arithmetic value. An empty
+digit string shall be treated as zero. Leading zeros and signs on zeros
+shall not affect ordering.
+.IP "\fB\(mir\fP" 10
+Reverse the sense of comparisons.
+.P
+The treatment of field separators can be altered using the options:
+.IP "\fB\(mib\fP" 10
+Ignore leading
+<blank>
+characters when determining the starting and ending positions of a
+restricted sort key. If the
+.BR \(mib
+option is specified before the first
+.BR \(mik
+option, it shall be applied to all
+.BR \(mik
+options. Otherwise, the
+.BR \(mib
+option can be attached independently to each
+.BR \(mik
+.IR field_start
+or
+.IR field_end
+option-argument (see below).
+.IP "\fB\(mit\ \fIchar\fR" 10
+Use
+.IR char
+as the field separator character;
+.IR char
+shall not be considered to be part of a field (although it can be
+included in a sort key). Each occurrence of
+.IR char
+shall be significant (for example, <\fIchar\fR><\fIchar\fR> delimits an
+empty field). If
+.BR \(mit
+is not specified,
+<blank>
+characters shall be used as default field separators; each maximal
+non-empty sequence of
+<blank>
+characters that follows a non-\c
+<blank>
+shall be a field separator.
+.P
+Sort keys can be specified using the options:
+.IP "\fB\(mik\ \fIkeydef\fR" 10
+The
+.IR keydef
+argument is a restricted sort key field definition. The format of this
+definition is:
+.RS 10
+.sp
+.RS 4
+.nf
+\fB
+\fIfield_start\fB[\fItype\fB][\fR,\fIfield_end\fB[\fItype\fB]]\fR
+.fi \fR
+.P
+.RE
+.P
+where
+.IR field_start
+and
+.IR field_end
+define a key field restricted to a portion of the line (see the
+EXTENDED DESCRIPTION section), and
+.IR type
+is a modifier from the list of characters
+.BR 'b' ,
+.BR 'd' ,
+.BR 'f' ,
+.BR 'i' ,
+.BR 'n' ,
+.BR 'r' .
+The
+.BR 'b'
+modifier shall behave like the
+.BR \(mib
+option, but shall apply only to the
+.IR field_start
+or
+.IR field_end
+to which it is attached. The other modifiers shall behave like the
+corresponding options, but shall apply only to the key field to which
+they are attached; they shall have this effect if specified with
+.IR field_start ,
+.IR field_end ,
+or both. If any modifier is attached to a
+.IR field_start
+or to a
+.IR field_end ,
+no option shall apply to either. Implementations shall support at
+least nine occurrences of the
+.BR \(mik
+option, which shall be significant in command line order. If no
+.BR \(mik
+option is specified, a default sort key of the entire line shall be
+used.
+.P
+When there are multiple key fields, later keys shall be compared only
+after all earlier keys compare equal. Except when the
+.BR \(miu
+option is specified, lines that otherwise compare equal shall be
+ordered as if none of the options
+.BR \(mid ,
+.BR \(mif ,
+.BR \(mii ,
+.BR \(min ,
+or
+.BR \(mik
+were present (but with
+.BR \(mir
+still in effect, if it was specified) and with all bytes in the lines
+significant to the comparison. The order in which lines that still
+compare equal are written is unspecified.
+.RE
+.SH OPERANDS
+The following operand shall be supported:
+.IP "\fIfile\fR" 10
+A pathname of a file to be sorted, merged, or checked. If no
+.IR file
+operands are specified, or if a
+.IR file
+operand is
+.BR '\(mi' ,
+the standard input shall be used.
+.SH STDIN
+The standard input shall be used only if no
+.IR file
+operands are specified, or if a
+.IR file
+operand is
+.BR '\(mi' .
+See the INPUT FILES section.
+.SH "INPUT FILES"
+The input files shall be text files, except that the
+.IR sort
+utility shall add a
+<newline>
+to the end of a file ending with an incomplete last line.
+.SH "ENVIRONMENT VARIABLES"
+The following environment variables shall affect the execution of
+.IR sort :
+.IP "\fILANG\fP" 10
+Provide a default value for the internationalization variables that are
+unset or null. (See the Base Definitions volume of POSIX.1\(hy2008,
+.IR "Section 8.2" ", " "Internationalization Variables"
+for the precedence of internationalization variables used to determine
+the values of locale categories.)
+.IP "\fILC_ALL\fP" 10
+If set to a non-empty string value, override the values of all the
+other internationalization variables.
+.IP "\fILC_COLLATE\fP" 10
+.br
+Determine the locale for ordering rules.
+.IP "\fILC_CTYPE\fP" 10
+Determine the locale for the interpretation of sequences of bytes of
+text data as characters (for example, single-byte as opposed to
+multi-byte characters in arguments and input files) and the behavior of
+character classification for the
+.BR \(mib ,
+.BR \(mid ,
+.BR \(mif ,
+.BR \(mii ,
+and
+.BR \(min
+options.
+.IP "\fILC_MESSAGES\fP" 10
+.br
+Determine the locale that should be used to affect the format and
+contents of diagnostic messages written to standard error.
+.IP "\fILC_NUMERIC\fP" 10
+.br
+Determine the locale for the definition of the radix character and
+thousands separator for the
+.BR \(min
+option.
+.IP "\fINLSPATH\fP" 10
+Determine the location of message catalogs for the processing of
+.IR LC_MESSAGES .
+.SH "ASYNCHRONOUS EVENTS"
+Default.
+.SH STDOUT
+Unless the
+.BR \(mio
+or
+.BR \(mic
+options are in effect, the standard output shall contain the sorted
+input.
+.SH STDERR
+The standard error shall be used for diagnostic messages. When
+.BR \(mic
+is specified, if disorder is detected (or if
+.BR \(miu
+is also specified and a duplicate key is detected), a message shall
+be written to the standard error which identifies the input line at
+which disorder (or a duplicate key) was detected. A warning
+message about correcting an incomplete last line of an input file
+may be generated, but need not affect the final exit status.
+.SH "OUTPUT FILES"
+If the
+.BR \(mio
+option is in effect, the sorted input shall be written to the file
+.IR output .
+.SH "EXTENDED DESCRIPTION"
+The notation:
+.sp
+.RS 4
+.nf
+\fB
+\(mik \fIfield_start\fB[\fItype\fB][\fR,\fIfield_end\fB[\fItype\fB]]\fR
+.fi \fR
+.P
+.RE
+.P
+shall define a key field that begins at
+.IR field_start
+and ends at
+.IR field_end
+inclusive, unless
+.IR field_start
+falls beyond the end of the line or after
+.IR field_end ,
+in which case the key field is empty. A missing
+.IR field_end
+shall mean the last character of the line.
+.P
+A field comprises a maximal sequence of non-separating characters and,
+in the absence of option
+.BR \(mit ,
+any preceding field separator.
+.P
+The
+.IR field_start
+portion of the
+.IR keydef
+option-argument shall have the form:
+.sp
+.RS 4
+.nf
+\fB
+\fIfield_number\fB[\fR.\fIfirst_character\fB]\fR
+.fi \fR
+.P
+.RE
+.P
+Fields and characters within fields shall be numbered starting with 1.
+The
+.IR field_number
+and
+.IR first_character
+pieces, interpreted as positive decimal integers, shall specify the
+first character to be used as part of a sort key. If
+.IR .first_character
+is omitted, it shall refer to the first character of the field.
+.P
+The
+.IR field_end
+portion of the
+.IR keydef
+option-argument shall have the form:
+.sp
+.RS 4
+.nf
+\fB
+\fIfield_number\fB[\fR.\fIlast_character\fB]\fR
+.fi \fR
+.P
+.RE
+.P
+The
+.IR field_number
+shall be as described above for
+.IR field_start.
+The
+.IR last_character
+piece, interpreted as a non-negative decimal integer, shall specify the
+last character to be used as part of the sort key. If
+.IR last_character
+evaluates to zero or
+.IR .last_character
+is omitted, it shall refer to the last character of the field specified
+by
+.IR field_number .
+.P
+If the
+.BR \(mib
+option or
+.BR b
+type modifier is in effect, characters within a field shall be counted
+from the first non-\c
+<blank>
+in the field. (This shall apply separately to
+.IR first_character
+and
+.IR last_character .)
+.SH "EXIT STATUS"
+The following exit values shall be returned:
+.IP "\00" 6
+All input files were output successfully, or
+.BR \(mic
+was specified and the input file was correctly sorted.
+.IP "\01" 6
+Under the
+.BR \(mic
+option, the file was not ordered as specified, or if the
+.BR \(mic
+and
+.BR \(miu
+options were both specified, two input lines were found with equal
+keys.
+.IP >1 6
+An error occurred.
+.SH "CONSEQUENCES OF ERRORS"
+Default.
+.LP
+.IR "The following sections are informative."
+.SH "APPLICATION USAGE"
+The default value for
+.BR \(mit ,
+<blank>,
+has different properties from, for example,
+.BR \(mit \c
+"<space>". If a line contains:
+.sp
+.RS 4
+.nf
+\fB
+<space><space>foo
+.fi \fR
+.P
+.RE
+.P
+the following treatment would occur with default separation as opposed
+to specifically selecting a
+<space>:
+.TS
+center box tab(@);
+cB | cB | cB
+n | l | l.
+Field@Default@\(mit "<space>"
+_
+1@<space><space>foo@\fIempty\fP
+2@\fIempty\fP@\fIempty\fP
+3@\fIempty\fP@foo
+.TE
+.P
+The leading field separator itself is included in a field when
+.BR \(mit
+is not used. For example, this command returns an exit status of zero,
+meaning the input was already sorted:
+.sp
+.RS 4
+.nf
+\fB
+sort \(mic \(mik 2 <<eof
+y<tab>b
+x<space>a
+eof
+.fi \fR
+.P
+.RE
+.P
+(assuming that a
+<tab>
+precedes the
+<space>
+in the current collating sequence). The field separator is not included
+in a field when it is explicitly set via
+.BR \(mit .
+This is historical practice and allows usage such as:
+.sp
+.RS 4
+.nf
+\fB
+sort \(mit "|" \(mik 2n <<eof
+Atlanta|425022|Georgia
+Birmingham|284413|Alabama
+Columbia|100385|South Carolina
+eof
+.fi \fR
+.P
+.RE
+.P
+where the second field can be correctly sorted numerically without
+regard to the non-numeric field separator.
+.P
+The wording in the OPTIONS section clarifies that the
+.BR \(mib ,
+.BR \(mid ,
+.BR \(mif ,
+.BR \(mii ,
+.BR \(min ,
+and
+.BR \(mir
+options have to come before the first sort key specified if they are
+intended to apply to all specified keys. The way it is described in
+\&this volume of POSIX.1\(hy2008 matches historical practice, not historical documentation.
+The results are unspecified if these options are specified after a
+.BR \(mik
+option.
+.P
+The
+.BR \(mif
+option might not work as expected in locales where there is not a
+one-to-one mapping between an uppercase and a lowercase letter.
+.SH EXAMPLES
+.IP " 1." 4
+The following command sorts the contents of
+.BR infile
+with the second field as the sort key:
+.RS 4
+.sp
+.RS 4
+.nf
+\fB
+sort \(mik 2,2 infile
+.fi \fR
+.P
+.RE
+.RE
+.IP " 2." 4
+The following command sorts, in reverse order, the contents of
+.BR infile1
+and
+.BR infile2 ,
+placing the output in
+.BR outfile
+and using the second character of the second field as the sort key
+(assuming that the first character of the second field is the field
+separator):
+.RS 4
+.sp
+.RS 4
+.nf
+\fB
+sort \(mir \(mio outfile \(mik 2.2,2.2 infile1 infile2
+.fi \fR
+.P
+.RE
+.RE
+.IP " 3." 4
+The following command sorts the contents of
+.BR infile1
+and
+.BR infile2
+using the second non-\c
+<blank>
+of the second field as the sort key:
+.RS 4
+.sp
+.RS 4
+.nf
+\fB
+sort \(mik 2.2b,2.2b infile1 infile2
+.fi \fR
+.P
+.RE
+.RE
+.IP " 4." 4
+The following command prints the System\ V password file (user
+database) sorted by the numeric user ID (the third
+<colon>-separated
+field):
+.RS 4
+.sp
+.RS 4
+.nf
+\fB
+sort \(mit : \(mik 3,3n /etc/passwd
+.fi \fR
+.P
+.RE
+.RE
+.IP " 5." 4
+The following command prints the lines of the already sorted file
+.BR infile ,
+suppressing all but one occurrence of lines having the same third
+field:
+.RS 4
+.sp
+.RS 4
+.nf
+\fB
+sort \(mium \(mik 3.1,3.0 infile
+.fi \fR
+.P
+.RE
+.RE
+.SH RATIONALE
+Examples in some historical documentation state that options
+.BR \(mium
+with one input file keep the first in each set of lines with equal
+keys. This behavior was deemed to be an implementation artifact and
+was not standardized.
+.P
+The
+.BR \(miz
+option was omitted; it is not standard practice on most systems and is
+inconsistent with using
+.IR sort
+to sort several files individually and then merge them together. The
+text concerning
+.BR \(miz
+in historical documentation appeared to require implementations to
+determine the proper buffer length during the sort phase of operation,
+but not during the merge.
+.P
+The
+.BR \(miy
+option was omitted because of non-portability. The
+.BR \(miM
+option, present in System V, was omitted because of non-portability in
+international usage.
+.P
+An undocumented
+.BR \(miT
+option exists in some implementations. It is used to specify a
+directory for intermediate files. Implementations are encouraged to
+support the use of the
+.IR TMPDIR
+environment variable instead of adding an option to support this
+functionality.
+.P
+The
+.BR \(mik
+option was added to satisfy two objections. First, the zero-based
+counting used by
+.IR sort
+is not consistent with other utility conventions. Second, it did not
+meet syntax guideline requirements.
+.P
+Historical documentation indicates that ``setting
+.BR \(min
+implies
+.BR \(mib ''.
+The description of
+.BR \(min
+already states that optional leading <blank>s are tolerated in doing
+the comparison. If
+.BR \(mib
+is enabled, rather than implied, by
+.BR \(min ,
+this has unusual side-effects. When a character offset is used in a
+column of numbers (for example, to sort modulo 100), that offset is
+measured relative to the most significant digit, not to the column.
+Based upon a recommendation from the author of the original
+.IR sort
+utility, the
+.BR \(mib
+implication has been omitted from this volume of POSIX.1\(hy2008, and an application wishing to
+achieve the previously mentioned side-effects has to code the
+.BR \(mib
+flag explicitly.
+.P
+Earlier versions of this standard allowed the
+.BR \(mio
+option to appear after operands. Historical practice allowed all
+options to be interspersed with operands. This version of the
+standard allows implementations to accept options after operands
+but conforming applications should not use this form.
+.P
+Earlier versions of this standard also allowed the
+.BR \(mi \c
+.IR number
+and
+.BR \(pl \c
+.IR number
+options. These options are no longer specified by POSIX.1\(hy2008 but may
+be present in some implementations.
+.P
+Historical implementations produced a message on standard error when
+.BR \(mic
+was specified and disorder was detected, and when
+.BR \(mic
+and
+.BR \(miu
+were specified and a duplicate key was detected. An earlier version of
+this standard contained wording that did not make it clear that this
+message was allowed and some implementations removed this message to
+be sure that they conformed to the standard's requirements. Confronted
+with this difference in behavior, interactive users that wanted to be
+sure that they got visual feedback instead of just exit code 1 could
+have used a command like:
+.sp
+.RS 4
+.nf
+\fB
+sort \(mic file || echo disorder
+.fi \fR
+.P
+.RE
+.P
+whether or not the
+.IR sort
+utility provided a message in this case. But, it was not easy for a user
+to find where the disorder or duplicate key occurred on implementations
+that do not produce a message, especially when some parts of the input
+line were not part of the key and when one or more of the
+.BR \(mib ,
+.BR \(mid ,
+.BR \(mif ,
+.BR \(mii ,
+.BR \(min ,
+or
+.BR \(mi r
+options or
+.IR keydef
+type modifiers were in use. POSIX.1\(hy2008 requires a message to be
+produced in this case. POSIX.1\(hy2008 also contains the
+.BR \(miC
+option giving users the ability to choose either behavior.
+.P
+When a disorder or duplicate is found when the
+.BR \(mic
+option is specified, some implementations print a message containing
+the first line that is out of order or contains a duplicate key; others
+print a message specifying the line number of the offending line. This
+standard allows either type of message.
+.SH "FUTURE DIRECTIONS"
+None.
+.SH "SEE ALSO"
+.IR "\fIcomm\fR\^",
+.IR "\fIjoin\fR\^",
+.IR "\fIuniq\fR\^"
+.P
+The Base Definitions volume of POSIX.1\(hy2008,
+.IR "Chapter 8" ", " "Environment Variables",
+.IR "Section 12.2" ", " "Utility Syntax Guidelines"
+.P
+The System Interfaces volume of POSIX.1\(hy2008,
+.IR "\fItoupper\fR\^(\|)"
+.SH COPYRIGHT
+Portions of this text are reprinted and reproduced in electronic form
+from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
+-- Portable Operating System Interface (POSIX), The Open Group Base
+Specifications Issue 7, Copyright (C) 2013 by the Institute of
+Electrical and Electronics Engineers, Inc and The Open Group.
+(This is POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
+event of any discrepancy between this version and the original IEEE and
+The Open Group Standard, the original IEEE and The Open Group Standard
+is the referee document. The original Standard can be obtained online at
+http://www.unix.org/online.html .
+
+Any typographical or formatting errors that appear
+in this page are most likely
+to have been introduced during the conversion of the source files to
+man page format. To report such errors, see
+https://www.kernel.org/doc/man-pages/reporting_bugs.html .