summaryrefslogtreecommitdiffstats
path: root/man-pages-posix-2017/man1p/uniq.1p
diff options
context:
space:
mode:
Diffstat (limited to 'man-pages-posix-2017/man1p/uniq.1p')
-rw-r--r--man-pages-posix-2017/man1p/uniq.1p408
1 files changed, 408 insertions, 0 deletions
diff --git a/man-pages-posix-2017/man1p/uniq.1p b/man-pages-posix-2017/man1p/uniq.1p
new file mode 100644
index 0000000..964ca0d
--- /dev/null
+++ b/man-pages-posix-2017/man1p/uniq.1p
@@ -0,0 +1,408 @@
+'\" et
+.TH UNIQ "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual"
+.\"
+.SH PROLOG
+This manual page is part of the POSIX Programmer's Manual.
+The Linux implementation of this interface may differ (consult
+the corresponding Linux manual page for details of Linux behavior),
+or the interface may not be implemented on Linux.
+.\"
+.SH NAME
+uniq
+\(em report or filter out repeated lines in a file
+.SH SYNOPSIS
+.LP
+.nf
+uniq \fB[\fR-c|-d|-u\fB] [\fR-f \fIfields\fB] [\fR-s \fIchar\fB] [\fIinput_file \fB[\fIoutput_file\fB]]\fR
+.fi
+.SH DESCRIPTION
+The
+.IR uniq
+utility shall read an input file comparing adjacent lines, and write
+one copy of each input line on the output. The second and succeeding
+copies of repeated adjacent input lines shall not be written.
+The trailing
+<newline>
+of each line in the input shall be ignored when doing comparisons.
+.P
+Repeated lines in the input shall not be detected if they are not
+adjacent.
+.SH OPTIONS
+The
+.IR uniq
+utility shall conform to the Base Definitions volume of POSIX.1\(hy2017,
+.IR "Section 12.2" ", " "Utility Syntax Guidelines",
+except that
+.BR '\(pl'
+may be recognized as an option delimiter as well as
+.BR '\-' .
+.P
+The following options shall be supported:
+.IP "\fB\-c\fP" 10
+Precede each output line with a count of the number of times the line
+occurred in the input.
+.IP "\fB\-d\fP" 10
+Suppress the writing of lines that are not repeated in the input.
+.IP "\fB\-f\ \fIfields\fR" 10
+Ignore the first
+.IR fields
+fields on each input line when doing comparisons, where
+.IR fields
+is a positive decimal integer. A field is the maximal string matched
+by the basic regular expression:
+.RS 10
+.sp
+.RS 4
+.nf
+
+[[:blank:]]*[\(ha[:blank:]]*
+.fi
+.P
+.RE
+.P
+If the
+.IR fields
+option-argument specifies more fields than appear on an input line, a
+null string shall be used for comparison.
+.RE
+.IP "\fB\-s\ \fIchars\fR" 10
+Ignore the first
+.IR chars
+characters when doing comparisons, where
+.IR chars
+shall be a positive decimal integer. If specified in conjunction with
+the
+.BR \-f
+option, the first
+.IR chars
+characters after the first
+.IR fields
+fields shall be ignored. If the
+.IR chars
+option-argument specifies more characters than remain on an input line,
+a null string shall be used for comparison.
+.IP "\fB\-u\fP" 10
+Suppress the writing of lines that are repeated in the input.
+.SH OPERANDS
+The following operands shall be supported:
+.IP "\fIinput_file\fR" 10
+A pathname of the input file. If the
+.IR input_file
+operand is not specified, or if the
+.IR input_file
+is
+.BR '\-' ,
+the standard input shall be used.
+.IP "\fIoutput_file\fR" 10
+A pathname of the output file. If the
+.IR output_file
+operand is not specified, the standard output shall be used. The
+results are unspecified if the file named by
+.IR output_file
+is the file named by
+.IR input_file .
+.SH STDIN
+The standard input shall be used only if no
+.IR input_file
+operand is specified or if
+.IR input_file
+is
+.BR '\-' .
+See the INPUT FILES section.
+.SH "INPUT FILES"
+The input file shall be a text file.
+.SH "ENVIRONMENT VARIABLES"
+The following environment variables shall affect the execution of
+.IR uniq :
+.IP "\fILANG\fP" 10
+Provide a default value for the internationalization variables that are
+unset or null. (See the Base Definitions volume of POSIX.1\(hy2017,
+.IR "Section 8.2" ", " "Internationalization Variables"
+for the precedence of internationalization variables used to determine
+the values of locale categories.)
+.IP "\fILC_ALL\fP" 10
+If set to a non-empty string value, override the values of all the
+other internationalization variables.
+.IP "\fILC_CTYPE\fP" 10
+Determine the locale for the interpretation of sequences of bytes of
+text data as characters (for example, single-byte as opposed to
+multi-byte characters in arguments and input files) and which
+characters constitute a
+<blank>
+in the current locale.
+.IP "\fILC_MESSAGES\fP" 10
+.br
+Determine the locale that should be used to affect the format and
+contents of diagnostic messages written to standard error.
+.IP "\fINLSPATH\fP" 10
+Determine the location of message catalogs for the processing of
+.IR LC_MESSAGES .
+.SH "ASYNCHRONOUS EVENTS"
+Default.
+.SH STDOUT
+The standard output shall be used if no
+.IR output_file
+operand is specified, and shall be used if the
+.IR output_file
+operand is
+.BR '\-'
+and the implementation treats the
+.BR '\-'
+as meaning standard output. Otherwise, the standard output shall
+not be used.
+See the OUTPUT FILES section.
+.SH STDERR
+The standard error shall be used only for diagnostic messages.
+.SH "OUTPUT FILES"
+If the
+.BR \-c
+option is specified, the output file shall be empty or each line
+shall be of the form:
+.sp
+.RS 4
+.nf
+
+"%d %s", <\fInumber of duplicates\fR>, <\fIline\fR>
+.fi
+.P
+.RE
+.P
+otherwise, the output file shall be empty or each line shall be
+of the form:
+.sp
+.RS 4
+.nf
+
+"%s", <\fIline\fR>
+.fi
+.P
+.RE
+.SH "EXTENDED DESCRIPTION"
+None.
+.SH "EXIT STATUS"
+The following exit values shall be returned:
+.IP "\00" 6
+The utility executed successfully.
+.IP >0 6
+An error occurred.
+.SH "CONSEQUENCES OF ERRORS"
+Default.
+.LP
+.IR "The following sections are informative."
+.SH "APPLICATION USAGE"
+If the collating sequence of the current locale has a total ordering
+of all characters, the
+.IR sort
+utility can be used to cause repeated lines to be adjacent in the input
+file. If the collating sequence does not have a total ordering of all
+characters, the
+.IR sort
+utility should still do this but it might not. To ensure that all
+duplicate lines are eliminated, and have the output sorted according
+the collating sequence of the current locale, applications should use:
+.sp
+.RS 4
+.nf
+
+LC_ALL=C sort -u | sort
+.fi
+.P
+.RE
+.P
+instead of:
+.sp
+.RS 4
+.nf
+
+sort | uniq
+.fi
+.P
+.RE
+.P
+To remove duplicate lines based on whether they collate equally
+instead of whether they are identical, applications should use:
+.sp
+.RS 4
+.nf
+
+sort -u
+.fi
+.P
+.RE
+.P
+instead of:
+.sp
+.RS 4
+.nf
+
+sort | uniq
+.fi
+.P
+.RE
+.P
+When using
+.IR uniq
+to process pathnames, it is recommended that LC_ALL, or at least
+LC_CTYPE and LC_COLLATE, are set to POSIX or C in the environment,
+since pathnames can contain byte sequences that do not form valid
+characters in some locales, in which case the utility's behavior would
+be undefined. In the POSIX locale each byte is a valid single-byte
+character, and therefore this problem is avoided.
+.SH EXAMPLES
+The following input file data (but flushed left) was used for a test
+series on
+.IR uniq :
+.sp
+.RS 4
+.nf
+
+#01 foo0 bar0 foo1 bar1
+#02 bar0 foo1 bar1 foo1
+#03 foo0 bar0 foo1 bar1
+#04
+#05 foo0 bar0 foo1 bar1
+#06 foo0 bar0 foo1 bar1
+#07 bar0 foo1 bar1 foo0
+.fi
+.P
+.RE
+.P
+What follows is a series of test invocations of the
+.IR uniq
+utility that use a mixture of
+.IR uniq
+options against the input file data. These tests verify the meaning of
+.IR adjacent .
+The
+.IR uniq
+utility views the input data as a sequence of strings delimited by
+.BR '\en' .
+Accordingly, for the
+.IR fields th
+member of the sequence,
+.IR uniq
+interprets unique or repeated adjacent lines strictly relative to the
+.IR fields +1th
+member.
+.IP " 1." 4
+This first example tests the line counting option, comparing each line
+of the input file data starting from the second field:
+.RS 4
+.sp
+.RS 4
+.nf
+
+uniq -c -f 1 uniq_0I.t
+ 1 #01 foo0 bar0 foo1 bar1
+ 1 #02 bar0 foo1 bar1 foo1
+ 1 #03 foo0 bar0 foo1 bar1
+ 1 #04
+ 2 #05 foo0 bar0 foo1 bar1
+ 1 #07 bar0 foo1 bar1 foo0
+.fi
+.P
+.RE
+.P
+The number
+.BR '2' ,
+prefixing the fifth line of output, signifies that the
+.IR uniq
+utility detected a pair of repeated lines. Given the input data, this
+can only be true when
+.IR uniq
+is run using the
+.BR "\-f\ 1"
+option (which shall cause
+.IR uniq
+to ignore the first field on each input line).
+.RE
+.IP " 2." 4
+The second example tests the option to suppress unique lines, comparing
+each line of the input file data starting from the second field:
+.RS 4
+.sp
+.RS 4
+.nf
+
+uniq -d -f 1 uniq_0I.t
+#05 foo0 bar0 foo1 bar1
+.fi
+.P
+.RE
+.RE
+.IP " 3." 4
+This test suppresses repeated lines, comparing each line of the input
+file data starting from the second field:
+.RS 4
+.sp
+.RS 4
+.nf
+
+uniq -u -f 1 uniq_0I.t
+#01 foo0 bar0 foo1 bar1
+#02 bar0 foo1 bar1 foo1
+#03 foo0 bar0 foo1 bar1
+#04
+#07 bar0 foo1 bar1 foo0
+.fi
+.P
+.RE
+.RE
+.IP " 4." 4
+This suppresses unique lines, comparing each line of the input file
+data starting from the third character:
+.RS 4
+.sp
+.RS 4
+.nf
+
+uniq -d -s 2 uniq_0I.t
+.fi
+.P
+.RE
+.P
+In the last example, the
+.IR uniq
+utility found no input matching the above criteria.
+.RE
+.SH RATIONALE
+Some historical implementations have limited lines to be 1\|080 bytes
+in length, which does not meet the implied
+{LINE_MAX}
+limit.
+.P
+Earlier versions of this standard allowed the
+.BR \- \c
+.IR number
+and
+.BR \(pl \c
+.IR number
+options. These options are no longer specified by POSIX.1\(hy2008 but
+may be present in some implementations.
+.SH "FUTURE DIRECTIONS"
+None.
+.SH "SEE ALSO"
+.IR "\fIcomm\fR\^",
+.IR "\fIsort\fR\^"
+.P
+The Base Definitions volume of POSIX.1\(hy2017,
+.IR "Chapter 8" ", " "Environment Variables",
+.IR "Section 12.2" ", " "Utility Syntax Guidelines"
+.\"
+.SH COPYRIGHT
+Portions of this text are reprinted and reproduced in electronic form
+from IEEE Std 1003.1-2017, Standard for Information Technology
+-- Portable Operating System Interface (POSIX), The Open Group Base
+Specifications Issue 7, 2018 Edition,
+Copyright (C) 2018 by the Institute of
+Electrical and Electronics Engineers, Inc and The Open Group.
+In the event of any discrepancy between this version and the original IEEE and
+The Open Group Standard, the original IEEE and The Open Group Standard
+is the referee document. The original Standard can be obtained online at
+http://www.opengroup.org/unix/online.html .
+.PP
+Any typographical or formatting errors that appear
+in this page are most likely
+to have been introduced during the conversion of the source files to
+man page format. To report such errors, see
+https://www.kernel.org/doc/man-pages/reporting_bugs.html .