summaryrefslogtreecommitdiffstats
path: root/man2/seccomp.2
diff options
context:
space:
mode:
Diffstat (limited to 'man2/seccomp.2')
-rw-r--r--man2/seccomp.2141
1 files changed, 71 insertions, 70 deletions
diff --git a/man2/seccomp.2 b/man2/seccomp.2
index 25376c3bf..c356b0b32 100644
--- a/man2/seccomp.2
+++ b/man2/seccomp.2
@@ -32,7 +32,7 @@ Standard C library
glibc provides no wrapper for
.BR seccomp (),
necessitating the use of
-.BR syscall (2).
+.MR syscall 2 .
.SH DESCRIPTION
The
.BR seccomp ()
@@ -45,13 +45,13 @@ values:
.TP
.B SECCOMP_SET_MODE_STRICT
The only system calls that the calling thread is permitted to make are
-.BR read (2),
-.BR write (2),
-.BR _exit (2)
+.MR read 2 ,
+.MR write 2 ,
+.MR _exit 2
(but not
-.BR exit_group (2)),
+.MR exit_group 2 ),
and
-.BR sigreturn (2).
+.MR sigreturn 2 .
Other system calls result in the termination of the calling thread,
or termination of the entire process with the
.B SIGKILL
@@ -61,21 +61,21 @@ applications that may need to execute untrusted byte code, perhaps
obtained by reading from a pipe or socket.
.IP
Note that although the calling thread can no longer call
-.BR sigprocmask (2),
+.MR sigprocmask 2 ,
it can use
-.BR sigreturn (2)
+.MR sigreturn 2
to block all signals apart from
.B SIGKILL
and
.BR SIGSTOP .
This means that
-.BR alarm (2)
+.MR alarm 2
(for example) is not sufficient for restricting the process's execution time.
Instead, to reliably terminate the process,
.B SIGKILL
must be used.
This can be done by using
-.BR timer_create (2)
+.MR timer_create 2
with
.B SIGEV_SIGNAL
and
@@ -83,7 +83,7 @@ and
set to
.BR SIGKILL ,
or by using
-.BR setrlimit (2)
+.MR setrlimit 2
to set the hard limit for
.BR RLIMIT_CPU .
.IP
@@ -121,16 +121,16 @@ in
.IR errno .
.IP
If
-.BR fork (2)
+.MR fork 2
or
-.BR clone (2)
+.MR clone 2
is allowed by the filter, any child processes will be constrained to
the same system call filters as the parent.
If
-.BR execve (2)
+.MR execve 2
is allowed,
the existing filters will be preserved across a call to
-.BR execve (2).
+.MR execve 2 .
.IP
In order to use the
.B SECCOMP_SET_MODE_FILTER
@@ -157,10 +157,10 @@ in
This requirement ensures that an unprivileged process cannot apply
a malicious filter and then invoke a set-user-ID or
other privileged program using
-.BR execve (2),
+.MR execve 2 ,
thus potentially compromising that program.
(Such a malicious filter might, for example, cause an attempt to use
-.BR setuid (2)
+.MR setuid 2
to set the caller's user IDs to nonzero values to instead
return 0 without actually making the system call.
Thus, the program might be tricked into retaining superuser privileges
@@ -168,7 +168,7 @@ in circumstances where it is possible to influence it to do
dangerous things because it did not actually drop privileges.)
.IP
If
-.BR prctl (2)
+.MR prctl 2
or
.BR seccomp ()
is allowed by the attached filter, further filters may be added.
@@ -220,7 +220,7 @@ At most one seccomp filter using the
flag can be installed for a thread.
.IP
See
-.BR seccomp_unotify (2)
+.MR seccomp_unotify 2
for further details.
.TP
.BR SECCOMP_FILTER_FLAG_SPEC_ALLOW " (since Linux 4.17)"
@@ -282,7 +282,7 @@ struct seccomp_notif_sizes
.EE
.IP
See
-.BR seccomp_unotify (2)
+.MR seccomp_unotify 2
for further details.
.\"
.SS Filters
@@ -341,7 +341,7 @@ Because numbering of system calls varies between architectures and
some architectures (e.g., x86-64) allow user-space code to use
the calling conventions of multiple architectures
(and the convention being used may vary over the life of a process that uses
-.BR execve (2)
+.MR execve 2
to execute binaries that employ the different conventions),
it is usually necessary to verify the value of the
.I arch
@@ -406,7 +406,7 @@ For example,
== (101 |
.BR __X32_SYSCALL_BIT )
would result in invocations of
-.BR ptrace (2)
+.MR ptrace 2
with potentially confused x32-vs-x86_64 semantics in the kernel.
Policies intended to work on kernels before Linux 5.4 must ensure that they
deny or otherwise correctly handle these system calls.
@@ -425,9 +425,9 @@ This might be useful in conjunction with the use of
to perform checks based on which region (mapping) of the program
made the system call.
(Probably, it is wise to lock down the
-.BR mmap (2)
+.MR mmap 2
and
-.BR mprotect (2)
+.MR mprotect 2
system calls to prevent the program from subverting such checks.)
.P
When checking values from
@@ -492,7 +492,7 @@ below, all threads in the thread group are terminated.
(For a discussion of thread groups, see the description of the
.B CLONE_THREAD
flag in
-.BR clone (2).)
+.MR clone 2 .)
.IP
The process terminates
.I "as though"
@@ -503,7 +503,7 @@ Even if a signal handler has been registered for
.BR SIGSYS ,
the handler will be ignored in this case and the process always terminates.
To a parent process that is waiting on this process (using
-.BR waitpid (2)
+.MR waitpid 2
or similar), the returned
.I wstatus
will indicate that its child was terminated as though by a
@@ -535,7 +535,7 @@ any process terminated in this way would not trigger a coredump
(even though
.B SIGSYS
is documented in
-.BR signal (7)
+.MR signal 7
as having a default action of termination with a core dump).
Since Linux 4.11,
a single-threaded process will dump core if terminated in this way.
@@ -562,7 +562,7 @@ signal to the triggering thread.
Various fields will be set in the
.I siginfo_t
structure (see
-.BR sigaction (2))
+.MR sigaction 2 )
associated with signal:
.RS
.IP \[bu] 3
@@ -616,7 +616,7 @@ flag or because the file descriptor was closed), the filter returns
.B SECCOMP_RET_TRACE
and there is no tracer).
See
-.BR seccomp_unotify (2)
+.MR seccomp_unotify 2
for further details.
.IP
Note that the supervisor process will not be notified
@@ -625,7 +625,7 @@ if another filter returns an action value with a precedence greater than
.TP
.B SECCOMP_RET_TRACE
When returned, this value will cause the kernel to attempt to notify a
-.BR ptrace (2)-based
+.MR ptrace 2 -based
tracer prior to executing the system call.
If there is no tracer present,
the system call is not executed and returns a failure status with
@@ -660,7 +660,7 @@ notified.
(This means that, on older kernels, seccomp-based sandboxes
.B "must not"
allow use of
-.BR ptrace (2)\[em]even
+.MR ptrace 2 \[em]even
of other
sandboxed processes\[em]without extreme care;
ptracers can use this mechanism to escape from the seccomp sandbox.)
@@ -763,7 +763,8 @@ and the action appears in the
file, the action is logged.
.IP \[bu]
Otherwise, if kernel auditing is enabled and the process is being audited
-.RB ( autrace (8)),
+\%(\c
+.MR autrace 8 ),
the action is logged.
.IP \[bu]
Otherwise, the action is not logged.
@@ -777,9 +778,9 @@ was used,
the return value is the ID of the thread
that caused the synchronization failure.
(This ID is a kernel thread ID of the type returned by
-.BR clone (2)
+.MR clone 2
and
-.BR gettid (2).)
+.MR gettid 2 .)
On other errors, \-1 is returned, and
.I errno
is set to indicate the error.
@@ -877,17 +878,17 @@ The
field of the
.IR /proc/ pid /status
file provides a method of viewing the seccomp mode of a process; see
-.BR proc (5).
+.MR proc 5 .
.P
.BR seccomp ()
provides a superset of the functionality provided by the
-.BR prctl (2)
+.MR prctl 2
.B PR_SET_SECCOMP
operation (which does not support
.IR flags ).
.P
Since Linux 4.4, the
-.BR ptrace (2)
+.MR ptrace 2
.B PTRACE_SECCOMP_GET_FILTER
operation can be used to dump a process's seccomp filters.
.\"
@@ -921,17 +922,17 @@ There are various subtleties to consider when applying seccomp filters
to a program, including the following:
.IP \[bu] 3
Some traditional system calls have user-space implementations in the
-.BR vdso (7)
+.MR vdso 7
on many architectures.
Notable examples include
-.BR clock_gettime (2),
-.BR gettimeofday (2),
+.MR clock_gettime 2 ,
+.MR gettimeofday 2 ,
and
-.BR time (2).
+.MR time 2 .
On such architectures,
seccomp filtering for these system calls will have no effect.
(However, there are cases where the
-.BR vdso (7)
+.MR vdso 7
implementations may fall back to invoking the true system call,
in which case seccomp filters would see the system call.)
.IP \[bu]
@@ -945,13 +946,13 @@ Consequently, one must be aware of the following:
The glibc wrappers for some traditional system calls may actually
employ system calls with different names in the kernel.
For example, the
-.BR exit (2)
+.MR exit 2
wrapper function actually employs the
-.BR exit_group (2)
+.MR exit_group 2
system call, and the
-.BR fork (2)
+.MR fork 2
wrapper function actually calls
-.BR clone (2).
+.MR clone 2 .
.IP \[bu]
The behavior of wrapper functions may vary across architectures,
according to the range of system calls provided on those architectures.
@@ -960,10 +961,10 @@ different system calls on different architectures.
.IP \[bu]
Finally, the behavior of wrapper functions can change across glibc versions.
For example, in older versions, the glibc wrapper function for
-.BR open (2)
+.MR open 2
invoked the system call of the same name,
but starting in glibc 2.26, the implementation switched to calling
-.BR openat (2)
+.MR openat 2
on all architectures.
.RE
.P
@@ -1023,9 +1024,9 @@ being set to the specified error number.
The remaining command-line arguments specify
the pathname and additional arguments of a program
that the example program should attempt to execute using
-.BR execv (3)
+.MR execv 3
(a library function that employs the
-.BR execve (2)
+.MR execve 2
system call).
Some example runs of the program are shown below.
.P
@@ -1057,9 +1058,9 @@ EADDRNOTAVAIL 99 Cannot assign requested address
.in
.P
In the following example, we attempt to run the command
-.BR whoami (1),
+.MR whoami 1 ,
but the BPF filter rejects the
-.BR execve (2)
+.MR execve 2
system call, so that the command is not even executed:
.P
.in +4n
@@ -1076,9 +1077,9 @@ execv: Cannot assign requested address
.in
.P
In the next example, the BPF filter rejects the
-.BR write (2)
+.MR write 2
system call, so that, although it is successfully started, the
-.BR whoami (1)
+.MR whoami 1
command is not able to write output:
.P
.in +4n
@@ -1091,7 +1092,7 @@ $ \fB./a.out 1 0xC000003E 99 /bin/whoami\fP
.P
In the final example,
the BPF filter rejects a system call that is not used by the
-.BR whoami (1)
+.MR whoami 1
command, so it is able to successfully execute and produce output:
.P
.in +4n
@@ -1208,26 +1209,26 @@ main(int argc, char *argv[])
.EE
.\" SRC END
.SH SEE ALSO
-.BR bpfc (1),
-.BR strace (1),
-.BR bpf (2),
-.BR prctl (2),
-.BR ptrace (2),
-.BR seccomp_unotify (2),
-.BR sigaction (2),
-.BR proc (5),
-.BR signal (7),
-.BR socket (7)
+.MR bpfc 1 ,
+.MR strace 1 ,
+.MR bpf 2 ,
+.MR prctl 2 ,
+.MR ptrace 2 ,
+.MR seccomp_unotify 2 ,
+.MR sigaction 2 ,
+.MR proc 5 ,
+.MR signal 7 ,
+.MR socket 7
.P
Various pages from the
.I libseccomp
library, including:
-.BR scmp_sys_resolver (1),
-.BR seccomp_export_bpf (3),
-.BR seccomp_init (3),
-.BR seccomp_load (3),
+.MR scmp_sys_resolver 1 ,
+.MR seccomp_export_bpf 3 ,
+.MR seccomp_init 3 ,
+.MR seccomp_load 3 ,
and
-.BR seccomp_rule_add (3).
+.MR seccomp_rule_add 3 .
.P
The kernel source files
.I Documentation/networking/filter.txt