summaryrefslogtreecommitdiffstats
path: root/man7/user_namespaces.7
diff options
context:
space:
mode:
Diffstat (limited to 'man7/user_namespaces.7')
-rw-r--r--man7/user_namespaces.7226
1 files changed, 115 insertions, 111 deletions
diff --git a/man7/user_namespaces.7 b/man7/user_namespaces.7
index 290771af2..190b51841 100644
--- a/man7/user_namespaces.7
+++ b/man7/user_namespaces.7
@@ -9,19 +9,19 @@
user_namespaces \- overview of Linux user namespaces
.SH DESCRIPTION
For an overview of namespaces, see
-.BR namespaces (7).
+.MR namespaces 7 .
.P
User namespaces isolate security-related identifiers and attributes,
in particular,
user IDs and group IDs (see
-.BR credentials (7)),
+.MR credentials 7 ),
the root directory,
keys (see
-.BR keyrings (7)),
+.MR keyrings 7 ),
.\" FIXME: This page says very little about the interaction
.\" of user namespaces and keys. Add something on this topic.
and capabilities (see
-.BR capabilities (7)).
+.MR capabilities 7 ).
A process's user and group IDs can be different
inside and outside a user namespace.
In particular,
@@ -40,9 +40,9 @@ namespace\[em]has a parent user namespace,
and can have zero or more child user namespaces.
The parent user namespace is the user namespace
of the process that creates the user namespace via a call to
-.BR unshare (2)
+.MR unshare 2
or
-.BR clone (2)
+.MR clone 2
with the
.B CLONE_NEWUSER
flag.
@@ -52,45 +52,45 @@ The kernel imposes (since Linux 3.11) a limit of 32 nested levels of
user namespaces.
.\" FIXME Explain the rationale for this limit. (What is the rationale?)
Calls to
-.BR unshare (2)
+.MR unshare 2
or
-.BR clone (2)
+.MR clone 2
that would cause this limit to be exceeded fail with the error
.BR EUSERS .
.P
Each process is a member of exactly one user namespace.
A process created via
-.BR fork (2)
+.MR fork 2
or
-.BR clone (2)
+.MR clone 2
without the
.B CLONE_NEWUSER
flag is a member of the same user namespace as its parent.
A single-threaded process can join another user namespace with
-.BR setns (2)
+.MR setns 2
if it has the
.B CAP_SYS_ADMIN
in that namespace;
upon doing so, it gains a full set of capabilities in that namespace.
.P
A call to
-.BR clone (2)
+.MR clone 2
or
-.BR unshare (2)
+.MR unshare 2
with the
.B CLONE_NEWUSER
flag makes the new child process (for
-.BR clone (2))
+.MR clone 2 )
or the caller (for
-.BR unshare (2))
+.MR unshare 2 )
a member of the new user namespace created by the call.
.P
The
.B NS_GET_PARENT
-.BR ioctl (2)
+.MR ioctl 2
operation can be used to discover the parental relationship
between user namespaces; see
-.BR ioctl_ns (2).
+.MR ioctl_ns 2 .
.P
A task that changes one of its effective IDs
will have its dumpability reset to the value in
@@ -101,44 +101,44 @@ to write to mapping files of child processes running in a new user namespace.
In such cases making the parent process dumpable, using
.B PR_SET_DUMPABLE
in a call to
-.BR prctl (2),
+.MR prctl 2 ,
before creating a child process in a new user namespace
may rectify this problem.
See
-.BR prctl (2)
+.MR prctl 2
and
-.BR proc (5)
+.MR proc 5
for details on how ownership is affected.
.\"
.\" ============================================================
.\"
.SS Capabilities
The child process created by
-.BR clone (2)
+.MR clone 2
with the
.B CLONE_NEWUSER
flag starts out with a complete set
of capabilities in the new user namespace.
Likewise, a process that creates a new user namespace using
-.BR unshare (2)
+.MR unshare 2
or joins an existing user namespace using
-.BR setns (2)
+.MR setns 2
gains a full set of capabilities in that namespace.
On the other hand,
that process has no capabilities in the parent (in the case of
-.BR clone (2))
+.MR clone 2 )
or previous (in the case of
-.BR unshare (2)
+.MR unshare 2
and
-.BR setns (2))
+.MR setns 2 )
user namespace,
even if the new namespace is created or joined by the root user
(i.e., a process with user ID 0 in the root namespace).
.P
Note that a call to
-.BR execve (2)
+.MR execve 2
will cause a process's capabilities to be recalculated in the usual way (see
-.BR capabilities (7)).
+.MR capabilities 7 ).
Consequently,
unless the process has a user ID of 0 within the namespace,
or the executable file has a nonempty inheritable capabilities mask,
@@ -146,30 +146,30 @@ the process will lose all capabilities.
See the discussion of user and group ID mappings, below.
.P
A call to
-.BR clone (2)
+.MR clone 2
or
-.BR unshare (2)
+.MR unshare 2
using the
.B CLONE_NEWUSER
flag
or a call to
-.BR setns (2)
+.MR setns 2
that moves the caller into another user namespace
sets the "securebits" flags
(see
-.BR capabilities (7))
+.MR capabilities 7 )
to their default values (all flags disabled) in the child (for
-.BR clone (2))
+.MR clone 2 )
or caller (for
-.BR unshare (2)
+.MR unshare 2
or
-.BR setns (2)).
+.MR setns 2 ).
Note that because the caller no longer has capabilities
in its original user namespace after a call to
-.BR setns (2),
+.MR setns 2 ,
it is not possible for a process to reset its "securebits" flags while
retaining its user namespace membership by using a pair of
-.BR setns (2)
+.MR setns 2
calls to move to another user namespace and then return to
its original user namespace.
.P
@@ -185,10 +185,10 @@ For example, it may execute a set-user-ID program or an
executable with associated file capabilities.
In addition,
a process may gain capabilities via the effect of
-.BR clone (2),
-.BR unshare (2),
+.MR clone 2 ,
+.MR unshare 2 ,
or
-.BR setns (2),
+.MR setns 2 ,
as already described.
.\" In the 3.8 sources, see security/commoncap.c::cap_capable():
.IP \[bu]
@@ -215,10 +215,10 @@ this means that the process has all capabilities in all
further removed descendant user namespaces as well.
The
.B NS_GET_OWNER_UID
-.BR ioctl (2)
+.MR ioctl 2
operation can be used to discover the user ID of the owner of the namespace;
see
-.BR ioctl_ns (2).
+.MR ioctl_ns 2 .
.\"
.\" ============================================================
.\"
@@ -262,7 +262,7 @@ and mount the following types of filesystems:
.I devpts
(since Linux 3.9)
.IP \[bu]
-.BR tmpfs (5)
+.MR tmpfs 5
(since Linux 3.9)
.IP \[bu]
.I ramfs
@@ -325,22 +325,24 @@ If
is specified along with other
.B CLONE_NEW*
flags in a single
-.BR clone (2)
+.MR clone 2
or
-.BR unshare (2)
+.MR unshare 2
call, the user namespace is guaranteed to be created first,
giving the child
-.RB ( clone (2))
+\%(\c
+.MR clone 2 )
or caller
-.RB ( unshare (2))
+\%(\c
+.MR unshare 2 )
privileges over the remaining namespaces created by the call.
Thus, it is possible for an unprivileged caller to specify this combination
of flags.
.P
When a new namespace (other than a user namespace) is created via
-.BR clone (2)
+.MR clone 2
or
-.BR unshare (2),
+.MR unshare 2 ,
the kernel records the user namespace of the creating process as the owner of
the new namespace.
(This association can't be changed.)
@@ -350,7 +352,8 @@ resources isolated by the namespace,
the permission checks are performed according to the process's capabilities
in the user namespace that the kernel associated with the new namespace.
For example, suppose that a process attempts to change the hostname
-.RB ( sethostname (2)),
+\%(\c
+.MR sethostname 2 ),
a resource governed by the UTS namespace.
In this case,
the kernel will determine which user namespace owns
@@ -361,10 +364,10 @@ in that user namespace.
.P
The
.B NS_GET_USERNS
-.BR ioctl (2)
+.MR ioctl 2
operation can be used to discover the user namespace
that owns a nonuser namespace; see
-.BR ioctl_ns (2).
+.MR ioctl_ns 2 .
.\"
.\" ============================================================
.\"
@@ -450,17 +453,17 @@ The length of the range of user IDs that is mapped between the two
user namespaces.
.P
System calls that return user IDs (group IDs)\[em]for example,
-.BR getuid (2),
-.BR getgid (2),
+.MR getuid 2 ,
+.MR getgid 2 ,
and the credential fields in the structure returned by
-.BR stat (2)\[em]return
+.MR stat 2 \[em]return
the user ID (group ID) mapped into the caller's user namespace.
.P
When a process accesses a file, its user and group IDs
are mapped into the initial user namespace for the purpose of permission
checking and assigning IDs when creating a file.
When a process retrieves file user and group IDs via
-.BR stat (2),
+.MR stat 2 ,
the IDs are mapped in the opposite direction,
to produce values relative to the process user and group ID mappings.
.P
@@ -488,7 +491,7 @@ This leaves 4294967295 (the 32-bit signed \-1 value) unmapped.
This is deliberate:
.I (uid_t)\~\-1
is used in several interfaces (e.g.,
-.BR setreuid (2))
+.MR setreuid 2 )
as a way to specify "no user ID".
Leaving
.I (uid_t)\~\-1
@@ -533,9 +536,9 @@ the limit is 340 lines.
In addition, the number of bytes written to
the file must be less than the system page size,
and the write must be performed at the start of the file (i.e.,
-.BR lseek (2)
+.MR lseek 2
and
-.BR pwrite (2)
+.MR pwrite 2
can't be used to write to nonzero offsets in the file).
.IP \[bu]
The range of user IDs (group IDs)
@@ -601,7 +604,7 @@ a UID 0 process that lacks the
capability,
which is needed to create a binary with namespaced file capabilities
(as described in
-.BR capabilities (7)),
+.MR capabilities 7 ),
could nevertheless create such a binary,
by the following steps:
.RS
@@ -652,7 +655,7 @@ that created the user namespace.
In the case of
.IR gid_map ,
use of the
-.BR setgroups (2)
+.MR setgroups 2
system call must first be denied by writing
.RI \[dq] deny \[dq]
to the
@@ -671,9 +674,9 @@ Writes that violate the above rules fail with the error
Similarly to user and group ID mappings,
it is possible to create project ID mappings for a user namespace.
(Project IDs are used for disk quotas; see
-.BR setquota (8)
+.MR setquota 8
and
-.BR quotactl (2).)
+.MR quotactl 2 .)
.P
Project ID mappings are defined by writing to the
.IR /proc/ pid /projid_map
@@ -686,7 +689,7 @@ The validity rules for writing to the
file are as for writing to the
.I uid_map
file; violation of these rules causes
-.BR write (2)
+.MR write 2
to fail with the error
.BR EINVAL .
.P
@@ -703,7 +706,7 @@ The mapped project IDs must in turn have a mapping
in the parent user namespace.
.P
Violation of these rules causes
-.BR write (2)
+.MR write 2
to fail with the error
.BR EPERM .
.\"
@@ -724,18 +727,18 @@ files have been written, only the mapped values may be used in
system calls that change user and group IDs.
.P
For user IDs, the relevant system calls include
-.BR setuid (2),
-.BR setfsuid (2),
-.BR setreuid (2),
+.MR setuid 2 ,
+.MR setfsuid 2 ,
+.MR setreuid 2 ,
and
-.BR setresuid (2).
+.MR setresuid 2 .
For group IDs, the relevant system calls include
-.BR setgid (2),
-.BR setfsgid (2),
-.BR setregid (2),
-.BR setresgid (2),
+.MR setgid 2 ,
+.MR setfsgid 2 ,
+.MR setregid 2 ,
+.MR setresgid 2 ,
and
-.BR setgroups (2).
+.MR setgroups 2 .
.P
Writing
.RI \[dq] deny \[dq]
@@ -748,7 +751,7 @@ file before writing to
.\" commit 66d2f338ee4c449396b6f99f5e75cd18eb6df272
.\" http://lwn.net/Articles/626665/
will permanently disable
-.BR setgroups (2)
+.MR setgroups 2
in a user namespace and allow writing to
.IR /proc/ pid /gid_map
without having the
@@ -771,16 +774,16 @@ file displays the string
if processes in the user namespace that contains the process
.I pid
are permitted to employ the
-.BR setgroups (2)
+.MR setgroups 2
system call; it displays
.RI \[dq] deny \[dq]
if
-.BR setgroups (2)
+.MR setgroups 2
is not permitted in that user namespace.
Note that regardless of the value in the
.IR /proc/ pid /setgroups
file (and regardless of the process's capabilities), calls to
-.BR setgroups (2)
+.MR setgroups 2
are also not permitted if
.IR /proc/ pid /gid_map
has not yet been set.
@@ -799,25 +802,25 @@ for this user namespace to the file
Writing the string
.RI \[dq] deny \[dq]
prevents any process in the user namespace from employing
-.BR setgroups (2).
+.MR setgroups 2 .
.P
The essence of the restrictions described in the preceding
paragraph is that it is permitted to write to
.IR /proc/ pid /setgroups
only so long as calling
-.BR setgroups (2)
+.MR setgroups 2
is disallowed because
.IR /proc/ pid /gid_map
has not been set.
This ensures that a process cannot transition from a state where
-.BR setgroups (2)
+.MR setgroups 2
is allowed to a state where
-.BR setgroups (2)
+.MR setgroups 2
is denied;
a process can transition only from
-.BR setgroups (2)
+.MR setgroups 2
being disallowed to
-.BR setgroups (2)
+.MR setgroups 2
being allowed.
.P
The default value of this file in the initial user namespace is
@@ -827,10 +830,10 @@ Once
.IR /proc/ pid /gid_map
has been written to
(which has the effect of enabling
-.BR setgroups (2)
+.MR setgroups 2
in the user namespace),
it is no longer possible to disallow
-.BR setgroups (2)
+.MR setgroups 2
by writing
.RI \[dq] deny \[dq]
to
@@ -847,7 +850,7 @@ If the
file has the value
.RI \[dq] deny \[dq],
then the
-.BR setgroups (2)
+.MR setgroups 2
system call can't subsequently be reenabled (by writing
.RI \[dq] allow \[dq]
to the file) in this user namespace.
@@ -864,13 +867,13 @@ because it addresses a security issue.
The issue concerned files with permissions such as "rwx\-\-\-rwx".
Such files give fewer permissions to "group" than they do to "other".
This means that dropping groups using
-.BR setgroups (2)
+.MR setgroups 2
might allow a process file access that it did not formerly have.
Before the existence of user namespaces this was not a concern,
since only a privileged process (one with the
.B CAP_SETGID
capability) could call
-.BR setgroups (2).
+.MR setgroups 2 .
However, with the introduction of user namespaces,
it became possible for an unprivileged process to create
a new namespace in which the user had all privileges.
@@ -881,7 +884,7 @@ The
.IR /proc/ pid /setgroups
file was added to address this security issue,
by denying any pathway for an unprivileged process to drop groups with
-.BR setgroups (2).
+.MR setgroups 2 .
.\"
.\" /proc/PID/setgroups
.\" [allow == setgroups() is allowed, "deny" == setgroups() is disallowed]
@@ -901,7 +904,7 @@ by denying any pathway for an unprivileged process to drop groups with
There are various places where an unmapped user ID (group ID)
may be exposed to user space.
For example, the first process in a new user namespace may call
-.BR getuid (2)
+.MR getuid 2
before a user ID mapping has been defined for the namespace.
In most such cases, an unmapped user ID is converted
.\" from_kuid_munged(), from_kgid_munged()
@@ -912,18 +915,19 @@ See the descriptions of
and
.I /proc/sys/kernel/overflowgid
in
-.BR proc (5).
+.MR proc 5 .
.P
The cases where unmapped IDs are mapped in this fashion include
system calls that return user IDs
-.RB ( getuid (2),
-.BR getgid (2),
+\%(\c
+.MR getuid 2 ,
+.MR getgid 2 ,
and similar),
credentials passed over a UNIX domain socket,
.\" also SO_PEERCRED
credentials returned by
-.BR stat (2),
-.BR waitid (2),
+.MR stat 2 ,
+.MR waitid 2 ,
and the System V IPC "ctl"
.B IPC_STAT
operations,
@@ -936,11 +940,11 @@ credentials returned via the
field in the
.I siginfo_t
received with a signal (see
-.BR sigaction (2)),
+.MR sigaction 2 ),
credentials written to the process accounting file (see
-.BR acct (5)),
+.MR acct 5 ),
and credentials returned with POSIX message queue notifications (see
-.BR mq_notify (3)).
+.MR mq_notify 3 ).
.P
There is one notable case where unmapped user and group IDs are
.I not
@@ -1017,7 +1021,7 @@ but the process's effective user (group) ID is left unchanged.
program that resides on a filesystem that was mounted with the
.B MS_NOSUID
flag, as described in
-.BR mount (2).)
+.MR mount 2 .)
.\"
.\" ============================================================
.\"
@@ -1026,7 +1030,7 @@ When a process's user and group IDs are passed over a UNIX domain socket
to a process in a different user namespace (see the description of
.B SCM_CREDENTIALS
in
-.BR unix (7)),
+.MR unix 7 ),
they are translated into the corresponding values as per the
receiving process's user and group ID mappings.
.\"
@@ -1452,18 +1456,18 @@ main(int argc, char *argv[])
.SH SEE ALSO
.BR newgidmap (1), \" From the shadow package
.BR newuidmap (1), \" From the shadow package
-.BR clone (2),
-.BR ptrace (2),
-.BR setns (2),
-.BR unshare (2),
-.BR proc (5),
+.MR clone 2 ,
+.MR ptrace 2 ,
+.MR setns 2 ,
+.MR unshare 2 ,
+.MR proc 5 ,
.BR subgid (5), \" From the shadow package
.BR subuid (5), \" From the shadow package
-.BR capabilities (7),
-.BR cgroup_namespaces (7),
-.BR credentials (7),
-.BR namespaces (7),
-.BR pid_namespaces (7)
+.MR capabilities 7 ,
+.MR cgroup_namespaces 7 ,
+.MR credentials 7 ,
+.MR namespaces 7 ,
+.MR pid_namespaces 7
.P
The kernel source file
.IR Documentation/admin\-guide/namespaces/resource\-control.rst .