summaryrefslogtreecommitdiffstats
path: root/man7/capabilities.7
diff options
context:
space:
mode:
Diffstat (limited to 'man7/capabilities.7')
-rw-r--r--man7/capabilities.71872
1 files changed, 0 insertions, 1872 deletions
diff --git a/man7/capabilities.7 b/man7/capabilities.7
deleted file mode 100644
index dca8b01cf..000000000
--- a/man7/capabilities.7
+++ /dev/null
@@ -1,1872 +0,0 @@
-.\" Copyright (c) 2002 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" 6 Aug 2002 - Initial Creation
-.\" Modified 2003-05-23, Michael Kerrisk, <mtk.manpages@gmail.com>
-.\" Modified 2004-05-27, Michael Kerrisk, <mtk.manpages@gmail.com>
-.\" 2004-12-08, mtk Added O_NOATIME for CAP_FOWNER
-.\" 2005-08-16, mtk, Added CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE
-.\" 2008-07-15, Serge Hallyn <serue@us.bbm.com>
-.\" Document file capabilities, per-process capability
-.\" bounding set, changed semantics for CAP_SETPCAP,
-.\" and other changes in Linux 2.6.2[45].
-.\" Add CAP_MAC_ADMIN, CAP_MAC_OVERRIDE, CAP_SETFCAP.
-.\" 2008-07-15, mtk
-.\" Add text describing circumstances in which CAP_SETPCAP
-.\" (theoretically) permits a thread to change the
-.\" capability sets of another thread.
-.\" Add section describing rules for programmatically
-.\" adjusting thread capability sets.
-.\" Describe rationale for capability bounding set.
-.\" Document "securebits" flags.
-.\" Add text noting that if we set the effective flag for one file
-.\" capability, then we must also set the effective flag for all
-.\" other capabilities where the permitted or inheritable bit is set.
-.\" 2011-09-07, mtk/Serge hallyn: Add CAP_SYSLOG
-.\"
-.TH Capabilities 7 (date) "Linux man-pages (unreleased)"
-.SH NAME
-capabilities \- overview of Linux capabilities
-.SH DESCRIPTION
-For the purpose of performing permission checks,
-traditional UNIX implementations distinguish two categories of processes:
-.I privileged
-processes (whose effective user ID is 0, referred to as superuser or root),
-and
-.I unprivileged
-processes (whose effective UID is nonzero).
-Privileged processes bypass all kernel permission checks,
-while unprivileged processes are subject to full permission
-checking based on the process's credentials
-(usually: effective UID, effective GID, and supplementary group list).
-.PP
-Starting with Linux 2.2, Linux divides the privileges traditionally
-associated with superuser into distinct units, known as
-.IR capabilities ,
-which can be independently enabled and disabled.
-Capabilities are a per-thread attribute.
-.\"
-.SS Capabilities list
-The following list shows the capabilities implemented on Linux,
-and the operations or behaviors that each capability permits:
-.TP
-.BR CAP_AUDIT_CONTROL " (since Linux 2.6.11)"
-Enable and disable kernel auditing; change auditing filter rules;
-retrieve auditing status and filtering rules.
-.TP
-.BR CAP_AUDIT_READ " (since Linux 3.16)"
-.\" commit a29b694aa1739f9d76538e34ae25524f9c549d59
-.\" commit 3a101b8de0d39403b2c7e5c23fd0b005668acf48
-Allow reading the audit log via a multicast netlink socket.
-.TP
-.BR CAP_AUDIT_WRITE " (since Linux 2.6.11)"
-Write records to kernel auditing log.
-.\" FIXME Add FAN_ENABLE_AUDIT
-.TP
-.BR CAP_BLOCK_SUSPEND " (since Linux 3.5)"
-Employ features that can block system suspend
-.RB ( epoll (7)
-.BR EPOLLWAKEUP ,
-.IR /proc/sys/wake_lock ).
-.TP
-.BR CAP_BPF " (since Linux 5.8)"
-Employ privileged BPF operations; see
-.BR bpf (2)
-and
-.BR bpf\-helpers (7).
-.IP
-This capability was added in Linux 5.8 to separate out
-BPF functionality from the overloaded
-.B CAP_SYS_ADMIN
-capability.
-.TP
-.BR CAP_CHECKPOINT_RESTORE " (since Linux 5.9)"
-.\" commit 124ea650d3072b005457faed69909221c2905a1f
-.PD 0
-.RS
-.IP \[bu] 3
-Update
-.I /proc/sys/kernel/ns_last_pid
-(see
-.BR pid_namespaces (7));
-.IP \[bu]
-employ the
-.I set_tid
-feature of
-.BR clone3 (2);
-.\" FIXME There is also some use case relating to
-.\" prctl_set_mm_exe_file(); in the 5.9 sources, see
-.\" prctl_set_mm_map().
-.IP \[bu]
-read the contents of the symbolic links in
-.IR /proc/ pid /map_files
-for other processes.
-.RE
-.PD
-.IP
-This capability was added in Linux 5.9 to separate out
-checkpoint/restore functionality from the overloaded
-.B CAP_SYS_ADMIN
-capability.
-.TP
-.B CAP_CHOWN
-Make arbitrary changes to file UIDs and GIDs (see
-.BR chown (2)).
-.TP
-.B CAP_DAC_OVERRIDE
-Bypass file read, write, and execute permission checks.
-(DAC is an abbreviation of "discretionary access control".)
-.TP
-.B CAP_DAC_READ_SEARCH
-.PD 0
-.RS
-.IP \[bu] 3
-Bypass file read permission checks and
-directory read and execute permission checks;
-.IP \[bu]
-invoke
-.BR open_by_handle_at (2);
-.IP \[bu]
-use the
-.BR linkat (2)
-.B AT_EMPTY_PATH
-flag to create a link to a file referred to by a file descriptor.
-.RE
-.PD
-.TP
-.B CAP_FOWNER
-.PD 0
-.RS
-.IP \[bu] 3
-Bypass permission checks on operations that normally
-require the filesystem UID of the process to match the UID of
-the file (e.g.,
-.BR chmod (2),
-.BR utime (2)),
-excluding those operations covered by
-.B CAP_DAC_OVERRIDE
-and
-.BR CAP_DAC_READ_SEARCH ;
-.IP \[bu]
-set inode flags (see
-.BR ioctl_iflags (2))
-on arbitrary files;
-.IP \[bu]
-set Access Control Lists (ACLs) on arbitrary files;
-.IP \[bu]
-ignore directory sticky bit on file deletion;
-.IP \[bu]
-modify
-.I user
-extended attributes on sticky directory owned by any user;
-.IP \[bu]
-specify
-.B O_NOATIME
-for arbitrary files in
-.BR open (2)
-and
-.BR fcntl (2).
-.RE
-.PD
-.TP
-.B CAP_FSETID
-.PD 0
-.RS
-.IP \[bu] 3
-Don't clear set-user-ID and set-group-ID mode
-bits when a file is modified;
-.IP \[bu]
-set the set-group-ID bit for a file whose GID does not match
-the filesystem or any of the supplementary GIDs of the calling process.
-.RE
-.PD
-.TP
-.B CAP_IPC_LOCK
-.\" FIXME . As at Linux 3.2, there are some strange uses of this capability
-.\" in other places; they probably should be replaced with something else.
-.PD 0
-.RS
-.IP \[bu] 3
-Lock memory
-.RB ( mlock (2),
-.BR mlockall (2),
-.BR mmap (2),
-.BR shmctl (2));
-.IP \[bu]
-Allocate memory using huge pages
-.RB ( memfd_create (2),
-.BR mmap (2),
-.BR shmctl (2)).
-.RE
-.PD
-.TP
-.B CAP_IPC_OWNER
-Bypass permission checks for operations on System V IPC objects.
-.TP
-.B CAP_KILL
-Bypass permission checks for sending signals (see
-.BR kill (2)).
-This includes use of the
-.BR ioctl (2)
-.B KDSIGACCEPT
-operation.
-.\" FIXME . CAP_KILL also has an effect for threads + setting child
-.\" termination signal to other than SIGCHLD: without this
-.\" capability, the termination signal reverts to SIGCHLD
-.\" if the child does an exec(). What is the rationale
-.\" for this?
-.TP
-.BR CAP_LEASE " (since Linux 2.4)"
-Establish leases on arbitrary files (see
-.BR fcntl (2)).
-.TP
-.B CAP_LINUX_IMMUTABLE
-Set the
-.B FS_APPEND_FL
-and
-.B FS_IMMUTABLE_FL
-inode flags (see
-.BR ioctl_iflags (2)).
-.TP
-.BR CAP_MAC_ADMIN " (since Linux 2.6.25)"
-Allow MAC configuration or state changes.
-Implemented for the Smack Linux Security Module (LSM).
-.TP
-.BR CAP_MAC_OVERRIDE " (since Linux 2.6.25)"
-Override Mandatory Access Control (MAC).
-Implemented for the Smack LSM.
-.TP
-.BR CAP_MKNOD " (since Linux 2.4)"
-Create special files using
-.BR mknod (2).
-.TP
-.B CAP_NET_ADMIN
-Perform various network-related operations:
-.PD 0
-.RS
-.IP \[bu] 3
-interface configuration;
-.IP \[bu]
-administration of IP firewall, masquerading, and accounting;
-.IP \[bu]
-modify routing tables;
-.IP \[bu]
-bind to any address for transparent proxying;
-.IP \[bu]
-set type-of-service (TOS);
-.IP \[bu]
-clear driver statistics;
-.IP \[bu]
-set promiscuous mode;
-.IP \[bu]
-enabling multicasting;
-.IP \[bu]
-use
-.BR setsockopt (2)
-to set the following socket options:
-.BR SO_DEBUG ,
-.BR SO_MARK ,
-.B SO_PRIORITY
-(for a priority outside the range 0 to 6),
-.BR SO_RCVBUFFORCE ,
-and
-.BR SO_SNDBUFFORCE .
-.RE
-.PD
-.TP
-.B CAP_NET_BIND_SERVICE
-Bind a socket to Internet domain privileged ports
-(port numbers less than 1024).
-.TP
-.B CAP_NET_BROADCAST
-(Unused) Make socket broadcasts, and listen to multicasts.
-.\" FIXME Since Linux 4.2, there are use cases for netlink sockets
-.\" commit 59324cf35aba5336b611074028777838a963d03b
-.TP
-.B CAP_NET_RAW
-.PD 0
-.RS
-.IP \[bu] 3
-Use RAW and PACKET sockets;
-.IP \[bu]
-bind to any address for transparent proxying.
-.RE
-.PD
-.\" Also various IP options and setsockopt(SO_BINDTODEVICE)
-.TP
-.BR CAP_PERFMON " (since Linux 5.8)"
-Employ various performance-monitoring mechanisms, including:
-.RS
-.IP \[bu] 3
-.PD 0
-call
-.BR perf_event_open (2);
-.IP \[bu]
-employ various BPF operations that have performance implications.
-.RE
-.PD
-.IP
-This capability was added in Linux 5.8 to separate out
-performance monitoring functionality from the overloaded
-.B CAP_SYS_ADMIN
-capability.
-See also the kernel source file
-.IR Documentation/admin\-guide/perf\-security.rst .
-.TP
-.B CAP_SETGID
-.RS
-.PD 0
-.IP \[bu] 3
-Make arbitrary manipulations of process GIDs and supplementary GID list;
-.IP \[bu]
-forge GID when passing socket credentials via UNIX domain sockets;
-.IP \[bu]
-write a group ID mapping in a user namespace (see
-.BR user_namespaces (7)).
-.PD
-.RE
-.TP
-.BR CAP_SETFCAP " (since Linux 2.6.24)"
-Set arbitrary capabilities on a file.
-.IP
-.\" commit db2e718a47984b9d71ed890eb2ea36ecf150de18
-Since Linux 5.12, this capability is
-also needed to map user ID 0 in a new user namespace; see
-.BR user_namespaces (7)
-for details.
-.TP
-.B CAP_SETPCAP
-If file capabilities are supported (i.e., since Linux 2.6.24):
-add any capability from the calling thread's bounding set
-to its inheritable set;
-drop capabilities from the bounding set (via
-.BR prctl (2)
-.BR PR_CAPBSET_DROP );
-make changes to the
-.I securebits
-flags.
-.IP
-If file capabilities are not supported (i.e., before Linux 2.6.24):
-grant or remove any capability in the
-caller's permitted capability set to or from any other process.
-(This property of
-.B CAP_SETPCAP
-is not available when the kernel is configured to support
-file capabilities, since
-.B CAP_SETPCAP
-has entirely different semantics for such kernels.)
-.TP
-.B CAP_SETUID
-.RS
-.PD 0
-.IP \[bu] 3
-Make arbitrary manipulations of process UIDs
-.RB ( setuid (2),
-.BR setreuid (2),
-.BR setresuid (2),
-.BR setfsuid (2));
-.IP \[bu]
-forge UID when passing socket credentials via UNIX domain sockets;
-.IP \[bu]
-write a user ID mapping in a user namespace (see
-.BR user_namespaces (7)).
-.PD
-.RE
-.\" FIXME CAP_SETUID also an effect in exec(); document this.
-.TP
-.B CAP_SYS_ADMIN
-.IR Note :
-this capability is overloaded; see
-.I Notes to kernel developers
-below.
-.IP
-.PD 0
-.RS
-.IP \[bu] 3
-Perform a range of system administration operations including:
-.BR quotactl (2),
-.BR mount (2),
-.BR umount (2),
-.BR pivot_root (2),
-.BR swapon (2),
-.BR swapoff (2),
-.BR sethostname (2),
-and
-.BR setdomainname (2);
-.IP \[bu]
-perform privileged
-.BR syslog (2)
-operations (since Linux 2.6.37,
-.B CAP_SYSLOG
-should be used to permit such operations);
-.IP \[bu]
-perform
-.B VM86_REQUEST_IRQ
-.BR vm86 (2)
-command;
-.IP \[bu]
-access the same checkpoint/restore functionality that is governed by
-.B CAP_CHECKPOINT_RESTORE
-(but the latter, weaker capability is preferred for accessing
-that functionality).
-.IP \[bu]
-perform the same BPF operations as are governed by
-.B CAP_BPF
-(but the latter, weaker capability is preferred for accessing
-that functionality).
-.IP \[bu]
-employ the same performance monitoring mechanisms as are governed by
-.B CAP_PERFMON
-(but the latter, weaker capability is preferred for accessing
-that functionality).
-.IP \[bu]
-perform
-.B IPC_SET
-and
-.B IPC_RMID
-operations on arbitrary System V IPC objects;
-.IP \[bu]
-override
-.B RLIMIT_NPROC
-resource limit;
-.IP \[bu]
-perform operations on
-.I trusted
-and
-.I security
-extended attributes (see
-.BR xattr (7));
-.IP \[bu]
-use
-.BR lookup_dcookie (2);
-.IP \[bu]
-use
-.BR ioprio_set (2)
-to assign
-.B IOPRIO_CLASS_RT
-and (before Linux 2.6.25)
-.B IOPRIO_CLASS_IDLE
-I/O scheduling classes;
-.IP \[bu]
-forge PID when passing socket credentials via UNIX domain sockets;
-.IP \[bu]
-exceed
-.IR /proc/sys/fs/file\-max ,
-the system-wide limit on the number of open files,
-in system calls that open files (e.g.,
-.BR accept (2),
-.BR execve (2),
-.BR open (2),
-.BR pipe (2));
-.IP \[bu]
-employ
-.B CLONE_*
-flags that create new namespaces with
-.BR clone (2)
-and
-.BR unshare (2)
-(but, since Linux 3.8,
-creating user namespaces does not require any capability);
-.IP \[bu]
-access privileged
-.I perf
-event information;
-.IP \[bu]
-call
-.BR setns (2)
-(requires
-.B CAP_SYS_ADMIN
-in the
-.I target
-namespace);
-.IP \[bu]
-call
-.BR fanotify_init (2);
-.IP \[bu]
-perform privileged
-.B KEYCTL_CHOWN
-and
-.B KEYCTL_SETPERM
-.BR keyctl (2)
-operations;
-.IP \[bu]
-perform
-.BR madvise (2)
-.B MADV_HWPOISON
-operation;
-.IP \[bu]
-employ the
-.B TIOCSTI
-.BR ioctl (2)
-to insert characters into the input queue of a terminal other than
-the caller's controlling terminal;
-.IP \[bu]
-employ the obsolete
-.BR nfsservctl (2)
-system call;
-.IP \[bu]
-employ the obsolete
-.BR bdflush (2)
-system call;
-.IP \[bu]
-perform various privileged block-device
-.BR ioctl (2)
-operations;
-.IP \[bu]
-perform various privileged filesystem
-.BR ioctl (2)
-operations;
-.IP \[bu]
-perform privileged
-.BR ioctl (2)
-operations on the
-.I /dev/random
-device (see
-.BR random (4));
-.IP \[bu]
-install a
-.BR seccomp (2)
-filter without first having to set the
-.I no_new_privs
-thread attribute;
-.IP \[bu]
-modify allow/deny rules for device control groups;
-.IP \[bu]
-employ the
-.BR ptrace (2)
-.B PTRACE_SECCOMP_GET_FILTER
-operation to dump tracee's seccomp filters;
-.IP \[bu]
-employ the
-.BR ptrace (2)
-.B PTRACE_SETOPTIONS
-operation to suspend the tracee's seccomp protections (i.e., the
-.B PTRACE_O_SUSPEND_SECCOMP
-flag);
-.IP \[bu]
-perform administrative operations on many device drivers;
-.IP \[bu]
-modify autogroup nice values by writing to
-.IR /proc/ pid /autogroup
-(see
-.BR sched (7)).
-.RE
-.PD
-.TP
-.B CAP_SYS_BOOT
-Use
-.BR reboot (2)
-and
-.BR kexec_load (2).
-.TP
-.B CAP_SYS_CHROOT
-.RS
-.PD 0
-.IP \[bu] 3
-Use
-.BR chroot (2);
-.IP \[bu]
-change mount namespaces using
-.BR setns (2).
-.PD
-.RE
-.TP
-.B CAP_SYS_MODULE
-.RS
-.PD 0
-.IP \[bu] 3
-Load and unload kernel modules
-(see
-.BR init_module (2)
-and
-.BR delete_module (2));
-.IP \[bu]
-before Linux 2.6.25:
-drop capabilities from the system-wide capability bounding set.
-.PD
-.RE
-.TP
-.B CAP_SYS_NICE
-.PD 0
-.RS
-.IP \[bu] 3
-Lower the process nice value
-.RB ( nice (2),
-.BR setpriority (2))
-and change the nice value for arbitrary processes;
-.IP \[bu]
-set real-time scheduling policies for calling process,
-and set scheduling policies and priorities for arbitrary processes
-.RB ( sched_setscheduler (2),
-.BR sched_setparam (2),
-.BR sched_setattr (2));
-.IP \[bu]
-set CPU affinity for arbitrary processes
-.RB ( sched_setaffinity (2));
-.IP \[bu]
-set I/O scheduling class and priority for arbitrary processes
-.RB ( ioprio_set (2));
-.IP \[bu]
-apply
-.BR migrate_pages (2)
-to arbitrary processes and allow processes
-to be migrated to arbitrary nodes;
-.\" FIXME CAP_SYS_NICE also has the following effect for
-.\" migrate_pages(2):
-.\" do_migrate_pages(mm, &old, &new,
-.\" capable(CAP_SYS_NICE) ? MPOL_MF_MOVE_ALL : MPOL_MF_MOVE);
-.\"
-.\" Document this.
-.IP \[bu]
-apply
-.BR move_pages (2)
-to arbitrary processes;
-.IP \[bu]
-use the
-.B MPOL_MF_MOVE_ALL
-flag with
-.BR mbind (2)
-and
-.BR move_pages (2).
-.RE
-.PD
-.TP
-.B CAP_SYS_PACCT
-Use
-.BR acct (2).
-.TP
-.B CAP_SYS_PTRACE
-.PD 0
-.RS
-.IP \[bu] 3
-Trace arbitrary processes using
-.BR ptrace (2);
-.IP \[bu]
-apply
-.BR get_robust_list (2)
-to arbitrary processes;
-.IP \[bu]
-transfer data to or from the memory of arbitrary processes using
-.BR process_vm_readv (2)
-and
-.BR process_vm_writev (2);
-.IP \[bu]
-inspect processes using
-.BR kcmp (2).
-.RE
-.PD
-.TP
-.B CAP_SYS_RAWIO
-.PD 0
-.RS
-.IP \[bu] 3
-Perform I/O port operations
-.RB ( iopl (2)
-and
-.BR ioperm (2));
-.IP \[bu]
-access
-.IR /proc/kcore ;
-.IP \[bu]
-employ the
-.B FIBMAP
-.BR ioctl (2)
-operation;
-.IP \[bu]
-open devices for accessing x86 model-specific registers (MSRs, see
-.BR msr (4));
-.IP \[bu]
-update
-.IR /proc/sys/vm/mmap_min_addr ;
-.IP \[bu]
-create memory mappings at addresses below the value specified by
-.IR /proc/sys/vm/mmap_min_addr ;
-.IP \[bu]
-map files in
-.IR /proc/bus/pci ;
-.IP \[bu]
-open
-.I /dev/mem
-and
-.IR /dev/kmem ;
-.IP \[bu]
-perform various SCSI device commands;
-.IP \[bu]
-perform certain operations on
-.BR hpsa (4)
-and
-.BR cciss (4)
-devices;
-.IP \[bu]
-perform a range of device-specific operations on other devices.
-.RE
-.PD
-.TP
-.B CAP_SYS_RESOURCE
-.PD 0
-.RS
-.IP \[bu] 3
-Use reserved space on ext2 filesystems;
-.IP \[bu]
-make
-.BR ioctl (2)
-calls controlling ext3 journaling;
-.IP \[bu]
-override disk quota limits;
-.IP \[bu]
-increase resource limits (see
-.BR setrlimit (2));
-.IP \[bu]
-override
-.B RLIMIT_NPROC
-resource limit;
-.IP \[bu]
-override maximum number of consoles on console allocation;
-.IP \[bu]
-override maximum number of keymaps;
-.IP \[bu]
-allow more than 64hz interrupts from the real-time clock;
-.IP \[bu]
-raise
-.I msg_qbytes
-limit for a System V message queue above the limit in
-.I /proc/sys/kernel/msgmnb
-(see
-.BR msgop (2)
-and
-.BR msgctl (2));
-.IP \[bu]
-allow the
-.B RLIMIT_NOFILE
-resource limit on the number of "in-flight" file descriptors
-to be bypassed when passing file descriptors to another process
-via a UNIX domain socket (see
-.BR unix (7));
-.IP \[bu]
-override the
-.I /proc/sys/fs/pipe\-size\-max
-limit when setting the capacity of a pipe using the
-.B F_SETPIPE_SZ
-.BR fcntl (2)
-command;
-.IP \[bu]
-use
-.B F_SETPIPE_SZ
-to increase the capacity of a pipe above the limit specified by
-.IR /proc/sys/fs/pipe\-max\-size ;
-.IP \[bu]
-override
-.IR /proc/sys/fs/mqueue/queues_max ,
-.IR /proc/sys/fs/mqueue/msg_max ,
-and
-.I /proc/sys/fs/mqueue/msgsize_max
-limits when creating POSIX message queues (see
-.BR mq_overview (7));
-.IP \[bu]
-employ the
-.BR prctl (2)
-.B PR_SET_MM
-operation;
-.IP \[bu]
-set
-.IR /proc/ pid /oom_score_adj
-to a value lower than the value last set by a process with
-.BR CAP_SYS_RESOURCE .
-.RE
-.PD
-.TP
-.B CAP_SYS_TIME
-Set system clock
-.RB ( settimeofday (2),
-.BR stime (2),
-.BR adjtimex (2));
-set real-time (hardware) clock.
-.TP
-.B CAP_SYS_TTY_CONFIG
-Use
-.BR vhangup (2);
-employ various privileged
-.BR ioctl (2)
-operations on virtual terminals.
-.TP
-.BR CAP_SYSLOG " (since Linux 2.6.37)"
-.RS
-.PD 0
-.IP \[bu] 3
-Perform privileged
-.BR syslog (2)
-operations.
-See
-.BR syslog (2)
-for information on which operations require privilege.
-.IP \[bu]
-View kernel addresses exposed via
-.I /proc
-and other interfaces when
-.I /proc/sys/kernel/kptr_restrict
-has the value 1.
-(See the discussion of the
-.I kptr_restrict
-in
-.BR proc (5).)
-.PD
-.RE
-.TP
-.BR CAP_WAKE_ALARM " (since Linux 3.0)"
-Trigger something that will wake up the system (set
-.B CLOCK_REALTIME_ALARM
-and
-.B CLOCK_BOOTTIME_ALARM
-timers).
-.\"
-.SS Past and current implementation
-A full implementation of capabilities requires that:
-.IP \[bu] 3
-For all privileged operations,
-the kernel must check whether the thread has the required
-capability in its effective set.
-.IP \[bu]
-The kernel must provide system calls allowing a thread's capability sets to
-be changed and retrieved.
-.IP \[bu]
-The filesystem must support attaching capabilities to an executable file,
-so that a process gains those capabilities when the file is executed.
-.PP
-Before Linux 2.6.24, only the first two of these requirements are met;
-since Linux 2.6.24, all three requirements are met.
-.\"
-.SS Notes to kernel developers
-When adding a new kernel feature that should be governed by a capability,
-consider the following points.
-.IP \[bu] 3
-The goal of capabilities is divide the power of superuser into pieces,
-such that if a program that has one or more capabilities is compromised,
-its power to do damage to the system would be less than the same program
-running with root privilege.
-.IP \[bu]
-You have the choice of either creating a new capability for your new feature,
-or associating the feature with one of the existing capabilities.
-In order to keep the set of capabilities to a manageable size,
-the latter option is preferable,
-unless there are compelling reasons to take the former option.
-(There is also a technical limit:
-the size of capability sets is currently limited to 64 bits.)
-.IP \[bu]
-To determine which existing capability might best be associated
-with your new feature, review the list of capabilities above in order
-to find a "silo" into which your new feature best fits.
-One approach to take is to determine if there are other features
-requiring capabilities that will always be used along with the new feature.
-If the new feature is useless without these other features,
-you should use the same capability as the other features.
-.IP \[bu]
-.I Don't
-choose
-.B CAP_SYS_ADMIN
-if you can possibly avoid it!
-A vast proportion of existing capability checks are associated
-with this capability (see the partial list above).
-It can plausibly be called "the new root",
-since on the one hand, it confers a wide range of powers,
-and on the other hand,
-its broad scope means that this is the capability
-that is required by many privileged programs.
-Don't make the problem worse.
-The only new features that should be associated with
-.B CAP_SYS_ADMIN
-are ones that
-.I closely
-match existing uses in that silo.
-.IP \[bu]
-If you have determined that it really is necessary to create
-a new capability for your feature,
-don't make or name it as a "single-use" capability.
-Thus, for example, the addition of the highly specific
-.B CAP_SYS_PACCT
-was probably a mistake.
-Instead, try to identify and name your new capability as a broader
-silo into which other related future use cases might fit.
-.\"
-.SS Thread capability sets
-Each thread has the following capability sets containing zero or more
-of the above capabilities:
-.TP
-.I Permitted
-This is a limiting superset for the effective
-capabilities that the thread may assume.
-It is also a limiting superset for the capabilities that
-may be added to the inheritable set by a thread that does not have the
-.B CAP_SETPCAP
-capability in its effective set.
-.IP
-If a thread drops a capability from its permitted set,
-it can never reacquire that capability (unless it
-.BR execve (2)s
-either a set-user-ID-root program, or
-a program whose associated file capabilities grant that capability).
-.TP
-.I Inheritable
-This is a set of capabilities preserved across an
-.BR execve (2).
-Inheritable capabilities remain inheritable when executing any program,
-and inheritable capabilities are added to the permitted set when executing
-a program that has the corresponding bits set in the file inheritable set.
-.IP
-Because inheritable capabilities are not generally preserved across
-.BR execve (2)
-when running as a non-root user, applications that wish to run helper
-programs with elevated capabilities should consider using
-ambient capabilities, described below.
-.TP
-.I Effective
-This is the set of capabilities used by the kernel to
-perform permission checks for the thread.
-.TP
-.IR Bounding " (per-thread since Linux 2.6.25)"
-The capability bounding set is a mechanism that can be used
-to limit the capabilities that are gained during
-.BR execve (2).
-.IP
-Since Linux 2.6.25, this is a per-thread capability set.
-In older kernels, the capability bounding set was a system wide attribute
-shared by all threads on the system.
-.IP
-For more details, see
-.I Capability bounding set
-below.
-.TP
-.IR Ambient " (since Linux 4.3)"
-.\" commit 58319057b7847667f0c9585b9de0e8932b0fdb08
-This is a set of capabilities that are preserved across an
-.BR execve (2)
-of a program that is not privileged.
-The ambient capability set obeys the invariant that no capability
-can ever be ambient if it is not both permitted and inheritable.
-.IP
-The ambient capability set can be directly modified using
-.BR prctl (2).
-Ambient capabilities are automatically lowered if either of
-the corresponding permitted or inheritable capabilities is lowered.
-.IP
-Executing a program that changes UID or GID due to the
-set-user-ID or set-group-ID bits or executing a program that has
-any file capabilities set will clear the ambient set.
-Ambient capabilities are added to the permitted set and
-assigned to the effective set when
-.BR execve (2)
-is called.
-If ambient capabilities cause a process's permitted and effective
-capabilities to increase during an
-.BR execve (2),
-this does not trigger the secure-execution mode described in
-.BR ld.so (8).
-.PP
-A child created via
-.BR fork (2)
-inherits copies of its parent's capability sets.
-For details on how
-.BR execve (2)
-affects capabilities, see
-.I Transformation of capabilities during execve()
-below.
-.PP
-Using
-.BR capset (2),
-a thread may manipulate its own capability sets; see
-.I Programmatically adjusting capability sets
-below.
-.PP
-Since Linux 3.2, the file
-.I /proc/sys/kernel/cap_last_cap
-.\" commit 73efc0394e148d0e15583e13712637831f926720
-exposes the numerical value of the highest capability
-supported by the running kernel;
-this can be used to determine the highest bit
-that may be set in a capability set.
-.\"
-.SS File capabilities
-Since Linux 2.6.24, the kernel supports
-associating capability sets with an executable file using
-.BR setcap (8).
-The file capability sets are stored in an extended attribute (see
-.BR setxattr (2)
-and
-.BR xattr (7))
-named
-.IR "security.capability" .
-Writing to this extended attribute requires the
-.B CAP_SETFCAP
-capability.
-The file capability sets,
-in conjunction with the capability sets of the thread,
-determine the capabilities of a thread after an
-.BR execve (2).
-.PP
-The three file capability sets are:
-.TP
-.IR Permitted " (formerly known as " forced ):
-These capabilities are automatically permitted to the thread,
-regardless of the thread's inheritable capabilities.
-.TP
-.IR Inheritable " (formerly known as " allowed ):
-This set is ANDed with the thread's inheritable set to determine which
-inheritable capabilities are enabled in the permitted set of
-the thread after the
-.BR execve (2).
-.TP
-.IR Effective :
-This is not a set, but rather just a single bit.
-If this bit is set, then during an
-.BR execve (2)
-all of the new permitted capabilities for the thread are
-also raised in the effective set.
-If this bit is not set, then after an
-.BR execve (2),
-none of the new permitted capabilities is in the new effective set.
-.IP
-Enabling the file effective capability bit implies
-that any file permitted or inheritable capability that causes a
-thread to acquire the corresponding permitted capability during an
-.BR execve (2)
-(see
-.I Transformation of capabilities during execve()
-below) will also acquire that
-capability in its effective set.
-Therefore, when assigning capabilities to a file
-.RB ( setcap (8),
-.BR cap_set_file (3),
-.BR cap_set_fd (3)),
-if we specify the effective flag as being enabled for any capability,
-then the effective flag must also be specified as enabled
-for all other capabilities for which the corresponding permitted or
-inheritable flag is enabled.
-.\"
-.SS File capability extended attribute versioning
-To allow extensibility,
-the kernel supports a scheme to encode a version number inside the
-.I security.capability
-extended attribute that is used to implement file capabilities.
-These version numbers are internal to the implementation,
-and not directly visible to user-space applications.
-To date, the following versions are supported:
-.TP
-.B VFS_CAP_REVISION_1
-This was the original file capability implementation,
-which supported 32-bit masks for file capabilities.
-.TP
-.BR VFS_CAP_REVISION_2 " (since Linux 2.6.25)"
-.\" commit e338d263a76af78fe8f38a72131188b58fceb591
-This version allows for file capability masks that are 64 bits in size,
-and was necessary as the number of supported capabilities grew beyond 32.
-The kernel transparently continues to support the execution of files
-that have 32-bit version 1 capability masks,
-but when adding capabilities to files that did not previously
-have capabilities, or modifying the capabilities of existing files,
-it automatically uses the version 2 scheme
-(or possibly the version 3 scheme, as described below).
-.TP
-.BR VFS_CAP_REVISION_3 " (since Linux 4.14)"
-.\" commit 8db6c34f1dbc8e06aa016a9b829b06902c3e1340
-Version 3 file capabilities are provided
-to support namespaced file capabilities (described below).
-.IP
-As with version 2 file capabilities,
-version 3 capability masks are 64 bits in size.
-But in addition, the root user ID of namespace is encoded in the
-.I security.capability
-extended attribute.
-(A namespace's root user ID is the value that user ID 0
-inside that namespace maps to in the initial user namespace.)
-.IP
-Version 3 file capabilities are designed to coexist
-with version 2 capabilities;
-that is, on a modern Linux system,
-there may be some files with version 2 capabilities
-while others have version 3 capabilities.
-.PP
-Before Linux 4.14,
-the only kind of file capability extended attribute
-that could be attached to a file was a
-.B VFS_CAP_REVISION_2
-attribute.
-Since Linux 4.14,
-the version of the
-.I security.capability
-extended attribute that is attached to a file
-depends on the circumstances in which the attribute was created.
-.PP
-Starting with Linux 4.14, a
-.I security.capability
-extended attribute is automatically created as (or converted to)
-a version 3
-.RB ( VFS_CAP_REVISION_3 )
-attribute if both of the following are true:
-.IP \[bu] 3
-The thread writing the attribute resides in a noninitial user namespace.
-(More precisely: the thread resides in a user namespace other
-than the one from which the underlying filesystem was mounted.)
-.IP \[bu]
-The thread has the
-.B CAP_SETFCAP
-capability over the file inode,
-meaning that (a) the thread has the
-.B CAP_SETFCAP
-capability in its own user namespace;
-and (b) the UID and GID of the file inode have mappings in
-the writer's user namespace.
-.PP
-When a
-.B VFS_CAP_REVISION_3
-.I security.capability
-extended attribute is created, the root user ID of the creating thread's
-user namespace is saved in the extended attribute.
-.PP
-By contrast, creating or modifying a
-.I security.capability
-extended attribute from a privileged
-.RB ( CAP_SETFCAP )
-thread that resides in the
-namespace where the underlying filesystem was mounted
-(this normally means the initial user namespace)
-automatically results in the creation of a version 2
-.RB ( VFS_CAP_REVISION_2 )
-attribute.
-.PP
-Note that the creation of a version 3
-.I security.capability
-extended attribute is automatic.
-That is to say, when a user-space application writes
-.RB ( setxattr (2))
-a
-.I security.capability
-attribute in the version 2 format,
-the kernel will automatically create a version 3 attribute
-if the attribute is created in the circumstances described above.
-Correspondingly, when a version 3
-.I security.capability
-attribute is retrieved
-.RB ( getxattr (2))
-by a process that resides inside a user namespace that was created by the
-root user ID (or a descendant of that user namespace),
-the returned attribute is (automatically)
-simplified to appear as a version 2 attribute
-(i.e., the returned value is the size of a version 2 attribute and does
-not include the root user ID).
-These automatic translations mean that no changes are required to
-user-space tools (e.g.,
-.BR setcap (1)
-and
-.BR getcap (1))
-in order for those tools to be used to create and retrieve version 3
-.I security.capability
-attributes.
-.PP
-Note that a file can have either a version 2 or a version 3
-.I security.capability
-extended attribute associated with it, but not both:
-creation or modification of the
-.I security.capability
-extended attribute will automatically modify the version
-according to the circumstances in which the extended attribute is
-created or modified.
-.\"
-.SS Transformation of capabilities during execve()
-During an
-.BR execve (2),
-the kernel calculates the new capabilities of
-the process using the following algorithm:
-.PP
-.in +4n
-.EX
-P'(ambient) = (file is privileged) ? 0 : P(ambient)
-\&
-P'(permitted) = (P(inheritable) & F(inheritable)) |
- (F(permitted) & P(bounding)) | P'(ambient)
-\&
-P'(effective) = F(effective) ? P'(permitted) : P'(ambient)
-\&
-P'(inheritable) = P(inheritable) [i.e., unchanged]
-\&
-P'(bounding) = P(bounding) [i.e., unchanged]
-.EE
-.in
-.PP
-where:
-.RS 4
-.TP
-P()
-denotes the value of a thread capability set before the
-.BR execve (2)
-.TP
-P'()
-denotes the value of a thread capability set after the
-.BR execve (2)
-.TP
-F()
-denotes a file capability set
-.RE
-.PP
-Note the following details relating to the above capability
-transformation rules:
-.IP \[bu] 3
-The ambient capability set is present only since Linux 4.3.
-When determining the transformation of the ambient set during
-.BR execve (2),
-a privileged file is one that has capabilities or
-has the set-user-ID or set-group-ID bit set.
-.IP \[bu]
-Prior to Linux 2.6.25,
-the bounding set was a system-wide attribute shared by all threads.
-That system-wide value was employed to calculate the new permitted set during
-.BR execve (2)
-in the same manner as shown above for
-.IR P(bounding) .
-.PP
-.IR Note :
-during the capability transitions described above,
-file capabilities may be ignored (treated as empty) for the same reasons
-that the set-user-ID and set-group-ID bits are ignored; see
-.BR execve (2).
-File capabilities are similarly ignored if the kernel was booted with the
-.I no_file_caps
-option.
-.PP
-.IR Note :
-according to the rules above,
-if a process with nonzero user IDs performs an
-.BR execve (2)
-then any capabilities that are present in
-its permitted and effective sets will be cleared.
-For the treatment of capabilities when a process with a
-user ID of zero performs an
-.BR execve (2),
-see
-.I Capabilities and execution of programs by root
-below.
-.\"
-.SS Safety checking for capability-dumb binaries
-A capability-dumb binary is an application that has been
-marked to have file capabilities, but has not been converted to use the
-.BR libcap (3)
-API to manipulate its capabilities.
-(In other words, this is a traditional set-user-ID-root program
-that has been switched to use file capabilities,
-but whose code has not been modified to understand capabilities.)
-For such applications,
-the effective capability bit is set on the file,
-so that the file permitted capabilities are automatically
-enabled in the process effective set when executing the file.
-The kernel recognizes a file which has the effective capability bit set
-as capability-dumb for the purpose of the check described here.
-.PP
-When executing a capability-dumb binary,
-the kernel checks if the process obtained all permitted capabilities
-that were specified in the file permitted set,
-after the capability transformations described above have been performed.
-(The typical reason why this might
-.I not
-occur is that the capability bounding set masked out some
-of the capabilities in the file permitted set.)
-If the process did not obtain the full set of
-file permitted capabilities, then
-.BR execve (2)
-fails with the error
-.BR EPERM .
-This prevents possible security risks that could arise when
-a capability-dumb application is executed with less privilege than it needs.
-Note that, by definition,
-the application could not itself recognize this problem,
-since it does not employ the
-.BR libcap (3)
-API.
-.\"
-.SS Capabilities and execution of programs by root
-.\" See cap_bprm_set_creds(), bprm_caps_from_vfs_cap() and
-.\" handle_privileged_root() in security/commoncap.c (Linux 5.0 source)
-In order to mirror traditional UNIX semantics,
-the kernel performs special treatment of file capabilities when
-a process with UID 0 (root) executes a program and
-when a set-user-ID-root program is executed.
-.PP
-After having performed any changes to the process effective ID that
-were triggered by the set-user-ID mode bit of the binary\[em]e.g.,
-switching the effective user ID to 0 (root) because
-a set-user-ID-root program was executed\[em]the
-kernel calculates the file capability sets as follows:
-.IP (1) 5
-If the real or effective user ID of the process is 0 (root),
-then the file inheritable and permitted sets are ignored;
-instead they are notionally considered to be all ones
-(i.e., all capabilities enabled).
-(There is one exception to this behavior, described in
-.I Set-user-ID-root programs that have file capabilities
-below.)
-.IP (2)
-If the effective user ID of the process is 0 (root) or
-the file effective bit is in fact enabled,
-then the file effective bit is notionally defined to be one (enabled).
-.PP
-These notional values for the file's capability sets are then used
-as described above to calculate the transformation of the process's
-capabilities during
-.BR execve (2).
-.PP
-Thus, when a process with nonzero UIDs
-.BR execve (2)s
-a set-user-ID-root program that does not have capabilities attached,
-or when a process whose real and effective UIDs are zero
-.BR execve (2)s
-a program, the calculation of the process's new
-permitted capabilities simplifies to:
-.PP
-.in +4n
-.EX
-P'(permitted) = P(inheritable) | P(bounding)
-\&
-P'(effective) = P'(permitted)
-.EE
-.in
-.PP
-Consequently, the process gains all capabilities in its permitted and
-effective capability sets,
-except those masked out by the capability bounding set.
-(In the calculation of P'(permitted),
-the P'(ambient) term can be simplified away because it is by
-definition a proper subset of P(inheritable).)
-.PP
-The special treatments of user ID 0 (root) described in this subsection
-can be disabled using the securebits mechanism described below.
-.\"
-.\"
-.SS Set-user-ID-root programs that have file capabilities
-There is one exception to the behavior described in
-.I Capabilities and execution of programs by root
-above.
-If (a) the binary that is being executed has capabilities attached and
-(b) the real user ID of the process is
-.I not
-0 (root) and
-(c) the effective user ID of the process
-.I is
-0 (root), then the file capability bits are honored
-(i.e., they are not notionally considered to be all ones).
-The usual way in which this situation can arise is when executing
-a set-UID-root program that also has file capabilities.
-When such a program is executed,
-the process gains just the capabilities granted by the program
-(i.e., not all capabilities,
-as would occur when executing a set-user-ID-root program
-that does not have any associated file capabilities).
-.PP
-Note that one can assign empty capability sets to a program file,
-and thus it is possible to create a set-user-ID-root program that
-changes the effective and saved set-user-ID of the process
-that executes the program to 0,
-but confers no capabilities to that process.
-.\"
-.SS Capability bounding set
-The capability bounding set is a security mechanism that can be used
-to limit the capabilities that can be gained during an
-.BR execve (2).
-The bounding set is used in the following ways:
-.IP \[bu] 3
-During an
-.BR execve (2),
-the capability bounding set is ANDed with the file permitted
-capability set, and the result of this operation is assigned to the
-thread's permitted capability set.
-The capability bounding set thus places a limit on the permitted
-capabilities that may be granted by an executable file.
-.IP \[bu]
-(Since Linux 2.6.25)
-The capability bounding set acts as a limiting superset for
-the capabilities that a thread can add to its inheritable set using
-.BR capset (2).
-This means that if a capability is not in the bounding set,
-then a thread can't add this capability to its
-inheritable set, even if it was in its permitted capabilities,
-and thereby cannot have this capability preserved in its
-permitted set when it
-.BR execve (2)s
-a file that has the capability in its inheritable set.
-.PP
-Note that the bounding set masks the file permitted capabilities,
-but not the inheritable capabilities.
-If a thread maintains a capability in its inheritable set
-that is not in its bounding set,
-then it can still gain that capability in its permitted set
-by executing a file that has the capability in its inheritable set.
-.PP
-Depending on the kernel version, the capability bounding set is either
-a system-wide attribute, or a per-process attribute.
-.PP
-.B "Capability bounding set from Linux 2.6.25 onward"
-.PP
-From Linux 2.6.25, the
-.I "capability bounding set"
-is a per-thread attribute.
-(The system-wide capability bounding set described below no longer exists.)
-.PP
-The bounding set is inherited at
-.BR fork (2)
-from the thread's parent, and is preserved across an
-.BR execve (2).
-.PP
-A thread may remove capabilities from its capability bounding set using the
-.BR prctl (2)
-.B PR_CAPBSET_DROP
-operation, provided it has the
-.B CAP_SETPCAP
-capability.
-Once a capability has been dropped from the bounding set,
-it cannot be restored to that set.
-A thread can determine if a capability is in its bounding set using the
-.BR prctl (2)
-.B PR_CAPBSET_READ
-operation.
-.PP
-Removing capabilities from the bounding set is supported only if file
-capabilities are compiled into the kernel.
-Before Linux 2.6.33,
-file capabilities were an optional feature configurable via the
-.B CONFIG_SECURITY_FILE_CAPABILITIES
-option.
-Since Linux 2.6.33,
-.\" commit b3a222e52e4d4be77cc4520a57af1a4a0d8222d1
-the configuration option has been removed
-and file capabilities are always part of the kernel.
-When file capabilities are compiled into the kernel, the
-.B init
-process (the ancestor of all processes) begins with a full bounding set.
-If file capabilities are not compiled into the kernel, then
-.B init
-begins with a full bounding set minus
-.BR CAP_SETPCAP ,
-because this capability has a different meaning when there are
-no file capabilities.
-.PP
-Removing a capability from the bounding set does not remove it
-from the thread's inheritable set.
-However it does prevent the capability from being added
-back into the thread's inheritable set in the future.
-.PP
-.B "Capability bounding set prior to Linux 2.6.25"
-.PP
-Before Linux 2.6.25, the capability bounding set is a system-wide
-attribute that affects all threads on the system.
-The bounding set is accessible via the file
-.IR /proc/sys/kernel/cap\-bound .
-(Confusingly, this bit mask parameter is expressed as a
-signed decimal number in
-.IR /proc/sys/kernel/cap\-bound .)
-.PP
-Only the
-.B init
-process may set capabilities in the capability bounding set;
-other than that, the superuser (more precisely: a process with the
-.B CAP_SYS_MODULE
-capability) may only clear capabilities from this set.
-.PP
-On a standard system the capability bounding set always masks out the
-.B CAP_SETPCAP
-capability.
-To remove this restriction (dangerous!), modify the definition of
-.B CAP_INIT_EFF_SET
-in
-.I include/linux/capability.h
-and rebuild the kernel.
-.PP
-The system-wide capability bounding set feature was added
-to Linux 2.2.11.
-.\"
-.\"
-.\"
-.SS Effect of user ID changes on capabilities
-To preserve the traditional semantics for transitions between
-0 and nonzero user IDs,
-the kernel makes the following changes to a thread's capability
-sets on changes to the thread's real, effective, saved set,
-and filesystem user IDs (using
-.BR setuid (2),
-.BR setresuid (2),
-or similar):
-.IP \[bu] 3
-If one or more of the real, effective, or saved set user IDs
-was previously 0, and as a result of the UID changes all of these IDs
-have a nonzero value,
-then all capabilities are cleared from the permitted, effective, and ambient
-capability sets.
-.IP \[bu]
-If the effective user ID is changed from 0 to nonzero,
-then all capabilities are cleared from the effective set.
-.IP \[bu]
-If the effective user ID is changed from nonzero to 0,
-then the permitted set is copied to the effective set.
-.IP \[bu]
-If the filesystem user ID is changed from 0 to nonzero (see
-.BR setfsuid (2)),
-then the following capabilities are cleared from the effective set:
-.BR CAP_CHOWN ,
-.BR CAP_DAC_OVERRIDE ,
-.BR CAP_DAC_READ_SEARCH ,
-.BR CAP_FOWNER ,
-.BR CAP_FSETID ,
-.B CAP_LINUX_IMMUTABLE
-(since Linux 2.6.30),
-.BR CAP_MAC_OVERRIDE ,
-and
-.B CAP_MKNOD
-(since Linux 2.6.30).
-If the filesystem UID is changed from nonzero to 0,
-then any of these capabilities that are enabled in the permitted set
-are enabled in the effective set.
-.PP
-If a thread that has a 0 value for one or more of its user IDs wants
-to prevent its permitted capability set being cleared when it resets
-all of its user IDs to nonzero values, it can do so using the
-.B SECBIT_KEEP_CAPS
-securebits flag described below.
-.\"
-.SS Programmatically adjusting capability sets
-A thread can retrieve and change its permitted, effective, and inheritable
-capability sets using the
-.BR capget (2)
-and
-.BR capset (2)
-system calls.
-However, the use of
-.BR cap_get_proc (3)
-and
-.BR cap_set_proc (3),
-both provided in the
-.I libcap
-package,
-is preferred for this purpose.
-The following rules govern changes to the thread capability sets:
-.IP \[bu] 3
-If the caller does not have the
-.B CAP_SETPCAP
-capability,
-the new inheritable set must be a subset of the combination
-of the existing inheritable and permitted sets.
-.IP \[bu]
-(Since Linux 2.6.25)
-The new inheritable set must be a subset of the combination of the
-existing inheritable set and the capability bounding set.
-.IP \[bu]
-The new permitted set must be a subset of the existing permitted set
-(i.e., it is not possible to acquire permitted capabilities
-that the thread does not currently have).
-.IP \[bu]
-The new effective set must be a subset of the new permitted set.
-.SS The securebits flags: establishing a capabilities-only environment
-.\" For some background:
-.\" see http://lwn.net/Articles/280279/ and
-.\" http://article.gmane.org/gmane.linux.kernel.lsm/5476/
-Starting with Linux 2.6.26,
-and with a kernel in which file capabilities are enabled,
-Linux implements a set of per-thread
-.I securebits
-flags that can be used to disable special handling of capabilities for UID 0
-.RI ( root ).
-These flags are as follows:
-.TP
-.B SECBIT_KEEP_CAPS
-Setting this flag allows a thread that has one or more 0 UIDs to retain
-capabilities in its permitted set
-when it switches all of its UIDs to nonzero values.
-If this flag is not set,
-then such a UID switch causes the thread to lose all permitted capabilities.
-This flag is always cleared on an
-.BR execve (2).
-.IP
-Note that even with the
-.B SECBIT_KEEP_CAPS
-flag set, the effective capabilities of a thread are cleared when it
-switches its effective UID to a nonzero value.
-However,
-if the thread has set this flag and its effective UID is already nonzero,
-and the thread subsequently switches all other UIDs to nonzero values,
-then the effective capabilities will not be cleared.
-.IP
-The setting of the
-.B SECBIT_KEEP_CAPS
-flag is ignored if the
-.B SECBIT_NO_SETUID_FIXUP
-flag is set.
-(The latter flag provides a superset of the effect of the former flag.)
-.IP
-This flag provides the same functionality as the older
-.BR prctl (2)
-.B PR_SET_KEEPCAPS
-operation.
-.TP
-.B SECBIT_NO_SETUID_FIXUP
-Setting this flag stops the kernel from adjusting the process's
-permitted, effective, and ambient capability sets when
-the thread's effective and filesystem UIDs are switched between
-zero and nonzero values.
-See
-.I Effect of user ID changes on capabilities
-above.
-.TP
-.B SECBIT_NOROOT
-If this bit is set, then the kernel does not grant capabilities
-when a set-user-ID-root program is executed, or when a process with
-an effective or real UID of 0 calls
-.BR execve (2).
-(See
-.I Capabilities and execution of programs by root
-above.)
-.TP
-.B SECBIT_NO_CAP_AMBIENT_RAISE
-Setting this flag disallows raising ambient capabilities via the
-.BR prctl (2)
-.B PR_CAP_AMBIENT_RAISE
-operation.
-.PP
-Each of the above "base" flags has a companion "locked" flag.
-Setting any of the "locked" flags is irreversible,
-and has the effect of preventing further changes to the
-corresponding "base" flag.
-The locked flags are:
-.BR SECBIT_KEEP_CAPS_LOCKED ,
-.BR SECBIT_NO_SETUID_FIXUP_LOCKED ,
-.BR SECBIT_NOROOT_LOCKED ,
-and
-.BR SECBIT_NO_CAP_AMBIENT_RAISE_LOCKED .
-.PP
-The
-.I securebits
-flags can be modified and retrieved using the
-.BR prctl (2)
-.B PR_SET_SECUREBITS
-and
-.B PR_GET_SECUREBITS
-operations.
-The
-.B CAP_SETPCAP
-capability is required to modify the flags.
-Note that the
-.B SECBIT_*
-constants are available only after including the
-.I <linux/securebits.h>
-header file.
-.PP
-The
-.I securebits
-flags are inherited by child processes.
-During an
-.BR execve (2),
-all of the flags are preserved, except
-.B SECBIT_KEEP_CAPS
-which is always cleared.
-.PP
-An application can use the following call to lock itself,
-and all of its descendants,
-into an environment where the only way of gaining capabilities
-is by executing a program with associated file capabilities:
-.PP
-.in +4n
-.EX
-prctl(PR_SET_SECUREBITS,
- /* SECBIT_KEEP_CAPS off */
- SECBIT_KEEP_CAPS_LOCKED |
- SECBIT_NO_SETUID_FIXUP |
- SECBIT_NO_SETUID_FIXUP_LOCKED |
- SECBIT_NOROOT |
- SECBIT_NOROOT_LOCKED);
- /* Setting/locking SECBIT_NO_CAP_AMBIENT_RAISE
- is not required */
-.EE
-.in
-.\"
-.\"
-.SS Per-user-namespace """set-user-ID-root""" programs
-A set-user-ID program whose UID matches the UID that
-created a user namespace will confer capabilities
-in the process's permitted and effective sets
-when executed by any process inside that namespace
-or any descendant user namespace.
-.PP
-The rules about the transformation of the process's capabilities during the
-.BR execve (2)
-are exactly as described in
-.I Transformation of capabilities during execve()
-and
-.I Capabilities and execution of programs by root
-above,
-with the difference that, in the latter subsection, "root"
-is the UID of the creator of the user namespace.
-.\"
-.\"
-.SS Namespaced file capabilities
-.\" commit 8db6c34f1dbc8e06aa016a9b829b06902c3e1340
-Traditional (i.e., version 2) file capabilities associate
-only a set of capability masks with a binary executable file.
-When a process executes a binary with such capabilities,
-it gains the associated capabilities (within its user namespace)
-as per the rules described in
-.I Transformation of capabilities during execve()
-above.
-.PP
-Because version 2 file capabilities confer capabilities to
-the executing process regardless of which user namespace it resides in,
-only privileged processes are permitted to associate capabilities with a file.
-Here, "privileged" means a process that has the
-.B CAP_SETFCAP
-capability in the user namespace where the filesystem was mounted
-(normally the initial user namespace).
-This limitation renders file capabilities useless for certain use cases.
-For example, in user-namespaced containers,
-it can be desirable to be able to create a binary that
-confers capabilities only to processes executed inside that container,
-but not to processes that are executed outside the container.
-.PP
-Linux 4.14 added so-called namespaced file capabilities
-to support such use cases.
-Namespaced file capabilities are recorded as version 3 (i.e.,
-.BR VFS_CAP_REVISION_3 )
-.I security.capability
-extended attributes.
-Such an attribute is automatically created in the circumstances described
-in
-.I File capability extended attribute versioning
-above.
-When a version 3
-.I security.capability
-extended attribute is created,
-the kernel records not just the capability masks in the extended attribute,
-but also the namespace root user ID.
-.PP
-As with a binary that has
-.B VFS_CAP_REVISION_2
-file capabilities, a binary with
-.B VFS_CAP_REVISION_3
-file capabilities confers capabilities to a process during
-.BR execve ().
-However, capabilities are conferred only if the binary is executed by
-a process that resides in a user namespace whose
-UID 0 maps to the root user ID that is saved in the extended attribute,
-or when executed by a process that resides in a descendant of such a namespace.
-.\"
-.\"
-.SS Interaction with user namespaces
-For further information on the interaction of
-capabilities and user namespaces, see
-.BR user_namespaces (7).
-.SH STANDARDS
-No standards govern capabilities, but the Linux capability implementation
-is based on the withdrawn
-.UR https://archive.org\:/details\:/posix_1003.1e\-990310
-POSIX.1e draft standard
-.UE .
-.SH NOTES
-When attempting to
-.BR strace (1)
-binaries that have capabilities (or set-user-ID-root binaries),
-you may find the
-.I \-u <username>
-option useful.
-Something like:
-.PP
-.in +4n
-.EX
-$ \fBsudo strace \-o trace.log \-u ceci ./myprivprog\fP
-.EE
-.in
-.PP
-From Linux 2.5.27 to Linux 2.6.26,
-.\" commit 5915eb53861c5776cfec33ca4fcc1fd20d66dd27 removed
-.\" CONFIG_SECURITY_CAPABILITIES
-capabilities were an optional kernel component,
-and could be enabled/disabled via the
-.B CONFIG_SECURITY_CAPABILITIES
-kernel configuration option.
-.PP
-The
-.IR /proc/ pid /task/TID/status
-file can be used to view the capability sets of a thread.
-The
-.IR /proc/ pid /status
-file shows the capability sets of a process's main thread.
-Before Linux 3.8, nonexistent capabilities were shown as being
-enabled (1) in these sets.
-Since Linux 3.8,
-.\" 7b9a7ec565505699f503b4fcf61500dceb36e744
-all nonexistent capabilities (above
-.BR CAP_LAST_CAP )
-are shown as disabled (0).
-.PP
-The
-.I libcap
-package provides a suite of routines for setting and
-getting capabilities that is more comfortable and less likely
-to change than the interface provided by
-.BR capset (2)
-and
-.BR capget (2).
-This package also provides the
-.BR setcap (8)
-and
-.BR getcap (8)
-programs.
-It can be found at
-.br
-.UR https://git.kernel.org\:/pub\:/scm\:/libs\:/libcap\:/libcap.git\:/refs/
-.UE .
-.PP
-Before Linux 2.6.24, and from Linux 2.6.24 to Linux 2.6.32 if
-file capabilities are not enabled, a thread with the
-.B CAP_SETPCAP
-capability can manipulate the capabilities of threads other than itself.
-However, this is only theoretically possible,
-since no thread ever has
-.B CAP_SETPCAP
-in either of these cases:
-.IP \[bu] 3
-In the pre-2.6.25 implementation the system-wide capability bounding set,
-.IR /proc/sys/kernel/cap\-bound ,
-always masks out the
-.B CAP_SETPCAP
-capability, and this can not be changed
-without modifying the kernel source and rebuilding the kernel.
-.IP \[bu]
-If file capabilities are disabled (i.e., the kernel
-.B CONFIG_SECURITY_FILE_CAPABILITIES
-option is disabled), then
-.B init
-starts out with the
-.B CAP_SETPCAP
-capability removed from its per-process bounding
-set, and that bounding set is inherited by all other processes
-created on the system.
-.SH SEE ALSO
-.BR capsh (1),
-.BR setpriv (1),
-.BR prctl (2),
-.BR setfsuid (2),
-.BR cap_clear (3),
-.BR cap_copy_ext (3),
-.BR cap_from_text (3),
-.BR cap_get_file (3),
-.BR cap_get_proc (3),
-.BR cap_init (3),
-.BR capgetp (3),
-.BR capsetp (3),
-.BR libcap (3),
-.BR proc (5),
-.BR credentials (7),
-.BR pthreads (7),
-.BR user_namespaces (7),
-.BR captest (8), \" from libcap-ng
-.BR filecap (8), \" from libcap-ng
-.BR getcap (8),
-.BR getpcaps (8),
-.BR netcap (8), \" from libcap-ng
-.BR pscap (8), \" from libcap-ng
-.BR setcap (8)
-.PP
-.I include/linux/capability.h
-in the Linux kernel source tree