proc.5, proc_sys.5: Split /proc/sys/ from proc(5)

Signed-off-by: Alejandro Colomar <alx@kernel.org>
author: Alejandro Colomar <alx@kernel.org> 2023-08-15 20:35:38 +0200
committer: Alejandro Colomar <alx@kernel.org> 2023-08-15 23:19:17 +0200
commit: bfc1299e7d6794265dc89806f601737d2bb958dc (patch)
tree: 160de735507a3f988d2dcc4d05a17e2c8c7d8bfa /man5
parent: 30bcb8114ea32e33b99caacc018c68a5893468dc (diff)
2 files changed, 1623 insertions, 1610 deletions
diff --git a/man5/proc.5 b/man5/proc.5
index d7d706e5f..de29415eb 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1,7 +1,6 @@
 '\" t
 .\" Copyright (C) 1994, 1995, Daniel Quinlan <quinlan@yggdrasil.com>
 .\" Copyright (C) 2002-2008, 2017, Michael Kerrisk <mtk.manpages@gmail.com>
-.\" and sysctl additions from Andries Brouwer (aeb@cwi.nl)
 .\" and System V IPC (as well as various other) additions from
 .\" Michael Kerrisk <mtk.manpages@gmail.com>
 .\"
@@ -228,1615 +227,6 @@ hierarchy.
 .\" FIXME Document /proc/sched_debug (since Linux 2.6.23)
 .\" See also /proc/[pid]/sched
 .TP
-.I /proc/sys
-This directory (present since Linux 1.3.57) contains a number of files
-and subdirectories corresponding to kernel variables.
-These variables can be read and in some cases modified using
-the \fI/proc\fP filesystem, and the (deprecated)
-.BR sysctl (2)
-system call.
-.IP
-String values may be terminated by either \[aq]\e0\[aq] or \[aq]\en\[aq].
-.IP
-Integer and long values may be written either in decimal or in
-hexadecimal notation (e.g., 0x3FFF).
-When writing multiple integer or long values, these may be separated
-by any of the following whitespace characters:
-\[aq]\ \[aq], \[aq]\et\[aq], or \[aq]\en\[aq].
-Using other separators leads to the error
-.BR EINVAL .
-.TP
-.IR /proc/sys/abi " (since Linux 2.4.10)"
-This directory may contain files with application binary information.
-.\" On some systems, it is not present.
-See the Linux kernel source file
-.I Documentation/sysctl/abi.rst
-(or
-.I Documentation/sysctl/abi.txt
-before Linux 5.3)
-for more information.
-.TP
-.I /proc/sys/debug
-This directory may be empty.
-.TP
-.I /proc/sys/dev
-This directory contains device-specific information (e.g.,
-.IR dev/cdrom/info ).
-On
-some systems, it may be empty.
-.TP
-.I /proc/sys/fs
-This directory contains the files and subdirectories for kernel variables
-related to filesystems.
-.TP
-.IR /proc/sys/fs/aio\-max\-nr " and " /proc/sys/fs/aio\-nr " (since Linux 2.6.4)"
-.I aio\-nr
-is the running total of the number of events specified by
-.BR io_setup (2)
-calls for all currently active AIO contexts.
-If
-.I aio\-nr
-reaches
-.IR aio\-max\-nr ,
-then
-.BR io_setup (2)
-will fail with the error
-.BR EAGAIN .
-Raising
-.I aio\-max\-nr
-does not result in the preallocation or resizing
-of any kernel data structures.
-.TP
-.I /proc/sys/fs/binfmt_misc
-Documentation for files in this directory can be found
-in the Linux kernel source in the file
-.I Documentation/admin\-guide/binfmt\-misc.rst
-(or in
-.I Documentation/binfmt_misc.txt
-on older kernels).
-.TP
-.IR /proc/sys/fs/dentry\-state " (since Linux 2.2)"
-This file contains information about the status of the
-directory cache (dcache).
-The file contains six numbers,
-.IR nr_dentry ,
-.IR nr_unused ,
-.I age_limit
-(age in seconds),
-.I want_pages
-(pages requested by system) and two dummy values.
-.RS
-.IP \[bu] 3
-.I nr_dentry
-is the number of allocated dentries (dcache entries).
-This field is unused in Linux 2.2.
-.IP \[bu]
-.I nr_unused
-is the number of unused dentries.
-.IP \[bu]
-.I age_limit
-.\" looks like this is unused in Linux 2.2 to Linux 2.6
-is the age in seconds after which dcache entries
-can be reclaimed when memory is short.
-.IP \[bu]
-.I want_pages
-.\" looks like this is unused in Linux 2.2 to Linux 2.6
-is nonzero when the kernel has called shrink_dcache_pages() and the
-dcache isn't pruned yet.
-.RE
-.TP
-.I /proc/sys/fs/dir\-notify\-enable
-This file can be used to disable or enable the
-.I dnotify
-interface described in
-.BR fcntl (2)
-on a system-wide basis.
-A value of 0 in this file disables the interface,
-and a value of 1 enables it.
-.TP
-.I /proc/sys/fs/dquot\-max
-This file shows the maximum number of cached disk quota entries.
-On some (2.4) systems, it is not present.
-If the number of free cached disk quota entries is very low and
-you have some awesome number of simultaneous system users,
-you might want to raise the limit.
-.TP
-.I /proc/sys/fs/dquot\-nr
-This file shows the number of allocated disk quota
-entries and the number of free disk quota entries.
-.TP
-.IR /proc/sys/fs/epoll " (since Linux 2.6.28)"
-This directory contains the file
-.IR max_user_watches ,
-which can be used to limit the amount of kernel memory consumed by the
-.I epoll
-interface.
-For further details, see
-.BR epoll (7).
-.TP
-.I /proc/sys/fs/file\-max
-This file defines
-a system-wide limit on the number of open files for all processes.
-System calls that fail when encountering this limit fail with the error
-.BR ENFILE .
-(See also
-.BR setrlimit (2),
-which can be used by a process to set the per-process limit,
-.BR RLIMIT_NOFILE ,
-on the number of files it may open.)
-If you get lots
-of error messages in the kernel log about running out of file handles
-(open file descriptions)
-(look for "VFS: file\-max limit <number> reached"),
-try increasing this value:
-.IP
-.in +4n
-.EX
-echo 100000 > /proc/sys/fs/file\-max
-.EE
-.in
-.IP
-Privileged processes
-.RB ( CAP_SYS_ADMIN )
-can override the
-.I file\-max
-limit.
-.TP
-.I /proc/sys/fs/file\-nr
-This (read-only) file contains three numbers:
-the number of allocated file handles
-(i.e., the number of open file descriptions; see
-.BR open (2));
-the number of free file handles;
-and the maximum number of file handles (i.e., the same value as
-.IR /proc/sys/fs/file\-max ).
-If the number of allocated file handles is close to the
-maximum, you should consider increasing the maximum.
-Before Linux 2.6,
-the kernel allocated file handles dynamically,
-but it didn't free them again.
-Instead the free file handles were kept in a list for reallocation;
-the "free file handles" value indicates the size of that list.
-A large number of free file handles indicates that there was
-a past peak in the usage of open file handles.
-Since Linux 2.6, the kernel does deallocate freed file handles,
-and the "free file handles" value is always zero.
-.TP
-.IR /proc/sys/fs/inode\-max " (only present until Linux 2.2)"
-This file contains the maximum number of in-memory inodes.
-This value should be 3\[en]4 times larger
-than the value in
-.IR file\-max ,
-since \fIstdin\fP, \fIstdout\fP
-and network sockets also need an inode to handle them.
-When you regularly run out of inodes, you need to increase this value.
-.IP
-Starting with Linux 2.4,
-there is no longer a static limit on the number of inodes,
-and this file is removed.
-.TP
-.I /proc/sys/fs/inode\-nr
-This file contains the first two values from
-.IR inode\-state .
-.TP
-.I /proc/sys/fs/inode\-state
-This file
-contains seven numbers:
-.IR nr_inodes ,
-.IR nr_free_inodes ,
-.IR preshrink ,
-and four dummy values (always zero).
-.IP
-.I nr_inodes
-is the number of inodes the system has allocated.
-.\" This can be slightly more than
-.\" .I inode\-max
-.\" because Linux allocates them one page full at a time.
-.I nr_free_inodes
-represents the number of free inodes.
-.IP
-.I preshrink
-is nonzero when the
-.I nr_inodes
->
-.I inode\-max
-and the system needs to prune the inode list instead of allocating more;
-since Linux 2.4, this field is a dummy value (always zero).
-.TP
-.IR /proc/sys/fs/inotify " (since Linux 2.6.13)"
-This directory contains files
-.IR max_queued_events ", " max_user_instances ", and " max_user_watches ,
-that can be used to limit the amount of kernel memory consumed by the
-.I inotify
-interface.
-For further details, see
-.BR inotify (7).
-.TP
-.I /proc/sys/fs/lease\-break\-time
-This file specifies the grace period that the kernel grants to a process
-holding a file lease
-.RB ( fcntl (2))
-after it has sent a signal to that process notifying it
-that another process is waiting to open the file.
-If the lease holder does not remove or downgrade the lease within
-this grace period, the kernel forcibly breaks the lease.
-.TP
-.I /proc/sys/fs/leases\-enable
-This file can be used to enable or disable file leases
-.RB ( fcntl (2))
-on a system-wide basis.
-If this file contains the value 0, leases are disabled.
-A nonzero value enables leases.
-.TP
-.IR /proc/sys/fs/mount\-max " (since Linux 4.9)"
-.\" commit d29216842a85c7970c536108e093963f02714498
-The value in this file specifies the maximum number of mounts that may exist
-in a mount namespace.
-The default value in this file is 100,000.
-.TP
-.IR /proc/sys/fs/mqueue " (since Linux 2.6.6)"
-This directory contains files
-.IR msg_max ", " msgsize_max ", and " queues_max ,
-controlling the resources used by POSIX message queues.
-See
-.BR mq_overview (7)
-for details.
-.TP
-.IR /proc/sys/fs/nr_open " (since Linux 2.6.25)"
-.\" commit 9cfe015aa424b3c003baba3841a60dd9b5ad319b
-This file imposes a ceiling on the value to which the
-.B RLIMIT_NOFILE
-resource limit can be raised (see
-.BR getrlimit (2)).
-This ceiling is enforced for both unprivileged and privileged process.
-The default value in this file is 1048576.
-(Before Linux 2.6.25, the ceiling for
-.B RLIMIT_NOFILE
-was hard-coded to the same value.)
-.TP
-.IR /proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid
-These files
-allow you to change the value of the fixed UID and GID.
-The default is 65534.
-Some filesystems support only 16-bit UIDs and GIDs, although in Linux
-UIDs and GIDs are 32 bits.
-When one of these filesystems is mounted
-with writes enabled, any UID or GID that would exceed 65535 is translated
-to the overflow value before being written to disk.
-.TP
-.IR /proc/sys/fs/pipe\-max\-size " (since Linux 2.6.35)"
-See
-.BR pipe (7).
-.TP
-.IR /proc/sys/fs/pipe\-user\-pages\-hard " (since Linux 4.5)"
-See
-.BR pipe (7).
-.TP
-.IR /proc/sys/fs/pipe\-user\-pages\-soft " (since Linux 4.5)"
-See
-.BR pipe (7).
-.TP
-.IR /proc/sys/fs/protected_fifos " (since Linux 4.19)"
-The value in this file is/can be set to one of the following:
-.RS
-.TP 4
-0
-Writing to FIFOs is unrestricted.
-.TP
-1
-Don't allow
-.B O_CREAT
-.BR open (2)
-on FIFOs that the caller doesn't own in world-writable sticky directories,
-unless the FIFO is owned by the owner of the directory.
-.TP
-2
-As for the value 1,
-but the restriction also applies to group-writable sticky directories.
-.RE
-.IP
-The intent of the above protections is to avoid unintentional writes to an
-attacker-controlled FIFO when a program expected to create a regular file.
-.TP
-.IR /proc/sys/fs/protected_hardlinks " (since Linux 3.6)"
-.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
-When the value in this file is 0,
-no restrictions are placed on the creation of hard links
-(i.e., this is the historical behavior before Linux 3.6).
-When the value in this file is 1,
-a hard link can be created to a target file
-only if one of the following conditions is true:
-.RS
-.IP \[bu] 3
-The calling process has the
-.B CAP_FOWNER
-capability in its user namespace
-and the file UID has a mapping in the namespace.
-.IP \[bu]
-The filesystem UID of the process creating the link matches
-the owner (UID) of the target file
-(as described in
-.BR credentials (7),
-a process's filesystem UID is normally the same as its effective UID).
-.IP \[bu]
-All of the following conditions are true:
-.RS 4
-.IP \[bu] 3
-the target is a regular file;
-.IP \[bu]
-the target file does not have its set-user-ID mode bit enabled;
-.IP \[bu]
-the target file does not have both its set-group-ID and
-group-executable mode bits enabled; and
-.IP \[bu]
-the caller has permission to read and write the target file
-(either via the file's permissions mask or because it has
-suitable capabilities).
-.RE
-.RE
-.IP
-The default value in this file is 0.
-Setting the value to 1
-prevents a longstanding class of security issues caused by
-hard-link-based time-of-check, time-of-use races,
-most commonly seen in world-writable directories such as
-.IR /tmp .
-The common method of exploiting this flaw
-is to cross privilege boundaries when following a given hard link
-(i.e., a root process follows a hard link created by another user).
-Additionally, on systems without separated partitions,
-this stops unauthorized users from "pinning" vulnerable set-user-ID and
-set-group-ID files against being upgraded by
-the administrator, or linking to special files.
-.TP
-.IR /proc/sys/fs/protected_regular " (since Linux 4.19)"
-The value in this file is/can be set to one of the following:
-.RS
-.TP 4
-0
-Writing to regular files is unrestricted.
-.TP
-1
-Don't allow
-.B O_CREAT
-.BR open (2)
-on regular files that the caller doesn't own in
-world-writable sticky directories,
-unless the regular file is owned by the owner of the directory.
-.TP
-2
-As for the value 1,
-but the restriction also applies to group-writable sticky directories.
-.RE
-.IP
-The intent of the above protections is similar to
-.IR protected_fifos ,
-but allows an application to
-avoid writes to an attacker-controlled regular file,
-where the application expected to create one.
-.TP
-.IR /proc/sys/fs/protected_symlinks " (since Linux 3.6)"
-.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
-When the value in this file is 0,
-no restrictions are placed on following symbolic links
-(i.e., this is the historical behavior before Linux 3.6).
-When the value in this file is 1, symbolic links are followed only
-in the following circumstances:
-.RS
-.IP \[bu] 3
-the filesystem UID of the process following the link matches
-the owner (UID) of the symbolic link
-(as described in
-.BR credentials (7),
-a process's filesystem UID is normally the same as its effective UID);
-.IP \[bu]
-the link is not in a sticky world-writable directory; or
-.IP \[bu]
-the symbolic link and its parent directory have the same owner (UID)
-.RE
-.IP
-A system call that fails to follow a symbolic link
-because of the above restrictions returns the error
-.B EACCES
-in
-.IR errno .
-.IP
-The default value in this file is 0.
-Setting the value to 1 avoids a longstanding class of security issues
-based on time-of-check, time-of-use races when accessing symbolic links.
-.TP
-.IR /proc/sys/fs/suid_dumpable " (since Linux 2.6.13)"
-.\" The following is based on text from Documentation/sysctl/kernel.txt
-The value in this file is assigned to a process's "dumpable" flag
-in the circumstances described in
-.BR prctl (2).
-In effect,
-the value in this file determines whether core dump files are
-produced for set-user-ID or otherwise protected/tainted binaries.
-The "dumpable" setting also affects the ownership of files in a process's
-.IR /proc/ pid
-directory, as described above.
-.IP
-Three different integer values can be specified:
-.RS
-.TP
-\fI0\ (default)\fP
-.\" In kernel source: SUID_DUMP_DISABLE
-This provides the traditional (pre-Linux 2.6.13) behavior.
-A core dump will not be produced for a process which has
-changed credentials (by calling
-.BR seteuid (2),
-.BR setgid (2),
-or similar, or by executing a set-user-ID or set-group-ID program)
-or whose binary does not have read permission enabled.
-.TP
-\fI1\ ("debug")\fP
-.\" In kernel source: SUID_DUMP_USER
-All processes dump core when possible.
-(Reasons why a process might nevertheless not dump core are described in
-.BR core (5).)
-The core dump is owned by the filesystem user ID of the dumping process
-and no security is applied.
-This is intended for system debugging situations only:
-this mode is insecure because it allows unprivileged users to
-examine the memory contents of privileged processes.
-.TP
-\fI2\ ("suidsafe")\fP
-.\" In kernel source: SUID_DUMP_ROOT
-Any binary which normally would not be dumped (see "0" above)
-is dumped readable by root only.
-This allows the user to remove the core dump file but not to read it.
-For security reasons core dumps in this mode will not overwrite one
-another or other files.
-This mode is appropriate when administrators are
-attempting to debug problems in a normal environment.
-.IP
-Additionally, since Linux 3.6,
-.\" 9520628e8ceb69fa9a4aee6b57f22675d9e1b709
-.I /proc/sys/kernel/core_pattern
-must either be an absolute pathname
-or a pipe command, as detailed in
-.BR core (5).
-Warnings will be written to the kernel log if
-.I core_pattern
-does not follow these rules, and no core dump will be produced.
-.\" 54b501992dd2a839e94e76aa392c392b55080ce8
-.RE
-.IP
-For details of the effect of a process's "dumpable" setting
-on ptrace access mode checking, see
-.BR ptrace (2).
-.TP
-.I /proc/sys/fs/super\-max
-This file
-controls the maximum number of superblocks, and
-thus the maximum number of mounted filesystems the kernel
-can have.
-You need increase only
-.I super\-max
-if you need to mount more filesystems than the current value in
-.I super\-max
-allows you to.
-.TP
-.I /proc/sys/fs/super\-nr
-This file
-contains the number of filesystems currently mounted.
-.TP
-.I /proc/sys/kernel
-This directory contains files controlling a range of kernel parameters,
-as described below.
-.TP
-.I /proc/sys/kernel/acct
-This file
-contains three numbers:
-.IR highwater ,
-.IR lowwater ,
-and
-.IR frequency .
-If BSD-style process accounting is enabled, these values control
-its behavior.
-If free space on filesystem where the log lives goes below
-.I lowwater
-percent, accounting suspends.
-If free space gets above
-.I highwater
-percent, accounting resumes.
-.I frequency
-determines
-how often the kernel checks the amount of free space (value is in
-seconds).
-Default values are 4, 2, and 30.
-That is, suspend accounting if 2% or less space is free; resume it
-if 4% or more space is free; consider information about amount of free space
-valid for 30 seconds.
-.TP
-.IR /proc/sys/kernel/auto_msgmni " (Linux 2.6.27 to Linux 3.18)"
-.\" commit 9eefe520c814f6f62c5d36a2ddcd3fb99dfdb30e (introduces feature)
-.\" commit 0050ee059f7fc86b1df2527aaa14ed5dc72f9973 (rendered redundant)
-From Linux 2.6.27 to Linux 3.18,
-this file was used to control recomputing of the value in
-.I /proc/sys/kernel/msgmni
-upon the addition or removal of memory or upon IPC namespace creation/removal.
-Echoing "1" into this file enabled
-.I msgmni
-automatic recomputing (and triggered a recomputation of
-.I msgmni
-based on the current amount of available memory and number of IPC namespaces).
-Echoing "0" disabled automatic recomputing.
-(Automatic recomputing was also disabled if a value was explicitly assigned to
-.IR /proc/sys/kernel/msgmni .)
-The default value in
-.I auto_msgmni
-was 1.
-.IP
-Since Linux 3.19, the content of this file has no effect (because
-.I msgmni
-.\" FIXME Must document the 3.19 'msgmni' changes.
-defaults to near the maximum value possible),
-and reads from this file always return the value "0".
-.TP
-.IR /proc/sys/kernel/cap_last_cap " (since Linux 3.2)"
-See
-.BR capabilities (7).
-.TP
-.IR /proc/sys/kernel/cap\-bound " (from Linux 2.2 to Linux 2.6.24)"
-This file holds the value of the kernel
-.I "capability bounding set"
-(expressed as a signed decimal number).
-This set is ANDed against the capabilities permitted to a process
-during
-.BR execve (2).
-Starting with Linux 2.6.25,
-the system-wide capability bounding set disappeared,
-and was replaced by a per-thread bounding set; see
-.BR capabilities (7).
-.TP
-.I /proc/sys/kernel/core_pattern
-See
-.BR core (5).
-.TP
-.I /proc/sys/kernel/core_pipe_limit
-See
-.BR core (5).
-.TP
-.I /proc/sys/kernel/core_uses_pid
-See
-.BR core (5).
-.TP
-.I /proc/sys/kernel/ctrl\-alt\-del
-This file
-controls the handling of Ctrl-Alt-Del from the keyboard.
-When the value in this file is 0, Ctrl-Alt-Del is trapped and
-sent to the
-.BR init (1)
-program to handle a graceful restart.
-When the value is greater than zero, Linux's reaction to a Vulcan
-Nerve Pinch (tm) will be an immediate reboot, without even
-syncing its dirty buffers.
-Note: when a program (like dosemu) has the keyboard in "raw"
-mode, the Ctrl-Alt-Del is intercepted by the program before it
-ever reaches the kernel tty layer, and it's up to the program
-to decide what to do with it.
-.TP
-.IR /proc/sys/kernel/dmesg_restrict " (since Linux 2.6.37)"
-The value in this file determines who can see kernel syslog contents.
-A value of 0 in this file imposes no restrictions.
-If the value is 1, only privileged users can read the kernel syslog.
-(See
-.BR syslog (2)
-for more details.)
-Since Linux 3.4,
-.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
-only users with the
-.B CAP_SYS_ADMIN
-capability may change the value in this file.
-.TP
-.IR /proc/sys/kernel/domainname " and " /proc/sys/kernel/hostname
-can be used to set the NIS/YP domainname and the
-hostname of your box in exactly the same way as the commands
-.BR domainname (1)
-and
-.BR hostname (1),
-that is:
-.IP
-.in +4n
-.EX
-.RB "#" " echo \[aq]darkstar\[aq] > /proc/sys/kernel/hostname"
-.RB "#" " echo \[aq]mydomain\[aq] > /proc/sys/kernel/domainname"
-.EE
-.in
-.IP
-has the same effect as
-.IP
-.in +4n
-.EX
-.RB "#" " hostname \[aq]darkstar\[aq]"
-.RB "#" " domainname \[aq]mydomain\[aq]"
-.EE
-.in
-.IP
-Note, however, that the classic darkstar.frop.org has the
-hostname "darkstar" and DNS (Internet Domain Name Server)
-domainname "frop.org", not to be confused with the NIS (Network
-Information Service) or YP (Yellow Pages) domainname.
-These two
-domain names are in general different.
-For a detailed discussion
-see the
-.BR hostname (1)
-man page.
-.TP
-.I /proc/sys/kernel/hotplug
-This file
-contains the pathname for the hotplug policy agent.
-The default value in this file is
-.IR /sbin/hotplug .
-.TP
-.\" Removed in commit 87f504e5c78b910b0c1d6ffb89bc95e492322c84 (tglx/history.git)
-.IR /proc/sys/kernel/htab\-reclaim " (before Linux 2.4.9.2)"
-(PowerPC only) If this file is set to a nonzero value,
-the PowerPC htab
-.\" removed in commit 1b483a6a7b2998e9c98ad985d7494b9b725bd228, before Linux 2.6.28
-(see kernel file
-.IR Documentation/powerpc/ppc_htab.txt )
-is pruned
-each time the system hits the idle loop.
-.TP
-.I /proc/sys/kernel/keys/*
-This directory contains various files that define parameters and limits
-for the key-management facility.
-These files are described in
-.BR keyrings (7).
-.TP
-.IR /proc/sys/kernel/kptr_restrict " (since Linux 2.6.38)"
-.\" 455cd5ab305c90ffc422dd2e0fb634730942b257
-The value in this file determines whether kernel addresses are exposed via
-.I /proc
-files and other interfaces.
-A value of 0 in this file imposes no restrictions.
-If the value is 1, kernel pointers printed using the
-.I %pK
-format specifier will be replaced with zeros unless the user has the
-.B CAP_SYSLOG
-capability.
-If the value is 2, kernel pointers printed using the
-.I %pK
-format specifier will be replaced with zeros regardless
-of the user's capabilities.
-The initial default value for this file was 1,
-but the default was changed
-.\" commit 411f05f123cbd7f8aa1edcae86970755a6e2a9d9
-to 0 in Linux 2.6.39.
-Since Linux 3.4,
-.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
-only users with the
-.B CAP_SYS_ADMIN
-capability can change the value in this file.
-.TP
-.I /proc/sys/kernel/l2cr
-(PowerPC only) This file
-contains a flag that controls the L2 cache of G3 processor
-boards.
-If 0, the cache is disabled.
-Enabled if nonzero.
-.TP
-.I /proc/sys/kernel/modprobe
-This file contains the pathname for the kernel module loader.
-The default value is
-.IR /sbin/modprobe .
-The file is present only if the kernel is built with the
-.B CONFIG_MODULES
-.RB ( CONFIG_KMOD
-in Linux 2.6.26 and earlier)
-option enabled.
-It is described by the Linux kernel source file
-.I Documentation/kmod.txt
-(present only in Linux 2.4 and earlier).
-.TP
-.IR /proc/sys/kernel/modules_disabled " (since Linux 2.6.31)"
-.\" 3d43321b7015387cfebbe26436d0e9d299162ea1
-.\" From Documentation/sysctl/kernel.txt
-A toggle value indicating if modules are allowed to be loaded
-in an otherwise modular kernel.
-This toggle defaults to off (0), but can be set true (1).
-Once true, modules can be neither loaded nor unloaded,
-and the toggle cannot be set back to false.
-The file is present only if the kernel is built with the
-.B CONFIG_MODULES
-option enabled.
-.TP
-.IR /proc/sys/kernel/msgmax " (since Linux 2.2)"
-This file defines
-a system-wide limit specifying the maximum number of bytes in
-a single message written on a System V message queue.
-.TP
-.IR /proc/sys/kernel/msgmni " (since Linux 2.4)"
-This file defines the system-wide limit on the number of
-message queue identifiers.
-See also
-.IR /proc/sys/kernel/auto_msgmni .
-.TP
-.IR /proc/sys/kernel/msgmnb " (since Linux 2.2)"
-This file defines a system-wide parameter used to initialize the
-.I msg_qbytes
-setting for subsequently created message queues.
-The
-.I msg_qbytes
-setting specifies the maximum number of bytes that may be written to the
-message queue.
-.TP
-.IR /proc/sys/kernel/ngroups_max " (since Linux 2.6.4)"
-This is a read-only file that displays the upper limit on the
-number of a process's group memberships.
-.TP
-.IR /proc/sys/kernel/ns_last_pid " (since Linux 3.3)"
-See
-.BR pid_namespaces (7).
-.TP
-.IR /proc/sys/kernel/ostype " and " /proc/sys/kernel/osrelease
-These files
-give substrings of
-.IR /proc/version .
-.TP
-.IR /proc/sys/kernel/overflowgid " and " /proc/sys/kernel/overflowuid
-These files duplicate the files
-.I /proc/sys/fs/overflowgid
-and
-.IR /proc/sys/fs/overflowuid .
-.TP
-.I /proc/sys/kernel/panic
-This file gives read/write access to the kernel variable
-.IR panic_timeout .
-If this is zero, the kernel will loop on a panic; if nonzero,
-it indicates that the kernel should autoreboot after this number
-of seconds.
-When you use the
-software watchdog device driver, the recommended setting is 60.
-.TP
-.IR /proc/sys/kernel/panic_on_oops " (since Linux 2.5.68)"
-This file controls the kernel's behavior when an oops
-or BUG is encountered.
-If this file contains 0, then the system
-tries to continue operation.
-If it contains 1, then the system
-delays a few seconds (to give klogd time to record the oops output)
-and then panics.
-If the
-.I /proc/sys/kernel/panic
-file is also nonzero, then the machine will be rebooted.
-.TP
-.IR /proc/sys/kernel/pid_max " (since Linux 2.5.34)"
-This file specifies the value at which PIDs wrap around
-(i.e., the value in this file is one greater than the maximum PID).
-PIDs greater than this value are not allocated;
-thus, the value in this file also acts as a system-wide limit
-on the total number of processes and threads.
-The default value for this file, 32768,
-results in the same range of PIDs as on earlier kernels.
-On 32-bit platforms, 32768 is the maximum value for
-.IR pid_max .
-On 64-bit systems,
-.I pid_max
-can be set to any value up to 2\[ha]22
-.RB ( PID_MAX_LIMIT ,
-approximately 4 million).
-.\" Prior to Linux 2.6.10, pid_max could also be raised above 32768 on 32-bit
-.\" platforms, but this broke /proc/[pid]
-.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=109513010926152&w=2
-.TP
-.IR /proc/sys/kernel/powersave\-nap " (PowerPC only)"
-This file contains a flag.
-If set, Linux-PPC will use the "nap" mode of
-powersaving,
-otherwise the "doze" mode will be used.
-.TP
-.I /proc/sys/kernel/printk
-See
-.BR syslog (2).
-.TP
-.IR /proc/sys/kernel/pty " (since Linux 2.6.4)"
-This directory contains two files relating to the number of UNIX 98
-pseudoterminals (see
-.BR pts (4))
-on the system.
-.TP
-.I /proc/sys/kernel/pty/max
-This file defines the maximum number of pseudoterminals.
-.\" FIXME Document /proc/sys/kernel/pty/reserve
-.\"     New in Linux 3.3
-.\"     commit e9aba5158a80098447ff207a452a3418ae7ee386
-.TP
-.I /proc/sys/kernel/pty/nr
-This read-only file
-indicates how many pseudoterminals are currently in use.
-.TP
-.I /proc/sys/kernel/random
-This directory
-contains various parameters controlling the operation of the file
-.IR /dev/random .
-See
-.BR random (4)
-for further information.
-.TP
-.IR /proc/sys/kernel/random/uuid " (since Linux 2.4)"
-Each read from this read-only file returns a randomly generated 128-bit UUID,
-as a string in the standard UUID format.
-.TP
-.IR /proc/sys/kernel/randomize_va_space " (since Linux 2.6.12)"
-.\" Some further details can be found in Documentation/sysctl/kernel.txt
-Select the address space layout randomization (ASLR) policy for the system
-(on architectures that support ASLR).
-Three values are supported for this file:
-.RS
-.TP
-.B 0
-Turn ASLR off.
-This is the default for architectures that don't support ASLR,
-and when the kernel is booted with the
-.I norandmaps
-parameter.
-.TP
-.B 1
-Make the addresses of
-.BR mmap (2)
-allocations, the stack, and the VDSO page randomized.
-Among other things, this means that shared libraries will be
-loaded at randomized addresses.
-The text segment of PIE-linked binaries will also be loaded
-at a randomized address.
-This value is the default if the kernel was configured with
-.BR CONFIG_COMPAT_BRK .
-.TP
-.B 2
-(Since Linux 2.6.25)
-.\" commit c1d171a002942ea2d93b4fbd0c9583c56fce0772
-Also support heap randomization.
-This value is the default if the kernel was not configured with
-.BR CONFIG_COMPAT_BRK .
-.RE
-.TP
-.I /proc/sys/kernel/real\-root\-dev
-This file is documented in the Linux kernel source file
-.I Documentation/admin\-guide/initrd.rst
-.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
-(or
-.I Documentation/initrd.txt
-before Linux 4.10).
-.TP
-.IR /proc/sys/kernel/reboot\-cmd " (Sparc only)"
-This file seems to be a way to give an argument to the SPARC
-ROM/Flash boot loader.
-Maybe to tell it what to do after
-rebooting?
-.TP
-.I /proc/sys/kernel/rtsig\-max
-(Up to and including Linux 2.6.7; see
-.BR setrlimit (2))
-This file can be used to tune the maximum number
-of POSIX real-time (queued) signals that can be outstanding
-in the system.
-.TP
-.I /proc/sys/kernel/rtsig\-nr
-(Up to and including Linux 2.6.7.)
-This file shows the number of POSIX real-time signals currently queued.
-.TP
-.IR /proc/ pid /sched_autogroup_enabled " (since Linux 2.6.38)"
-.\" commit 5091faa449ee0b7d73bc296a93bca9540fc51d0a
-See
-.BR sched (7).
-.TP
-.IR /proc/sys/kernel/sched_child_runs_first " (since Linux 2.6.23)"
-If this file contains the value zero, then, after a
-.BR fork (2),
-the parent is first scheduled on the CPU.
-If the file contains a nonzero value,
-then the child is scheduled first on the CPU.
-(Of course, on a multiprocessor system,
-the parent and the child might both immediately be scheduled on a CPU.)
-.TP
-.IR /proc/sys/kernel/sched_rr_timeslice_ms " (since Linux 3.9)"
-See
-.BR sched_rr_get_interval (2).
-.TP
-.IR /proc/sys/kernel/sched_rt_period_us " (since Linux 2.6.25)"
-See
-.BR sched (7).
-.TP
-.IR /proc/sys/kernel/sched_rt_runtime_us " (since Linux 2.6.25)"
-See
-.BR sched (7).
-.TP
-.IR /proc/sys/kernel/seccomp " (since Linux 4.14)"
-.\" commit 8e5f1ad116df6b0de65eac458d5e7c318d1c05af
-This directory provides additional seccomp information and
-configuration.
-See
-.BR seccomp (2)
-for further details.
-.TP
-.IR /proc/sys/kernel/sem " (since Linux 2.4)"
-This file contains 4 numbers defining limits for System V IPC semaphores.
-These fields are, in order:
-.RS
-.TP
-SEMMSL
-The maximum semaphores per semaphore set.
-.TP
-SEMMNS
-A system-wide limit on the number of semaphores in all semaphore sets.
-.TP
-SEMOPM
-The maximum number of operations that may be specified in a
-.BR semop (2)
-call.
-.TP
-SEMMNI
-A system-wide limit on the maximum number of semaphore identifiers.
-.RE
-.TP
-.I /proc/sys/kernel/sg\-big\-buff
-This file
-shows the size of the generic SCSI device (sg) buffer.
-You can't tune it just yet, but you could change it at
-compile time by editing
-.I include/scsi/sg.h
-and changing
-the value of
-.BR SG_BIG_BUFF .
-However, there shouldn't be any reason to change this value.
-.TP
-.IR /proc/sys/kernel/shm_rmid_forced " (since Linux 3.1)"
-.\" commit b34a6b1da371ed8af1221459a18c67970f7e3d53
-.\" See also Documentation/sysctl/kernel.txt
-If this file is set to 1, all System V shared memory segments will
-be marked for destruction as soon as the number of attached processes
-falls to zero;
-in other words, it is no longer possible to create shared memory segments
-that exist independently of any attached process.
-.IP
-The effect is as though a
-.BR shmctl (2)
-.B IPC_RMID
-is performed on all existing segments as well as all segments
-created in the future (until this file is reset to 0).
-Note that existing segments that are attached to no process will be
-immediately destroyed when this file is set to 1.
-Setting this option will also destroy segments that were created,
-but never attached,
-upon termination of the process that created the segment with
-.BR shmget (2).
-.IP
-Setting this file to 1 provides a way of ensuring that
-all System V shared memory segments are counted against the
-resource usage and resource limits (see the description of
-.B RLIMIT_AS
-in
-.BR getrlimit (2))
-of at least one process.
-.IP
-Because setting this file to 1 produces behavior that is nonstandard
-and could also break existing applications,
-the default value in this file is 0.
-Set this file to 1 only if you have a good understanding
-of the semantics of the applications using
-System V shared memory on your system.
-.TP
-.IR /proc/sys/kernel/shmall " (since Linux 2.2)"
-This file
-contains the system-wide limit on the total number of pages of
-System V shared memory.
-.TP
-.IR /proc/sys/kernel/shmmax " (since Linux 2.2)"
-This file
-can be used to query and set the run-time limit
-on the maximum (System V IPC) shared memory segment size that can be
-created.
-Shared memory segments up to 1 GB are now supported in the
-kernel.
-This value defaults to
-.BR SHMMAX .
-.TP
-.IR /proc/sys/kernel/shmmni " (since Linux 2.4)"
-This file
-specifies the system-wide maximum number of System V shared memory
-segments that can be created.
-.TP
-.IR /proc/sys/kernel/sysctl_writes_strict " (since Linux 3.16)"
-.\" commit f88083005ab319abba5d0b2e4e997558245493c8
-.\" commit 2ca9bb456ada8bcbdc8f77f8fc78207653bbaa92
-.\" commit f4aacea2f5d1a5f7e3154e967d70cf3f711bcd61
-.\" commit 24fe831c17ab8149413874f2fd4e5c8a41fcd294
-The value in this file determines how the file offset affects
-the behavior of updating entries in files under
-.IR /proc/sys .
-The file has three possible values:
-.RS
-.TP 4
-\-1
-This provides legacy handling, with no printk warnings.
-Each
-.BR write (2)
-must fully contain the value to be written,
-and multiple writes on the same file descriptor
-will overwrite the entire value, regardless of the file position.
-.TP
-0
-(default) This provides the same behavior as for \-1,
-but printk warnings are written for processes that
-perform writes when the file offset is not 0.
-.TP
-1
-Respect the file offset when writing strings into
-.I /proc/sys
-files.
-Multiple writes will
-.I append
-to the value buffer.
-Anything written beyond the maximum length
-of the value buffer will be ignored.
-Writes to numeric
-.I /proc/sys
-entries must always be at file offset 0 and the value must be
-fully contained in the buffer provided to
-.BR write (2).
-.\" FIXME .
-.\"     With /proc/sys/kernel/sysctl_writes_strict==1, writes at an
-.\"     offset other than 0 do not generate an error. Instead, the
-.\"     write() succeeds, but the file is left unmodified.
-.\"     This is surprising. The behavior may change in the future.
-.\"     See thread.gmane.org/gmane.linux.man/9197
-.\"		From: Michael Kerrisk (man-pages <mtk.manpages@...>
-.\"		Subject: sysctl_writes_strict documentation + an oddity?
-.\"		Newsgroups: gmane.linux.man, gmane.linux.kernel
-.\"		Date: 2015-05-09 08:54:11 GMT
-.RE
-.TP
-.I /proc/sys/kernel/sysrq
-This file controls the functions allowed to be invoked by the SysRq key.
-By default,
-the file contains 1 meaning that every possible SysRq request is allowed
-(in older kernel versions, SysRq was disabled by default,
-and you were required to specifically enable it at run-time,
-but this is not the case any more).
-Possible values in this file are:
-.RS
-.TP 5
-0
-Disable sysrq completely
-.TP
-1
-Enable all functions of sysrq
-.TP
-> 1
-Bit mask of allowed sysrq functions, as follows:
-.PD 0
-.RS
-.TP 5
-\ \ 2
-Enable control of console logging level
-.TP
-\ \ 4
-Enable control of keyboard (SAK, unraw)
-.TP
-\ \ 8
-Enable debugging dumps of processes etc.
-.TP
-\ 16
-Enable sync command
-.TP
-\ 32
-Enable remount read-only
-.TP
-\ 64
-Enable signaling of processes (term, kill, oom-kill)
-.TP
-128
-Allow reboot/poweroff
-.TP
-256
-Allow nicing of all real-time tasks
-.RE
-.PD
-.RE
-.IP
-This file is present only if the
-.B CONFIG_MAGIC_SYSRQ
-kernel configuration option is enabled.
-For further details see the Linux kernel source file
-.I Documentation/admin\-guide/sysrq.rst
-.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
-(or
-.I Documentation/sysrq.txt
-before Linux 4.10).
-.TP
-.I /proc/sys/kernel/version
-This file contains a string such as:
-.IP
-.in +4n
-.EX
-#5 Wed Feb 25 21:49:24 MET 1998
-.EE
-.in
-.IP
-The "#5" means that
-this is the fifth kernel built from this source base and the
-date following it indicates the time the kernel was built.
-.TP
-.IR /proc/sys/kernel/threads\-max " (since Linux 2.3.11)"
-.\" The following is based on Documentation/sysctl/kernel.txt
-This file specifies the system-wide limit on the number of
-threads (tasks) that can be created on the system.
-.IP
-Since Linux 4.1,
-.\" commit 230633d109e35b0a24277498e773edeb79b4a331
-the value that can be written to
-.I threads\-max
-is bounded.
-The minimum value that can be written is 20.
-The maximum value that can be written is given by the
-constant
-.B FUTEX_TID_MASK
-(0x3fffffff).
-If a value outside of this range is written to
-.IR threads\-max ,
-the error
-.B EINVAL
-occurs.
-.IP
-The value written is checked against the available RAM pages.
-If the thread structures would occupy too much (more than 1/8th)
-of the available RAM pages,
-.I threads\-max
-is reduced accordingly.
-.TP
-.IR /proc/sys/kernel/yama/ptrace_scope " (since Linux 3.5)"
-See
-.BR ptrace (2).
-.TP
-.IR /proc/sys/kernel/zero\-paged " (PowerPC only)"
-This file
-contains a flag.
-When enabled (nonzero), Linux-PPC will pre-zero pages in
-the idle loop, possibly speeding up get_free_pages.
-.TP
-.I /proc/sys/net
-This directory contains networking stuff.
-Explanations for some of the files under this directory can be found in
-.BR tcp (7)
-and
-.BR ip (7).
-.TP
-.I /proc/sys/net/core/bpf_jit_enable
-See
-.BR bpf (2).
-.TP
-.I /proc/sys/net/core/somaxconn
-This file defines a ceiling value for the
-.I backlog
-argument of
-.BR listen (2);
-see the
-.BR listen (2)
-manual page for details.
-.TP
-.I /proc/sys/proc
-This directory may be empty.
-.TP
-.I /proc/sys/sunrpc
-This directory supports Sun remote procedure call for network filesystem
-(NFS).
-On some systems, it is not present.
-.TP
-.IR /proc/sys/user " (since Linux 4.9)"
-See
-.BR namespaces (7).
-.TP
-.I /proc/sys/vm
-This directory contains files for memory management tuning, buffer, and
-cache management.
-.TP
-.IR /proc/sys/vm/admin_reserve_kbytes " (since Linux 3.10)"
-.\" commit 4eeab4f5580d11bffedc697684b91b0bca0d5009
-This file defines the amount of free memory (in KiB) on the system that
-should be reserved for users with the capability
-.BR CAP_SYS_ADMIN .
-.IP
-The default value in this file is the minimum of [3% of free pages, 8MiB]
-expressed as KiB.
-The default is intended to provide enough for the superuser
-to log in and kill a process, if necessary,
-under the default overcommit 'guess' mode (i.e., 0 in
-.IR /proc/sys/vm/overcommit_memory ).
-.IP
-Systems running in "overcommit never" mode (i.e., 2 in
-.IR /proc/sys/vm/overcommit_memory )
-should increase the value in this file to account
-for the full virtual memory size of the programs used to recover (e.g.,
-.BR login (1)
-.BR ssh (1),
-and
-.BR top (1))
-Otherwise, the superuser may not be able to log in to recover the system.
-For example, on x86-64 a suitable value is 131072 (128MiB reserved).
-.IP
-Changing the value in this file takes effect whenever
-an application requests memory.
-.TP
-.IR /proc/sys/vm/compact_memory " (since Linux 2.6.35)"
-When 1 is written to this file, all zones are compacted such that free
-memory is available in contiguous blocks where possible.
-The effect of this action can be seen by examining
-.IR /proc/buddyinfo .
-.IP
-Present only if the kernel was configured with
-.BR CONFIG_COMPACTION .
-.TP
-.IR /proc/sys/vm/drop_caches " (since Linux 2.6.16)"
-Writing to this file causes the kernel to drop clean caches, dentries, and
-inodes from memory, causing that memory to become free.
-This can be useful for memory management testing and
-performing reproducible filesystem benchmarks.
-Because writing to this file causes the benefits of caching to be lost,
-it can degrade overall system performance.
-.IP
-To free pagecache, use:
-.IP
-.in +4n
-.EX
-echo 1 > /proc/sys/vm/drop_caches
-.EE
-.in
-.IP
-To free dentries and inodes, use:
-.IP
-.in +4n
-.EX
-echo 2 > /proc/sys/vm/drop_caches
-.EE
-.in
-.IP
-To free pagecache, dentries, and inodes, use:
-.IP
-.in +4n
-.EX
-echo 3 > /proc/sys/vm/drop_caches
-.EE
-.in
-.IP
-Because writing to this file is a nondestructive operation and dirty objects
-are not freeable, the
-user should run
-.BR sync (1)
-first.
-.TP
-.IR  /proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)"
-This writable file contains a group ID that is allowed
-to allocate memory using huge pages.
-If a process has a filesystem group ID or any supplementary group ID that
-matches this group ID,
-then it can make huge-page allocations without holding the
-.B CAP_IPC_LOCK
-capability; see
-.BR memfd_create (2),
-.BR mmap (2),
-and
-.BR shmget (2).
-.TP
-.IR /proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)"
-.\" The following is from Documentation/filesystems/proc.txt
-If nonzero, this disables the new 32-bit memory-mapping layout;
-the kernel will use the legacy (2.4) layout for all processes.
-.TP
-.IR /proc/sys/vm/memory_failure_early_kill " (since Linux 2.6.32)"
-.\" The following is based on the text in Documentation/sysctl/vm.txt
-Control how to kill processes when an uncorrected memory error
-(typically a 2-bit error in a memory module)
-that cannot be handled by the kernel
-is detected in the background by hardware.
-In some cases (like the page still having a valid copy on disk),
-the kernel will handle the failure
-transparently without affecting any applications.
-But if there is no other up-to-date copy of the data,
-it will kill processes to prevent any data corruptions from propagating.
-.IP
-The file has one of the following values:
-.RS
-.TP
-.B 1
-Kill all processes that have the corrupted-and-not-reloadable page mapped
-as soon as the corruption is detected.
-Note that this is not supported for a few types of pages,
-such as kernel internally
-allocated data or the swap cache, but works for the majority of user pages.
-.TP
-.B 0
-Unmap the corrupted page from all processes and kill a process
-only if it tries to access the page.
-.RE
-.IP
-The kill is performed using a
-.B SIGBUS
-signal with
-.I si_code
-set to
-.BR BUS_MCEERR_AO .
-Processes can handle this if they want to; see
-.BR sigaction (2)
-for more details.
-.IP
-This feature is active only on architectures/platforms with advanced machine
-check handling and depends on the hardware capabilities.
-.IP
-Applications can override the
-.I memory_failure_early_kill
-setting individually with the
-.BR prctl (2)
-.B PR_MCE_KILL
-operation.
-.IP
-Present only if the kernel was configured with
-.BR CONFIG_MEMORY_FAILURE .
-.TP
-.IR /proc/sys/vm/memory_failure_recovery " (since Linux 2.6.32)"
-.\" The following is based on the text in Documentation/sysctl/vm.txt
-Enable memory failure recovery (when supported by the platform).
-.RS
-.TP
-.B 1
-Attempt recovery.
-.TP
-.B 0
-Always panic on a memory failure.
-.RE
-.IP
-Present only if the kernel was configured with
-.BR CONFIG_MEMORY_FAILURE .
-.TP
-.IR /proc/sys/vm/oom_dump_tasks " (since Linux 2.6.25)"
-.\" The following is from Documentation/sysctl/vm.txt
-Enables a system-wide task dump (excluding kernel threads) to be
-produced when the kernel performs an OOM-killing.
-The dump includes the following information
-for each task (thread, process):
-thread ID, real user ID, thread group ID (process ID),
-virtual memory size, resident set size,
-the CPU that the task is scheduled on,
-oom_adj score (see the description of
-.IR /proc/ pid /oom_adj ),
-and command name.
-This is helpful to determine why the OOM-killer was invoked
-and to identify the rogue task that caused it.
-.IP
-If this contains the value zero, this information is suppressed.
-On very large systems with thousands of tasks,
-it may not be feasible to dump the memory state information for each one.
-Such systems should not be forced to incur a performance penalty in
-OOM situations when the information may not be desired.
-.IP
-If this is set to nonzero, this information is shown whenever the
-OOM-killer actually kills a memory-hogging task.
-.IP
-The default value is 0.
-.TP
-.IR /proc/sys/vm/oom_kill_allocating_task " (since Linux 2.6.24)"
-.\" The following is from Documentation/sysctl/vm.txt
-This enables or disables killing the OOM-triggering task in
-out-of-memory situations.
-.IP
-If this is set to zero, the OOM-killer will scan through the entire
-tasklist and select a task based on heuristics to kill.
-This normally selects a rogue memory-hogging task that
-frees up a large amount of memory when killed.
-.IP
-If this is set to nonzero, the OOM-killer simply kills the task that
-triggered the out-of-memory condition.
-This avoids a possibly expensive tasklist scan.
-.IP
-If
-.I /proc/sys/vm/panic_on_oom
-is nonzero, it takes precedence over whatever value is used in
-.IR /proc/sys/vm/oom_kill_allocating_task .
-.IP
-The default value is 0.
-.TP
-.IR /proc/sys/vm/overcommit_kbytes " (since Linux 3.14)"
-.\" commit 49f0ce5f92321cdcf741e35f385669a421013cb7
-This writable file provides an alternative to
-.I /proc/sys/vm/overcommit_ratio
-for controlling the
-.I CommitLimit
-when
-.I /proc/sys/vm/overcommit_memory
-has the value 2.
-It allows the amount of memory overcommitting to be specified as
-an absolute value (in kB),
-rather than as a percentage, as is done with
-.IR overcommit_ratio .
-This allows for finer-grained control of
-.I CommitLimit
-on systems with extremely large memory sizes.
-.IP
-Only one of
-.I overcommit_kbytes
-or
-.I overcommit_ratio
-can have an effect:
-if
-.I overcommit_kbytes
-has a nonzero value, then it is used to calculate
-.IR CommitLimit ,
-otherwise
-.I overcommit_ratio
-is used.
-Writing a value to either of these files causes the
-value in the other file to be set to zero.
-.TP
-.I /proc/sys/vm/overcommit_memory
-This file contains the kernel virtual memory accounting mode.
-Values are:
-.RS
-.IP
-0: heuristic overcommit (this is the default)
-.br
-1: always overcommit, never check
-.br
-2: always check, never overcommit
-.RE
-.IP
-In mode 0, calls of
-.BR mmap (2)
-with
-.B MAP_NORESERVE
-are not checked, and the default check is very weak,
-leading to the risk of getting a process "OOM-killed".
-.IP
-In mode 1, the kernel pretends there is always enough memory,
-until memory actually runs out.
-One use case for this mode is scientific computing applications
-that employ large sparse arrays.
-Before Linux 2.6.0, any nonzero value implies mode 1.
-.IP
-In mode 2 (available since Linux 2.6), the total virtual address space
-that can be allocated
-.RI ( CommitLimit
-in
-.IR /proc/meminfo )
-is calculated as
-.IP
-.in +4n
-.EX
-CommitLimit = (total_RAM \- total_huge_TLB) *
-	      overcommit_ratio / 100 + total_swap
-.EE
-.in
-.IP
-where:
-.RS
-.IP \[bu] 3
-.I total_RAM
-is the total amount of RAM on the system;
-.IP \[bu]
-.I total_huge_TLB
-is the amount of memory set aside for huge pages;
-.IP \[bu]
-.I overcommit_ratio
-is the value in
-.IR /proc/sys/vm/overcommit_ratio ;
-and
-.IP \[bu]
-.I total_swap
-is the amount of swap space.
-.RE
-.IP
-For example, on a system with 16 GB of physical RAM, 16 GB
-of swap, no space dedicated to huge pages, and an
-.I overcommit_ratio
-of 50, this formula yields a
-.I CommitLimit
-of 24 GB.
-.IP
-Since Linux 3.14, if the value in
-.I /proc/sys/vm/overcommit_kbytes
-is nonzero, then
-.I CommitLimit
-is instead calculated as:
-.IP
-.in +4n
-.EX
-CommitLimit = overcommit_kbytes + total_swap
-.EE
-.in
-.IP
-See also the description of
-.I /proc/sys/vm/admin_reserve_kbytes
-and
-.IR /proc/sys/vm/user_reserve_kbytes .
-.TP
-.IR /proc/sys/vm/overcommit_ratio " (since Linux 2.6.0)"
-This writable file defines a percentage by which memory
-can be overcommitted.
-The default value in the file is 50.
-See the description of
-.IR /proc/sys/vm/overcommit_memory .
-.TP
-.IR /proc/sys/vm/panic_on_oom " (since Linux 2.6.18)"
-.\" The following is adapted from Documentation/sysctl/vm.txt
-This enables or disables a kernel panic in
-an out-of-memory situation.
-.IP
-If this file is set to the value 0,
-the kernel's OOM-killer will kill some rogue process.
-Usually, the OOM-killer is able to kill a rogue process and the
-system will survive.
-.IP
-If this file is set to the value 1,
-then the kernel normally panics when out-of-memory happens.
-However, if a process limits allocations to certain nodes
-using memory policies
-.RB ( mbind (2)
-.BR MPOL_BIND )
-or cpusets
-.RB ( cpuset (7))
-and those nodes reach memory exhaustion status,
-one process may be killed by the OOM-killer.
-No panic occurs in this case:
-because other nodes' memory may be free,
-this means the system as a whole may not have reached
-an out-of-memory situation yet.
-.IP
-If this file is set to the value 2,
-the kernel always panics when an out-of-memory condition occurs.
-.IP
-The default value is 0.
-1 and 2 are for failover of clustering.
-Select either according to your policy of failover.
-.TP
-.I /proc/sys/vm/swappiness
-.\" The following is from Documentation/sysctl/vm.txt
-The value in this file controls how aggressively the kernel will swap
-memory pages.
-Higher values increase aggressiveness, lower values
-decrease aggressiveness.
-The default value is 60.
-.TP
-.IR /proc/sys/vm/user_reserve_kbytes " (since Linux 3.10)"
-.\" commit c9b1d0981fcce3d9976d7b7a56e4e0503bc610dd
-Specifies an amount of memory (in KiB) to reserve for user processes.
-This is intended to prevent a user from starting a single memory hogging
-process, such that they cannot recover (kill the hog).
-The value in this file has an effect only when
-.I /proc/sys/vm/overcommit_memory
-is set to 2 ("overcommit never" mode).
-In this case, the system reserves an amount of memory that is the minimum
-of [3% of current process size,
-.IR user_reserve_kbytes ].
-.IP
-The default value in this file is the minimum of [3% of free pages, 128MiB]
-expressed as KiB.
-.IP
-If the value in this file is set to zero,
-then a user will be allowed to allocate all free memory with a single process
-(minus the amount reserved by
-.IR /proc/sys/vm/admin_reserve_kbytes ).
-Any subsequent attempts to execute a command will result in
-"fork: Cannot allocate memory".
-.IP
-Changing the value in this file takes effect whenever
-an application requests memory.
-.TP
-.IR /proc/sys/vm/unprivileged_userfaultfd " (since Linux 5.2)"
-.\" cefdca0a86be517bc390fc4541e3674b8e7803b0
-This (writable) file exposes a flag that controls whether
-unprivileged processes are allowed to employ
-.BR userfaultfd (2).
-If this file has the value 1, then unprivileged processes may use
-.BR userfaultfd (2).
-If this file has the value 0, then only processes that have the
-.B CAP_SYS_PTRACE
-capability may employ
-.BR userfaultfd (2).
-The default value in this file is 1.
-.TP
 .IR /proc/sysrq\-trigger " (since Linux 2.4.21)"
 Writing a character to this file triggers the same SysRq function as
 typing ALT-SysRq-<character> (see the description of
diff --git a/man5/proc_sys.5 b/man5/proc_sys.5
new file mode 100644
index 000000000..78f0c192c
--- /dev/null
+++ b/man5/proc_sys.5
@@ -0,0 +1,1623 @@
+'\" t
+.\" Copyright (C) 1994, 1995, Daniel Quinlan <quinlan@yggdrasil.com>
+.\" Copyright (C) 2002-2008, 2017, Michael Kerrisk <mtk.manpages@gmail.com>
+.\" Copyright (C) , Andries Brouwer <aeb@cwi.nl>
+.\" Copyright (C) 2023, Alejandro Colomar <alx@kernel.org>
+.\"
+.\" SPDX-License-Identifier: GPL-3.0-or-later
+.\"
+.TH proc_sys 5 (date) "Linux man-pages (unreleased)"
+.SH NAME
+/proc/sys/ \- system information, and sysctl pseudo-filesystem
+.SH DESCRIPTION
+.TP
+.I /proc/sys/
+This directory (present since Linux 1.3.57) contains a number of files
+and subdirectories corresponding to kernel variables.
+These variables can be read and in some cases modified using
+the \fI/proc\fP filesystem, and the (deprecated)
+.BR sysctl (2)
+system call.
+.IP
+String values may be terminated by either \[aq]\e0\[aq] or \[aq]\en\[aq].
+.IP
+Integer and long values may be written either in decimal or in
+hexadecimal notation (e.g., 0x3FFF).
+When writing multiple integer or long values, these may be separated
+by any of the following whitespace characters:
+\[aq]\ \[aq], \[aq]\et\[aq], or \[aq]\en\[aq].
+Using other separators leads to the error
+.BR EINVAL .
+.TP
+.IR /proc/sys/abi/ " (since Linux 2.4.10)"
+This directory may contain files with application binary information.
+.\" On some systems, it is not present.
+See the Linux kernel source file
+.I Documentation/sysctl/abi.rst
+(or
+.I Documentation/sysctl/abi.txt
+before Linux 5.3)
+for more information.
+.TP
+.I /proc/sys/debug/
+This directory may be empty.
+.TP
+.I /proc/sys/dev/
+This directory contains device-specific information (e.g.,
+.IR dev/cdrom/info ).
+On
+some systems, it may be empty.
+.TP
+.I /proc/sys/fs/
+This directory contains the files and subdirectories for kernel variables
+related to filesystems.
+.TP
+.IR /proc/sys/fs/aio\-max\-nr " and " /proc/sys/fs/aio\-nr " (since Linux 2.6.4)"
+.I aio\-nr
+is the running total of the number of events specified by
+.BR io_setup (2)
+calls for all currently active AIO contexts.
+If
+.I aio\-nr
+reaches
+.IR aio\-max\-nr ,
+then
+.BR io_setup (2)
+will fail with the error
+.BR EAGAIN .
+Raising
+.I aio\-max\-nr
+does not result in the preallocation or resizing
+of any kernel data structures.
+.TP
+.I /proc/sys/fs/binfmt_misc
+Documentation for files in this directory can be found
+in the Linux kernel source in the file
+.I Documentation/admin\-guide/binfmt\-misc.rst
+(or in
+.I Documentation/binfmt_misc.txt
+on older kernels).
+.TP
+.IR /proc/sys/fs/dentry\-state " (since Linux 2.2)"
+This file contains information about the status of the
+directory cache (dcache).
+The file contains six numbers,
+.IR nr_dentry ,
+.IR nr_unused ,
+.I age_limit
+(age in seconds),
+.I want_pages
+(pages requested by system) and two dummy values.
+.RS
+.IP \[bu] 3
+.I nr_dentry
+is the number of allocated dentries (dcache entries).
+This field is unused in Linux 2.2.
+.IP \[bu]
+.I nr_unused
+is the number of unused dentries.
+.IP \[bu]
+.I age_limit
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is the age in seconds after which dcache entries
+can be reclaimed when memory is short.
+.IP \[bu]
+.I want_pages
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is nonzero when the kernel has called shrink_dcache_pages() and the
+dcache isn't pruned yet.
+.RE
+.TP
+.I /proc/sys/fs/dir\-notify\-enable
+This file can be used to disable or enable the
+.I dnotify
+interface described in
+.BR fcntl (2)
+on a system-wide basis.
+A value of 0 in this file disables the interface,
+and a value of 1 enables it.
+.TP
+.I /proc/sys/fs/dquot\-max
+This file shows the maximum number of cached disk quota entries.
+On some (2.4) systems, it is not present.
+If the number of free cached disk quota entries is very low and
+you have some awesome number of simultaneous system users,
+you might want to raise the limit.
+.TP
+.I /proc/sys/fs/dquot\-nr
+This file shows the number of allocated disk quota
+entries and the number of free disk quota entries.
+.TP
+.IR /proc/sys/fs/epoll/ " (since Linux 2.6.28)"
+This directory contains the file
+.IR max_user_watches ,
+which can be used to limit the amount of kernel memory consumed by the
+.I epoll
+interface.
+For further details, see
+.BR epoll (7).
+.TP
+.I /proc/sys/fs/file\-max
+This file defines
+a system-wide limit on the number of open files for all processes.
+System calls that fail when encountering this limit fail with the error
+.BR ENFILE .
+(See also
+.BR setrlimit (2),
+which can be used by a process to set the per-process limit,
+.BR RLIMIT_NOFILE ,
+on the number of files it may open.)
+If you get lots
+of error messages in the kernel log about running out of file handles
+(open file descriptions)
+(look for "VFS: file\-max limit <number> reached"),
+try increasing this value:
+.IP
+.in +4n
+.EX
+echo 100000 > /proc/sys/fs/file\-max
+.EE
+.in
+.IP
+Privileged processes
+.RB ( CAP_SYS_ADMIN )
+can override the
+.I file\-max
+limit.
+.TP
+.I /proc/sys/fs/file\-nr
+This (read-only) file contains three numbers:
+the number of allocated file handles
+(i.e., the number of open file descriptions; see
+.BR open (2));
+the number of free file handles;
+and the maximum number of file handles (i.e., the same value as
+.IR /proc/sys/fs/file\-max ).
+If the number of allocated file handles is close to the
+maximum, you should consider increasing the maximum.
+Before Linux 2.6,
+the kernel allocated file handles dynamically,
+but it didn't free them again.
+Instead the free file handles were kept in a list for reallocation;
+the "free file handles" value indicates the size of that list.
+A large number of free file handles indicates that there was
+a past peak in the usage of open file handles.
+Since Linux 2.6, the kernel does deallocate freed file handles,
+and the "free file handles" value is always zero.
+.TP
+.IR /proc/sys/fs/inode\-max " (only present until Linux 2.2)"
+This file contains the maximum number of in-memory inodes.
+This value should be 3\[en]4 times larger
+than the value in
+.IR file\-max ,
+since \fIstdin\fP, \fIstdout\fP
+and network sockets also need an inode to handle them.
+When you regularly run out of inodes, you need to increase this value.
+.IP
+Starting with Linux 2.4,
+there is no longer a static limit on the number of inodes,
+and this file is removed.
+.TP
+.I /proc/sys/fs/inode\-nr
+This file contains the first two values from
+.IR inode\-state .
+.TP
+.I /proc/sys/fs/inode\-state
+This file
+contains seven numbers:
+.IR nr_inodes ,
+.IR nr_free_inodes ,
+.IR preshrink ,
+and four dummy values (always zero).
+.IP
+.I nr_inodes
+is the number of inodes the system has allocated.
+.\" This can be slightly more than
+.\" .I inode\-max
+.\" because Linux allocates them one page full at a time.
+.I nr_free_inodes
+represents the number of free inodes.
+.IP
+.I preshrink
+is nonzero when the
+.I nr_inodes
+>
+.I inode\-max
+and the system needs to prune the inode list instead of allocating more;
+since Linux 2.4, this field is a dummy value (always zero).
+.TP
+.IR /proc/sys/fs/inotify/ " (since Linux 2.6.13)"
+This directory contains files
+.IR max_queued_events ", " max_user_instances ", and " max_user_watches ,
+that can be used to limit the amount of kernel memory consumed by the
+.I inotify
+interface.
+For further details, see
+.BR inotify (7).
+.TP
+.I /proc/sys/fs/lease\-break\-time
+This file specifies the grace period that the kernel grants to a process
+holding a file lease
+.RB ( fcntl (2))
+after it has sent a signal to that process notifying it
+that another process is waiting to open the file.
+If the lease holder does not remove or downgrade the lease within
+this grace period, the kernel forcibly breaks the lease.
+.TP
+.I /proc/sys/fs/leases\-enable
+This file can be used to enable or disable file leases
+.RB ( fcntl (2))
+on a system-wide basis.
+If this file contains the value 0, leases are disabled.
+A nonzero value enables leases.
+.TP
+.IR /proc/sys/fs/mount\-max " (since Linux 4.9)"
+.\" commit d29216842a85c7970c536108e093963f02714498
+The value in this file specifies the maximum number of mounts that may exist
+in a mount namespace.
+The default value in this file is 100,000.
+.TP
+.IR /proc/sys/fs/mqueue/ " (since Linux 2.6.6)"
+This directory contains files
+.IR msg_max ", " msgsize_max ", and " queues_max ,
+controlling the resources used by POSIX message queues.
+See
+.BR mq_overview (7)
+for details.
+.TP
+.IR /proc/sys/fs/nr_open " (since Linux 2.6.25)"
+.\" commit 9cfe015aa424b3c003baba3841a60dd9b5ad319b
+This file imposes a ceiling on the value to which the
+.B RLIMIT_NOFILE
+resource limit can be raised (see
+.BR getrlimit (2)).
+This ceiling is enforced for both unprivileged and privileged process.
+The default value in this file is 1048576.
+(Before Linux 2.6.25, the ceiling for
+.B RLIMIT_NOFILE
+was hard-coded to the same value.)
+.TP
+.IR /proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid
+These files
+allow you to change the value of the fixed UID and GID.
+The default is 65534.
+Some filesystems support only 16-bit UIDs and GIDs, although in Linux
+UIDs and GIDs are 32 bits.
+When one of these filesystems is mounted
+with writes enabled, any UID or GID that would exceed 65535 is translated
+to the overflow value before being written to disk.
+.TP
+.IR /proc/sys/fs/pipe\-max\-size " (since Linux 2.6.35)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-hard " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-soft " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/protected_fifos " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to FIFOs is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on FIFOs that the caller doesn't own in world-writable sticky directories,
+unless the FIFO is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is to avoid unintentional writes to an
+attacker-controlled FIFO when a program expected to create a regular file.
+.TP
+.IR /proc/sys/fs/protected_hardlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on the creation of hard links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1,
+a hard link can be created to a target file
+only if one of the following conditions is true:
+.RS
+.IP \[bu] 3
+The calling process has the
+.B CAP_FOWNER
+capability in its user namespace
+and the file UID has a mapping in the namespace.
+.IP \[bu]
+The filesystem UID of the process creating the link matches
+the owner (UID) of the target file
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID).
+.IP \[bu]
+All of the following conditions are true:
+.RS 4
+.IP \[bu] 3
+the target is a regular file;
+.IP \[bu]
+the target file does not have its set-user-ID mode bit enabled;
+.IP \[bu]
+the target file does not have both its set-group-ID and
+group-executable mode bits enabled; and
+.IP \[bu]
+the caller has permission to read and write the target file
+(either via the file's permissions mask or because it has
+suitable capabilities).
+.RE
+.RE
+.IP
+The default value in this file is 0.
+Setting the value to 1
+prevents a longstanding class of security issues caused by
+hard-link-based time-of-check, time-of-use races,
+most commonly seen in world-writable directories such as
+.IR /tmp .
+The common method of exploiting this flaw
+is to cross privilege boundaries when following a given hard link
+(i.e., a root process follows a hard link created by another user).
+Additionally, on systems without separated partitions,
+this stops unauthorized users from "pinning" vulnerable set-user-ID and
+set-group-ID files against being upgraded by
+the administrator, or linking to special files.
+.TP
+.IR /proc/sys/fs/protected_regular " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to regular files is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on regular files that the caller doesn't own in
+world-writable sticky directories,
+unless the regular file is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is similar to
+.IR protected_fifos ,
+but allows an application to
+avoid writes to an attacker-controlled regular file,
+where the application expected to create one.
+.TP
+.IR /proc/sys/fs/protected_symlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on following symbolic links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1, symbolic links are followed only
+in the following circumstances:
+.RS
+.IP \[bu] 3
+the filesystem UID of the process following the link matches
+the owner (UID) of the symbolic link
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID);
+.IP \[bu]
+the link is not in a sticky world-writable directory; or
+.IP \[bu]
+the symbolic link and its parent directory have the same owner (UID)
+.RE
+.IP
+A system call that fails to follow a symbolic link
+because of the above restrictions returns the error
+.B EACCES
+in
+.IR errno .
+.IP
+The default value in this file is 0.
+Setting the value to 1 avoids a longstanding class of security issues
+based on time-of-check, time-of-use races when accessing symbolic links.
+.TP
+.IR /proc/sys/fs/suid_dumpable " (since Linux 2.6.13)"
+.\" The following is based on text from Documentation/sysctl/kernel.txt
+The value in this file is assigned to a process's "dumpable" flag
+in the circumstances described in
+.BR prctl (2).
+In effect,
+the value in this file determines whether core dump files are
+produced for set-user-ID or otherwise protected/tainted binaries.
+The "dumpable" setting also affects the ownership of files in a process's
+.IR /proc/ pid
+directory, as described above.
+.IP
+Three different integer values can be specified:
+.RS
+.TP
+\fI0\ (default)\fP
+.\" In kernel source: SUID_DUMP_DISABLE
+This provides the traditional (pre-Linux 2.6.13) behavior.
+A core dump will not be produced for a process which has
+changed credentials (by calling
+.BR seteuid (2),
+.BR setgid (2),
+or similar, or by executing a set-user-ID or set-group-ID program)
+or whose binary does not have read permission enabled.
+.TP
+\fI1\ ("debug")\fP
+.\" In kernel source: SUID_DUMP_USER
+All processes dump core when possible.
+(Reasons why a process might nevertheless not dump core are described in
+.BR core (5).)
+The core dump is owned by the filesystem user ID of the dumping process
+and no security is applied.
+This is intended for system debugging situations only:
+this mode is insecure because it allows unprivileged users to
+examine the memory contents of privileged processes.
+.TP
+\fI2\ ("suidsafe")\fP
+.\" In kernel source: SUID_DUMP_ROOT
+Any binary which normally would not be dumped (see "0" above)
+is dumped readable by root only.
+This allows the user to remove the core dump file but not to read it.
+For security reasons core dumps in this mode will not overwrite one
+another or other files.
+This mode is appropriate when administrators are
+attempting to debug problems in a normal environment.
+.IP
+Additionally, since Linux 3.6,
+.\" 9520628e8ceb69fa9a4aee6b57f22675d9e1b709
+.I /proc/sys/kernel/core_pattern
+must either be an absolute pathname
+or a pipe command, as detailed in
+.BR core (5).
+Warnings will be written to the kernel log if
+.I core_pattern
+does not follow these rules, and no core dump will be produced.
+.\" 54b501992dd2a839e94e76aa392c392b55080ce8
+.RE
+.IP
+For details of the effect of a process's "dumpable" setting
+on ptrace access mode checking, see
+.BR ptrace (2).
+.TP
+.I /proc/sys/fs/super\-max
+This file
+controls the maximum number of superblocks, and
+thus the maximum number of mounted filesystems the kernel
+can have.
+You need increase only
+.I super\-max
+if you need to mount more filesystems than the current value in
+.I super\-max
+allows you to.
+.TP
+.I /proc/sys/fs/super\-nr
+This file
+contains the number of filesystems currently mounted.
+.TP
+.I /proc/sys/kernel/
+This directory contains files controlling a range of kernel parameters,
+as described below.
+.TP
+.I /proc/sys/kernel/acct
+This file
+contains three numbers:
+.IR highwater ,
+.IR lowwater ,
+and
+.IR frequency .
+If BSD-style process accounting is enabled, these values control
+its behavior.
+If free space on filesystem where the log lives goes below
+.I lowwater
+percent, accounting suspends.
+If free space gets above
+.I highwater
+percent, accounting resumes.
+.I frequency
+determines
+how often the kernel checks the amount of free space (value is in
+seconds).
+Default values are 4, 2, and 30.
+That is, suspend accounting if 2% or less space is free; resume it
+if 4% or more space is free; consider information about amount of free space
+valid for 30 seconds.
+.TP
+.IR /proc/sys/kernel/auto_msgmni " (Linux 2.6.27 to Linux 3.18)"
+.\" commit 9eefe520c814f6f62c5d36a2ddcd3fb99dfdb30e (introduces feature)
+.\" commit 0050ee059f7fc86b1df2527aaa14ed5dc72f9973 (rendered redundant)
+From Linux 2.6.27 to Linux 3.18,
+this file was used to control recomputing of the value in
+.I /proc/sys/kernel/msgmni
+upon the addition or removal of memory or upon IPC namespace creation/removal.
+Echoing "1" into this file enabled
+.I msgmni
+automatic recomputing (and triggered a recomputation of
+.I msgmni
+based on the current amount of available memory and number of IPC namespaces).
+Echoing "0" disabled automatic recomputing.
+(Automatic recomputing was also disabled if a value was explicitly assigned to
+.IR /proc/sys/kernel/msgmni .)
+The default value in
+.I auto_msgmni
+was 1.
+.IP
+Since Linux 3.19, the content of this file has no effect (because
+.I msgmni
+.\" FIXME Must document the 3.19 'msgmni' changes.
+defaults to near the maximum value possible),
+and reads from this file always return the value "0".
+.TP
+.IR /proc/sys/kernel/cap_last_cap " (since Linux 3.2)"
+See
+.BR capabilities (7).
+.TP
+.IR /proc/sys/kernel/cap\-bound " (from Linux 2.2 to Linux 2.6.24)"
+This file holds the value of the kernel
+.I "capability bounding set"
+(expressed as a signed decimal number).
+This set is ANDed against the capabilities permitted to a process
+during
+.BR execve (2).
+Starting with Linux 2.6.25,
+the system-wide capability bounding set disappeared,
+and was replaced by a per-thread bounding set; see
+.BR capabilities (7).
+.TP
+.I /proc/sys/kernel/core_pattern
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/core_pipe_limit
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/core_uses_pid
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/ctrl\-alt\-del
+This file
+controls the handling of Ctrl-Alt-Del from the keyboard.
+When the value in this file is 0, Ctrl-Alt-Del is trapped and
+sent to the
+.BR init (1)
+program to handle a graceful restart.
+When the value is greater than zero, Linux's reaction to a Vulcan
+Nerve Pinch (tm) will be an immediate reboot, without even
+syncing its dirty buffers.
+Note: when a program (like dosemu) has the keyboard in "raw"
+mode, the Ctrl-Alt-Del is intercepted by the program before it
+ever reaches the kernel tty layer, and it's up to the program
+to decide what to do with it.
+.TP
+.IR /proc/sys/kernel/dmesg_restrict " (since Linux 2.6.37)"
+The value in this file determines who can see kernel syslog contents.
+A value of 0 in this file imposes no restrictions.
+If the value is 1, only privileged users can read the kernel syslog.
+(See
+.BR syslog (2)
+for more details.)
+Since Linux 3.4,
+.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
+only users with the
+.B CAP_SYS_ADMIN
+capability may change the value in this file.
+.TP
+.IR /proc/sys/kernel/domainname " and " /proc/sys/kernel/hostname
+can be used to set the NIS/YP domainname and the
+hostname of your box in exactly the same way as the commands
+.BR domainname (1)
+and
+.BR hostname (1),
+that is:
+.IP
+.in +4n
+.EX
+.RB "#" " echo \[aq]darkstar\[aq] > /proc/sys/kernel/hostname"
+.RB "#" " echo \[aq]mydomain\[aq] > /proc/sys/kernel/domainname"
+.EE
+.in
+.IP
+has the same effect as
+.IP
+.in +4n
+.EX
+.RB "#" " hostname \[aq]darkstar\[aq]"
+.RB "#" " domainname \[aq]mydomain\[aq]"
+.EE
+.in
+.IP
+Note, however, that the classic darkstar.frop.org has the
+hostname "darkstar" and DNS (Internet Domain Name Server)
+domainname "frop.org", not to be confused with the NIS (Network
+Information Service) or YP (Yellow Pages) domainname.
+These two
+domain names are in general different.
+For a detailed discussion
+see the
+.BR hostname (1)
+man page.
+.TP
+.I /proc/sys/kernel/hotplug
+This file
+contains the pathname for the hotplug policy agent.
+The default value in this file is
+.IR /sbin/hotplug .
+.TP
+.\" Removed in commit 87f504e5c78b910b0c1d6ffb89bc95e492322c84 (tglx/history.git)
+.IR /proc/sys/kernel/htab\-reclaim " (before Linux 2.4.9.2)"
+(PowerPC only) If this file is set to a nonzero value,
+the PowerPC htab
+.\" removed in commit 1b483a6a7b2998e9c98ad985d7494b9b725bd228, before Linux 2.6.28
+(see kernel file
+.IR Documentation/powerpc/ppc_htab.txt )
+is pruned
+each time the system hits the idle loop.
+.TP
+.I /proc/sys/kernel/keys/
+This directory contains various files that define parameters and limits
+for the key-management facility.
+These files are described in
+.BR keyrings (7).
+.TP
+.IR /proc/sys/kernel/kptr_restrict " (since Linux 2.6.38)"
+.\" 455cd5ab305c90ffc422dd2e0fb634730942b257
+The value in this file determines whether kernel addresses are exposed via
+.I /proc
+files and other interfaces.
+A value of 0 in this file imposes no restrictions.
+If the value is 1, kernel pointers printed using the
+.I %pK
+format specifier will be replaced with zeros unless the user has the
+.B CAP_SYSLOG
+capability.
+If the value is 2, kernel pointers printed using the
+.I %pK
+format specifier will be replaced with zeros regardless
+of the user's capabilities.
+The initial default value for this file was 1,
+but the default was changed
+.\" commit 411f05f123cbd7f8aa1edcae86970755a6e2a9d9
+to 0 in Linux 2.6.39.
+Since Linux 3.4,
+.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
+only users with the
+.B CAP_SYS_ADMIN
+capability can change the value in this file.
+.TP
+.I /proc/sys/kernel/l2cr
+(PowerPC only) This file
+contains a flag that controls the L2 cache of G3 processor
+boards.
+If 0, the cache is disabled.
+Enabled if nonzero.
+.TP
+.I /proc/sys/kernel/modprobe
+This file contains the pathname for the kernel module loader.
+The default value is
+.IR /sbin/modprobe .
+The file is present only if the kernel is built with the
+.B CONFIG_MODULES
+.RB ( CONFIG_KMOD
+in Linux 2.6.26 and earlier)
+option enabled.
+It is described by the Linux kernel source file
+.I Documentation/kmod.txt
+(present only in Linux 2.4 and earlier).
+.TP
+.IR /proc/sys/kernel/modules_disabled " (since Linux 2.6.31)"
+.\" 3d43321b7015387cfebbe26436d0e9d299162ea1
+.\" From Documentation/sysctl/kernel.txt
+A toggle value indicating if modules are allowed to be loaded
+in an otherwise modular kernel.
+This toggle defaults to off (0), but can be set true (1).
+Once true, modules can be neither loaded nor unloaded,
+and the toggle cannot be set back to false.
+The file is present only if the kernel is built with the
+.B CONFIG_MODULES
+option enabled.
+.TP
+.IR /proc/sys/kernel/msgmax " (since Linux 2.2)"
+This file defines
+a system-wide limit specifying the maximum number of bytes in
+a single message written on a System V message queue.
+.TP
+.IR /proc/sys/kernel/msgmni " (since Linux 2.4)"
+This file defines the system-wide limit on the number of
+message queue identifiers.
+See also
+.IR /proc/sys/kernel/auto_msgmni .
+.TP
+.IR /proc/sys/kernel/msgmnb " (since Linux 2.2)"
+This file defines a system-wide parameter used to initialize the
+.I msg_qbytes
+setting for subsequently created message queues.
+The
+.I msg_qbytes
+setting specifies the maximum number of bytes that may be written to the
+message queue.
+.TP
+.IR /proc/sys/kernel/ngroups_max " (since Linux 2.6.4)"
+This is a read-only file that displays the upper limit on the
+number of a process's group memberships.
+.TP
+.IR /proc/sys/kernel/ns_last_pid " (since Linux 3.3)"
+See
+.BR pid_namespaces (7).
+.TP
+.IR /proc/sys/kernel/ostype " and " /proc/sys/kernel/osrelease
+These files
+give substrings of
+.IR /proc/version .
+.TP
+.IR /proc/sys/kernel/overflowgid " and " /proc/sys/kernel/overflowuid
+These files duplicate the files
+.I /proc/sys/fs/overflowgid
+and
+.IR /proc/sys/fs/overflowuid .
+.TP
+.I /proc/sys/kernel/panic
+This file gives read/write access to the kernel variable
+.IR panic_timeout .
+If this is zero, the kernel will loop on a panic; if nonzero,
+it indicates that the kernel should autoreboot after this number
+of seconds.
+When you use the
+software watchdog device driver, the recommended setting is 60.
+.TP
+.IR /proc/sys/kernel/panic_on_oops " (since Linux 2.5.68)"
+This file controls the kernel's behavior when an oops
+or BUG is encountered.
+If this file contains 0, then the system
+tries to continue operation.
+If it contains 1, then the system
+delays a few seconds (to give klogd time to record the oops output)
+and then panics.
+If the
+.I /proc/sys/kernel/panic
+file is also nonzero, then the machine will be rebooted.
+.TP
+.IR /proc/sys/kernel/pid_max " (since Linux 2.5.34)"
+This file specifies the value at which PIDs wrap around
+(i.e., the value in this file is one greater than the maximum PID).
+PIDs greater than this value are not allocated;
+thus, the value in this file also acts as a system-wide limit
+on the total number of processes and threads.
+The default value for this file, 32768,
+results in the same range of PIDs as on earlier kernels.
+On 32-bit platforms, 32768 is the maximum value for
+.IR pid_max .
+On 64-bit systems,
+.I pid_max
+can be set to any value up to 2\[ha]22
+.RB ( PID_MAX_LIMIT ,
+approximately 4 million).
+.\" Prior to Linux 2.6.10, pid_max could also be raised above 32768 on 32-bit
+.\" platforms, but this broke /proc/[pid]
+.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=109513010926152&w=2
+.TP
+.IR /proc/sys/kernel/powersave\-nap " (PowerPC only)"
+This file contains a flag.
+If set, Linux-PPC will use the "nap" mode of
+powersaving,
+otherwise the "doze" mode will be used.
+.TP
+.I /proc/sys/kernel/printk
+See
+.BR syslog (2).
+.TP
+.IR /proc/sys/kernel/pty " (since Linux 2.6.4)"
+This directory contains two files relating to the number of UNIX 98
+pseudoterminals (see
+.BR pts (4))
+on the system.
+.TP
+.I /proc/sys/kernel/pty/max
+This file defines the maximum number of pseudoterminals.
+.\" FIXME Document /proc/sys/kernel/pty/reserve
+.\"     New in Linux 3.3
+.\"     commit e9aba5158a80098447ff207a452a3418ae7ee386
+.TP
+.I /proc/sys/kernel/pty/nr
+This read-only file
+indicates how many pseudoterminals are currently in use.
+.TP
+.I /proc/sys/kernel/random/
+This directory
+contains various parameters controlling the operation of the file
+.IR /dev/random .
+See
+.BR random (4)
+for further information.
+.TP
+.IR /proc/sys/kernel/random/uuid " (since Linux 2.4)"
+Each read from this read-only file returns a randomly generated 128-bit UUID,
+as a string in the standard UUID format.
+.TP
+.IR /proc/sys/kernel/randomize_va_space " (since Linux 2.6.12)"
+.\" Some further details can be found in Documentation/sysctl/kernel.txt
+Select the address space layout randomization (ASLR) policy for the system
+(on architectures that support ASLR).
+Three values are supported for this file:
+.RS
+.TP
+.B 0
+Turn ASLR off.
+This is the default for architectures that don't support ASLR,
+and when the kernel is booted with the
+.I norandmaps
+parameter.
+.TP
+.B 1
+Make the addresses of
+.BR mmap (2)
+allocations, the stack, and the VDSO page randomized.
+Among other things, this means that shared libraries will be
+loaded at randomized addresses.
+The text segment of PIE-linked binaries will also be loaded
+at a randomized address.
+This value is the default if the kernel was configured with
+.BR CONFIG_COMPAT_BRK .
+.TP
+.B 2
+(Since Linux 2.6.25)
+.\" commit c1d171a002942ea2d93b4fbd0c9583c56fce0772
+Also support heap randomization.
+This value is the default if the kernel was not configured with
+.BR CONFIG_COMPAT_BRK .
+.RE
+.TP
+.I /proc/sys/kernel/real\-root\-dev
+This file is documented in the Linux kernel source file
+.I Documentation/admin\-guide/initrd.rst
+.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
+(or
+.I Documentation/initrd.txt
+before Linux 4.10).
+.TP
+.IR /proc/sys/kernel/reboot\-cmd " (Sparc only)"
+This file seems to be a way to give an argument to the SPARC
+ROM/Flash boot loader.
+Maybe to tell it what to do after
+rebooting?
+.TP
+.I /proc/sys/kernel/rtsig\-max
+(Up to and including Linux 2.6.7; see
+.BR setrlimit (2))
+This file can be used to tune the maximum number
+of POSIX real-time (queued) signals that can be outstanding
+in the system.
+.TP
+.I /proc/sys/kernel/rtsig\-nr
+(Up to and including Linux 2.6.7.)
+This file shows the number of POSIX real-time signals currently queued.
+.TP
+.IR /proc/ pid /sched_autogroup_enabled " (since Linux 2.6.38)"
+.\" commit 5091faa449ee0b7d73bc296a93bca9540fc51d0a
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/sched_child_runs_first " (since Linux 2.6.23)"
+If this file contains the value zero, then, after a
+.BR fork (2),
+the parent is first scheduled on the CPU.
+If the file contains a nonzero value,
+then the child is scheduled first on the CPU.
+(Of course, on a multiprocessor system,
+the parent and the child might both immediately be scheduled on a CPU.)
+.TP
+.IR /proc/sys/kernel/sched_rr_timeslice_ms " (since Linux 3.9)"
+See
+.BR sched_rr_get_interval (2).
+.TP
+.IR /proc/sys/kernel/sched_rt_period_us " (since Linux 2.6.25)"
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/sched_rt_runtime_us " (since Linux 2.6.25)"
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/seccomp/ " (since Linux 4.14)"
+.\" commit 8e5f1ad116df6b0de65eac458d5e7c318d1c05af
+This directory provides additional seccomp information and
+configuration.
+See
+.BR seccomp (2)
+for further details.
+.TP
+.IR /proc/sys/kernel/sem " (since Linux 2.4)"
+This file contains 4 numbers defining limits for System V IPC semaphores.
+These fields are, in order:
+.RS
+.TP
+SEMMSL
+The maximum semaphores per semaphore set.
+.TP
+SEMMNS
+A system-wide limit on the number of semaphores in all semaphore sets.
+.TP
+SEMOPM
+The maximum number of operations that may be specified in a
+.BR semop (2)
+call.
+.TP
+SEMMNI
+A system-wide limit on the maximum number of semaphore identifiers.
+.RE
+.TP
+.I /proc/sys/kernel/sg\-big\-buff
+This file
+shows the size of the generic SCSI device (sg) buffer.
+You can't tune it just yet, but you could change it at
+compile time by editing
+.I include/scsi/sg.h
+and changing
+the value of
+.BR SG_BIG_BUFF .
+However, there shouldn't be any reason to change this value.
+.TP
+.IR /proc/sys/kernel/shm_rmid_forced " (since Linux 3.1)"
+.\" commit b34a6b1da371ed8af1221459a18c67970f7e3d53
+.\" See also Documentation/sysctl/kernel.txt
+If this file is set to 1, all System V shared memory segments will
+be marked for destruction as soon as the number of attached processes
+falls to zero;
+in other words, it is no longer possible to create shared memory segments
+that exist independently of any attached process.
+.IP
+The effect is as though a
+.BR shmctl (2)
+.B IPC_RMID
+is performed on all existing segments as well as all segments
+created in the future (until this file is reset to 0).
+Note that existing segments that are attached to no process will be
+immediately destroyed when this file is set to 1.
+Setting this option will also destroy segments that were created,
+but never attached,
+upon termination of the process that created the segment with
+.BR shmget (2).
+.IP
+Setting this file to 1 provides a way of ensuring that
+all System V shared memory segments are counted against the
+resource usage and resource limits (see the description of
+.B RLIMIT_AS
+in
+.BR getrlimit (2))
+of at least one process.
+.IP
+Because setting this file to 1 produces behavior that is nonstandard
+and could also break existing applications,
+the default value in this file is 0.
+Set this file to 1 only if you have a good understanding
+of the semantics of the applications using
+System V shared memory on your system.
+.TP
+.IR /proc/sys/kernel/shmall " (since Linux 2.2)"
+This file
+contains the system-wide limit on the total number of pages of
+System V shared memory.
+.TP
+.IR /proc/sys/kernel/shmmax " (since Linux 2.2)"
+This file
+can be used to query and set the run-time limit
+on the maximum (System V IPC) shared memory segment size that can be
+created.
+Shared memory segments up to 1 GB are now supported in the
+kernel.
+This value defaults to
+.BR SHMMAX .
+.TP
+.IR /proc/sys/kernel/shmmni " (since Linux 2.4)"
+This file
+specifies the system-wide maximum number of System V shared memory
+segments that can be created.
+.TP
+.IR /proc/sys/kernel/sysctl_writes_strict " (since Linux 3.16)"
+.\" commit f88083005ab319abba5d0b2e4e997558245493c8
+.\" commit 2ca9bb456ada8bcbdc8f77f8fc78207653bbaa92
+.\" commit f4aacea2f5d1a5f7e3154e967d70cf3f711bcd61
+.\" commit 24fe831c17ab8149413874f2fd4e5c8a41fcd294
+The value in this file determines how the file offset affects
+the behavior of updating entries in files under
+.IR /proc/sys .
+The file has three possible values:
+.RS
+.TP 4
+\-1
+This provides legacy handling, with no printk warnings.
+Each
+.BR write (2)
+must fully contain the value to be written,
+and multiple writes on the same file descriptor
+will overwrite the entire value, regardless of the file position.
+.TP
+0
+(default) This provides the same behavior as for \-1,
+but printk warnings are written for processes that
+perform writes when the file offset is not 0.
+.TP
+1
+Respect the file offset when writing strings into
+.I /proc/sys
+files.
+Multiple writes will
+.I append
+to the value buffer.
+Anything written beyond the maximum length
+of the value buffer will be ignored.
+Writes to numeric
+.I /proc/sys
+entries must always be at file offset 0 and the value must be
+fully contained in the buffer provided to
+.BR write (2).
+.\" FIXME .
+.\"     With /proc/sys/kernel/sysctl_writes_strict==1, writes at an
+.\"     offset other than 0 do not generate an error. Instead, the
+.\"     write() succeeds, but the file is left unmodified.
+.\"     This is surprising. The behavior may change in the future.
+.\"     See thread.gmane.org/gmane.linux.man/9197
+.\"		From: Michael Kerrisk (man-pages <mtk.manpages@...>
+.\"		Subject: sysctl_writes_strict documentation + an oddity?
+.\"		Newsgroups: gmane.linux.man, gmane.linux.kernel
+.\"		Date: 2015-05-09 08:54:11 GMT
+.RE
+.TP
+.I /proc/sys/kernel/sysrq
+This file controls the functions allowed to be invoked by the SysRq key.
+By default,
+the file contains 1 meaning that every possible SysRq request is allowed
+(in older kernel versions, SysRq was disabled by default,
+and you were required to specifically enable it at run-time,
+but this is not the case any more).
+Possible values in this file are:
+.RS
+.TP 5
+0
+Disable sysrq completely
+.TP
+1
+Enable all functions of sysrq
+.TP
+> 1
+Bit mask of allowed sysrq functions, as follows:
+.PD 0
+.RS
+.TP 5
+\ \ 2
+Enable control of console logging level
+.TP
+\ \ 4
+Enable control of keyboard (SAK, unraw)
+.TP
+\ \ 8
+Enable debugging dumps of processes etc.
+.TP
+\ 16
+Enable sync command
+.TP
+\ 32
+Enable remount read-only
+.TP
+\ 64
+Enable signaling of processes (term, kill, oom-kill)
+.TP
+128
+Allow reboot/poweroff
+.TP
+256
+Allow nicing of all real-time tasks
+.RE
+.PD
+.RE
+.IP
+This file is present only if the
+.B CONFIG_MAGIC_SYSRQ
+kernel configuration option is enabled.
+For further details see the Linux kernel source file
+.I Documentation/admin\-guide/sysrq.rst
+.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
+(or
+.I Documentation/sysrq.txt
+before Linux 4.10).
+.TP
+.I /proc/sys/kernel/version
+This file contains a string such as:
+.IP
+.in +4n
+.EX
+#5 Wed Feb 25 21:49:24 MET 1998
+.EE
+.in
+.IP
+The "#5" means that
+this is the fifth kernel built from this source base and the
+date following it indicates the time the kernel was built.
+.TP
+.IR /proc/sys/kernel/threads\-max " (since Linux 2.3.11)"
+.\" The following is based on Documentation/sysctl/kernel.txt
+This file specifies the system-wide limit on the number of
+threads (tasks) that can be created on the system.
+.IP
+Since Linux 4.1,
+.\" commit 230633d109e35b0a24277498e773edeb79b4a331
+the value that can be written to
+.I threads\-max
+is bounded.
+The minimum value that can be written is 20.
+The maximum value that can be written is given by the
+constant
+.B FUTEX_TID_MASK
+(0x3fffffff).
+If a value outside of this range is written to
+.IR threads\-max ,
+the error
+.B EINVAL
+occurs.
+.IP
+The value written is checked against the available RAM pages.
+If the thread structures would occupy too much (more than 1/8th)
+of the available RAM pages,
+.I threads\-max
+is reduced accordingly.
+.TP
+.IR /proc/sys/kernel/yama/ptrace_scope " (since Linux 3.5)"
+See
+.BR ptrace (2).
+.TP
+.IR /proc/sys/kernel/zero\-paged " (PowerPC only)"
+This file
+contains a flag.
+When enabled (nonzero), Linux-PPC will pre-zero pages in
+the idle loop, possibly speeding up get_free_pages.
+.TP
+.I /proc/sys/net
+This directory contains networking stuff.
+Explanations for some of the files under this directory can be found in
+.BR tcp (7)
+and
+.BR ip (7).
+.TP
+.I /proc/sys/net/core/bpf_jit_enable
+See
+.BR bpf (2).
+.TP
+.I /proc/sys/net/core/somaxconn
+This file defines a ceiling value for the
+.I backlog
+argument of
+.BR listen (2);
+see the
+.BR listen (2)
+manual page for details.
+.TP
+.I /proc/sys/proc
+This directory may be empty.
+.TP
+.I /proc/sys/sunrpc
+This directory supports Sun remote procedure call for network filesystem
+(NFS).
+On some systems, it is not present.
+.TP
+.IR /proc/sys/user " (since Linux 4.9)"
+See
+.BR namespaces (7).
+.TP
+.I /proc/sys/vm/
+This directory contains files for memory management tuning, buffer, and
+cache management.
+.TP
+.IR /proc/sys/vm/admin_reserve_kbytes " (since Linux 3.10)"
+.\" commit 4eeab4f5580d11bffedc697684b91b0bca0d5009
+This file defines the amount of free memory (in KiB) on the system that
+should be reserved for users with the capability
+.BR CAP_SYS_ADMIN .
+.IP
+The default value in this file is the minimum of [3% of free pages, 8MiB]
+expressed as KiB.
+The default is intended to provide enough for the superuser
+to log in and kill a process, if necessary,
+under the default overcommit 'guess' mode (i.e., 0 in
+.IR /proc/sys/vm/overcommit_memory ).
+.IP
+Systems running in "overcommit never" mode (i.e., 2 in
+.IR /proc/sys/vm/overcommit_memory )
+should increase the value in this file to account
+for the full virtual memory size of the programs used to recover (e.g.,
+.BR login (1)
+.BR ssh (1),
+and
+.BR top (1))
+Otherwise, the superuser may not be able to log in to recover the system.
+For example, on x86-64 a suitable value is 131072 (128MiB reserved).
+.IP
+Changing the value in this file takes effect whenever
+an application requests memory.
+.TP
+.IR /proc/sys/vm/compact_memory " (since Linux 2.6.35)"
+When 1 is written to this file, all zones are compacted such that free
+memory is available in contiguous blocks where possible.
+The effect of this action can be seen by examining
+.IR /proc/buddyinfo .
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_COMPACTION .
+.TP
+.IR /proc/sys/vm/drop_caches " (since Linux 2.6.16)"
+Writing to this file causes the kernel to drop clean caches, dentries, and
+inodes from memory, causing that memory to become free.
+This can be useful for memory management testing and
+performing reproducible filesystem benchmarks.
+Because writing to this file causes the benefits of caching to be lost,
+it can degrade overall system performance.
+.IP
+To free pagecache, use:
+.IP
+.in +4n
+.EX
+echo 1 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+To free dentries and inodes, use:
+.IP
+.in +4n
+.EX
+echo 2 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+To free pagecache, dentries, and inodes, use:
+.IP
+.in +4n
+.EX
+echo 3 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+Because writing to this file is a nondestructive operation and dirty objects
+are not freeable, the
+user should run
+.BR sync (1)
+first.
+.TP
+.IR  /proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)"
+This writable file contains a group ID that is allowed
+to allocate memory using huge pages.
+If a process has a filesystem group ID or any supplementary group ID that
+matches this group ID,
+then it can make huge-page allocations without holding the
+.B CAP_IPC_LOCK
+capability; see
+.BR memfd_create (2),
+.BR mmap (2),
+and
+.BR shmget (2).
+.TP
+.IR /proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)"
+.\" The following is from Documentation/filesystems/proc.txt
+If nonzero, this disables the new 32-bit memory-mapping layout;
+the kernel will use the legacy (2.4) layout for all processes.
+.TP
+.IR /proc/sys/vm/memory_failure_early_kill " (since Linux 2.6.32)"
+.\" The following is based on the text in Documentation/sysctl/vm.txt
+Control how to kill processes when an uncorrected memory error
+(typically a 2-bit error in a memory module)
+that cannot be handled by the kernel
+is detected in the background by hardware.
+In some cases (like the page still having a valid copy on disk),
+the kernel will handle the failure
+transparently without affecting any applications.
+But if there is no other up-to-date copy of the data,
+it will kill processes to prevent any data corruptions from propagating.
+.IP
+The file has one of the following values:
+.RS
+.TP
+.B 1
+Kill all processes that have the corrupted-and-not-reloadable page mapped
+as soon as the corruption is detected.
+Note that this is not supported for a few types of pages,
+such as kernel internally
+allocated data or the swap cache, but works for the majority of user pages.
+.TP
+.B 0
+Unmap the corrupted page from all processes and kill a process
+only if it tries to access the page.
+.RE
+.IP
+The kill is performed using a
+.B SIGBUS
+signal with
+.I si_code
+set to
+.BR BUS_MCEERR_AO .
+Processes can handle this if they want to; see
+.BR sigaction (2)
+for more details.
+.IP
+This feature is active only on architectures/platforms with advanced machine
+check handling and depends on the hardware capabilities.
+.IP
+Applications can override the
+.I memory_failure_early_kill
+setting individually with the
+.BR prctl (2)
+.B PR_MCE_KILL
+operation.
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_MEMORY_FAILURE .
+.TP
+.IR /proc/sys/vm/memory_failure_recovery " (since Linux 2.6.32)"
+.\" The following is based on the text in Documentation/sysctl/vm.txt
+Enable memory failure recovery (when supported by the platform).
+.RS
+.TP
+.B 1
+Attempt recovery.
+.TP
+.B 0
+Always panic on a memory failure.
+.RE
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_MEMORY_FAILURE .
+.TP
+.IR /proc/sys/vm/oom_dump_tasks " (since Linux 2.6.25)"
+.\" The following is from Documentation/sysctl/vm.txt
+Enables a system-wide task dump (excluding kernel threads) to be
+produced when the kernel performs an OOM-killing.
+The dump includes the following information
+for each task (thread, process):
+thread ID, real user ID, thread group ID (process ID),
+virtual memory size, resident set size,
+the CPU that the task is scheduled on,
+oom_adj score (see the description of
+.IR /proc/ pid /oom_adj ),
+and command name.
+This is helpful to determine why the OOM-killer was invoked
+and to identify the rogue task that caused it.
+.IP
+If this contains the value zero, this information is suppressed.
+On very large systems with thousands of tasks,
+it may not be feasible to dump the memory state information for each one.
+Such systems should not be forced to incur a performance penalty in
+OOM situations when the information may not be desired.
+.IP
+If this is set to nonzero, this information is shown whenever the
+OOM-killer actually kills a memory-hogging task.
+.IP
+The default value is 0.
+.TP
+.IR /proc/sys/vm/oom_kill_allocating_task " (since Linux 2.6.24)"
+.\" The following is from Documentation/sysctl/vm.txt
+This enables or disables killing the OOM-triggering task in
+out-of-memory situations.
+.IP
+If this is set to zero, the OOM-killer will scan through the entire
+tasklist and select a task based on heuristics to kill.
+This normally selects a rogue memory-hogging task that
+frees up a large amount of memory when killed.
+.IP
+If this is set to nonzero, the OOM-killer simply kills the task that
+triggered the out-of-memory condition.
+This avoids a possibly expensive tasklist scan.
+.IP
+If
+.I /proc/sys/vm/panic_on_oom
+is nonzero, it takes precedence over whatever value is used in
+.IR /proc/sys/vm/oom_kill_allocating_task .
+.IP
+The default value is 0.
+.TP
+.IR /proc/sys/vm/overcommit_kbytes " (since Linux 3.14)"
+.\" commit 49f0ce5f92321cdcf741e35f385669a421013cb7
+This writable file provides an alternative to
+.I /proc/sys/vm/overcommit_ratio
+for controlling the
+.I CommitLimit
+when
+.I /proc/sys/vm/overcommit_memory
+has the value 2.
+It allows the amount of memory overcommitting to be specified as
+an absolute value (in kB),
+rather than as a percentage, as is done with
+.IR overcommit_ratio .
+This allows for finer-grained control of
+.I CommitLimit
+on systems with extremely large memory sizes.
+.IP
+Only one of
+.I overcommit_kbytes
+or
+.I overcommit_ratio
+can have an effect:
+if
+.I overcommit_kbytes
+has a nonzero value, then it is used to calculate
+.IR CommitLimit ,
+otherwise
+.I overcommit_ratio
+is used.
+Writing a value to either of these files causes the
+value in the other file to be set to zero.
+.TP
+.I /proc/sys/vm/overcommit_memory
+This file contains the kernel virtual memory accounting mode.
+Values are:
+.RS
+.IP
+0: heuristic overcommit (this is the default)
+.br
+1: always overcommit, never check
+.br
+2: always check, never overcommit
+.RE
+.IP
+In mode 0, calls of
+.BR mmap (2)
+with
+.B MAP_NORESERVE
+are not checked, and the default check is very weak,
+leading to the risk of getting a process "OOM-killed".
+.IP
+In mode 1, the kernel pretends there is always enough memory,
+until memory actually runs out.
+One use case for this mode is scientific computing applications
+that employ large sparse arrays.
+Before Linux 2.6.0, any nonzero value implies mode 1.
+.IP
+In mode 2 (available since Linux 2.6), the total virtual address space
+that can be allocated
+.RI ( CommitLimit
+in
+.IR /proc/meminfo )
+is calculated as
+.IP
+.in +4n
+.EX
+CommitLimit = (total_RAM \- total_huge_TLB) *
+	      overcommit_ratio / 100 + total_swap
+.EE
+.in
+.IP
+where:
+.RS
+.IP \[bu] 3
+.I total_RAM
+is the total amount of RAM on the system;
+.IP \[bu]
+.I total_huge_TLB
+is the amount of memory set aside for huge pages;
+.IP \[bu]
+.I overcommit_ratio
+is the value in
+.IR /proc/sys/vm/overcommit_ratio ;
+and
+.IP \[bu]
+.I total_swap
+is the amount of swap space.
+.RE
+.IP
+For example, on a system with 16 GB of physical RAM, 16 GB
+of swap, no space dedicated to huge pages, and an
+.I overcommit_ratio
+of 50, this formula yields a
+.I CommitLimit
+of 24 GB.
+.IP
+Since Linux 3.14, if the value in
+.I /proc/sys/vm/overcommit_kbytes
+is nonzero, then
+.I CommitLimit
+is instead calculated as:
+.IP
+.in +4n
+.EX
+CommitLimit = overcommit_kbytes + total_swap
+.EE
+.in
+.IP
+See also the description of
+.I /proc/sys/vm/admin_reserve_kbytes
+and
+.IR /proc/sys/vm/user_reserve_kbytes .
+.TP
+.IR /proc/sys/vm/overcommit_ratio " (since Linux 2.6.0)"
+This writable file defines a percentage by which memory
+can be overcommitted.
+The default value in the file is 50.
+See the description of
+.IR /proc/sys/vm/overcommit_memory .
+.TP
+.IR /proc/sys/vm/panic_on_oom " (since Linux 2.6.18)"
+.\" The following is adapted from Documentation/sysctl/vm.txt
+This enables or disables a kernel panic in
+an out-of-memory situation.
+.IP
+If this file is set to the value 0,
+the kernel's OOM-killer will kill some rogue process.
+Usually, the OOM-killer is able to kill a rogue process and the
+system will survive.
+.IP
+If this file is set to the value 1,
+then the kernel normally panics when out-of-memory happens.
+However, if a process limits allocations to certain nodes
+using memory policies
+.RB ( mbind (2)
+.BR MPOL_BIND )
+or cpusets
+.RB ( cpuset (7))
+and those nodes reach memory exhaustion status,
+one process may be killed by the OOM-killer.
+No panic occurs in this case:
+because other nodes' memory may be free,
+this means the system as a whole may not have reached
+an out-of-memory situation yet.
+.IP
+If this file is set to the value 2,
+the kernel always panics when an out-of-memory condition occurs.
+.IP
+The default value is 0.
+1 and 2 are for failover of clustering.
+Select either according to your policy of failover.
+.TP
+.I /proc/sys/vm/swappiness
+.\" The following is from Documentation/sysctl/vm.txt
+The value in this file controls how aggressively the kernel will swap
+memory pages.
+Higher values increase aggressiveness, lower values
+decrease aggressiveness.
+The default value is 60.
+.TP
+.IR /proc/sys/vm/user_reserve_kbytes " (since Linux 3.10)"
+.\" commit c9b1d0981fcce3d9976d7b7a56e4e0503bc610dd
+Specifies an amount of memory (in KiB) to reserve for user processes.
+This is intended to prevent a user from starting a single memory hogging
+process, such that they cannot recover (kill the hog).
+The value in this file has an effect only when
+.I /proc/sys/vm/overcommit_memory
+is set to 2 ("overcommit never" mode).
+In this case, the system reserves an amount of memory that is the minimum
+of [3% of current process size,
+.IR user_reserve_kbytes ].
+.IP
+The default value in this file is the minimum of [3% of free pages, 128MiB]
+expressed as KiB.
+.IP
+If the value in this file is set to zero,
+then a user will be allowed to allocate all free memory with a single process
+(minus the amount reserved by
+.IR /proc/sys/vm/admin_reserve_kbytes ).
+Any subsequent attempts to execute a command will result in
+"fork: Cannot allocate memory".
+.IP
+Changing the value in this file takes effect whenever
+an application requests memory.
+.TP
+.IR /proc/sys/vm/unprivileged_userfaultfd " (since Linux 5.2)"
+.\" cefdca0a86be517bc390fc4541e3674b8e7803b0
+This (writable) file exposes a flag that controls whether
+unprivileged processes are allowed to employ
+.BR userfaultfd (2).
+If this file has the value 1, then unprivileged processes may use
+.BR userfaultfd (2).
+If this file has the value 0, then only processes that have the
+.B CAP_SYS_PTRACE
+capability may employ
+.BR userfaultfd (2).
+The default value in this file is 1.
+.SH SEE ALSO
+.BR proc (5)
author	Alejandro Colomar <alx@kernel.org>	2023-08-15 20:35:38 +0200
committer	Alejandro Colomar <alx@kernel.org>	2023-08-15 23:19:17 +0200
commit	bfc1299e7d6794265dc89806f601737d2bb958dc (patch)
tree	160de735507a3f988d2dcc4d05a17e2c8c7d8bfa /man5
parent	30bcb8114ea32e33b99caacc018c68a5893468dc (diff)