summaryrefslogtreecommitdiffstats
path: root/man/man5/proc_sys_fs.5
diff options
context:
space:
mode:
Diffstat (limited to 'man/man5/proc_sys_fs.5')
-rw-r--r--man/man5/proc_sys_fs.5471
1 files changed, 471 insertions, 0 deletions
diff --git a/man/man5/proc_sys_fs.5 b/man/man5/proc_sys_fs.5
new file mode 100644
index 000000000..e745a64fd
--- /dev/null
+++ b/man/man5/proc_sys_fs.5
@@ -0,0 +1,471 @@
+.\" Copyright (C) 1994, 1995, Daniel Quinlan <quinlan@yggdrasil.com>
+.\" Copyright (C) 2002-2008, 2017, Michael Kerrisk <mtk.manpages@gmail.com>
+.\" Copyright (C) , Andries Brouwer <aeb@cwi.nl>
+.\" Copyright (C) 2023, Alejandro Colomar <alx@kernel.org>
+.\"
+.\" SPDX-License-Identifier: GPL-3.0-or-later
+.\"
+.TH proc_sys_fs 5 (date) "Linux man-pages (unreleased)"
+.SH NAME
+/proc/sys/fs/ \- kernel variables related to filesystems
+.SH DESCRIPTION
+.TP
+.I /proc/sys/fs/
+This directory contains the files and subdirectories for kernel variables
+related to filesystems.
+.TP
+.IR /proc/sys/fs/aio\-max\-nr " and " /proc/sys/fs/aio\-nr " (since Linux 2.6.4)"
+.I aio\-nr
+is the running total of the number of events specified by
+.BR io_setup (2)
+calls for all currently active AIO contexts.
+If
+.I aio\-nr
+reaches
+.IR aio\-max\-nr ,
+then
+.BR io_setup (2)
+will fail with the error
+.BR EAGAIN .
+Raising
+.I aio\-max\-nr
+does not result in the preallocation or resizing
+of any kernel data structures.
+.TP
+.I /proc/sys/fs/binfmt_misc
+Documentation for files in this directory can be found
+in the Linux kernel source in the file
+.I Documentation/admin\-guide/binfmt\-misc.rst
+(or in
+.I Documentation/binfmt_misc.txt
+on older kernels).
+.TP
+.IR /proc/sys/fs/dentry\-state " (since Linux 2.2)"
+This file contains information about the status of the
+directory cache (dcache).
+The file contains six numbers,
+.IR nr_dentry ,
+.IR nr_unused ,
+.I age_limit
+(age in seconds),
+.I want_pages
+(pages requested by system) and two dummy values.
+.RS
+.IP \[bu] 3
+.I nr_dentry
+is the number of allocated dentries (dcache entries).
+This field is unused in Linux 2.2.
+.IP \[bu]
+.I nr_unused
+is the number of unused dentries.
+.IP \[bu]
+.I age_limit
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is the age in seconds after which dcache entries
+can be reclaimed when memory is short.
+.IP \[bu]
+.I want_pages
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is nonzero when the kernel has called shrink_dcache_pages() and the
+dcache isn't pruned yet.
+.RE
+.TP
+.I /proc/sys/fs/dir\-notify\-enable
+This file can be used to disable or enable the
+.I dnotify
+interface described in
+.BR fcntl (2)
+on a system-wide basis.
+A value of 0 in this file disables the interface,
+and a value of 1 enables it.
+.TP
+.I /proc/sys/fs/dquot\-max
+This file shows the maximum number of cached disk quota entries.
+On some (2.4) systems, it is not present.
+If the number of free cached disk quota entries is very low and
+you have some awesome number of simultaneous system users,
+you might want to raise the limit.
+.TP
+.I /proc/sys/fs/dquot\-nr
+This file shows the number of allocated disk quota
+entries and the number of free disk quota entries.
+.TP
+.IR /proc/sys/fs/epoll/ " (since Linux 2.6.28)"
+This directory contains the file
+.IR max_user_watches ,
+which can be used to limit the amount of kernel memory consumed by the
+.I epoll
+interface.
+For further details, see
+.BR epoll (7).
+.TP
+.I /proc/sys/fs/file\-max
+This file defines
+a system-wide limit on the number of open files for all processes.
+System calls that fail when encountering this limit fail with the error
+.BR ENFILE .
+(See also
+.BR setrlimit (2),
+which can be used by a process to set the per-process limit,
+.BR RLIMIT_NOFILE ,
+on the number of files it may open.)
+If you get lots
+of error messages in the kernel log about running out of file handles
+(open file descriptions)
+(look for "VFS: file\-max limit <number> reached"),
+try increasing this value:
+.IP
+.in +4n
+.EX
+echo 100000 > /proc/sys/fs/file\-max
+.EE
+.in
+.IP
+Privileged processes
+.RB ( CAP_SYS_ADMIN )
+can override the
+.I file\-max
+limit.
+.TP
+.I /proc/sys/fs/file\-nr
+This (read-only) file contains three numbers:
+the number of allocated file handles
+(i.e., the number of open file descriptions; see
+.BR open (2));
+the number of free file handles;
+and the maximum number of file handles (i.e., the same value as
+.IR /proc/sys/fs/file\-max ).
+If the number of allocated file handles is close to the
+maximum, you should consider increasing the maximum.
+Before Linux 2.6,
+the kernel allocated file handles dynamically,
+but it didn't free them again.
+Instead the free file handles were kept in a list for reallocation;
+the "free file handles" value indicates the size of that list.
+A large number of free file handles indicates that there was
+a past peak in the usage of open file handles.
+Since Linux 2.6, the kernel does deallocate freed file handles,
+and the "free file handles" value is always zero.
+.TP
+.IR /proc/sys/fs/inode\-max " (only present until Linux 2.2)"
+This file contains the maximum number of in-memory inodes.
+This value should be 3\[en]4 times larger
+than the value in
+.IR file\-max ,
+since \fIstdin\fP, \fIstdout\fP
+and network sockets also need an inode to handle them.
+When you regularly run out of inodes, you need to increase this value.
+.IP
+Starting with Linux 2.4,
+there is no longer a static limit on the number of inodes,
+and this file is removed.
+.TP
+.I /proc/sys/fs/inode\-nr
+This file contains the first two values from
+.IR inode\-state .
+.TP
+.I /proc/sys/fs/inode\-state
+This file
+contains seven numbers:
+.IR nr_inodes ,
+.IR nr_free_inodes ,
+.IR preshrink ,
+and four dummy values (always zero).
+.IP
+.I nr_inodes
+is the number of inodes the system has allocated.
+.\" This can be slightly more than
+.\" .I inode\-max
+.\" because Linux allocates them one page full at a time.
+.I nr_free_inodes
+represents the number of free inodes.
+.IP
+.I preshrink
+is nonzero when the
+.I nr_inodes
+>
+.I inode\-max
+and the system needs to prune the inode list instead of allocating more;
+since Linux 2.4, this field is a dummy value (always zero).
+.TP
+.IR /proc/sys/fs/inotify/ " (since Linux 2.6.13)"
+This directory contains files
+.IR max_queued_events ", " max_user_instances ", and " max_user_watches ,
+that can be used to limit the amount of kernel memory consumed by the
+.I inotify
+interface.
+For further details, see
+.BR inotify (7).
+.TP
+.I /proc/sys/fs/lease\-break\-time
+This file specifies the grace period that the kernel grants to a process
+holding a file lease
+.RB ( fcntl (2))
+after it has sent a signal to that process notifying it
+that another process is waiting to open the file.
+If the lease holder does not remove or downgrade the lease within
+this grace period, the kernel forcibly breaks the lease.
+.TP
+.I /proc/sys/fs/leases\-enable
+This file can be used to enable or disable file leases
+.RB ( fcntl (2))
+on a system-wide basis.
+If this file contains the value 0, leases are disabled.
+A nonzero value enables leases.
+.TP
+.IR /proc/sys/fs/mount\-max " (since Linux 4.9)"
+.\" commit d29216842a85c7970c536108e093963f02714498
+The value in this file specifies the maximum number of mounts that may exist
+in a mount namespace.
+The default value in this file is 100,000.
+.TP
+.IR /proc/sys/fs/mqueue/ " (since Linux 2.6.6)"
+This directory contains files
+.IR msg_max ", " msgsize_max ", and " queues_max ,
+controlling the resources used by POSIX message queues.
+See
+.BR mq_overview (7)
+for details.
+.TP
+.IR /proc/sys/fs/nr_open " (since Linux 2.6.25)"
+.\" commit 9cfe015aa424b3c003baba3841a60dd9b5ad319b
+This file imposes a ceiling on the value to which the
+.B RLIMIT_NOFILE
+resource limit can be raised (see
+.BR getrlimit (2)).
+This ceiling is enforced for both unprivileged and privileged process.
+The default value in this file is 1048576.
+(Before Linux 2.6.25, the ceiling for
+.B RLIMIT_NOFILE
+was hard-coded to the same value.)
+.TP
+.IR /proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid
+These files
+allow you to change the value of the fixed UID and GID.
+The default is 65534.
+Some filesystems support only 16-bit UIDs and GIDs, although in Linux
+UIDs and GIDs are 32 bits.
+When one of these filesystems is mounted
+with writes enabled, any UID or GID that would exceed 65535 is translated
+to the overflow value before being written to disk.
+.TP
+.IR /proc/sys/fs/pipe\-max\-size " (since Linux 2.6.35)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-hard " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-soft " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/protected_fifos " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to FIFOs is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on FIFOs that the caller doesn't own in world-writable sticky directories,
+unless the FIFO is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is to avoid unintentional writes to an
+attacker-controlled FIFO when a program expected to create a regular file.
+.TP
+.IR /proc/sys/fs/protected_hardlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on the creation of hard links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1,
+a hard link can be created to a target file
+only if one of the following conditions is true:
+.RS
+.IP \[bu] 3
+The calling process has the
+.B CAP_FOWNER
+capability in its user namespace
+and the file UID has a mapping in the namespace.
+.IP \[bu]
+The filesystem UID of the process creating the link matches
+the owner (UID) of the target file
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID).
+.IP \[bu]
+All of the following conditions are true:
+.RS 4
+.IP \[bu] 3
+the target is a regular file;
+.IP \[bu]
+the target file does not have its set-user-ID mode bit enabled;
+.IP \[bu]
+the target file does not have both its set-group-ID and
+group-executable mode bits enabled; and
+.IP \[bu]
+the caller has permission to read and write the target file
+(either via the file's permissions mask or because it has
+suitable capabilities).
+.RE
+.RE
+.IP
+The default value in this file is 0.
+Setting the value to 1
+prevents a longstanding class of security issues caused by
+hard-link-based time-of-check, time-of-use races,
+most commonly seen in world-writable directories such as
+.IR /tmp .
+The common method of exploiting this flaw
+is to cross privilege boundaries when following a given hard link
+(i.e., a root process follows a hard link created by another user).
+Additionally, on systems without separated partitions,
+this stops unauthorized users from "pinning" vulnerable set-user-ID and
+set-group-ID files against being upgraded by
+the administrator, or linking to special files.
+.TP
+.IR /proc/sys/fs/protected_regular " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to regular files is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on regular files that the caller doesn't own in
+world-writable sticky directories,
+unless the regular file is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is similar to
+.IR protected_fifos ,
+but allows an application to
+avoid writes to an attacker-controlled regular file,
+where the application expected to create one.
+.TP
+.IR /proc/sys/fs/protected_symlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on following symbolic links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1, symbolic links are followed only
+in the following circumstances:
+.RS
+.IP \[bu] 3
+the filesystem UID of the process following the link matches
+the owner (UID) of the symbolic link
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID);
+.IP \[bu]
+the link is not in a sticky world-writable directory; or
+.IP \[bu]
+the symbolic link and its parent directory have the same owner (UID)
+.RE
+.IP
+A system call that fails to follow a symbolic link
+because of the above restrictions returns the error
+.B EACCES
+in
+.IR errno .
+.IP
+The default value in this file is 0.
+Setting the value to 1 avoids a longstanding class of security issues
+based on time-of-check, time-of-use races when accessing symbolic links.
+.TP
+.IR /proc/sys/fs/suid_dumpable " (since Linux 2.6.13)"
+.\" The following is based on text from Documentation/sysctl/kernel.txt
+The value in this file is assigned to a process's "dumpable" flag
+in the circumstances described in
+.BR prctl (2).
+In effect,
+the value in this file determines whether core dump files are
+produced for set-user-ID or otherwise protected/tainted binaries.
+The "dumpable" setting also affects the ownership of files in a process's
+.IR /proc/ pid
+directory, as described above.
+.IP
+Three different integer values can be specified:
+.RS
+.TP
+\fI0\ (default)\fP
+.\" In kernel source: SUID_DUMP_DISABLE
+This provides the traditional (pre-Linux 2.6.13) behavior.
+A core dump will not be produced for a process which has
+changed credentials (by calling
+.BR seteuid (2),
+.BR setgid (2),
+or similar, or by executing a set-user-ID or set-group-ID program)
+or whose binary does not have read permission enabled.
+.TP
+\fI1\ ("debug")\fP
+.\" In kernel source: SUID_DUMP_USER
+All processes dump core when possible.
+(Reasons why a process might nevertheless not dump core are described in
+.BR core (5).)
+The core dump is owned by the filesystem user ID of the dumping process
+and no security is applied.
+This is intended for system debugging situations only:
+this mode is insecure because it allows unprivileged users to
+examine the memory contents of privileged processes.
+.TP
+\fI2\ ("suidsafe")\fP
+.\" In kernel source: SUID_DUMP_ROOT
+Any binary which normally would not be dumped (see "0" above)
+is dumped readable by root only.
+This allows the user to remove the core dump file but not to read it.
+For security reasons core dumps in this mode will not overwrite one
+another or other files.
+This mode is appropriate when administrators are
+attempting to debug problems in a normal environment.
+.IP
+Additionally, since Linux 3.6,
+.\" 9520628e8ceb69fa9a4aee6b57f22675d9e1b709
+.I /proc/sys/kernel/core_pattern
+must either be an absolute pathname
+or a pipe command, as detailed in
+.BR core (5).
+Warnings will be written to the kernel log if
+.I core_pattern
+does not follow these rules, and no core dump will be produced.
+.\" 54b501992dd2a839e94e76aa392c392b55080ce8
+.RE
+.IP
+For details of the effect of a process's "dumpable" setting
+on ptrace access mode checking, see
+.BR ptrace (2).
+.TP
+.I /proc/sys/fs/super\-max
+This file
+controls the maximum number of superblocks, and
+thus the maximum number of mounted filesystems the kernel
+can have.
+You need increase only
+.I super\-max
+if you need to mount more filesystems than the current value in
+.I super\-max
+allows you to.
+.TP
+.I /proc/sys/fs/super\-nr
+This file
+contains the number of filesystems currently mounted.
+.SH SEE ALSO
+.BR proc (5),
+.BR proc_sys (5)