summaryrefslogtreecommitdiffstats
path: root/man/man2/ioctl_userfaultfd.2
diff options
context:
space:
mode:
Diffstat (limited to 'man/man2/ioctl_userfaultfd.2')
-rw-r--r--man/man2/ioctl_userfaultfd.21072
1 files changed, 1072 insertions, 0 deletions
diff --git a/man/man2/ioctl_userfaultfd.2 b/man/man2/ioctl_userfaultfd.2
new file mode 100644
index 000000000..cbd0c7978
--- /dev/null
+++ b/man/man2/ioctl_userfaultfd.2
@@ -0,0 +1,1072 @@
+.\" Copyright (c) 2016, IBM Corporation.
+.\" Written by Mike Rapoport <rppt@linux.vnet.ibm.com>
+.\" and Copyright (C) 2016 Michael Kerrisk <mtk.manpages@gmail.com>
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.\"
+.TH ioctl_userfaultfd 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+ioctl_userfaultfd \- create a file descriptor for handling page faults in user
+space
+.SH LIBRARY
+Standard C library
+.RI ( libc ", " \-lc )
+.SH SYNOPSIS
+.nf
+.BR "#include <linux/userfaultfd.h>" " /* Definition of " UFFD* " constants */"
+.B #include <sys/ioctl.h>
+.P
+.BI "int ioctl(int " fd ", int " op ", ...);"
+.fi
+.SH DESCRIPTION
+Various
+.BR ioctl (2)
+operations can be performed on a userfaultfd object (created by a call to
+.BR userfaultfd (2))
+using calls of the form:
+.P
+.in +4n
+.EX
+ioctl(fd, op, argp);
+.EE
+.in
+.P
+In the above,
+.I fd
+is a file descriptor referring to a userfaultfd object,
+.I op
+is one of the operations listed below, and
+.I argp
+is a pointer to a data structure that is specific to
+.IR op .
+.P
+The various
+.BR ioctl (2)
+operations are described below.
+The
+.BR UFFDIO_API ,
+.BR UFFDIO_REGISTER ,
+and
+.B UFFDIO_UNREGISTER
+operations are used to
+.I configure
+userfaultfd behavior.
+These operations allow the caller to choose what features will be enabled and
+what kinds of events will be delivered to the application.
+The remaining operations are
+.I range
+operations.
+These operations enable the calling application to resolve page-fault
+events.
+.\"
+.SS UFFDIO_API
+(Since Linux 4.3.)
+Enable operation of the userfaultfd and perform API handshake.
+.P
+The
+.I argp
+argument is a pointer to a
+.I uffdio_api
+structure, defined as:
+.P
+.in +4n
+.EX
+struct uffdio_api {
+ __u64 api; /* Requested API version (input) */
+ __u64 features; /* Requested features (input/output) */
+ __u64 ioctls; /* Available ioctl() operations (output) */
+};
+.EE
+.in
+.P
+The
+.I api
+field denotes the API version requested by the application.
+The kernel verifies that it can support the requested API version,
+and sets the
+.I features
+and
+.I ioctls
+fields to bit masks representing all the available features and the generic
+.BR ioctl (2)
+operations available.
+.P
+Since Linux 4.11,
+applications should use the
+.I features
+field to perform a two-step handshake.
+First,
+.B UFFDIO_API
+is called with the
+.I features
+field set to zero.
+The kernel responds by setting all supported feature bits.
+.P
+Applications which do not require any specific features
+can begin using the userfaultfd immediately.
+Applications which do need specific features
+should call
+.B UFFDIO_API
+again with a subset of the reported feature bits set
+to enable those features.
+.P
+Before Linux 4.11, the
+.I features
+field must be initialized to zero before the call to
+.BR UFFDIO_API ,
+and zero (i.e., no feature bits) is placed in the
+.I features
+field by the kernel upon return from
+.BR ioctl (2).
+.P
+If the application sets unsupported feature bits,
+the kernel will zero out the returned
+.I uffdio_api
+structure and return
+.BR EINVAL .
+.P
+The following feature bits may be set:
+.TP
+.BR UFFD_FEATURE_EVENT_FORK " (since Linux 4.11)"
+When this feature is enabled,
+the userfaultfd objects associated with a parent process are duplicated
+into the child process during
+.BR fork (2)
+and a
+.B UFFD_EVENT_FORK
+event is delivered to the userfaultfd monitor
+.TP
+.BR UFFD_FEATURE_EVENT_REMAP " (since Linux 4.11)"
+If this feature is enabled,
+when the faulting process invokes
+.BR mremap (2),
+the userfaultfd monitor will receive an event of type
+.BR UFFD_EVENT_REMAP .
+.TP
+.BR UFFD_FEATURE_EVENT_REMOVE " (since Linux 4.11)"
+If this feature is enabled,
+when the faulting process calls
+.BR madvise (2)
+with the
+.B MADV_DONTNEED
+or
+.B MADV_REMOVE
+advice value to free a virtual memory area
+the userfaultfd monitor will receive an event of type
+.BR UFFD_EVENT_REMOVE .
+.TP
+.BR UFFD_FEATURE_EVENT_UNMAP " (since Linux 4.11)"
+If this feature is enabled,
+when the faulting process unmaps virtual memory either explicitly with
+.BR munmap (2),
+or implicitly during either
+.BR mmap (2)
+or
+.BR mremap (2),
+the userfaultfd monitor will receive an event of type
+.BR UFFD_EVENT_UNMAP .
+.TP
+.BR UFFD_FEATURE_MISSING_HUGETLBFS " (since Linux 4.11)"
+If this feature bit is set,
+the kernel supports registering userfaultfd ranges on hugetlbfs
+virtual memory areas
+.TP
+.BR UFFD_FEATURE_MISSING_SHMEM " (since Linux 4.11)"
+If this feature bit is set,
+the kernel supports registering userfaultfd ranges on shared memory areas.
+This includes all kernel shared memory APIs:
+System V shared memory,
+.BR tmpfs (5),
+shared mappings of
+.IR /dev/zero ,
+.BR mmap (2)
+with the
+.B MAP_SHARED
+flag set,
+.BR memfd_create (2),
+and so on.
+.TP
+.BR UFFD_FEATURE_SIGBUS " (since Linux 4.14)"
+.\" commit 2d6d6f5a09a96cc1fec7ed992b825e05f64cb50e
+If this feature bit is set, no page-fault events
+.RB ( UFFD_EVENT_PAGEFAULT )
+will be delivered.
+Instead, a
+.B SIGBUS
+signal will be sent to the faulting process.
+Applications using this
+feature will not require the use of a userfaultfd monitor for processing
+memory accesses to the regions registered with userfaultfd.
+.TP
+.BR UFFD_FEATURE_THREAD_ID " (since Linux 4.14)"
+If this feature bit is set,
+.I uffd_msg.pagefault.feat.ptid
+will be set to the faulted thread ID for each page-fault message.
+.TP
+.BR UFFD_FEATURE_PAGEFAULT_FLAG_WP " (since Linux 5.10)"
+If this feature bit is set,
+userfaultfd supports write-protect faults
+for anonymous memory.
+(Note that shmem / hugetlbfs support
+is indicated by a separate feature.)
+.TP
+.BR UFFD_FEATURE_MINOR_HUGETLBFS " (since Linux 5.13)"
+If this feature bit is set,
+the kernel supports registering userfaultfd ranges
+in minor mode on hugetlbfs-backed memory areas.
+.TP
+.BR UFFD_FEATURE_MINOR_SHMEM " (since Linux 5.14)"
+If this feature bit is set,
+the kernel supports registering userfaultfd ranges
+in minor mode on shmem-backed memory areas.
+.TP
+.BR UFFD_FEATURE_EXACT_ADDRESS " (since Linux 5.18)"
+If this feature bit is set,
+.I uffd_msg.pagefault.address
+will be set to the exact page-fault address that was reported by the hardware,
+and will not mask the offset within the page.
+Note that old Linux versions might indicate the exact address as well,
+even though the feature bit is not set.
+.TP
+.BR UFFD_FEATURE_WP_HUGETLBFS_SHMEM " (since Linux 5.19)"
+If this feature bit is set,
+userfaultfd supports write-protect faults
+for hugetlbfs and shmem / tmpfs memory.
+.TP
+.BR UFFD_FEATURE_WP_UNPOPULATED " (since Linux 6.4)"
+If this feature bit is set,
+the kernel will handle anonymous memory the same way as file memory,
+by allowing the user to write-protect unpopulated page table entries.
+.TP
+.BR UFFD_FEATURE_POISON " (since Linux 6.6)"
+If this feature bit is set,
+the kernel supports resolving faults with the
+.B UFFDIO_POISON
+ioctl.
+.TP
+.BR UFFD_FEATURE_WP_ASYNC " (since Linux 6.7)"
+If this feature bit is set,
+the write protection faults would be asynchronously resolved
+by the kernel.
+.P
+The returned
+.I ioctls
+field can contain the following bits:
+.\" FIXME This user-space API seems not fully polished. Why are there
+.\" not constants defined for each of the bit-mask values listed below?
+.TP
+.B 1 << _UFFDIO_API
+The
+.B UFFDIO_API
+operation is supported.
+.TP
+.B 1 << _UFFDIO_REGISTER
+The
+.B UFFDIO_REGISTER
+operation is supported.
+.TP
+.B 1 << _UFFDIO_UNREGISTER
+The
+.B UFFDIO_UNREGISTER
+operation is supported.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+If an error occurs,
+the kernel may zero the provided
+.I uffdio_api
+structure.
+The caller should treat its contents as unspecified,
+and reinitialize it before re-attempting another
+.B UFFDIO_API
+call.
+Possible errors include:
+.TP
+.B EFAULT
+.I argp
+refers to an address that is outside the calling process's
+accessible address space.
+.TP
+.B EINVAL
+The API version requested in the
+.I api
+field is not supported by this kernel, or the
+.I features
+field passed to the kernel includes feature bits that are not supported
+by the current kernel version.
+.TP
+.B EINVAL
+A previous
+.B UFFDIO_API
+call already enabled one or more features for this userfaultfd.
+Calling
+.B UFFDIO_API
+twice,
+the first time with no features set,
+is explicitly allowed
+as per the two-step feature detection handshake.
+.TP
+.B EPERM
+The
+.B UFFD_FEATURE_EVENT_FORK
+feature was enabled,
+but the calling process doesn't have the
+.B CAP_SYS_PTRACE
+capability.
+.SS UFFDIO_REGISTER
+(Since Linux 4.3.)
+Register a memory address range with the userfaultfd object.
+The pages in the range must be \[lq]compatible\[rq].
+Please refer to the list of register modes below
+for the compatible memory backends for each mode.
+.P
+The
+.I argp
+argument is a pointer to a
+.I uffdio_register
+structure, defined as:
+.P
+.in +4n
+.EX
+struct uffdio_range {
+ __u64 start; /* Start of range */
+ __u64 len; /* Length of range (bytes) */
+};
+\&
+struct uffdio_register {
+ struct uffdio_range range;
+ __u64 mode; /* Desired mode of operation (input) */
+ __u64 ioctls; /* Available ioctl() operations (output) */
+};
+.EE
+.in
+.P
+The
+.I range
+field defines a memory range starting at
+.I start
+and continuing for
+.I len
+bytes that should be handled by the userfaultfd.
+.P
+The
+.I mode
+field defines the mode of operation desired for this memory region.
+The following values may be bitwise ORed to set the userfaultfd mode for
+the specified range:
+.TP
+.B UFFDIO_REGISTER_MODE_MISSING
+Track page faults on missing pages.
+Since Linux 4.3,
+only private anonymous ranges are compatible.
+Since Linux 4.11,
+hugetlbfs and shared memory ranges are also compatible.
+.TP
+.B UFFDIO_REGISTER_MODE_WP
+Track page faults on write-protected pages.
+Since Linux 5.7,
+only private anonymous ranges are compatible.
+.TP
+.B UFFDIO_REGISTER_MODE_MINOR
+Track minor page faults.
+Since Linux 5.13,
+only hugetlbfs ranges are compatible.
+Since Linux 5.14,
+compatibility with shmem ranges was added.
+.P
+If the operation is successful, the kernel modifies the
+.I ioctls
+bit-mask field to indicate which
+.BR ioctl (2)
+operations are available for the specified range.
+This returned bit mask can contain the following bits:
+.TP
+.B 1 << _UFFDIO_COPY
+The
+.B UFFDIO_COPY
+operation is supported.
+.TP
+.B 1 << _UFFDIO_WAKE
+The
+.B UFFDIO_WAKE
+operation is supported.
+.TP
+.B 1 << _UFFDIO_WRITEPROTECT
+The
+.B UFFDIO_WRITEPROTECT
+operation is supported.
+.TP
+.B 1 << _UFFDIO_ZEROPAGE
+The
+.B UFFDIO_ZEROPAGE
+operation is supported.
+.TP
+.B 1 << _UFFDIO_CONTINUE
+The
+.B UFFDIO_CONTINUE
+operation is supported.
+.TP
+.B 1 << _UFFDIO_POISON
+The
+.B UFFDIO_POISON
+operation is supported.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.\" FIXME Is the following error list correct?
+.\"
+.TP
+.B EBUSY
+A mapping in the specified range is registered with another
+userfaultfd object.
+.TP
+.B EFAULT
+.I argp
+refers to an address that is outside the calling process's
+accessible address space.
+.TP
+.B EINVAL
+An invalid or unsupported bit was specified in the
+.I mode
+field; or the
+.I mode
+field was zero.
+.TP
+.B EINVAL
+There is no mapping in the specified address range.
+.TP
+.B EINVAL
+.I range.start
+or
+.I range.len
+is not a multiple of the system page size; or,
+.I range.len
+is zero; or these fields are otherwise invalid.
+.TP
+.B EINVAL
+There as an incompatible mapping in the specified address range.
+.\" Mike Rapoport:
+.\" ENOMEM if the process is exiting and the
+.\" mm_struct has gone by the time userfault grabs it.
+.SS UFFDIO_UNREGISTER
+(Since Linux 4.3.)
+Unregister a memory address range from userfaultfd.
+The pages in the range must be \[lq]compatible\[rq]
+(see the description of
+.BR UFFDIO_REGISTER .)
+.P
+The address range to unregister is specified in the
+.I uffdio_range
+structure pointed to by
+.IR argp .
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EINVAL
+Either the
+.I start
+or the
+.I len
+field of the
+.I ufdio_range
+structure was not a multiple of the system page size; or the
+.I len
+field was zero; or these fields were otherwise invalid.
+.TP
+.B EINVAL
+There as an incompatible mapping in the specified address range.
+.TP
+.B EINVAL
+There was no mapping in the specified address range.
+.\"
+.SS UFFDIO_COPY
+(Since Linux 4.3.)
+Atomically copy a continuous memory chunk into the userfault registered
+range and optionally wake up the blocked thread.
+The source and destination addresses and the number of bytes to copy are
+specified by the
+.IR src ,
+.IR dst ,
+and
+.I len
+fields of the
+.I uffdio_copy
+structure pointed to by
+.IR argp :
+.P
+.in +4n
+.EX
+struct uffdio_copy {
+ __u64 dst; /* Destination of copy */
+ __u64 src; /* Source of copy */
+ __u64 len; /* Number of bytes to copy */
+ __u64 mode; /* Flags controlling behavior of copy */
+ __s64 copy; /* Number of bytes copied, or negated error */
+};
+.EE
+.in
+.P
+The following value may be bitwise ORed in
+.I mode
+to change the behavior of the
+.B UFFDIO_COPY
+operation:
+.TP
+.B UFFDIO_COPY_MODE_DONTWAKE
+Do not wake up the thread that waits for page-fault resolution
+.TP
+.B UFFDIO_COPY_MODE_WP
+Copy the page with read-only permission.
+This allows the user to trap the next write to the page,
+which will block and generate another write-protect userfault message.
+This is used only when both
+.B UFFDIO_REGISTER_MODE_MISSING
+and
+.B UFFDIO_REGISTER_MODE_WP
+modes are enabled for the registered range.
+.P
+The
+.I copy
+field is used by the kernel to return the number of bytes
+that was actually copied, or an error (a negated
+.IR errno -style
+value).
+.\" FIXME Above: Why is the 'copy' field used to return error values?
+.\" This should be explained in the manual page.
+If the value returned in
+.I copy
+doesn't match the value that was specified in
+.IR len ,
+the operation fails with the error
+.BR EAGAIN .
+The
+.I copy
+field is output-only;
+it is not read by the
+.B UFFDIO_COPY
+operation.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+In this case, the entire area was copied.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EAGAIN
+The number of bytes copied (i.e., the value returned in the
+.I copy
+field)
+does not equal the value that was specified in the
+.I len
+field.
+.TP
+.B EINVAL
+Either
+.I dst
+or
+.I len
+was not a multiple of the system page size, or the range specified by
+.I src
+and
+.I len
+or
+.I dst
+and
+.I len
+was invalid.
+.TP
+.B EINVAL
+An invalid bit was specified in the
+.I mode
+field.
+.TP
+.BR ENOENT " (since Linux 4.11)"
+The faulting process has changed
+its virtual memory layout simultaneously with an outstanding
+.B UFFDIO_COPY
+operation.
+.TP
+.BR ENOSPC " (from Linux 4.11 until Linux 4.13)"
+The faulting process has exited at the time of a
+.B UFFDIO_COPY
+operation.
+.TP
+.BR ESRCH " (since Linux 4.13)"
+The faulting process has exited at the time of a
+.B UFFDIO_COPY
+operation.
+.\"
+.SS UFFDIO_ZEROPAGE
+(Since Linux 4.3.)
+Zero out a memory range registered with userfaultfd.
+.P
+The requested range is specified by the
+.I range
+field of the
+.I uffdio_zeropage
+structure pointed to by
+.IR argp :
+.P
+.in +4n
+.EX
+struct uffdio_zeropage {
+ struct uffdio_range range;
+ __u64 mode; /* Flags controlling behavior of copy */
+ __s64 zeropage; /* Number of bytes zeroed, or negated error */
+};
+.EE
+.in
+.P
+The following value may be bitwise ORed in
+.I mode
+to change the behavior of the
+.B UFFDIO_ZEROPAGE
+operation:
+.TP
+.B UFFDIO_ZEROPAGE_MODE_DONTWAKE
+Do not wake up the thread that waits for page-fault resolution.
+.P
+The
+.I zeropage
+field is used by the kernel to return the number of bytes
+that was actually zeroed,
+or an error in the same manner as
+.BR UFFDIO_COPY .
+.\" FIXME Why is the 'zeropage' field used to return error values?
+.\" This should be explained in the manual page.
+If the value returned in the
+.I zeropage
+field doesn't match the value that was specified in
+.IR range.len ,
+the operation fails with the error
+.BR EAGAIN .
+The
+.I zeropage
+field is output-only;
+it is not read by the
+.B UFFDIO_ZEROPAGE
+operation.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+In this case, the entire area was zeroed.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EAGAIN
+The number of bytes zeroed (i.e., the value returned in the
+.I zeropage
+field)
+does not equal the value that was specified in the
+.I range.len
+field.
+.TP
+.B EINVAL
+Either
+.I range.start
+or
+.I range.len
+was not a multiple of the system page size; or
+.I range.len
+was zero; or the range specified was invalid.
+.TP
+.B EINVAL
+An invalid bit was specified in the
+.I mode
+field.
+.TP
+.BR ESRCH " (since Linux 4.13)"
+The faulting process has exited at the time of a
+.B UFFDIO_ZEROPAGE
+operation.
+.\"
+.SS UFFDIO_WAKE
+(Since Linux 4.3.)
+Wake up the thread waiting for page-fault resolution on
+a specified memory address range.
+.P
+The
+.B UFFDIO_WAKE
+operation is used in conjunction with
+.B UFFDIO_COPY
+and
+.B UFFDIO_ZEROPAGE
+operations that have the
+.B UFFDIO_COPY_MODE_DONTWAKE
+or
+.B UFFDIO_ZEROPAGE_MODE_DONTWAKE
+bit set in the
+.I mode
+field.
+The userfault monitor can perform several
+.B UFFDIO_COPY
+and
+.B UFFDIO_ZEROPAGE
+operations in a batch and then explicitly wake up the faulting thread using
+.BR UFFDIO_WAKE .
+.P
+The
+.I argp
+argument is a pointer to a
+.I uffdio_range
+structure (shown above) that specifies the address range.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EINVAL
+The
+.I start
+or the
+.I len
+field of the
+.I ufdio_range
+structure was not a multiple of the system page size; or
+.I len
+was zero; or the specified range was otherwise invalid.
+.SS UFFDIO_WRITEPROTECT
+(Since Linux 5.7.)
+Write-protect or write-unprotect a userfaultfd-registered memory range
+registered with mode
+.BR UFFDIO_REGISTER_MODE_WP .
+.P
+The
+.I argp
+argument is a pointer to a
+.I uffdio_range
+structure as shown below:
+.P
+.in +4n
+.EX
+struct uffdio_writeprotect {
+ struct uffdio_range range; /* Range to change write permission*/
+ __u64 mode; /* Mode to change write permission */
+};
+.EE
+.in
+.P
+There are two mode bits that are supported in this structure:
+.TP
+.B UFFDIO_WRITEPROTECT_MODE_WP
+When this mode bit is set,
+the ioctl will be a write-protect operation upon the memory range specified by
+.IR range .
+Otherwise it will be a write-unprotect operation upon the specified range,
+which can be used to resolve a userfaultfd write-protect page fault.
+.TP
+.B UFFDIO_WRITEPROTECT_MODE_DONTWAKE
+When this mode bit is set,
+do not wake up any thread that waits for
+page-fault resolution after the operation.
+This can be specified only if
+.B UFFDIO_WRITEPROTECT_MODE_WP
+is not specified.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EINVAL
+The
+.I start
+or the
+.I len
+field of the
+.I ufdio_range
+structure was not a multiple of the system page size; or
+.I len
+was zero; or the specified range was otherwise invalid.
+.TP
+.B EAGAIN
+The process was interrupted; retry this call.
+.TP
+.B ENOENT
+The range specified in
+.I range
+is not valid.
+For example, the virtual address does not exist,
+or not registered with userfaultfd write-protect mode.
+.TP
+.B EFAULT
+Encountered a generic fault during processing.
+.\"
+.SS UFFDIO_CONTINUE
+(Since Linux 5.13.)
+Resolve a minor page fault
+by installing page table entries
+for existing pages in the page cache.
+.P
+The
+.I argp
+argument is a pointer to a
+.I uffdio_continue
+structure as shown below:
+.P
+.in +4n
+.EX
+struct uffdio_continue {
+ struct uffdio_range range;
+ /* Range to install PTEs for and continue */
+ __u64 mode; /* Flags controlling the behavior of continue */
+ __s64 mapped; /* Number of bytes mapped, or negated error */
+};
+.EE
+.in
+.P
+The following value may be bitwise ORed in
+.I mode
+to change the behavior of the
+.B UFFDIO_CONTINUE
+operation:
+.TP
+.B UFFDIO_CONTINUE_MODE_DONTWAKE
+Do not wake up the thread that waits for page-fault resolution.
+.P
+The
+.I mapped
+field is used by the kernel
+to return the number of bytes that were actually mapped,
+or an error in the same manner as
+.BR UFFDIO_COPY .
+If the value returned in the
+.I mapped
+field doesn't match the value that was specified in
+.IR range.len ,
+the operation fails with the error
+.BR EAGAIN .
+The
+.I mapped
+field is output-only;
+it is not read by the
+.B UFFDIO_CONTINUE
+operation.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+In this case,
+the entire area was mapped.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EAGAIN
+The number of bytes mapped
+(i.e., the value returned in the
+.I mapped
+field)
+does not equal the value that was specified in the
+.I range.len
+field.
+.TP
+.B EEXIST
+One or more pages were already mapped in the given range.
+.TP
+.B EFAULT
+No existing page could be found in the page cache for the given range.
+.TP
+.B EINVAL
+Either
+.I range.start
+or
+.I range.len
+was not a multiple of the system page size; or
+.I range.len
+was zero; or the range specified was invalid.
+.TP
+.B EINVAL
+An invalid bit was specified in the
+.I mode
+field.
+.TP
+.B ENOENT
+The faulting process has changed its virtual memory layout simultaneously with
+an outstanding
+.B UFFDIO_CONTINUE
+operation.
+.TP
+.B ENOMEM
+Allocating memory needed to setup the page table mappings failed.
+.TP
+.B ESRCH
+The faulting process has exited at the time of a
+.B UFFDIO_CONTINUE
+operation.
+.\"
+.SS UFFDIO_POISON
+(Since Linux 6.6.)
+Mark an address range as "poisoned".
+Future accesses to these addresses will raise a
+.B SIGBUS
+signal.
+Unlike
+.B MADV_HWPOISON
+this works by installing page table entries,
+rather than "really" poisoning the underlying physical pages.
+This means it only affects this particular address space.
+.P
+The
+.I argp
+argument is a pointer to a
+.I uffdio_poison
+structure as shown below:
+.P
+.in +4n
+.EX
+struct uffdio_poison {
+ struct uffdio_range range;
+ /* Range to install poison PTE markers in */
+ __u64 mode; /* Flags controlling the behavior of poison */
+ __s64 updated; /* Number of bytes poisoned, or negated error */
+};
+.EE
+.in
+.P
+The following value may be bitwise ORed in
+.I mode
+to change the behavior of the
+.B UFFDIO_POISON
+operation:
+.TP
+.B UFFDIO_POISON_MODE_DONTWAKE
+Do not wake up the thread that waits for page-fault resolution.
+.P
+The
+.I updated
+field is used by the kernel
+to return the number of bytes that were actually poisoned,
+or an error in the same manner as
+.BR UFFDIO_COPY .
+If the value returned in the
+.I updated
+field doesn't match the value that was specified in
+.IR range.len ,
+the operation fails with the error
+.BR EAGAIN .
+The
+.I updated
+field is output-only;
+it is not read by the
+.B UFFDIO_POISON
+operation.
+.P
+This
+.BR ioctl (2)
+operation returns 0 on success.
+In this case,
+the entire area was poisoned.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EAGAIN
+The number of bytes mapped
+(i.e., the value returned in the
+.I updated
+field)
+does not equal the value that was specified in the
+.I range.len
+field.
+.TP
+.B EINVAL
+Either
+.I range.start
+or
+.I range.len
+was not a multiple of the system page size; or
+.I range.len
+was zero; or the range specified was invalid.
+.TP
+.B EINVAL
+An invalid bit was specified in the
+.I mode
+field.
+.TP
+.B EEXIST
+One or more pages were already mapped in the given range.
+.TP
+.B ENOENT
+The faulting process has changed its virtual memory layout simultaneously with
+an outstanding
+.B UFFDIO_POISON
+operation.
+.TP
+.B ENOMEM
+Allocating memory for page table entries failed.
+.TP
+.B ESRCH
+The faulting process has exited at the time of a
+.B UFFDIO_POISON
+operation.
+.\"
+.SH RETURN VALUE
+See descriptions of the individual operations, above.
+.SH ERRORS
+See descriptions of the individual operations, above.
+In addition, the following general errors can occur for all of the
+operations described above:
+.TP
+.B EFAULT
+.I argp
+does not point to a valid memory address.
+.TP
+.B EINVAL
+(For all operations except
+.BR UFFDIO_API .)
+The userfaultfd object has not yet been enabled (via the
+.B UFFDIO_API
+operation).
+.SH STANDARDS
+Linux.
+.SH BUGS
+In order to detect available userfault features and
+enable some subset of those features
+the userfaultfd file descriptor must be closed after the first
+.B UFFDIO_API
+operation that queries features availability and reopened before
+the second
+.B UFFDIO_API
+operation that actually enables the desired features.
+.SH EXAMPLES
+See
+.BR userfaultfd (2).
+.SH SEE ALSO
+.BR ioctl (2),
+.BR mmap (2),
+.BR userfaultfd (2)
+.P
+.I Documentation/admin\-guide/mm/userfaultfd.rst
+in the Linux kernel source tree