summaryrefslogtreecommitdiffstats
path: root/man2/process_madvise.2
blob: 1cee5be246187f1283a48c3872fce938e183b890 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
.\" Copyright (C) 2021 Suren Baghdasaryan <surenb@google.com>
.\" and Copyright (C) 2021 Minchan Kim <minchan@kernel.org>
.\"
.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
.\" Commit ecb8ac8b1f146915aa6b96449b66dd48984caacc
.\"
.TH process_madvise 2 (date) "Linux man-pages (unreleased)"
.SH NAME
process_madvise \- give advice about use of memory to a process
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
.SH SYNOPSIS
.nf
.BR "#include <sys/mman.h>" "      /* Definition of " MADV_* " constants */"
.BR "#include <sys/syscall.h>" "   /* Definition of " SYS_* " constants */"
.BR "#include <sys/uio.h>" "       /* Definition of " "struct iovec" " type */"
.B #include <unistd.h>
.PP
.BI "ssize_t syscall(SYS_process_madvise, int " pidfd ,
.BI "                const struct iovec *" iovec ", size_t " vlen \
", int " advice ,
.BI "                unsigned int " flags ");"
.fi
.PP
.IR Note :
glibc provides no wrapper for
.BR process_madvise (),
necessitating the use of
.BR syscall (2).
.\" FIXME: See <https://sourceware.org/bugzilla/show_bug.cgi?id=27380>
.SH DESCRIPTION
The
.BR process_madvise ()
system call is used to give advice or directions to the kernel about the
address ranges of another process or of the calling process.
It provides the advice for the address ranges described by
.I iovec
and
.IR vlen .
The goal of such advice is to improve system or application performance.
.PP
The
.I pidfd
argument is a PID file descriptor (see
.BR pidfd_open (2))
that specifies the process to which the advice is to be applied.
.PP
The pointer
.I iovec
points to an array of
.I iovec
structures, described in
.BR iovec (3type).
.PP
.I vlen
specifies the number of elements in the array of
.I iovec
structures.
This value must be less than or equal to
.B IOV_MAX
(defined in
.I <limits.h>
or accessible via the call
.IR sysconf(_SC_IOV_MAX) ).
.PP
The
.I advice
argument is one of the following values:
.TP
.B MADV_COLD
See
.BR madvise (2).
.TP
.B MADV_COLLAPSE
See
.BR madvise (2).
.TP
.B MADV_PAGEOUT
See
.BR madvise (2).
.TP
.B MADV_WILLNEED
See
.BR madvise (2).
.PP
The
.I flags
argument is reserved for future use; currently, this argument must be
specified as 0.
.PP
The
.I vlen
and
.I iovec
arguments are checked before applying any advice.
If
.I vlen
is too big, or
.I iovec
is invalid,
then an error will be returned immediately and no advice will be applied.
.PP
The advice might be applied to only a part of
.I iovec
if one of its elements points to an invalid memory region in the
remote process.
No further elements will be processed beyond that point.
(See the discussion regarding partial advice in RETURN VALUE.)
.PP
.\" commit 96cfe2c0fd23ea7c2368d14f769d287e7ae1082e
Starting in Linux 5.12,
permission to apply advice to another process is governed by
ptrace access mode
.B PTRACE_MODE_READ_FSCREDS
check (see
.BR ptrace (2));
in addition,
because of the performance implications of applying the advice,
the caller must have the
.B CAP_SYS_NICE
capability
(see
.BR capabilities (7)).
.SH RETURN VALUE
On success,
.BR process_madvise ()
returns the number of bytes advised.
This return value may be less than the total number of requested bytes,
if an error occurred after some
.I iovec
elements were already processed.
The caller should check the return value to determine whether a partial
advice occurred.
.PP
On error, \-1 is returned and
.I errno
is set to indicate the error.
.SH ERRORS
.TP
.B EBADF
.I pidfd
is not a valid PID file descriptor.
.TP
.B EFAULT
The memory described by
.I iovec
is outside the accessible address space of the process referred to by
.IR pidfd .
.TP
.B EINVAL
.I flags
is not 0.
.TP
.B EINVAL
The sum of the
.I iov_len
values of
.I iovec
overflows a
.I ssize_t
value.
.TP
.B EINVAL
.I vlen
is too large.
.TP
.B ENOMEM
Could not allocate memory for internal copies of the
.I iovec
structures.
.TP
.B EPERM
The caller does not have permission to access the address space of the process
.IR pidfd .
.TP
.B ESRCH
The target process does not exist (i.e., it has terminated and been waited on).
.PP
See
.BR madvise (2)
for
.IR advice -specific
errors.
.SH STANDARDS
Linux.
.SH HISTORY
Linux 5.10.
.\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc
.PP
Support for this system call is optional,
depending on the setting of the
.B CONFIG_ADVISE_SYSCALLS
configuration option.
.PP
When this system call first appeared in Linux 5.10,
permission to apply advice to another process was entirely governed by
ptrace access mode
.B PTRACE_MODE_ATTACH_FSCREDS
check (see
.BR ptrace (2)).
This requirement was relaxed in Linux 5.12 so that the caller didn't require
full control over the target process.
.SH SEE ALSO
.BR madvise (2),
.BR pidfd_open (2),
.BR process_vm_readv (2),
.BR process_vm_write (2)