bpf: Add support for sleepable tracepoint programs#11398
bpf: Add support for sleepable tracepoint programs#11398kernel-patches-daemon-bpf[bot] wants to merge 5 commits intobpf-next_basefrom
Conversation
|
Upstream branch: ca0f39a |
AI reviewed your patch. Please fix the bug or email reply why it's not a bug. In-Reply-To-Subject: |
AI reviewed your patch. Please fix the bug or email reply why it's not a bug. In-Reply-To-Subject: |
AI reviewed your patch. Please fix the bug or email reply why it's not a bug. In-Reply-To-Subject: |
|
Forwarding comment 4058102918 via email |
|
Forwarding comment 4058107515 via email |
|
Forwarding comment 4058130661 via email |
15b24d7 to
43a1b48
Compare
|
Upstream branch: bb41fce |
2b9c4f9 to
3c91638
Compare
43a1b48 to
912a12b
Compare
|
Upstream branch: bb41fce |
3c91638 to
d0ab5cf
Compare
912a12b to
71c9f15
Compare
|
Upstream branch: 202e42e |
d0ab5cf to
24ac3d1
Compare
71c9f15 to
45bc2d7
Compare
|
Upstream branch: 6c8e1a9 |
24ac3d1 to
55fe0a0
Compare
45bc2d7 to
1e267ee
Compare
Rework __bpf_trace_run() to support sleepable BPF programs by using explicit RCU flavor selection, following the uprobe_prog_run() pattern. For sleepable programs, use rcu_read_lock_trace() for lifetime protection and add a might_fault() annotation. For non-sleepable programs, use the regular rcu_read_lock(). Replace the combined rcu_read_lock_dont_migrate() with separate rcu_read_lock()/ migrate_disable() calls, since sleepable programs need rcu_read_lock_trace() instead of rcu_read_lock(). Remove the preempt_disable_notrace/preempt_enable_notrace pair from the faultable tracepoint BPF probe wrapper in bpf_probe.h, since preemption management is now handled inside __bpf_trace_run(). This enables both BTF-based raw tracepoints (tp_btf.s) and classic raw tracepoints (raw_tp.s) to run sleepable BPF programs when attached to faultable tracepoints (e.g. syscall tracepoints). Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Add trace_call_bpf_faultable(), a variant of trace_call_bpf() for faultable tracepoints that supports sleepable BPF programs. It uses rcu_read_lock_trace() for lifetime protection and bpf_prog_run_array_uprobe() for per-program RCU flavor selection, following the uprobe_prog_run() pattern. Uses preempt-safe this_cpu_inc_return/this_cpu_dec for the bpf_prog_active recursion counter since preemption is enabled in this context. Restructure perf_syscall_enter() and perf_syscall_exit() to run BPF filter before perf event processing. Previously, BPF ran after the per-cpu perf trace buffer was allocated under preempt_disable, requiring cleanup via perf_swevent_put_recursion_context() on filter. Now BPF runs in faultable context before preempt_disable, reading syscall arguments from local variables instead of the per-cpu trace record, removing the dependency on buffer allocation. This allows sleepable BPF programs to execute and avoids unnecessary buffer allocation when BPF filters the event. The perf event submission path (buffer allocation, fill, submit) remains under preempt_disable as before. Add an attach-time check in __perf_event_set_bpf_prog() to reject sleepable BPF_PROG_TYPE_TRACEPOINT programs on non-syscall tracepoints, since only syscall tracepoints run in faultable context. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Allow BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_TRACEPOINT, and BPF_TRACE_RAW_TP (tp_btf) programs to be sleepable by adding them to can_be_sleepable(). For BTF-based raw tracepoints (tp_btf), add a load-time check in bpf_check_attach_target() that rejects sleepable programs attaching to non-faultable tracepoints with a descriptive error message. For classic raw tracepoints (raw_tp), add an attach-time check in bpf_raw_tp_link_attach() that rejects sleepable programs on non-faultable tracepoints. The attach-time check is needed because the tracepoint name is not known at load time for classic raw_tp. Replace the verbose error message that enumerates allowed program types with a generic "program of this type cannot be sleepable" message, since the list of sleepable-capable types keeps growing. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Add SEC_DEF entries for sleepable tracepoint variants: - "tp_btf.s+" for sleepable BTF-based raw tracepoints - "raw_tp.s+" for sleepable classic raw tracepoints - "raw_tracepoint.s+" (alias) - "tp.s+" for sleepable classic tracepoints - "tracepoint.s+" (alias) Update attach_raw_tp() to recognize "raw_tp.s" and "raw_tracepoint.s" prefixes when extracting the tracepoint name. Rewrite attach_tp() to use a prefix array including "tp.s/" and "tracepoint.s/" variants for proper section name parsing. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Add functional tests for sleepable tracepoint programs that attach to
the nanosleep syscall and use bpf_copy_from_user() to read user memory:
- tp_btf: BTF-based raw tracepoint using SEC("tp_btf.s/sys_enter")
with PT_REGS_PARM1_SYSCALL (non-CO-RE macro for BTF programs).
- classic: Classic raw tracepoint using SEC("raw_tp.s/sys_enter")
with PT_REGS_PARM1_CORE_SYSCALL (CO-RE macro needed for classic).
- tracepoint: Classic tracepoint using
SEC("tp.s/syscalls/sys_enter_nanosleep") receiving
struct syscall_trace_enter with direct access to args[].
Add a negative test (test_sleepable_raw_tp_fail) that verifies
sleepable programs are rejected on non-faultable tracepoints
(sched_switch).
Update verifier/sleepable.c tests:
- Add "sleepable raw tracepoint accept" test for sys_enter.
- Rename reject test and update error message to match the new
descriptive "Sleepable program cannot attach to non-faultable
tracepoint" message.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
|
Upstream branch: 2364959 |
55fe0a0 to
31f8330
Compare
Pull request for series with
subject: bpf: Add support for sleepable tracepoint programs
version: 4
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1066584