[DTrace-devel] [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix
Kris Van Hees
kris.van.hees at oracle.com
Fri Oct 18 15:41:34 UTC 2024
Before I review further, I have a question... Do we need to consider the
<func>.<suffix> symbols as separate probes from <func> (at a user level),
or can we group them together? I am hoping that grouping them together
would be the preference if only because the suffix versions result from
compiler optimizations and it is therefore likely that a user would want to
be able to probe <func> and expect it to work even if the compiler decided
to do something under the covers that results in a suffix-variant to also
be created.
Kris
On Wed, Oct 16, 2024 at 04:54:05PM +0100, Alan Maguire wrote:
> This series is focused on solving a few issues with fprobe-based
> attachment which prevent us being able to attach to functions
> like finish_task_switch.isra.0. Such functions are present in
> available_filter_functions, and represent real function boundaries
> (since they correspond to the mcount function boundary sites)
> but because they either lack BTF representations, or because
> those BTF representations are named without the .isra suffix, attach
> via fentry/fexit is currently impossible. Falling back to the
> kprobe implementation is the best solution here.
>
> However, for stability, it is best to represent the probes for
> these functions without the ".isra" suffix, so we need to store
> the full function name (with suffix) in the tracepoint data when
> the probe is populated. Patch 1 supports this.
>
> Patch 2 ensures that we use kprobe implementation for any "."-suffixed
> functions. An additional fbt provider with kprobe implementation is
> created to support this (so as not to disturb existing fprobes for other
> functions). At kprobe attach we use the full function name stored
> as tp event data to carry out attach.
>
> Next we need to ensure we do not end up with a mix of kprobes and
> fprobes. Ideally we would do this in a more fine-grained manner, but
> for now just ensure we do not have an fprobe/kprobe mix program-wide.
> When fprobes are active, we will only use kprobes for "."-suffixed
> functions that are used, so in practice such mixes will be relatively
> rare.
>
> As Kris pointed out [1] at compilation time, trampolines have not yet been
> set up, so we can replace the provider underlying fbt at that time.
> The probe_info() callbacks are used to check for a mix of kprobe and
> fprobe implementations; we check for multiple fbt providers which
> have a count of used probes > 0; if this occurs, switch the fbt provider
> using fprobe to use the kprobe implementation and reset any event
> ids associated with fprobes from the BTF id used in fprobes to 0.
>
> Finally we can then use fbt::finish_task_switch:return as the
> dependent probe for sched:::on-cpu, as we now can probe it even
> if it becomes finish_task_switch.isra.0.
>
> So to recap:
>
> Patch 1 supports storing/freeing event data with tp events.
> Patch 2 allows tracing of "."-suffixed functions like
> finish_task_switch.isra.0 via a kprobe-backed fbt implementation.
> Patch 3 ensures we do not end up with a kprobe/fprobe mix.
> Patch 4 then uses the fact we can now trace "."-suffixed functions
> (with kprobe fallback) by using fbt:vmlinux:finish_task_switch:return
> as the kprobe dependent event for sched:::on-cpu . This function is
> often optimized to become finish_task_switch.isra.0.
>
> Tested on upstream, 5.15 and 5.4 kernels.
>
> Changes since v2:
>
> - probe function name exposed drops the suffix (Kris, patches 1, 2)
> - restrict kprobe use to "."-suffixed functions; this makes their use
> less likely in the fprobe environment. Do this instead of creating
> a "fake" fprobe probe with kprobe backing (Kris, patch 2)
> - modify fallback logic to handle kprobe/fprobe mix (patch 3)
> - modify sched:::on-cpu to use fbt::finish_task_switch:return ; no
> wildcard needed now that probe function name is unsuffixed.
>
> Changes since v1:
>
> - simplified approach by just swapping out probe impl when BTF lookup fails
> (Kris, patch 2)
>
> [1] https://lore.kernel.org/dtrace/20241009140236.883884-1-alan.maguire@oracle.com/
>
> Alan Maguire (4):
> dt_provider_tp: add optional event data, freed on tp free
> fbt: support "."-suffixed functions for kprobes
> fbt: avoid mix of kprobe, fprobe implementations for used probes
> sched: fix on-cpu firing for kernels < 5.16
>
> libdtrace/dt_prov_fbt.c | 138 ++++++++++++++++++++++++++++++++-----
> libdtrace/dt_prov_sched.c | 23 +------
> libdtrace/dt_provider_tp.c | 27 ++++++++
> libdtrace/dt_provider_tp.h | 8 +++
> 4 files changed, 158 insertions(+), 38 deletions(-)
>
> --
> 2.43.5
>
More information about the DTrace-devel
mailing list