[DTrace-devel] [PATCH v2 dtrace 1/3] fbt: support "."-suffixed functions for kprobes

Kris Van Hees kris.van.hees at oracle.com
Fri Oct 11 20:17:13 UTC 2024


On Fri, Oct 11, 2024 at 05:23:48PM +0100, Alan Maguire wrote:
> On 11/10/2024 16:00, Kris Van Hees wrote:
> > On Fri, Oct 11, 2024 at 11:28:08AM +0100, Alan Maguire wrote:
> >> gcc adds suffixes when it carries out optimizations, but often
> >> these leave parameters to functions intact.  Many of these functions
> >> (like finish_task_switch()) are important for tracing (and they
> >> are present in available_filter_functions so are traceable) so it is
> >> valuable to support probing them.  For kprobes, all that is needed
> >> is to ensure that the event name does not contain a ".".
> > 
> > The challenge here is that not everything that is present in
> > available_filter_functions is actually traceable.  I.e. anything with a
> > compiler-optimization-generated name (*.*) cannot be traced with fprobes.
> > Also, at least with older kernels, my testing has shown that not all that
> > is listed in that file can even be traced with kprobes (probably due to a
> > kernel bug that has since been fixed).
> 
> Yeah the results of wildcarding on all fbt functions are pretty varied
> unfortunately. I wonder if there's a way we could come up with a test
> and feed back the results upstream; maybe a few more functions just need
> to be annotated notrace. Unfortunately it's such a fast-moving target..
> 
> The logic in this patch is coupled with that in the next in a way; so we
> present fprobes for everything in available_filter_functions, and if any
> probes are _used_ that do not have a BTF representation, we can then
> fall back to kprobes. Previously I just added "."-suffixed functions as
> kprobes initially, but that then raises the issue of intermixing of
> kprobe- and fprobe-backed fbt at a clause.

Yes, but my point is that this is the function that creates the probes that
are listed when you do dtrace -lP fbt, so by not filtering out the names
with "." in them, you end up creating probe entries for things that we will
not be able to probe at all.  We already know we cannot provide fprobe-based
probes for everything in available_filter_functions, so pretending to the user
that we can would be a bad user experience.

> > This leaves us with an interesting challenge.  Since we prefer fprobes
> > because they are faster (and give us argument type info), using the content
> > of that file indiscriminately is not a good idea because it will cause us
> > to list probes that cannot be used.  On the other hand, if probes with "."
> > in their names *can* be probed using kprobe, then it would certainly be
> > beneficial to make use of that.
> > 
> > Now, since the .-names are compiler generated we do not necessarily need to
> > list those as probes.  We could just list the base function *if* we can
> > determine (for FBT) which suffixes refer to variants that actually constitute
> > a valid alternative (where function entry and return are valid representations
> > of the original function).
> >
> 
> Right, we could eliminate for example .cold. suffixes and .part.
> suffixes as these do not represent function boundaries; we'd want to
> keep .isra and .constprop as these do correspond to function boundaries.
> As you say, after doing that, it might be better to represent the probe
> to the user without the suffix, as this would mean they could trace the
> function in a stable manner regardless of what the compiler did for a
> specific kernel.
> 
> >> Signed-off-by: Alan Maguire <alan.maguire at oracle.com>
> >> ---
> >>  libdtrace/dt_prov_fbt.c | 31 +++++++++++++++++++++----------
> >>  1 file changed, 21 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c
> >> index 21f63ddf..0ba0fb71 100644
> >> --- a/libdtrace/dt_prov_fbt.c
> >> +++ b/libdtrace/dt_prov_fbt.c
> >> @@ -106,10 +106,6 @@ static int populate(dtrace_hdl_t *dtp)
> >>  				p++;
> >>  		}
> >>  
> >> -		/* Weed out synthetic symbol names (that are invalid). */
> >> -		if (strchr(buf, '.') != NULL)
> >> -			continue;
> >> -
> >>  #define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
> >>  		/* Weed out __ftrace_invalid_address___* entries. */
> >>  		if (strstarts(buf, "__ftrace_invalid_address__") ||
> >> @@ -363,7 +359,7 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl)
> >>  static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
> >>  {
> >>  	if (!dt_tp_probe_has_info(prp)) {
> >> -		char	*fn;
> >> +		char	*fn, *event, *s;
> >>  		FILE	*f;
> >>  		size_t	len;
> >>  		int	fd, rc = -1;
> >> @@ -376,22 +372,37 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
> >>  		if (fd == -1)
> >>  			return -ENOENT;
> >>  
> >> +		if (asprintf(&event, "%s", prp->desc->fun) < 0)
> >> +			return -ENOMEM;
> >> +		/* replace possible "." in event name as "." is not supported
> >> +		 * in trace event names, and function could be named foo.isra.0
> >> +		 */
> >> +		for (s = strchr(event, '.'); s != NULL && *s != '\0'; s++) {
> >> +			if (*s == '.')
> >> +				*s = '_';
> >> +		}
> >> +
> >>  		rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n",
> >>  			     prp->desc->prb[0] == 'e' ? 'p' : 'r',
> >> -			     FBT_GROUP_DATA, prp->desc->fun, prp->desc->fun);
> >> +			     FBT_GROUP_DATA, event, prp->desc->fun);
> >>  		close(fd);
> >> -		if (rc == -1)
> >> +		if (rc == -1) {
> >> +			free(event);
> >>  			return -ENOENT;
> >> +		}
> >>  
> >>  		/* create format file name */
> >>  		len = snprintf(NULL, 0, "%s" FBT_GROUP_FMT "/%s/format",
> >> -			       EVENTSFS, FBT_GROUP_DATA, prp->desc->fun) + 1;
> >> +			       EVENTSFS, FBT_GROUP_DATA, event) + 1;
> >>  		fn = dt_alloc(dtp, len);
> >> -		if (fn == NULL)
> >> +		if (fn == NULL) {
> >> +			free(event);
> >>  			return -ENOENT;
> >> +		}
> >>  
> >>  		snprintf(fn, len, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS,
> >> -			 FBT_GROUP_DATA, prp->desc->fun);
> >> +			 FBT_GROUP_DATA, event);
> >> +		free(event);
> >>  
> >>  		/* open format file */
> >>  		f = fopen(fn, "r");
> >> -- 
> >> 2.43.5
> >>



More information about the DTrace-devel mailing list