[DTrace-devel] [PATCH 09/15] provider: Implement a rawtp provider

Nick Alcock nick.alcock at oracle.com
Thu Feb 23 13:01:38 UTC 2023


On 23 Feb 2023, Kris Van Hees via DTrace-devel outgrape:

> BPF allows attaching to raw forms of tracepoints, which allows access
> to the raw arguments.  (Regular tracepoints often expose translated
> arguments based on the raw arguments that were passed in the tracepoint
> call.)
>
> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>

Reviewed-by: Nick Alcock <nick.alcock at oracle.com>

with much YESS!ing.

A few comments below though. Mostly for later, but the one about error
handling in probe_info() probably should actually get fixed sooner
rather than later. It's a really confusing problem waiting to happen.

> +/*
> + * The PROBE_LIST file lists all tracepoints in a <group>:<name> format.
> + * We need to ignore these groups:
> + *   - GROUP_FMT (created by DTrace processes)
> + *   - kprobes and uprobes
> + *   - syscalls (handled by a different provider)
> + *   - pid and usdt probes (ditto)
> + */

What if there are other providers of tracepoints on the system? We
should really ignore all uprobes created by anyone that isn't us, but
just as we named ours differently, so too can they... I don't see an
obvious way around this, wich presuambly means you could set up a pair
of warring tracers each madly tracing the other one :)

> +static int populate(dtrace_hdl_t *dtp)
> +{
> +	dt_provider_t	*prv;
> +	FILE		*f;
> +	char		*buf = NULL;
> +	char		*p;
> +	size_t		n;
> +
> +	prv = dt_provider_create(dtp, prvname, &dt_rawtp, &pattr);
> +	if (prv == NULL)
> +		return 0;
> +
> +	f = fopen(PROBE_LIST, "r");
> +	if (f == NULL)
> +		return 0;
> +
> +	while (getline(&buf, &n, f) >= 0) {

*cheer*

> +		p = strchr(buf, '\n');
> +		if (p)
> +			*p = '\0';
> +
> +		p = strchr(buf, ':');
> +		if (p != NULL) {
> +			int	dummy;
> +			char	*str;
> +
> +			*p++ = '\0';
> +
> +			if (sscanf(buf, GROUP_SFMT, &dummy, &str) == 2) {

Not entirely sure that the use of #defines for format strings here (and
in dt_prov_sdt.c) is buying us anything other than potential future
bugs: it's bad enough with printf but used with scanf it's just
terrifying. I'm shuddering inwardly at the idea of separating format
strings and format arguments like this. Maybe the whole sscanf should be
bundled up in a shared function in the future instead...

> +	/*
> +	 * After the dt_cg_tramp_prologue() call, we have:
> +	 *				//     (%r7 = dctx->mst)
> +	 *				//     (%r8 = dctx->ctx)
> +	 */

Nice comment!

> +static int probe_info(dtrace_hdl_t *dtp, const dt_probe_t *prp,
> +		      int *argcp, dt_argdesc_t **argvp)
> +{
> +	int		argc, i;
> +	dt_argdesc_t	*argv = NULL;
> +
> +	/*
> +	 * This is an unfortunate necessity.  The BPF verifier will not allow
> +	 * us to access more argument values than are passed to the raw
> +	 * tracepoint but the number of argument values for any given raw
> +	 * tracepoint is not made available to userspace.  So we use a trial
> +	 * and error loop to see what the BPF verifier accepts.

EW! :) but also, cunning.

> +	 */
> +	for (argc = ARRAY_SIZE(((dt_mstate_t *)0)->argv); argc > 0; argc--) {
> +		int		bpf_fd, rtp_fd;
> +		struct bpf_insn	prog[2];
> +		dtrace_difo_t	dif;
> +
> +		prog[0] = BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_1, (argc - 1) * sizeof(uint64_t));
> +		prog[1] = BPF_RETURN();
> +		dif.dtdo_buf = prog;
> +		dif.dtdo_len = 2;
> +
> +		bpf_fd = dt_bpf_prog_load(dt_rawtp.prog_type, &dif, 0, NULL, 0);
> +		if (bpf_fd == -1)
> +			continue;
> +		rtp_fd = dt_bpf_raw_tracepoint_open(prp->desc->prb, bpf_fd);
> +		close(bpf_fd);
> +		if (rtp_fd == -1)
> +			continue;
> +		close(rtp_fd);

... I suppose we could at least check for EPERM? It's a shame there's no
immediate way to get a notification that this specifically got a
verifier failure, let alone what the failure was.

> +		break;
> +	}
> +
> +	if (argc == 0)
> +		goto done;

... so if we can't find any number of valid args, we assume it zero and
silently keep going? We should probably also check for success in the
the zero-args case, and if that still fails, raise an actual error more
loudly, since something else is clearly wrong.

> +	argv = dt_zalloc(dtp, argc * sizeof(dt_argdesc_t));
> +
> +	for (i = 0; i < argc; i++) {
> +		argv[i].mapping = i;
> +		argv[i].native = "uint64_t";
> +		argv[i].xlate = NULL;
> +	}
> +
> +done:
> +        *argcp = argc;
> +        *argvp = argv;
> +
> +        return 0;
> +}

(There really should be at least some error case here.)

-- 
NULL && (void)



More information about the DTrace-devel mailing list