[DTrace-devel] [PATCH 4/4] uprobe: Implement PID-specific uprobes
Kris Van Hees
kris.van.hees at oracle.com
Wed Nov 12 00:23:57 UTC 2025
On Tue, Nov 11, 2025 at 06:19:26PM -0500, Eugene Loh wrote:
> First, I had been under the impression there might be some test suite
> changes. New tests? Something like that?
I am working on that but wanted to get the code out for review in the meantime.
I am adding a few unit tests for USDT post-start discovery because the only
etsts that currently exist are rather complex and can fail for various reasons,
which means that we do not have any that test the very basics of the machinery
(the pre-existing mechinery as well).
Although, even without adding tests, this patch could be reviewed and be given
R-b simply because we do have proper tests for discovered USDT probes already,
and the fact that they pass is a valid test for this functionality.
But I hope to be able to add a few tests in the v2 so that the unit tests make
it into the testsuite, anyway. That may come in handy for future detection of
regressions in the fundamental logic of the discovery mechanism.
> On 11/10/25 10:27, Kris Van Hees via DTrace-devel wrote:
> > The mechanism to create uprobes by writing to $TRACEFS/uprobe_events
> > caused probes to be placed in the dev/inode based mapping. This means
> > that all tasks that use that mapping are be subject to the probes
>
> s/be//?
Thanks.
> > firing.
> >
> > The kernel supports placing uprobes for a specific task (by PID), which
> > avoids impacting all other tasks that share the same code but are not
> > the target of the tracing.
> >
> > This new mechanism places uprobes using the perf_event_open interface.
> > Perf event attribute configuration data is read from
> > /sys/bus/event_source/devices/uprobe/ as needed (and cached to ease
> > repeated use). Underlying probes are now organized by PID-specific
> > providers (uprobe$PID), and attach/detach no longer depends on the
> > generic tracepoint support.
> >
> > The usdt_prids BPF map is no longer needed because USDT BPF programs
> > are now task-specific. The trampoline generation for USDT Probes
> > discovered after tracing started can now perform a simple loop over
> > all compiled clauses, adding those that match the probe description
> > to the program.
>
> Should we remove:
> libdtrace/dt_dlibs.c: DT_BPF_SYMBOL(usdt_prids, DT_IDENT_PTR),
Yes, thanks.
> > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> > @@ -974,19 +974,12 @@ gmap_create_probes(dtrace_hdl_t *dtp)
> > }
> > /*
> > - * Create the 'usdt_names' and 'usdt_prids' BPF maps.
> > + * Create the 'usdt_names' BPF maps.
>
> s/maps/map/
Thanks.
> > *
> > * 'usdt_names': a global hash map indexed by PRID and whose value has probe
> > * name elements at fixed offsets within the value. This map
> > * is used for get_bvar() to look up probe name elements for
> > * any prid that was created after dtrace_go().
> > - *
> > - * 'usdt_prids': a global hash map indexed by (pid, underlying probe ID).
> > - * The value is a probe ID for the overlying USDT probe and
> > - * a bit mask indicating which clauses to execute for this pid.
> > - *
> > - * For a given (pid, PRID) key, there can be at most one
> > - * overlying USDT probe.
> > */
> > static int
> > gmap_create_usdt(dtrace_hdl_t *dtp)
> > diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
> > @@ -20,21 +20,6 @@
> > #include <dt_probe.h>
> > #include <dt_bpf.h>
> > -int
> > -dt_stmt_clsflag_set(dtrace_stmtdesc_t *stp, int flags) {
> > - stp->dtsd_clauseflags |= flags;
> > -
> > - return 0;
> > -}
> > -
> > -int
> > -dt_stmt_clsflag_test(dtrace_stmtdesc_t *stp, int flags) {
> > - if (stp->dtsd_clauseflags & flags)
> > - return 1;
> > -
> > - return 0;
> > -}
> > -
>
> Okay, but then in dt_program.h get rid of
> dt_program.h:extern int dt_stmt_clsflag_set(dtrace_stmtdesc_t *stp, int
> flags);
> dt_program.h:extern int dt_stmt_clsflag_test(dtrace_stmtdesc_t *stp, int
> flags);
Correct - thanks.
> > dtrace_prog_t *
> > dt_program_create(dtrace_hdl_t *dtp)
> > {
> > diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
> > @@ -316,11 +319,71 @@ dt_provimpl_t dt_pid;
> > static int populate(dtrace_hdl_t *dtp)
> > {
> > + uprobe_data_t *udp = dt_alloc(dtp, sizeof(uprobe_data_t));
> > +
> > + udp->perf_type = -1; /* not initialized */
> > + udp->ret_flag = -1; /* not initialized */
> > + udp->ref_shift = -1; /* not initialized */
> > +
> > if (dt_provider_create(dtp, dt_uprobe.name, &dt_uprobe, &pattr,
> > - NULL) == NULL ||
> > - dt_provider_create(dtp, dt_pid.name, &dt_pid, &pattr,
> > + udp) == NULL)
> > + return -1;
> > +
> > + if (dt_provider_create(dtp, dt_pid.name, &dt_pid, &pattr,
> > NULL) == NULL ||
> > dt_provider_create(dtp, dt_stapsdt.name, &dt_stapsdt, &pattr,
> > NULL) == NULL)
>
> Why is create(dt_uprobe) being split off the other two create()s? This makes
> both the delta and the resulting code (a tiny bit) more complex.
I want to highlight the difference that the first one creates a provider with
provider-specific private date.
> > @@ -401,182 +474,57 @@ static void probe_disable(dtrace_hdl_t *dtp, dt_probe_t *prp)
> > -/*
> > - * Judge whether clause "n" could ever be called as a USDT probe
> > - * for this underlying probe. We can pass uprp==NULL to see if
> > - * the clause can be excluded for every probe.
> > - */
> > static int
> > -ignore_clause(dtrace_hdl_t *dtp, int n, const dt_probe_t *uprp)
> > +clean_usdt_probes(dtrace_hdl_t *dtp)
> > {
> > - dtrace_stmtdesc_t *stp = dtp->dt_stmts[n];
> > - dtrace_probedesc_t *pdp = &stp->dtsd_ecbdesc->dted_probe;
> > + int fdnames = dtp->dt_usdt_namesmap_fd;
> > + uint32_t key, nxt;
> > + del_list_t dlist = { 0, };
> > + del_list_t *del, *ndel;
> > + dt_probe_t *prp;
> > - if (stp == NULL)
> > - return 1;
> > + /* Initialize key to a probe id that cannot be found. */
> > + key = DTRACE_IDNONE;
> > - /*
> > - * Some clauses could never be called for a USDT probe,
> > - * regardless of the underlying probe uprp. Cache this
> > - * status in the clause flags for dt_stmts[n].
> > - */
> > - if (dt_stmt_clsflag_test(stp, DT_CLSFLAG_USDT_INCLUDE | DT_CLSFLAG_USDT_EXCLUDE) == 0) {
> > - size_t len = strlen(pdp->prv);
> > + /* Loop over usdt_names entries. */
> > + while (dt_bpf_map_next_key(fdnames, &key, &nxt) == 0) {
> > + dtrace_probedesc_t pd = { 0, };
> > - /*
> > - * If the last char in the provider description is
> > - * neither '*' nor a digit, it cannot be a USDT probe.
> > - */
> > - if (len > 1) {
> > - char lastchar = (pdp->prv[0] != '\0' ? pdp->prv[len - 1] : '*');
> > -
> > - if (lastchar != '*' && !isdigit(lastchar)) {
> > - dt_stmt_clsflag_set(stp, DT_CLSFLAG_USDT_EXCLUDE);
> > - return 1;
> > - }
> > - }
> > + key = nxt;
> > + pd.id = key;
> > /*
> > - * If the provider description is "pid[0-9]*", it
> > - * is a pid probe, not USDT.
> > + * If the probe exists (as it should), and the process exists,
> > + * we should keep it.
> > */
> > - if (strncmp(pdp->prv, "pid", 3) == 0) {
> > - int i, l = strlen(pdp->prv);
> > -
> > - for (i = 3; i < l; i++)
> > - if (!isdigit((pdp->prv[i])))
> > - break;
> > + prp = dt_probe_lookup(dtp, &pd);
> > + if (prp != NULL) {
> > + list_probe_t *pup = prp->prv_data;
> > + dt_uprobe_t *upp = pup->probe->prv_data;
> > - if (i == l) {
> > - dt_stmt_clsflag_set(stp, DT_CLSFLAG_USDT_EXCLUDE);
> > - return 1;
> > - }
> > + if (Pexists(upp->pid))
> > + continue;
> > }
> > - /* Otherwise, it is possibly a USDT probe. */
> > - dt_stmt_clsflag_set(stp, DT_CLSFLAG_USDT_INCLUDE);
> > + /* Add the key and probe to the delete list. */
> > + del = dt_zalloc(dtp, sizeof(del_list_t));
> > + del->probe = prp;
> > + dt_list_append((dt_list_t *)&dlist, del);
> > }
> > - if (dt_stmt_clsflag_test(stp, DT_CLSFLAG_USDT_EXCLUDE) == 1)
> > - return 1;
> > - if (uprp == NULL)
> > - return 0;
> > - /*
> > - * If we cannot ignore this statement, try to use uprp.
> > - */
> > -
> > - /* We know what function we're in. It must match the probe description (unless "-"). */
> > - if (strcmp(pdp->fun, "-") != 0) {
> > - dt_uprobe_t *upp = uprp->prv_data;
> > + /* Really delete entries from usdt_names. */
> > + for (del = dt_list_next(&dlist); del != NULL; del = ndel) {
> > + ndel = dt_list_next(del);
> > + prp = del->probe;
> > - assert(upp->func); // never a return probe
> > - if (!dt_gmatch(upp->func, pdp->fun))
> > - return 1;
> > + dt_bpf_map_delete(fdnames, &prp->desc->id);
> > + probe_disable(dtp, prp);
> > + free(del);
>
> Okay. Just out of curiosity, under what conditions does one use dt_free()?
Thanks - this is indeed a case where it should be used.
In all, we should probably discontinue the use of dt_* alloc/free functions
soon, because the rationale behind them was carried over from Solaris days,
and we are way past that (and I do not believe there is any use for it).
But for consistency, I will change this to dt_free(dtp, del); because that is
actually what should be used when the allocation was with a dt_*() function.
> > }
> > return 0;
> > @@ -1508,95 +1400,73 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
> > /*
> > * USDT.
> > */
> > - /* In some cases, we know there are no USDT probes. */ // FIXME: add more checks
> > - if (upp->flags & PP_IS_RETURN)
> > - goto out;
> > -
> > + /*
> > + * First check whether the USDT probe is active, i.e. its probe ID is
> > + * in the usdt_names BPF map. If not, ignore it for now.
> > + */
> > + emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0), usdtp->desc->id));
> > + dt_cg_xsetx(dlp, usdt_names, DT_LBL_NONE, BPF_REG_1, usdt_names->di_id);
> > + emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP));
> > + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0)));
> > + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem));
> > + emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, pcb->pcb_exitlbl));
> > +
> > + /* Set up probe arguments. */
> > if (upp->sargc)
> > copy_args(pcb, upp);
> > else
> > dt_cg_tramp_copy_args_from_regs(pcb, 0);
> > - /*
> > - * Retrieve the PID of the process that caused the probe to fire.
> > - */
> > - emit(dlp, BPF_CALL_HELPER(BPF_FUNC_get_current_pid_tgid));
> > - emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32));
> > -
> > - /*
> > - * Look up in the BPF 'usdt_prids' map. The key should fit into
> > - * trampoline stack slot 0.
> > - */
> > - assert(sizeof(usdt_prids_map_key_t) <= DT_STK_SLOT_SZ);
> > - emit(dlp, BPF_STORE(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0), BPF_REG_0));
> > - emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + (int)sizeof(pid_t), uprp->desc->id));
> > - dt_cg_xsetx(dlp, usdt_prids, DT_LBL_NONE, BPF_REG_1, usdt_prids->di_id);
> > - emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP));
> > - emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0)));
> > - emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem));
> > - emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit));
> > -
> > if (upp->flags & PP_IS_ENABLED) {
> > /*
> > - * Generate a BPF trampoline for an is-enabled probe. The is-enabled probe
> > - * prototype looks like:
> > + * Generate a BPF trampoline for an is-enabled probe. The
> > + * is-enabled probe * prototype looks like:
>
> s/* prototype/prototype/
Thanks.
> > *
> > * int is_enabled(int *arg)
> > *
> > - * The trampoline writes 1 into the location pointed to by the passed-in arg.
> > + * The trampoline writes 1 into the location pointed to by the
> > + * passed-in arg.
> > */
> > emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0), 1));
> > emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_7, DMST_ARG(0)));
More information about the DTrace-devel
mailing list