[DTrace-devel] [PATCH 16/16] PID provider implementation
Eugene Loh
eugene.loh at oracle.com
Fri Mar 26 20:30:48 PDT 2021
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
but lots of comments below.
On 3/19/21 12:55 AM, Kris Van Hees wrote:
> This patch provides an implementation of the pid provider. This is
> a provider that makes it possible to do function boundary tracing
> at the userspace level (in shared libraries and executables).
>
> The implementation of userspace probes is built on the uprobes
> support in the kernel tracing infrastructure. The uprobes are
> inode based and therefore not process specific whereas pid probes in
> DTrace are process specific. This means that there is a n-to-1
"a n-to-1" should be either
"an n-to-1" or
"a many-to-one". (I prefer the latter.)
> relation between pid probes and uprobes. It is also worth noting
> that a pid probe together with its underlying uprobe provides
> equivalent functionality as any other non-pid DTrace probes. The
> (process specific) pid probe represents the named probe and has
> D clauses associated eith it. The uprobe represents the underlying
eith -> with
> system probe (not process specific) that our program will be attached
> to.
>
> When we want to trace the main() functon in a process with PID 1234,
functon -> function
> we can specify this probes as pid1234:a.out:main:entry. A lookup is
> performed to see if we already have a pid:<ino>:main:entry probe to
> associate the new pid probe with. If not, it is created. It is this
> pid:<ino>:main:entry probe that will be added to the enablings list of
> probes to which we need to attach a BPF program.
>
> Any clauses for pid1234:a.out:main:entry will be associated with that
> specific probe.
>
> When the final BPF program is constructed for a pid related enaling,
enaling -> enabling
> we construct the program for probe pid:<ino>:main:entry. The program
> (trampoline) will comprise basic environment setup, copying of any
> possible probe arguments, and a sequence of code blocks following the
> pattern:
>
> if (pid == <some-pid>) {
> casee dt_clause_0(dctx);
> casee dt_clause_1(dctx);
> <...>
> }
casee -> case: (2 occurrences)
> In other words, although the uprobe will fire for any process that
> reaches the probe point, we check the PID of the process that caused
> the probe to fire against the registered probes and we only execute
> the clauses for the matching PID.
The commit message has very useful descriptions. Therefore, how about
putting it in the source code? After all, the information is less about
a delta and more about how source code is being done. I would suggest
at the top of dt_prov_pid.c.
> Future patches will augment the functionality provided here with
> arbitrary instruction tracing at the userspace level.
>
> Signee-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Signee->Signed
> ---
> libdtrace/Build | 2 +
> libdtrace/dt_cc.c | 4 +-
> libdtrace/dt_impl.h | 1 +
> libdtrace/dt_open.c | 1 +
> libdtrace/dt_pid.c | 32 ++-
> libdtrace/dt_prov_pid.c | 360 ++++++++++++++++++++++++++++
> libdtrace/dt_provider.h | 5 +
> test/unittest/pid/tst.addprobes.sh | 2 +-
> test/unittest/pid/tst.args1.d | 6 +-
> test/unittest/pid/tst.coverage.d | 4 +-
> test/unittest/pid/tst.float.d | 4 +-
> test/unittest/pid/tst.main.sh | 3 +-
> test/unittest/pid/tst.manypids.sh | 3 +-
> test/unittest/pid/tst.probemod.sh | 7 +-
> test/unittest/pid/tst.provregex1.sh | 3 +-
> test/unittest/usdt/tst.dlclose3.sh | 3 +-
> 16 files changed, 416 insertions(+), 24 deletions(-)
> create mode 100644 libdtrace/dt_prov_pid.c
>
> diff --git a/libdtrace/Build b/libdtrace/Build
> index 9c683e8f..cbc52659 100644
> --- a/libdtrace/Build
> +++ b/libdtrace/Build
> @@ -50,6 +50,7 @@ libdtrace-build_SOURCES = dt_aggregate.c \
> dt_program.c \
> dt_prov_dtrace.c \
> dt_prov_fbt.c \
> + dt_prov_pid.c \
> dt_prov_profile.c \
> dt_prov_sdt.c \
> dt_prov_syscall.c \
> @@ -88,6 +89,7 @@ dt_dis.c_CFLAGS := -Wno-pedantic
> dt_proc.c_CFLAGS := -Wno-pedantic
> dt_prov_dtrace.c_CFLAGS := -Wno-pedantic
> dt_prov_fbt.c_CFLAGS := -Wno-pedantic
> +dt_prov_pid.c_CFLAGS := -Wno-pedantic
> dt_prov_profile.c_CFLAGS := -Wno-pedantic
> dt_prov_sdt.c_CFLAGS := -Wno-pedantic
> dt_prov_syscall.c_CFLAGS := -Wno-pedantic
> diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
> index 4769daa6..3bb521f5 100644
> --- a/libdtrace/dt_cc.c
> +++ b/libdtrace/dt_cc.c
> @@ -1707,7 +1707,7 @@ dt_setcontext(dtrace_hdl_t *dtp, dtrace_probedesc_t *pdp)
> * that process if:
> *
> * (1) The provider doesn't exist, or,
> - * (2) The provider exists and has DTRACE_PRIV_PROC privilege.
> + * (2) The provider exists and has DT_PROVIDER_PID flag set.
> *
> * On an error, dt_pid_create_probes() will set the error message
> * and tag -- we just have to longjmp() out of here.
> @@ -1715,7 +1715,7 @@ dt_setcontext(dtrace_hdl_t *dtp, dtrace_probedesc_t *pdp)
> if (pdp->prv && pdp->prv[0] &&
> isdigit(pdp->prv[strlen(pdp->prv) - 1]) &&
> ((pvp = dt_provider_lookup(dtp, pdp->prv)) == NULL ||
> - pvp->desc.dtvd_priv.dtpp_flags & DTRACE_PRIV_PROC) &&
> + pvp->pv_flags & DT_PROVIDER_PID) &&
> dt_pid_create_probes(pdp, dtp, yypcb) != 0) {
> longjmp(yypcb->pcb_jmpbuf, EDT_COMPILER);
> }
Okay, so at this point should all the DTRACE_PRIV_* be deleted? They are
no longer used.
> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> index 687132fe..47029804 100644
> --- a/libdtrace/dt_impl.h
> +++ b/libdtrace/dt_impl.h
> @@ -293,6 +293,7 @@ struct dtrace_hdl {
> struct dt_provider **dt_provs; /* hash table of dt_provider_t's */
> uint_t dt_provbuckets; /* number of provider hash buckets */
> uint_t dt_nprovs; /* number of providers in hash and list */
> + const struct dt_provider *dt_prov_pid; /* PID provider */
> dt_proc_hash_t *dt_procs; /* hash table of grabbed process handles */
> dt_intdesc_t dt_ints[6]; /* cached integer type descriptions */
> ctf_id_t dt_type_func; /* cached CTF identifier for function type */
IIUC, dt_prov_pid has rather limited/dubious utility. It'd be better to
live without it.
> diff --git a/libdtrace/dt_open.c b/libdtrace/dt_open.c
> index 79b834e7..74405d3b 100644
> --- a/libdtrace/dt_open.c
> +++ b/libdtrace/dt_open.c
> @@ -64,6 +64,7 @@ const dt_version_t _dtrace_versions[] = {
> static const dt_provimpl_t *dt_providers[] = {
> &dt_dtrace,
> &dt_fbt,
> + &dt_pid,
> &dt_profile,
> &dt_sdt,
> &dt_syscall,
> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
> index 7bd299e5..210b199d 100644
> --- a/libdtrace/dt_pid.c
> +++ b/libdtrace/dt_pid.c
> @@ -21,6 +21,7 @@
>
> #include <dt_impl.h>
> #include <dt_program.h>
> +#include <dt_provider.h>
> #include <dt_pid.h>
> #include <dt_string.h>
>
> @@ -94,15 +95,29 @@ static int
> dt_pid_create_fbt_probe(struct ps_prochandle *P, dtrace_hdl_t *dtp,
> pid_probespec_t *psp, const GElf_Sym *symp, pid_probetype_t type)
> {
> + const dt_provider_t *pvp;
> +
> psp->pps_type = type;
> psp->pps_pc = (uintptr_t)symp->st_value;
> psp->pps_size = (size_t)symp->st_size;
> psp->pps_glen = 0; /* no glob pattern */
> psp->pps_gstr[0] = '\0';
>
> + /* Make sure we have a PID provider. */
> + pvp = dtp->dt_prov_pid;
> + if (pvp == NULL) {
> + pvp = dt_provider_lookup(dtp, "pid");
> + if (pvp == NULL)
> + return 0;
> +
> + dtp->dt_prov_pid = pvp;
> + }
IIUC, dt_prov_pid has rather limited/dubious utility. It'd be better to
live without it.
> +
> + assert(pvp->impl != NULL && pvp->impl->provide_pid != NULL);
> +
> /* Create a probe using 'psp'. */
>
> - return 1;
> + return pvp->impl->provide_pid(dtp, psp);
> }
>
> static int
> @@ -231,6 +246,7 @@ dt_pid_per_sym(dt_pid_probe_t *pp, const GElf_Sym *symp, const char *func)
>
> out:
> free(psp->pps_mod);
> + free(psp->pps_fn);
> dt_free(dtp, psp);
> return rc;
> }
> @@ -261,6 +277,12 @@ dt_pid_sym_filt(void *arg, const GElf_Sym *symp, const char *func)
> if (strcmp(func, "_init") == 0 || strcmp(func, "_fini") == 0)
> return 0;
>
> + /*
> + * Versioned identifiers are a problem.
> + */
> + if (strchr(func, '@') != NULL)
> + return 0;
> +
> if ((pp->dpp_last_taken = gmatch(func, pp->dpp_func)) != 0) {
> pp->dpp_last = *symp;
> return dt_pid_per_sym(pp, symp, func);
> @@ -286,7 +308,7 @@ dt_pid_per_mod(void *arg, const prmap_t *pmp, const char *obj)
> dt_Plmid(pp->dpp_dtp, pid, pmp->pr_vaddr, &pp->dpp_lmid);
>
> pp->dpp_ino = pmp->pr_inum;
> - pp->dpp_vaddr = pmp->pr_vaddr;
> + pp->dpp_vaddr = pmp->pr_file->first_segment->pr_vaddr;
>
> /*
> * Note: if an execve() happens in the victim after this point, the
> @@ -518,6 +540,7 @@ dt_pid_create_pid_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp,
> pp.dpp_func = pdp->fun[0] != '\0' ? pdp->fun : "*";
> pp.dpp_name = pdp->prb[0] != '\0' ? pdp->prb : "*";
> pp.dpp_last_taken = 0;
> + pp.dpp_fname = NULL;
>
> if (strcmp(pp.dpp_func, "-") == 0) {
> const prmap_t *aout, *pmp;
> @@ -568,6 +591,11 @@ dt_pid_create_pid_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp,
> }
> }
>
> + if (pp.dpp_func != pdp->fun) {
> + free((char *)pdp->fun);
> + pdp->fun = pp.dpp_func;
> + }
> +
> free(pp.dpp_fname);
>
> return ret;
> diff --git a/libdtrace/dt_prov_pid.c b/libdtrace/dt_prov_pid.c
> new file mode 100644
> index 00000000..681874f1
> --- /dev/null
> +++ b/libdtrace/dt_prov_pid.c
> @@ -0,0 +1,360 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + *
> + * The PID provider for DTrace.
> + */
> +#include <assert.h>
> +#include <errno.h>
> +#include <string.h>
> +
> +#include <bpf_asm.h>
> +
> +#include "dt_dctx.h"
> +#include "dt_cg.h"
> +#include "dt_list.h"
> +#include "dt_provider.h"
> +#include "dt_probe.h"
> +
> +static const char prvname[] = "pid";
> +
> +#define PRV "pid%d"
This definition is not used anywhere (and "pid%d" is used only once).
Get rid of PRV.
> +
> +#define UPROBE_EVENTS TRACEFS "uprobe_events"
> +
> +#define PID_GROUP_FMT GROUP_FMT "_%lx"
> +#define PID_GROUP_DATA GROUP_DATA, pp->ino
> +
> +typedef struct pid_probe {
> + ino_t ino;
> + char *fn;
> + uint64_t off;
> + tp_probe_t *tp;
> + dt_list_t probes;
> +} pid_probe_t;
> +
> +static const dtrace_pattr_t pattr = {
> +{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_ISA },
> +{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN },
> +{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN },
> +{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_ISA },
> +{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN },
> +};
> +
> +dt_provimpl_t dt_pid_sub;
> +
> +static int populate(dtrace_hdl_t *dtp)
> +{
> + dt_provider_create(dtp, prvname, &dt_pid, &pattr);
> + return 0;
> +}
> +
> +static void probe_destroy(dtrace_hdl_t *dtp, void *datap)
> +{
> + pid_probe_t *pp = datap;
> +
> + dt_tp_destroy(dtp, pp->tp);
> + dt_free(dtp, pp->fn);
> + dt_free(dtp, pp);
> +}
> +
> +static int provide_pid(dtrace_hdl_t *dtp, const pid_probespec_t *psp)
> +{
The whole business of distinguishing between two pid providers -- the
"real" one and the process-specific one -- is never really explained,
whether in the commit message or in the code. (I admit that the
difference is somewhat explained with respect to probes, and one could
from there deduce the difference between the providers, but I think we
can do better.) I suggest:
- Introduce a comment block before the provide() function that says
that there are two kinds of pid providers:
- the "main" provider
- process-specific providers that the DTrace user sees
- Rename dt_pid_sub to be dt_pid_proc.
> + char prv[DTRACE_PROVNAMELEN];
> + char mod[DTRACE_MODNAMELEN];
> + char prb[DTRACE_NAMELEN];
> + dt_provider_t *pidpvp;
> + dt_provider_t *pvp;
> + dtrace_probedesc_t pd;
> + pid_probe_t *pp;
> + dt_probe_t *prp;
> + uint64_t off = psp->pps_pc - psp->pps_vaddr;
Some comments would be good to explain what these things are, especially
since you ultimately deal with a uprobe and a DTrace process-specific
probe, and different variables get used for different (or both!)
probes. How about:
prv /* process-specific provider name */
mod /* inode for the uprobe */
prb /* probe name ("entry", "return", or hex offset) for either
pid probe or uprobe */
pidpvp /* main pid provider */ // rename it mainpvp
pvp /* process-specific pid provider */ // rename it procpvp
pd /* probe descriptor used for both main and process-specific
providers */
pp /* provider-specific data for the uprobe */
prp /* probe pointer for either main or process-specific provider */
> +
> + /*
> + * First check whether this pid probe already exists. If so, there is
> + * nothing left to do.
> + */
I would defer that comment for later. Right now, just say we're
constructing the name of the process-specific provider.
> + snprintf(prv, sizeof(prv), "pid%d", psp->pps_pid);
> +
And here just say we are constructing the name of the probe, whether for
the uprobe or the process-specific probe.
> + switch (psp->pps_type) {
> + case DTFTP_ENTRY:
> + strncpy(prb, "entry", sizeof(prb));
> + break;
> + case DTFTP_RETURN:
> + strncpy(prb, "return", sizeof(prb));
> + break;
> + case DTFTP_OFFSETS:
> + snprintf(prb, sizeof(prb), "%lx", off);
> + break;
> + default:
> + return 0;
> + }
> +
Now say we're looking for the process-specific probe!
> + pd.id = DTRACE_IDNONE;
> + pd.prv = prv;
> + pd.mod = psp->pps_mod;
> + pd.fun = psp->pps_fun;
> + pd.prb = prb;
> +
> + prp = dt_probe_lookup(dtp, &pd);
> + if (prp != NULL)
> + return 1; /* probe found */
> +
> + /* Get the main (real) pid provider. */
"Real" has not been defined. Omit it. Again, though, the "main"
provider can be defined earlier and pidpvp can be renamed mainpvp to be
clear. ("pidpvp"... what does "pid" mean in this context? Everything is
"pid" in this file!)
> + pidpvp = dt_provider_lookup(dtp, prvname);
> + if (pidpvp == NULL)
> + return 0;
> +
> + /* Get (or create) the provider for the PID of the probe. */
Again, "process-specific" provider, hopefully named "procpvp".
> + pvp = dt_provider_lookup(dtp, prv);
> + if (pvp == NULL) {
> + pvp = dt_provider_create(dtp, prv, &dt_pid_sub, &pattr);
> + if (pvp == NULL)
> + return 0;
> + }
> +
> + /* Mark the provider as a PID provider. */
> + pvp->pv_flags |= DT_PROVIDER_PID;
> +
> + /*
> + * Fill in the probe description for the main (real) probe. The
A few times, you refer to the "main (real) probe". I think it makes
sense to talk about it in terms of uprobes. (That is, it has a
one-to-one correspondence with a uprobe.) Most of all, I think it makes
sense to start with a high-level description that lays out the
nomenclature you're going to use and then to stick with that
nomenclature throughout. In particular:
*) There is a probe that corresponds to the uprobe.
*) There are process-specific probes that are visible to the DTrace user.
> + * module is the inode number (in hex), the function name is as
> + * specified for the pid probe, and the probe name is "entry",
> + * "return", or the offset into the function (in hex).
> + */
> + snprintf(mod, sizeof(mod), "%lx", psp->pps_ino);
> +
> + /*
> + * Try to lookup the main (real) probe. Since multiple pid probes may
> + * all map onto the same underlying main (real) probe, we may already
> + * have one in the system.
Again "uprobe" rather than "main (real)" probe.
> + *
> + * If not found, we create a new probe.
> + */
> + pd.id = DTRACE_IDNONE;
> + pd.prv = prvname;
> + pd.mod = mod;
> + pd.fun = psp->pps_fun;
> + pd.prb = prb;
> + prp = dt_probe_lookup(dtp, &pd);
> + if (prp == NULL) {
> + /* Set up the pid probe data. */
> + pp = dt_zalloc(dtp, sizeof(pid_probe_t));
> + if (pp == NULL)
> + return 0;
> +
> + pp->ino = psp->pps_ino;
> + pp->fn = strdup(psp->pps_fn);
> + pp->off = off;
> + pp->tp = dt_tp_alloc(dtp);
> + if (pp->tp == NULL)
> + goto fail;
> +
> + prp = dt_probe_insert(dtp, pidpvp, prvname, mod, psp->pps_fun,
> + prb, pp);
> + if (prp == NULL)
> + goto fail;
> + } else
> + pp = prp->prv_data;
> +
A little confusing since prp is used in quick succession to mean
different things as one goes through a nonlinear flow of execution. I
don't know.
> + /* Try to add the pid probe. */
> + prp = dt_probe_insert(dtp, pvp, prv, psp->pps_mod, psp->pps_fun, prb,
> + prp);
> + if (prp == NULL)
> + goto fail;
> +
> + /* Add the pid probe to the list of probes for the main (real) probe. */
> + dt_list_append(&pp->probes, prp);
Maybe this warrants a little bit of explanation? Normally, we append
the probe to the list of enablings. Here, only the uprobe is. The
process-specific probes get appended onto the uprobe's probes list. Not
sure if this is worth pointing out. Or, maybe it's already sufficiently
clear in the commit message and this stuff is just a little
complicated? Dunno, but I lost track of things by the time I got to
that dt_list_append() call. Anyhow, this dt_list_append() is a cousin
to the dt_list_append() in enabling(), which comes up next.
> +
> + return 1;
> +
> +fail:
> + probe_destroy(dtp, pp);
> + return 0;
> +}
> +
> +static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp)
> +{
> + assert(prp->prov->impl == &dt_pid_sub);
> +
> + /* We should enable the main (real) probe (if not enabled yet). */
> + prp = prp->prv_data;
> +
> + if (!dt_in_list(&dtp->dt_enablings, prp))
> + dt_list_append(&dtp->dt_enablings, prp);
> +}
> +
> +/*
> + * Generate a BPF trampoline for a pid probe.
> + *
> + * The trampoline function is called when a pid probe triggers, and it must
> + * satisfy the following prototype:
> + *
> + * int dt_pid(dt_pt_regs *regs)
> + *
> + * The trampoline will first populate a dt_dctx_t struct. It will then emulate
> + * the firing of all dependent pid* probes and their clauses.
> + */
> +static void trampoline(dt_pcb_t *pcb)
> +{
> + dt_irlist_t *dlp = &pcb->pcb_ir;
> + const dt_probe_t *prp = pcb->pcb_probe;
> + const dt_probe_t *pprp;
> + const pid_probe_t *pp = prp->prv_data;
> +
> + dt_cg_tramp_prologue(pcb);
> +
> + /*
> + * After the dt_cg_tramp_prologue() call, we have:
> + * // (%r7 = dctx->mst)
> + * // (%r8 = dctx->ctx)
> + */
> +
> +#if 0
> + /*
> + * dctx->mst->regs = *(dt_pt_regs *)dctx->ctx;
> + * // lddw %r0, [%r8 + 0]
> + * // stdw [%r7 + DMST_REGS + 0], %r0
> + * // lddw %r0, [%r8 + 8]
> + * // stdw [%r7 + DMST_REGS + 8], %r0
> + * // (...)
> + */
> + for (i = 0; i < sizeof(dt_pt_regs); i += 8) {
> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_8, i));
> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_REGS + i, BPF_REG_0));
> + }
> +#endif
You intend to resuscitate this code soon?
> +
> + dt_cg_tramp_copy_args_from_regs(pcb, BPF_REG_8);
> +
> + /*
> + * Retrieve the PID of the process that caused the probe to fire.
> + */
> + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_get_current_pid_tgid));
> + emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32));
> + for (pprp = dt_list_next(&pp->probes); pprp != NULL;
> + pprp = dt_list_next(pprp)) {
> + uint_t lbl_next = dt_irlist_label(dlp);
> + pid_t pid = strtoul(pprp->desc->prv + 3, NULL, 10);
> +
> + /*
> + * Check whether this pid-provider probe serves the current
> + * process.
> + */
> + emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, pid, lbl_next));
> + dt_cg_tramp_call_clauses(pcb, pprp, DT_ACTIVITY_ACTIVE);
> + emitl(dlp, lbl_next,
> + BPF_NOP());
> + }
I guess we talked about this earlier. Just to trampoline exit after a
match to bypass the subsequent unnecessary conditionals?
> +
> + dt_cg_tramp_return(pcb);
I think the case for having
dt_cg_tramp_epilogue(dt_pcb_t *pcb)
dt_cg_tramp_epilogue_advance(dt_pcb_t *pcb, dt_activity_t act)
has disappeared. The other provider trampolines would be clearer if
these two functions were inlined. The structure among the various
providers would be clearer if the dt_cg_tramp_call_clauses() and
dt_cg_tramp_return() calls were inlined, as they are for this provider.
Bypassing the tramp_epilogue*() calls for this provider I think makes
the case for eliminating tramp_epilogue*() for all of them.
> +}
> +
> +static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
> +{
> + pid_probe_t *pp = prp->prv_data;
Also define
tp_probe_t *tpp = pp->tp;
and then use tpp in place of pp->tp the three times it appears in the
function. The motivation is not so much to reduce code as to amplify
the similarities with the dtrace and fbt providers.
> +
> + if (!dt_tp_is_attached(pp->tp)) {
> + char *fn;
> + FILE *f;
> + size_t len;
> + int fd, rc = -1;
> +
> + /* add the uprobe */
> + fd = open(UPROBE_EVENTS, O_WRONLY | O_APPEND);
> + if (fd != -1) {
> + rc = dprintf(fd,
> + "%c:" PID_GROUP_FMT "/%s_%s %s:0x%lx\n",
> + prp->desc->prb[0] == 'e' ? 'p' : 'r',
> + PID_GROUP_DATA, prp->desc->fun,
> + prp->desc->prb, pp->fn, pp->off);
> + close(fd);
> + }
> + if (rc == -1)
> + return -ENOENT;
> +
> + /* open format file */
> + len = snprintf(NULL, 0, "%s" PID_GROUP_FMT "/%s_%s/format",
> + EVENTSFS, PID_GROUP_DATA, prp->desc->fun,
> + prp->desc->prb) + 1;
> + fn = dt_alloc(dtp, len);
> + if (fn == NULL)
> + return -ENOENT;
> +
> + snprintf(fn, len, "%s" PID_GROUP_FMT "/%s_%s/format",
> + EVENTSFS, PID_GROUP_DATA, prp->desc->fun,
> + prp->desc->prb);
> + f = fopen(fn, "r");
> + dt_free(dtp, fn);
> + if (f == NULL)
> + return -ENOENT;
> +
> + rc = dt_tp_event_info(dtp, f, 0, pp->tp, NULL, NULL);
> + fclose(f);
I guess rc<0 should be checked. The same comment applies to the dtrace
provider. The fbt provider is already okay.
> + }
> +
> + /* attach BPF program to the probe */
> + return dt_tp_attach(dtp, pp->tp, bpf_fd);
> +}
> +
> +static int probe_info(dtrace_hdl_t *dtp, const dt_probe_t *prp,
> + int *argcp, dt_argdesc_t **argvp)
> +{
> + *argcp = 0; /* no arguments */
The point is not "no arguments" but "no typed arguments". This remark
applies to the similar comment in some other providers.
> + *argvp = NULL;
> +
> + return 0;
> +}
> +
> +/*
> + * Try to clean up system resources that may have been allocated for this
> + * probe.
> + *
> + * If there is an event FD, we close it.
> + *
> + * We also try to remove any uprobe that may have been created for the probe.
> + * This is harmless for probes that didn't get created. If the removal fails
> + * for some reason we are out of luck - fortunately it is not harmful to the
> + * system as a whole.
> + */
> +static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
> +{
> + int fd;
> + pid_probe_t *pp = prp->prv_data;
You define pp, but it is only used as pp->tp. So, instead, define:
tp_probe_t *tpp = ((pid_probe_t *) prp->prv_data)->tp;
and use tpp in place of pp->tp, which will better mimic what you did a
few patches earlier in dtrace and fbt.
> +
> + if (!dt_tp_is_attached(pp->tp))
> + return;
> +
> + dt_tp_detach(dtp, pp->tp);
> +
> + fd = open(UPROBE_EVENTS, O_WRONLY | O_APPEND);
> + if (fd == -1)
> + return;
> +
> + dprintf(fd, "-:" PID_GROUP_FMT "/%s_%s\n", PID_GROUP_DATA,
> + prp->desc->fun, prp->desc->prb);
> + close(fd);
> +}
> +
> +dt_provimpl_t dt_pid = {
> + .name = prvname,
> + .prog_type = BPF_PROG_TYPE_KPROBE,
> + .populate = &populate,
> + .provide_pid = &provide_pid,
> + .trampoline = &trampoline,
> + .attach = &attach,
> + .probe_info = &probe_info,
> + .detach = &detach,
> + .probe_destroy = &probe_destroy,
> +};
> +
> +dt_provimpl_t dt_pid_sub = {
> + .name = prvname,
> + .prog_type = BPF_PROG_TYPE_KPROBE,
> + .populate = &populate,
> + .provide_pid = &provide_pid,
> + .enable = &enable,
> + .trampoline = &trampoline,
> +};
I would think that
dt_pid_sub.name
dt_pid_sub.populate
could/should be NULL.
> diff --git a/libdtrace/dt_provider.h b/libdtrace/dt_provider.h
> index b8ac9c1e..c7a47d6d 100644
> --- a/libdtrace/dt_provider.h
> +++ b/libdtrace/dt_provider.h
> @@ -8,6 +8,7 @@
> #ifndef _DT_PROVIDER_H
> #define _DT_PROVIDER_H
>
> +#include <dtrace/pid.h>
> #include <dt_impl.h>
> #include <dt_ident.h>
> #include <dt_list.h>
> @@ -63,6 +64,8 @@ typedef struct dt_provimpl {
> const dtrace_probedesc_t *pdp);
> void (*enable)(dtrace_hdl_t *dtp, /* enable the given probe */
> struct dt_probe *prp);
> + int (*provide_pid)(dtrace_hdl_t *dtp, /* provide PID probes */
> + const pid_probespec_t *psp);
> void (*trampoline)(dt_pcb_t *pcb); /* generate BPF trampoline */
> int (*attach)(dtrace_hdl_t *dtp, /* attach BPF prog to probe */
> const struct dt_probe *prp, int bpf_fd);
I am not crazy about introducing the new dt_provimpl_t function
provide_pid(). It seems so similar to provide().
Now, *IF* one does want to have both a provide_pid() and a provide():
- In libdtrace/dt_provider.h, provide_pid() and provide() should
appear consecutively. Specifically, provide_pid() should come
before enable(), not after. This would also be more consistent
with what you already do in dt_prov_pid.c.
- If only the profile provider will use provide(), perhaps that
function should receive a more specialized name -- say,
provide_profile() -- akin to the specialized name given to
provide_pid().
But I would lobby instead for a different solution. Use just provide()
for both.
Do not introduce a new provide_pid() function. Instead, make provide()'s
second argument a (void *) that can be whatever the provider needs,
whether a (dtrace_probedesc_t *), a (pid_probespec_t *), or something we
have not yet anticipated.
> @@ -117,6 +121,7 @@ extern void dt_tp_probe_destroy(dtrace_hdl_t *dtp, void *datap);
>
> #define DT_PROVIDER_INTF 0x1 /* provider interface declaration */
> #define DT_PROVIDER_IMPL 0x2 /* provider implementation is loaded */
> +#define DT_PROVIDER_PID 0x4 /* provider is a PID provider */
Those three lines use tabs inconsistently.
> diff --git a/test/unittest/pid/tst.addprobes.sh b/test/unittest/pid/tst.addprobes.sh
> diff --git a/test/unittest/pid/tst.args1.d b/test/unittest/pid/tst.args1.d
> diff --git a/test/unittest/pid/tst.coverage.d b/test/unittest/pid/tst.coverage.d
> diff --git a/test/unittest/pid/tst.float.d b/test/unittest/pid/tst.float.d
> diff --git a/test/unittest/pid/tst.main.sh b/test/unittest/pid/tst.main.sh
> diff --git a/test/unittest/pid/tst.manypids.sh b/test/unittest/pid/tst.manypids.sh
> diff --git a/test/unittest/pid/tst.probemod.sh b/test/unittest/pid/tst.probemod.sh
> diff --git a/test/unittest/pid/tst.provregex1.sh b/test/unittest/pid/tst.provregex1.sh
> diff --git a/test/unittest/usdt/tst.dlclose3.sh b/test/unittest/usdt/tst.dlclose3.sh
Hooray on tests. But:
*) (I didn't check) do we have tests that look at multiple processes
using the same inode/uprobe?
*) There should be some pid-provider equivalent of
test/unittest/providers/tst.dtrace_cleanup.sh
Maybe some
test/unittest/providers/tst.pid_cleanup.sh
Arguably, maybe even some check of TRACEFS for orphaned probes when
./runtest.sh is finishing!
More information about the DTrace-devel
mailing list