[DTrace-devel] [PATCH 5/5] libproc: drop Pgrab() special cases in Ptrace()
Kris Van Hees
kris.van.hees at oracle.com
Sat Dec 7 04:40:03 UTC 2024
On Tue, Dec 03, 2024 at 11:36:10AM +0000, Nick Alcock wrote:
> Way back in 2013, in commit f5f05eb28058f2a62efeefef7c5faeca62b09578, we
> added a special case to Ptrace() causing it to not fail with an error
> if ptrace() failed and Ptrace() was being called by Pgrab().
>
> The need for this is long past: noninvasive tracing provides the semantics
> this change was meant to provide, far less unpleasantly. Worse yet, the
> patch is not threadsafe (even though we can have arbitrarily many threads
> monitoring arbitrarily many processes), and worse yet, the noninvasive
> tracing support in Pgrab() wants to *detect* failure to ptrace() so we
> can switch to tracing noninvasively instead. If the failure is hidden,
> we assume ptrace() has worked, and our first attempt to use this and
> waitpid() on the traced child fails with an -ECHILD and causes us to
> assume the process dead. Since it's not dead, bad things happen:
>
> libproc DEBUG 1733155118: 386060: Ppush_state(): ptrace_count 1, state 1
> libproc DEBUG 1733155118: 386060: Ppop_state(): ptrace_count 2, state 1
> libproc DEBUG 1733155118: Pgrab: grabbed PID 386060.
> [...]
> libproc DEBUG 1733155118: 386060: Activated rtld_db agent.
> libproc DEBUG 1733155118: 386060: link map iteration failed: process is dead.
> libdtrace DEBUG 1733155118: Called dt_attach() with attach_time 0
> libdtrace DEBUG 1733155118: pid 386060: dropping breakpoint on AT_ENTRY
> libproc DEBUG 1733155118: 386060: Ppush_state(): ptrace_count 1, state 4
> libproc DEBUG 1733155118: 386060: Ppop_state(): ptrace_count 2, state 4
> libproc DEBUG 1733155118: 386060: Cannot add breakpoint on ffffffffffffffff: Operation not permitted
> libdtrace DEBUG 1733155118: Cannot drop breakpoint in child process: acting as if evaltime=exec were in force.
>
> (Note that we weren't even logging the fact that Pgrab() had failed, up ther
> before the [...], and the first visible failure happened some time later,
> with entirely inaccurate messages about processes being dead and the like.)
>
> The solution is simple: take out the whole horrible Pgrab() special case,
> and treat invocations of Ptrace() from Pgrab() just like any other
> invocation from anywhere else. Pgrab() already deals with failure-to-grab
> errors perfectly well, if we only let it see the errors at all.
>
> With this in place, test/unittest/usdt/tst.multitrace.sh survives 200+
> invocations with zero failures.
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>
> ---
> libproc/Pcontrol.c | 17 +++--------------
> 1 file changed, 3 insertions(+), 14 deletions(-)
>
> diff --git a/libproc/Pcontrol.c b/libproc/Pcontrol.c
> index 6a454ef86bc3b..7d9b5055f8201 100644
> --- a/libproc/Pcontrol.c
> +++ b/libproc/Pcontrol.c
> @@ -39,8 +39,6 @@
>
> char procfs_path[PATH_MAX] = "/proc";
>
> -static int Pgrabbing = 0; /* A Pgrab() is underway. */
> -
> static int systemd_system = -1; /* 1 if this is a system running systemd. */
>
> static void Pfree_internal(struct ps_prochandle *P);
> @@ -327,9 +325,7 @@ Pgrab(pid_t pid, int noninvasiveness, int already_ptraced, void *wrap_arg,
> /*
> * Pmemfd() grabbed, try to ptrace().
> */
> - Pgrabbing = 1;
> *perr = Ptrace(P, 1);
> - Pgrabbing = 0;
>
> if (*perr < 0) {
> if (noninvasiveness < 1) {
> @@ -1373,10 +1369,7 @@ Ptrace(struct ps_prochandle *P, int stopped)
>
> if (wrapped_ptrace(P, PTRACE_SEIZE, P->pid, 0, LIBPROC_PTRACE_OPTIONS |
> PTRACE_O_TRACECLONE) < 0) {
> - if (!Pgrabbing)
> - goto err;
> - else
> - goto err2;
> + goto err;
> }
>
> P->ptraced = TRUE;
> @@ -1386,10 +1379,7 @@ Ptrace(struct ps_prochandle *P, int stopped)
>
> if (wrapped_ptrace(P, PTRACE_INTERRUPT, P->pid, 0, 0) < 0) {
> wrapped_ptrace(P, PTRACE_DETACH, P->pid, 0, 0);
> - if (!Pgrabbing)
> - goto err;
> - else
> - goto err2;
> + goto err;
> }
>
> /*
> @@ -1406,14 +1396,13 @@ Ptrace(struct ps_prochandle *P, int stopped)
> if ((P->state != PS_TRACESTOP) &&
> (P->state != PS_STOP)) {
> err = -ECHILD;
> - goto err2;
> + goto err;
> }
> }
>
> return err;
> err:
> err = -errno;
> -err2:
> /*
> * Note a subtlety here: the Ptrace_count may have been reduced, and the state
> * popped to match, by an exec() or other operation within the Pwait().
> --
> 2.47.1.279.g84c5f4e78e
>
More information about the DTrace-devel
mailing list