[DTrace-devel] [PATCH 5/5] libproc: drop Pgrab() special cases in Ptrace()

Kris Van Hees kris.van.hees at oracle.com
Sat Dec 7 04:40:03 UTC 2024


On Tue, Dec 03, 2024 at 11:36:10AM +0000, Nick Alcock wrote:
> Way back in 2013, in commit f5f05eb28058f2a62efeefef7c5faeca62b09578, we
> added a special case to Ptrace() causing it to not fail with an error
> if ptrace() failed and Ptrace() was being called by Pgrab().
> 
> The need for this is long past: noninvasive tracing provides the semantics
> this change was meant to provide, far less unpleasantly.  Worse yet, the
> patch is not threadsafe (even though we can have arbitrarily many threads
> monitoring arbitrarily many processes), and worse yet, the noninvasive
> tracing support in Pgrab() wants to *detect* failure to ptrace() so we
> can switch to tracing noninvasively instead.  If the failure is hidden,
> we assume ptrace() has worked, and our first attempt to use this and
> waitpid() on the traced child fails with an -ECHILD and causes us to
> assume the process dead.  Since it's not dead, bad things happen:
> 
> libproc DEBUG 1733155118: 386060: Ppush_state(): ptrace_count 1, state 1
> libproc DEBUG 1733155118: 386060: Ppop_state(): ptrace_count 2, state 1
> libproc DEBUG 1733155118: Pgrab: grabbed PID 386060.
> [...]
> libproc DEBUG 1733155118: 386060: Activated rtld_db agent.
> libproc DEBUG 1733155118: 386060: link map iteration failed: process is dead.
> libdtrace DEBUG 1733155118: Called dt_attach() with attach_time 0
> libdtrace DEBUG 1733155118: pid 386060: dropping breakpoint on AT_ENTRY
> libproc DEBUG 1733155118: 386060: Ppush_state(): ptrace_count 1, state 4
> libproc DEBUG 1733155118: 386060: Ppop_state(): ptrace_count 2, state 4
> libproc DEBUG 1733155118: 386060: Cannot add breakpoint on ffffffffffffffff: Operation not permitted
> libdtrace DEBUG 1733155118: Cannot drop breakpoint in child process: acting as if evaltime=exec were in force.
> 
> (Note that we weren't even logging the fact that Pgrab() had failed, up ther
> before the [...], and the first visible failure happened some time later,
> with entirely inaccurate messages about processes being dead and the like.)
> 
> The solution is simple: take out the whole horrible Pgrab() special case,
> and treat invocations of Ptrace() from Pgrab() just like any other
> invocation from anywhere else.  Pgrab() already deals with failure-to-grab
> errors perfectly well, if we only let it see the errors at all.
> 
> With this in place, test/unittest/usdt/tst.multitrace.sh survives 200+
> invocations with zero failures.
> 
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>

Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>

> ---
>  libproc/Pcontrol.c | 17 +++--------------
>  1 file changed, 3 insertions(+), 14 deletions(-)
> 
> diff --git a/libproc/Pcontrol.c b/libproc/Pcontrol.c
> index 6a454ef86bc3b..7d9b5055f8201 100644
> --- a/libproc/Pcontrol.c
> +++ b/libproc/Pcontrol.c
> @@ -39,8 +39,6 @@
>  
>  char	procfs_path[PATH_MAX] = "/proc";
>  
> -static	int Pgrabbing = 0; 	/* A Pgrab() is underway. */
> -
>  static int systemd_system = -1; /* 1 if this is a system running systemd. */
>  
>  static void Pfree_internal(struct ps_prochandle *P);
> @@ -327,9 +325,7 @@ Pgrab(pid_t pid, int noninvasiveness, int already_ptraced, void *wrap_arg,
>  			/*
>  			 * Pmemfd() grabbed, try to ptrace().
>  			 */
> -			Pgrabbing = 1;
>  			*perr = Ptrace(P, 1);
> -			Pgrabbing = 0;
>  
>  			if (*perr < 0) {
>  				if (noninvasiveness < 1) {
> @@ -1373,10 +1369,7 @@ Ptrace(struct ps_prochandle *P, int stopped)
>  
>  	if (wrapped_ptrace(P, PTRACE_SEIZE, P->pid, 0, LIBPROC_PTRACE_OPTIONS |
>  		    PTRACE_O_TRACECLONE) < 0) {
> -		if (!Pgrabbing)
> -			goto err;
> -		else
> -			goto err2;
> +		goto err;
>  	}
>  
>  	P->ptraced = TRUE;
> @@ -1386,10 +1379,7 @@ Ptrace(struct ps_prochandle *P, int stopped)
>  
>  		if (wrapped_ptrace(P, PTRACE_INTERRUPT, P->pid, 0, 0) < 0) {
>  			wrapped_ptrace(P, PTRACE_DETACH, P->pid, 0, 0);
> -			if (!Pgrabbing)
> -				goto err;
> -			else
> -				goto err2;
> +			goto err;
>  		}
>  
>  		/*
> @@ -1406,14 +1396,13 @@ Ptrace(struct ps_prochandle *P, int stopped)
>  		if ((P->state != PS_TRACESTOP) &&
>  		    (P->state != PS_STOP)) {
>  			err = -ECHILD;
> -			goto err2;
> +			goto err;
>  		}
>  	}
>  
>  	return err;
>  err:
>  	err = -errno;
> -err2:
>  	/*
>  	 * Note a subtlety here: the Ptrace_count may have been reduced, and the state
>  	 * popped to match, by an exec() or other operation within the Pwait().
> -- 
> 2.47.1.279.g84c5f4e78e
> 



More information about the DTrace-devel mailing list