[DTrace-devel] [PATCH 2/5] proc: more self-grab improvements
Kris Van Hees
kris.van.hees at oracle.com
Sat Dec 7 04:38:20 UTC 2024
On Tue, Dec 03, 2024 at 11:36:07AM +0000, Nick Alcock wrote:
> The self-grab armouring code is clearly too hard to read: the change just
> reverted broke it entirely and caused DTrace to never take out self-grabs on
> anything (because it misinterpreted processes that were not being debugged
> as processes that *were* being debugged: Ptracer_pid() returns zero if
> no tracer is active).
>
> Refactor it into something more readable via giving some of the conditions
> names. Doing this forces us to think things through properly. Some things
> can never work reliably and should always be blocked:
>
> - grabbing the thread doing the tracing
> - grabbing any other of this DTrace instance's threads, given the
> complexity of the proxy-call back-and-forth
> - grabbing a thread being debugged by someone else (a process cannot have
> two tracers at once)
> - grabbing PID 1 (init)
>
> Some things are just a bit risky and are reasonable to do if the user
> explicitly asks for it via dtrace -p, but not if it's implicitly requested
> by some thread doing a ustack() or something:
>
> - grabbing a system daemon
>
> Grabbing a thread we have already grabbed is probably impossible from this
> location (process-control thread initialization), but if it does happen it's
> fine: you can PTRACE_SEIZE the same thread from the same debugger more than
> once, that's routine and normal operation for a debugger.
>
> So split things up accordingly, and implement the "any other of DTrace's
> threads" case, which was not implemented before: "we grabbed ourself"
> i.e. the PID is ourself or the tgid of ourself and the tgid of the PID we're
> grabbing are the same; and "someone else is debugging us", i.e. there is a
> tracer in force already and it's not the current thread.
>
> This makes the code *ever* so much easier to read, and makes it possible to
> give decent error messages when things go wrong as well.
>
> (The lack of handling of the "any other of our threads" case explains the
> tst.multitrace.sh failure: when a shortlived grab from a ustack() etc hits,
> the initial grab and release request is issued by the main DTrace thread.
> If a shortlived grab hits for the main thread itself, only this case will
> prevent the tracer thread from stopping it and then trying to return to the
> stopped thread, deadlocking forever.)
>
> This has survived a thousand iterations of test/unittest/proc/tst.self-grab.sh
> and test/unittest/usdt/tst.multitrace.sh with no failures (after the other
> patches in this series are applied).
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>
> ---
> libdtrace/dt_proc.c | 45 ++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 38 insertions(+), 7 deletions(-)
>
> diff --git a/libdtrace/dt_proc.c b/libdtrace/dt_proc.c
> index a052abbacf204..41b148e807f84 100644
> --- a/libdtrace/dt_proc.c
> +++ b/libdtrace/dt_proc.c
> @@ -929,6 +929,9 @@ dt_proc_control(void *arg)
> dpr->dpr_pid = Pgetpid(dpr->dpr_proc);
> } else {
> int noninvasive = 0;
> + int self_grab = 0;
> + int other_tracer = 0;
> + pid_t tracer_pid;
>
> /*
> * "Shortlived" means that the monitoring of this process is not
> @@ -942,21 +945,49 @@ dt_proc_control(void *arg)
> * else (like another DTrace instance). No death notification
> * is ever sent.
> *
> - * Also, obviously enough, never drop breakpoints in ourself!
> + * Also, obviously enough, never drop breakpoints in ourself:
> + * we define that widely enough that no grabs of any thread of
> + * this DTrace process will be invasive.
> + *
> + * If this is *not* a shortlived grab, simply refuse the grab
> + * if this is being debugged by someone else or is ourself, or
> + * is PID 1: on explicit request, we'll still grab system
> + * daemons (if you use dtrace -p, we assume you actually want to
> + * do what you asked for), but grabs that cannot succeed should
> + * still be refused.
> + *
> + * (If the process is being *debugged* by ourself -- as in
> + * literally this thread -- we can do invasive grabs just fine.)
> */
> - if (datap->dpcd_flags & DTRACE_PROC_SHORTLIVED) {
> - pid_t tracer_pid, tgid;
>
> + tracer_pid = Ptracer_pid(dpr->dpr_pid);
> + self_grab = (dpr->dpr_pid == getpid() ||
> + Ptgid(dpr->dpr_pid) == (Ptgid(getpid())));
> + other_tracer = (tracer_pid != 0 && tracer_pid != getpid());
> +
> + if (datap->dpcd_flags & DTRACE_PROC_SHORTLIVED) {
> noninvasive = 1;
> dpr->dpr_notifiable = 0;
> - tracer_pid = Ptracer_pid(dpr->dpr_pid);
> - tgid = Ptgid(dpr->dpr_pid);
>
> if ((Psystem_daemon(dpr->dpr_pid, dtp->dt_useruid,
> dtp->dt_sysslice) > 0) ||
> - (tracer_pid == getpid()) ||
> - (tgid == getpid()))
> + other_tracer || self_grab)
> noninvasive = 2;
> + } else {
> + const char *reason;
> +
> + if (dpr->dpr_pid == 1 || other_tracer || self_grab) {
> + if (dpr->dpr_pid == 1)
> + reason = "is init";
> + else if (other_tracer)
> + reason = "being traced by someone else";
> + else
> + reason = "PID is ourself";
> +
> + dt_proc_error(dtp, dpr, "not safe to stop pid %li for grabbing: %s\n",
> + (long)dpr->dpr_pid, reason);
> + pthread_exit(NULL);
> + }
> }
>
> if ((dpr->dpr_proc = Pgrab(dpr->dpr_pid, noninvasive, 0,
> --
> 2.47.1.279.g84c5f4e78e
>
More information about the DTrace-devel
mailing list