[DTrace-devel] [PATCH] libproc: make Psystem_daemon() detect modern systemd properly

Sun Jun 29 02:12:00 UTC 2025

I tested the patch and here is what I found:

     OL7  UEK6     (I'm ignoring)
     OL8  UEK6     regression!!!
     OL8  UEK7     (never had a problem)
     OL9  UEK7     issue fixed
     OL9  UEK8     issue fixed
     OL10 UEK8     issue fixed

So why is there a regression for OL8/UEK6?

We call Psystem_daemon(pid, useruid, ":/system.slice/"), where we start 
reading /proc/$pid/cgroup.

We used to look for a line that had ":name=systemd:" in it.
So we found "1:name=systemd:/user.slice/user-1000.slice/session-35.scope".
Going to the second colon, we see ":/user.slice/...",
which does not match ":/system.slice/".
So we decide the process is not a system daemon.
This allows us to trace the process.

With the patch, we look for a line that has ".slice/" in it.
So we find "11:devices:/system.slice/sshd.service".
Going to the second colon, we see ":/system.slice/...",
which does match ":/system.slice/".
So we decide the process is a system daemon.
This means we cannot trace the process.

I do not understand any of this logic, but that's apparently why 
OL8/UEK6 has regressed.

Thanks for looking at this.  The problem causes about a dozen tests to 
fail on OL9 and OL10, both for me and for Sergey.  The fix works for 
those platforms, but then makes OL8/UEK6 fail.

By the way, you separate the systemd_system "yes" and "don't know" code 
paths.  Once "don't know" has been handled, you know systemd_system must 
be either 0 or 1.  So the following code is no longer needed:

                 if (systemd_system < 0)
                         systemd_system = 0;

On 6/19/25 08:00, Nick Alcock via DTrace-devel wrote:
> On 18 Jun 2025, Kris Van Hees verbalised:
>
>> On Fri, Jun 13, 2025 at 05:46:37PM +0100, Nick Alcock wrote:
>>> Psystem_daemon() is used when carrying out shortlived grabs to detect
>>> whether a process is too risky to carry out invasive grabs of (you wouldn't
>>> usually want to stop syslogd or, God forbid, try to ptrace PID 1, unless
>>> explicitly requested via -p: the process just coming up in routine probe
>>> firing is not enough).
>>>
>>> This has two code paths: a reliable one for systemd systems (which checks to
>>> see if the process is in the system slice, which contains precisely and only
>>> system daemons), and an unreliable one for other systems (which does the old
>>> Unix approach of consdering anything in the user uid range or with a TTY or
>>> with open standard FDs to TTYs to be not system daemons, and everything else
>>> to possibly be one).
>>>
>>> We were checking to see if a system was systemd by looking for the systemd
>>> cgroup hierarchy name in any of the victim process's cgroups.  This was
>>> reliable back in the days of cgroups v1, but alas in v2 where systemd runs
>>> all the cgroups if it runs any and there are no longer multiple hierarchies,
>>> systemd no longer names its cgroups this way and the test fails, causing us
>>> to fall back to the unreliable pre-systemd approach.
>>>
>>> Use a more reliable approach to detect systemd, the same approach used by
>>> sd_booted() in libsystemd; check for the existence of the
>>> /run/systemd/system directory.  Fix slice detection to work in the absence
>>> of a systemd hierarchy name, and everything else works unchanged.
>> Is /run/systems/system guaranteed to always be the correct path or is that
>> configurable in systemd and thus could change depending on distro etc?
> It's not configurable. I got the path from the manpage for sd_booted(3),
> which also recommends just doing the check yourself directly :) so as
> canonized as anything like this can be, much more so than the guesswork
> we were doing before.