[DTrace-devel] [PATCH v2] libproc: make Psystem_daemon() detect modern systemd properly
Eugene Loh
eugene.loh at oracle.com
Tue Jul 15 23:34:05 UTC 2025
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
I didn't check every detail, but read through and it tests fine.
On 7/15/25 15:09, Nick Alcock wrote:
> Psystem_daemon() is used when carrying out shortlived grabs to detect
> whether a process is too risky to carry out invasive grabs of (you wouldn't
> usually want to stop syslogd or, God forbid, try to ptrace PID 1, unless
> explicitly requested via -p: the process just coming up in routine probe
> firing is not enough).
>
> This has two code paths: a reliable one for systemd systems (which checks to
> see if the process is in the system slice, which contains precisely and only
> system daemons), and an unreliable one for other systems (which does the old
> Unix approach of consdering anything in the user uid range or with a TTY or
> with open standard FDs to TTYs to be not system daemons, and everything else
> to possibly be one).
>
> We were checking to see if a system was systemd by looking for the systemd
> cgroup hierarchy name in any of the victim process's cgroups. This was
> reliable back in the days of cgroups v1, but alas in v2 where systemd runs
> all the cgroups if it runs any and there are no longer multiple hierarchies,
> systemd no longer names its cgroups this way and the test fails, causing us
> to fall back to the unreliable pre-systemd approach.
>
> Use a more reliable approach to detect systemd, the same approach used by
> sd_booted() in libsystemd; check for the existence of the
> /run/systemd/system directory. Fix slice detection to work in the absence
> of a systemd hierarchy name (but keep it working when a hierarchy name
> *is* present, for older systems), and everything else works unchanged.
>
> We also arrange to fall back to the old code for any processes that are
> entirely outside of systemd management: this covers kernel threads,
> the occasional process that is part of systemd itself, and also processes
> running using Delegate= to give over their subtree's cgroup management to
> something else.
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
> ---
> libproc/Pcontrol.c | 101 +++++++++++++++++++++++++++++++--------------
> 1 file changed, 70 insertions(+), 31 deletions(-)
>
> OK, this doesn't regress with stdin coming from /dev/null on any systemd
> platform I've tried it on, old (cgroups v1) or new. (Non-systemd, we will
> of course mistake most of the tests for system daemons and fail. Don't
> run the testsuite noninteractively on such systems.)
>
> diff --git a/libproc/Pcontrol.c b/libproc/Pcontrol.c
> index 7d9b5055f8201..b5c4e27ef9d29 100644
> --- a/libproc/Pcontrol.c
> +++ b/libproc/Pcontrol.c
> @@ -2927,10 +2927,26 @@ Psystem_daemon(pid_t pid, uid_t useruid, const char *sysslice)
> int fd;
>
> /*
> - * If this is a system running systemd, or we don't know yet, dig out
> - * the systemd cgroup line from /proc/$pid/cgroup.
> + * If we don't know if this systemd is running systemd, find out.
> */
> - if (systemd_system != 0) {
> + if (systemd_system < 0) {
> + struct stat st;
> +
> + if (stat("/run/systemd/system", &st) < 0 ||
> + !S_ISDIR(st.st_mode))
> + systemd_system = 0;
> + else
> + systemd_system = 1;
> + _dprintf("systemd system.\n");
> + }
> +
> + /*
> + * If this is a system running systemd, dig out the systemd cgroup line
> + * from /proc/$pid/cgroup.
> + */
> + if (systemd_system) {
> + int found = 0;
> +
> snprintf(procname, sizeof(procname), "%s/%d/cgroup",
> procfs_path, pid);
>
> @@ -2941,47 +2957,70 @@ Psystem_daemon(pid_t pid, uid_t useruid, const char *sysslice)
> }
>
> while (getline(&buf, &n, fp) >= 0) {
> + /*
> + * cgroups v2: only one line, 0::-prepended, slice
> + * name always on that line.
> + */
> +
> + if (strncmp(buf, "0::", strlen ("0::")) == 0 &&
> + strstr(buf, ".slice/") != NULL) {
> + found = 1;
> + break;
> + }
> +
> + /*
> + * cgroups v1: find the line with the name=systemd
> + * controller notation.
> + */
> if (strstr(buf, ":name=systemd:") != NULL) {
> - systemd_system = 1;
> + found = 1;
> break;
> }
> }
> fclose(fp);
> - if (systemd_system < 0)
> - systemd_system = 0;
> - }
>
> - /*
> - * We have the systemd cgroup line in buf. Look at our slice name.
> - */
> - if (systemd_system) {
> - char *colon = strchr(buf, ':');
> - if (colon)
> - colon = strchr(colon + 1, ':');
> + /*
> + * We have our slice's cgroup line in buf. Extract the slice
> + * name, skipping over the hierarchy number and controller
> + * fields.
> + */
> + if (found) {
> + char *colon = strchr(buf, ':');
> + if (colon)
> + colon = strchr(colon + 1, ':');
>
> - _dprintf("systemd system: sysslice: %s; colon: %s\n",
> - sysslice, colon ? colon : "(not found)");
> - if (colon &&
> - (strncmp(colon, sysslice, strlen(sysslice)) == 0)) {
> + _dprintf("systemd system: sysslice: %s; colon: %s\n",
> + sysslice, colon ? colon : "(not found)");
> + if (colon &&
> + (strncmp(colon, sysslice, strlen(sysslice)) == 0)) {
> + free(buf);
> + _dprintf("%i is a system daemon process.\n", pid);
> + return 1;
> + }
> free(buf);
> - _dprintf("%i is a system daemon process.\n", pid);
> - return 1;
> + return 0;
> }
> - free(buf);
> - return 0;
> + /*
> + * No idea: this is probably a kernel thread or something
> + * else entirely outside of systemd management or delegated
> + * via Delegate=: at any rate, a system daemon. We can fall
> + * back to the old mechanism in this situation.
> + */
> + _dprintf("%i: probably non-systemd: delegated?\n", pid);
> }
> free(buf);
>
> /*
> - * This is not a systemd system -- we have to guess by looking at the
> - * process's UID, controlling terminal, and the TTYness and/or location
> - * of the files pointed to by its stdin/out/err. (i.e. we first
> - * consider whether something may be a system daemon by consulting its
> - * uid range and controlling TTY, then try to rule it out by looking for
> - * open fds to TTYs and regular files outside particular subtrees.) (As
> - * a consequence of these rules, a process with no standard streams at
> - * all is considered a system daemon -- this is a cheap way of catching
> - * kernel threads.)
> + * This is not a systemd system, or we can't extract the relevant
> + * slice info from it -- we have to guess by looking at the
> + * process's UID, controlling terminal, and the TTYness and/or
> + * location of the files pointed to by its stdin/out/err. (i.e. we
> + * first consider whether something may be a system daemon by
> + * consulting its uid range and controlling TTY, then try to rule it
> + * out by looking for open fds to TTYs and regular files outside
> + * particular subtrees.) (As a consequence of these rules, a
> + * process with no standard streams at all is considered a system
> + * daemon -- this is a cheap way of catching kernel threads.)
> */
> if ((Puid(pid) > useruid) || Phastty(pid))
> return 0;
More information about the DTrace-devel
mailing list