[DTrace-devel] [PATCH v2] libproc: make Psystem_daemon() detect modern systemd properly

Eugene Loh eugene.loh at oracle.com
Tue Jul 15 23:34:05 UTC 2025


Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
I didn't check every detail, but read through and it tests fine.

On 7/15/25 15:09, Nick Alcock wrote:
> Psystem_daemon() is used when carrying out shortlived grabs to detect
> whether a process is too risky to carry out invasive grabs of (you wouldn't
> usually want to stop syslogd or, God forbid, try to ptrace PID 1, unless
> explicitly requested via -p: the process just coming up in routine probe
> firing is not enough).
>
> This has two code paths: a reliable one for systemd systems (which checks to
> see if the process is in the system slice, which contains precisely and only
> system daemons), and an unreliable one for other systems (which does the old
> Unix approach of consdering anything in the user uid range or with a TTY or
> with open standard FDs to TTYs to be not system daemons, and everything else
> to possibly be one).
>
> We were checking to see if a system was systemd by looking for the systemd
> cgroup hierarchy name in any of the victim process's cgroups.  This was
> reliable back in the days of cgroups v1, but alas in v2 where systemd runs
> all the cgroups if it runs any and there are no longer multiple hierarchies,
> systemd no longer names its cgroups this way and the test fails, causing us
> to fall back to the unreliable pre-systemd approach.
>
> Use a more reliable approach to detect systemd, the same approach used by
> sd_booted() in libsystemd; check for the existence of the
> /run/systemd/system directory.  Fix slice detection to work in the absence
> of a systemd hierarchy name (but keep it working when a hierarchy name
> *is* present, for older systems), and everything else works unchanged.
>
> We also arrange to fall back to the old code for any processes that are
> entirely outside of systemd management: this covers kernel threads,
> the occasional process that is part of systemd itself, and also processes
> running using Delegate= to give over their subtree's cgroup management to
> something else.
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
> ---
>   libproc/Pcontrol.c | 101 +++++++++++++++++++++++++++++++--------------
>   1 file changed, 70 insertions(+), 31 deletions(-)
>
> OK, this doesn't regress with stdin coming from /dev/null on any systemd
> platform I've tried it on, old (cgroups v1) or new.  (Non-systemd, we will
> of course mistake most of the tests for system daemons and fail.  Don't
> run the testsuite noninteractively on such systems.)
>
> diff --git a/libproc/Pcontrol.c b/libproc/Pcontrol.c
> index 7d9b5055f8201..b5c4e27ef9d29 100644
> --- a/libproc/Pcontrol.c
> +++ b/libproc/Pcontrol.c
> @@ -2927,10 +2927,26 @@ Psystem_daemon(pid_t pid, uid_t useruid, const char *sysslice)
>   	int fd;
>   
>   	/*
> -	 * If this is a system running systemd, or we don't know yet, dig out
> -	 * the systemd cgroup line from /proc/$pid/cgroup.
> +	 * If we don't know if this systemd is running systemd, find out.
>   	 */
> -	if (systemd_system != 0) {
> +	if (systemd_system < 0) {
> +		struct stat st;
> +
> +		if (stat("/run/systemd/system", &st) < 0 ||
> +		    !S_ISDIR(st.st_mode))
> +			systemd_system = 0;
> +		else
> +			systemd_system = 1;
> +		_dprintf("systemd system.\n");
> +	}
> +
> +	/*
> +	 * If this is a system running systemd, dig out the systemd cgroup line
> +	 * from /proc/$pid/cgroup.
> +	 */
> +	if (systemd_system) {
> +		int found = 0;
> +
>   		snprintf(procname, sizeof(procname), "%s/%d/cgroup",
>   		    procfs_path, pid);
>   
> @@ -2941,47 +2957,70 @@ Psystem_daemon(pid_t pid, uid_t useruid, const char *sysslice)
>   		}
>   
>   		while (getline(&buf, &n, fp) >= 0) {
> +			/*
> +			 * cgroups v2: only one line, 0::-prepended, slice
> +			 * name always on that line.
> +			 */
> +
> +			if (strncmp(buf, "0::", strlen ("0::")) == 0 &&
> +			    strstr(buf, ".slice/") != NULL) {
> +				found = 1;
> +				break;
> +			}
> +
> +			/*
> +			 * cgroups v1: find the line with the name=systemd
> +			 * controller notation.
> +			 */
>   			if (strstr(buf, ":name=systemd:") != NULL) {
> -				systemd_system = 1;
> +				found = 1;
>   				break;
>   			}
>   		}
>   		fclose(fp);
> -		if (systemd_system < 0)
> -			systemd_system = 0;
> -	}
>   
> -	/*
> -	 * We have the systemd cgroup line in buf.  Look at our slice name.
> -	 */
> -	if (systemd_system) {
> -		char *colon = strchr(buf, ':');
> -		if (colon)
> -			colon = strchr(colon + 1, ':');
> +		/*
> +		 * We have our slice's cgroup line in buf.  Extract the slice
> +		 * name, skipping over the hierarchy number and controller
> +		 * fields.
> +		 */
> +		if (found) {
> +			char *colon = strchr(buf, ':');
> +			if (colon)
> +				colon = strchr(colon + 1, ':');
>   
> -		_dprintf("systemd system: sysslice: %s; colon: %s\n",
> -		    sysslice, colon ? colon : "(not found)");
> -		if (colon &&
> -		    (strncmp(colon, sysslice, strlen(sysslice)) == 0)) {
> +			_dprintf("systemd system: sysslice: %s; colon: %s\n",
> +				 sysslice, colon ? colon : "(not found)");
> +			if (colon &&
> +			    (strncmp(colon, sysslice, strlen(sysslice)) == 0)) {
> +				free(buf);
> +				_dprintf("%i is a system daemon process.\n", pid);
> +				return 1;
> +			}
>   			free(buf);
> -			_dprintf("%i is a system daemon process.\n", pid);
> -			return 1;
> +			return 0;
>   		}
> -		free(buf);
> -		return 0;
> +		/*
> +		 * No idea: this is probably a kernel thread or something
> +		 * else entirely outside of systemd management or delegated
> +		 * via Delegate=: at any rate, a system daemon.  We can fall
> +		 * back to the old mechanism in this situation.
> +		 */
> +		_dprintf("%i: probably non-systemd: delegated?\n", pid);
>   	}
>   	free(buf);
>   
>   	/*
> -	 * This is not a systemd system -- we have to guess by looking at the
> -	 * process's UID, controlling terminal, and the TTYness and/or location
> -	 * of the files pointed to by its stdin/out/err.  (i.e. we first
> -	 * consider whether something may be a system daemon by consulting its
> -	 * uid range and controlling TTY, then try to rule it out by looking for
> -	 * open fds to TTYs and regular files outside particular subtrees.)  (As
> -	 * a consequence of these rules, a process with no standard streams at
> -	 * all is considered a system daemon -- this is a cheap way of catching
> -	 * kernel threads.)
> +	 * This is not a systemd system, or we can't extract the relevant
> +	 * slice info from it -- we have to guess by looking at the
> +	 * process's UID, controlling terminal, and the TTYness and/or
> +	 * location of the files pointed to by its stdin/out/err.  (i.e. we
> +	 * first consider whether something may be a system daemon by
> +	 * consulting its uid range and controlling TTY, then try to rule it
> +	 * out by looking for open fds to TTYs and regular files outside
> +	 * particular subtrees.)  (As a consequence of these rules, a
> +	 * process with no standard streams at all is considered a system
> +	 * daemon -- this is a cheap way of catching kernel threads.)
>   	 */
>   	if ((Puid(pid) > useruid) || Phastty(pid))
>   		return 0;



More information about the DTrace-devel mailing list