[DTrace-devel] [PATCH 01/12] proc: use a rawtp for the proc:::exit probe

Eugene Loh eugene.loh at oracle.com
Sat Jan 6 05:50:57 UTC 2024


On 1/5/24 23:17, Kris Van Hees wrote:

> On Fri, Jan 05, 2024 at 02:30:00PM -0500, Eugene Loh via DTrace-devel wrote:
>> On 1/5/24 00:24, Kris Van Hees via DTrace-devel wrote:
>>> diff --git a/libdtrace/dt_prov_proc.c b/libdtrace/dt_prov_proc.c
>>> @@ -323,7 +323,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>>>    		int		rc = 0;
>>>    		uint_t		lbl_out = dt_irlist_label(dlp);
>>> -		emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(1)));
>>> +		emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(0)));
>> I'm confused on this one.  We're switching from
>>          taskstats_exit(tsk, group_dead);
>> to
>>          trace_sched_process_exit(tsk);
>> So it seems we want arg0 in either case.  So the patch is right, but it
>> means (iiuc) that there was a bug in the code.  So if a bug is being fixed,
>> doesn't that mean there should be a test change?
> Sadly, no.  It means we do not have adequate testing for all cases.

Then that inadequate testing is a deficiency in this patch.

First, isn't this what
         test/unittest/proc/tst.exitexit.sh
         test/unittest/proc/tst.exitkilled.sh
are supposed to test?  (The latter is @@xfail.)

Second, a test should be easy to add.  Something patterned after 
test/unittest/io/tst.fds.d:

         /* @@trigger: bogus-ioctl */
         /* @@trigger-timing: after */
         /* @@runtest-opts: -C $_pid */

         #pragma D option destructive
         #pragma D option quiet

         syscall::ioctl:entry
         /pid == $1 && arg0 == -1u/
         {
                 raise(SIGKILL);    /* and another test with SIGUSR1 */
         }

         proc:::exit
         /pid == $1/
         {
                 printf("exit with %s\n", arg0 == CLD_KILLED ? "killed" 
: "other");  /* and EXITED version */
                 exit(0);
         }

with accompanying .r file.  Or something like that.  Anyhow, the patch 
is changing how arg0 is handled, so that should be tested in the patch.  
I guess also something for the dump case.

Third, these tests pass for me even without your patch?  That is, we 
were already handling proc:::exit arg0 correctly for the exit and kill 
cases?  I'm so confused.  Ah.  I see.  The line in question actually has 
nothing to do with tsk->exit_code at all.  IIUC, it is checking if 
group_dead is 0?  Not necessary?  Was incorrect? We're now lost without 
it?  I have no idea.  But this patch changing from arg0 to arg1 is 
either right, wrong, or doing something that deserves explanation.

>>>    		emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, exitlbl));
>>>    		if (!cfp)
>>> @@ -342,9 +342,9 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>>>    		/*
>>>    		 * This implements:
>>>    		 *	%r1 = tsk->exit_code
>>> -		 *	if ((%r1 & 0x7f) == 0) args[0] = 1;
>>> -		 *	else if (%r1 & 0x80) args[0] = 3;
>>> -		 *	else args[0] = 2;
>>> +		 *	if ((%r1 & 0x7f) == 0) args[0] = 1;	// CLD_EXITED
>>> +		 *	else if (%r1 & 0x80) args[0] = 3;	// CLD_DUMPED
>>> +		 *	else args[0] = 2;			// CLD_KILLED
>>>    		 */
>>>    		emit(dlp,  BPF_MOV_REG(BPF_REG_1, BPF_REG_FP));
>>>    		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_STK_SPILL(0)));



More information about the DTrace-devel mailing list