[DTrace-devel] [PATCH 3/4] test: fix tst.depth.sh to use proper trigger
Eugene Loh
eugene.loh at oracle.com
Tue Nov 11 20:50:37 UTC 2025
On 11/10/25 22:00, Kris Van Hees wrote:
> On Mon, Nov 10, 2025 at 07:10:52PM -0500, Eugene Loh wrote:
>> I'm against this patch.
>>
>> The picky stuff is:
>> - the Copyright year should be updated
> Easy to fix - done.
>
>> - the timeout probe could replace
>> tick-1sec /n++ == 10/
>> with simply
>> tick-10sec
> Arguably, the two are not identical. It depends on why one is used over the
> other. Here we could probably go either way, though it might be nice to have
> the tick probe firing every second as a way to generate a bit more activity.
> Given the frequency of the other probes, not really important though as far as
> I can see. Note that I didn't touch the timeout probe primarily because that
> was not the purpose of this patch anyway.
>
>> - the "delay" env var be set for delaydie???
> Why? We are simply using delaydie to have a trigger executable to execute.
> This is a valid use of it (and it is used that same way in other tests also).
>
>> - the ustacks for delaydie are perhaps not very interesting
> That is not the point of the test, and for that matter, I highly doubt that
> ustacks from date would have been any more interesting (aside from the fact
> that you'd never know because the test didn't even work with date).
>
>> A little more serious is that all this patch does is swap one failure mode
>> for another. What for? It's XFAIL anyhow, and we do not check that we're
>> getting a particular, expected failure.
> The difference is that the pre-patch failure was due to a problem with the
> actual test. But since it is XFAIL that was reported as an expected failure,
> which was entirely bogus. After the patch, it still fails but at least it is
> failing for valid reasons. I.e. now we have a failure we should look into.
>
>> The real problem, meanwhile, is that this patch leads to lots of test
>> failures, at least for me. Not every VM, but many. Oh, and I do not mean
>> on this test, because this test remains XFAIL anyhow.
>>
>> Consider a subset of the VMs I run on. I get these failures on the
>> test/unittest/ustack/tst.*.d tests. Why on these? Because they run after
>> tst.depth.sh.
>>
>> +------------------ arm OL8 UEK7
>> | +---------------- x86 OL9 UEK7
>> | | +-------------- arm OL9 UEK7
>> | | | +------------ x86 OL9 UEK8
>> | | | | +---------- arm OL9 UEK8
>> | | | | | +-------- arm OL10 UEK8
>> | | | | | |
>> v v v v v v
>>
>> x x x x x x tst.kthread.d FAIL: timed out.
>> x x x x tst.uaddr.d FAIL: timed out.
>> x x x x tst.uaddr-pid0.d FAIL: timed out.
>> x x tst.ufunc.d FAIL: timed out.
>> x x x tst.ufunc-pid0.d FAIL: timed out.
>> x x tst.umod.d FAIL: timed out.
>> x x tst.ustack25_max95_profile.d FAIL: timed out.
>> x x tst.ustack25_pid.d FAIL: timed out.
>> x x tst.ustack25_profile.d FAIL: timed out.
>> x x tst.ustack95_max25_profile.d FAIL: timed out.
>> x x tst.ustack95_profile.d FAIL: timed out.
>> x x tst.ustack_max25_profile.d FAIL: timed out.
>> x x tst.ustack_profile.d FAIL: timed out.
>> x tst.usym.d FAIL: timed out.
>> x tst.usym-pid0.d FAIL: timed out.
>>
>> I think the problem is that tst.depth.sh -- with this patch, in any case --
>> turns on a huge number of probes. That causes problems for this test, but
>> not in the sense of it FAILing since it is marked XFAIL anyhow. (All this
>> patch does is change the FAIL from one problem to another, but we still do
>> not specify what failure we expect to see nor check that that is what we
>> get.)
>>
>> So, the problem is not with this test, but that the system is left in some
>> whacked state due to having attempted so many probes. As a result, the
>> process "dtrace -o $file -c delaydie" is still around well after the depth
>> test has finished, the load average remains high, the pid0 process does not
>> run, and many tests time out. How long (how many seconds or how many tests)
>> does this effect last? It depends, as one can see above.
>>
>> This raises all sorts of other questions as well, such as whether tests in
>> our test suite should be allowed to depend on running on a somewhat idle
>> system, whether we (dtrace or the test suite) are cleaning up well enough,
>> etc.
> There is of course the aspect that this patch is part of a series, and the
> next patch in the series should resolve the impact issues of this test. If
> it makes you more happy, I'd happily move this patch to the end of the series.
> I don't see how it matters, because this crazy fallout from tst.depth.sh will
> be an issue either way until the per-PID uprohes support is applied.
I don't get it.
First of all, the crazy fallout results with the patch and it doesn't
without the patch. Without the patch the test was basically SKIPing
(admittedly as a mistaken side effect of a test defect).
Second, the above results, with the crazy fallout, included the entire
patch series. That is, the tip of the branch looked like this:
59a49ce9 (tag: 2.0.4) doc: Add blank line before bold text so it is
rendered correctly
61e69762 bpf: fix file descriptor leak
20780963 uprobe: remove unnecessary enable_*() functions
170c4cf9 test: fix tst.depth.sh to use proper trigger
cf7ae35b uprobe: Implement PID-specific uprobes
If there are other bits/patches I should try, let me know.
>> On 11/10/25 10:27, Kris Van Hees wrote:
>>> The test used 'date' as its trigger but that caused the test to XFAIL
>>> for the wrong reason: date does not have symbols and therefore dtrace
>>> failed to enable probes for it.
>>>
>>> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
>>> ---
>>> test/unittest/ustack/tst.depth.sh | 8 ++++----
>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/test/unittest/ustack/tst.depth.sh b/test/unittest/ustack/tst.depth.sh
>>> index dec16a3a..977d1b42 100755
>>> --- a/test/unittest/ustack/tst.depth.sh
>>> +++ b/test/unittest/ustack/tst.depth.sh
>>> @@ -16,7 +16,7 @@ dtrace=$1
>>> rm -f $file
>>> -$dtrace $dt_flags -o $file -c date -s /dev/stdin <<EOF
>>> +$dtrace $dt_flags -o $file -c test/triggers/delaydie -s /dev/stdin <<EOF
>>> #pragma D option quiet
>>> #pragma D option bufsize=1M
>>> @@ -45,7 +45,7 @@ $dtrace $dt_flags -o $file -c date -s /dev/stdin <<EOF
>>> EOF
>>> status=$?
>>> -if [ "$status" -ne 0 ]; then
>>> +if [ $status -ne 0 ]; then
>>> echo $tst: dtrace failed
>>> rm -f $file
>>> exit $status
>>> @@ -78,8 +78,8 @@ EOF
>>> status=$?
>>> -count=`wc -l $file | cut -f1 -do`
>>> -if [ "$count" -lt 1000 ]; then
>>> +count=`wc -l $file | cut -f1 -d' '`
>>> +if [ $count -lt 1000 ]; then
>>> echo $tst: output was too short
>>> status=1
>>> fi
More information about the DTrace-devel
mailing list