[DTrace-devel] [PATCH v2] test: Wait for output to flush out in enable_pid
Eugene Loh
eugene.loh at oracle.com
Wed Jul 30 22:03:39 UTC 2025
On 7/30/25 10:26, Kris Van Hees wrote:
> On Wed, Jul 30, 2025 at 10:18:17AM -0400, Kris Van Hees via DTrace-devel wrote:
>> I still see intermittent FAILs. I wonder whether there is a potentially unsafe
>> contruct at work (see below).
> Hm, no that does not fix it either.
>
> Eugene - can you have a closer look at this and see what might be going on?
I ran 40x on each of my 11 test VMs and am unable to get a FAIL with
this patch. I took a look at the test some more, but without a failure
it's hard to fix.
Are you saying there is missing output (the motivating problem) or that
there is a timeout (I thought you may have said this, but I cannot find
where)? Those would seem to be two separate problems, the latter being
that a trigger does not see 4 signals.
>> On Sat, Jun 28, 2025 at 06:30:50PM -0400, eugene.loh at oracle.com wrote:
>>> From: Eugene Loh <eugene.loh at oracle.com>
>>>
>>> Our luck with this test has been quite good, but it sometimes fails
>>> to show its last lines of output. That is, we send a USR1 to the
>>> trigger processes to set off the final output and we immediately
>>> cat the output files. If there is any delay in handling the signal,
>>> the last output will be missing.
>>>
>>> Have the processes terminate themselves when their last output is
>>> flushed; then wait for those processes. Also, skip testing altogether if
>>> there is only a single processor to run the two, hard-spinning processes.
>>>
>>> Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
>>> ---
>>> test/unittest/usdt/tst.enable_pid.sh | 17 ++++++++++++-----
>>> test/unittest/usdt/tst.enable_pid.x | 8 ++++++++
>>> 2 files changed, 20 insertions(+), 5 deletions(-)
>>> create mode 100755 test/unittest/usdt/tst.enable_pid.x
>>>
>>> diff --git a/test/unittest/usdt/tst.enable_pid.sh b/test/unittest/usdt/tst.enable_pid.sh
>>> index 7f4f68698..296cfb382 100755
>>> --- a/test/unittest/usdt/tst.enable_pid.sh
>>> +++ b/test/unittest/usdt/tst.enable_pid.sh
>>> @@ -33,6 +33,8 @@ EOF
>>> cat > main.c <<EOF
>>> #include <signal.h>
>>> #include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <string.h>
>>> #include "prov.h"
>>>
>>> /* We check if the is-enabled probe is or is not enabled (or unknown). */
>>> @@ -41,7 +43,7 @@ cat > main.c <<EOF
>>> #define ENABLED_UNK 3
>>>
>>> /* Start with the previous probe "unknown". */
>>> -int prv = ENABLED_UNK;
>>> +int prv = ENABLED_UNK, nepochs_left = 4;
>>> long long num = 0;
>>>
>>> /* Report how many times the previous case was encountered. */
>>> @@ -71,6 +73,9 @@ static void mark_epoch(int sig) {
>>> report();
>>> printf("=== epoch ===\n");
>>> fflush(stdout);
>>> + nepochs_left--;
>>> + if (nepochs_left <= 0)
>>> + exit(0);
>> I wonder whether this could still cause some output to get lost. I would
>> actually move the conditional and exit to the end of the loop in main(),
>> so that we check nepochs_left and exit when <= 0 outside of the interrupt
>> handler.
>>
>> I am running a long sequence of this test to see if that helps/fixes it.
>>
>>> }
>>>
>>> int
>>> @@ -79,7 +84,7 @@ main(int argc, char **argv)
>>> struct sigaction act;
>>>
>>> /* Set USR1 to mark epochs. */
>>> - act.sa_flags = 0;
>>> + memset(&act, 0, sizeof(act));
>>> act.sa_handler = mark_epoch;
>>> if (sigaction(SIGUSR1, &act, NULL)) {
>>> printf("set handler failed\n");
>>> @@ -172,13 +177,15 @@ for pid in 1 $pid1 $pid2 '*'; do
>>> kill -USR1 $pid2
>>> done
>>>
>>> +# Wait for the processes.
>>> +wait $pid1
>>> +wait $pid2
>>> +
>>> +# Dump the output.
>>> echo done
>>> echo "========== out 1"; cat out.1
>>> echo "========== out 2"; cat out.2
>>>
>>> echo success
>>>
>>> -kill -TERM $pid1
>>> -kill -TERM $pid2
>>> -
>>> exit 0
>>> diff --git a/test/unittest/usdt/tst.enable_pid.x b/test/unittest/usdt/tst.enable_pid.x
>>> new file mode 100755
>>> index 000000000..9506674ee
>>> --- /dev/null
>>> +++ b/test/unittest/usdt/tst.enable_pid.x
>>> @@ -0,0 +1,8 @@
>>> +#!/bin/sh
>>> +
>>> +if [ `grep -c ^processor /proc/cpuinfo` -lt 2 ]; then
>>> + echo test should have at least two processors
>>> + exit 2
>>> +fi
>>> +
>>> +exit 0
>>> --
>>> 2.43.5
>>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
More information about the DTrace-devel
mailing list