[DTrace-devel] [PATCH v3 10/21] usdt: test improvements

Kris Van Hees kris.van.hees at oracle.com
Thu Feb 15 20:07:00 UTC 2024


On Tue, Jan 16, 2024 at 09:13:06PM +0000, Nick Alcock via DTrace-devel wrote:
> This soups up test/unittest/usdt/tst.manyprocs.sh to test dtprobed's
> cleaning up the wreckage of dead processes, by having every other process
> kill itself rather than dying peacefully (and sending a DTRACEHIOC_REMOVE).
> When doing in-tree testing (something we can detect via $dtrace), we can
> also check the DOF stash itself to see whether the wreckage has been cleaned
> up. (This isn't practical when doing systemwide testing because we don't
> know how much DOF the dtprobed might legitimately be hanging on to.)
> 
> tst.multitrace.sh gains the ability to test DOF reparsing on upgrade,
> waiting until the first probe-containing process has started and then
> intentionally overwriting some of the parsed DOF with junk (in lieu of
> actually changing struct dof_parsed on the fly :) ) and forcing a reparse
> via a kill -USR2 (in lieu of communicating the complete argument list of
> dtprobed down to this test just so it can be killed and restarted).
> 
> tst.multitrace.sh is no longer XFAIL: it is meant to pass now, and sometimes
> it does. (It still fails a lot of the time, but this seems to be unrelated
> bugs, not a general problem with doing multiple USDT traces at the same
> time.)

OK, but if this patch is failing a lot of the time rather than PASSing, it is
not really much use (or that what it is meant to test is broken).  Why is the
test so unstable?  What are the conditions under which it fails?  What are the
bugs causing it to fail?

> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
> ---
>  runtest.sh                           |  2 +-
>  test/unittest/usdt/tst.manyprocs.sh  | 35 +++++++++++++++++++++++-----
>  test/unittest/usdt/tst.multitrace.sh | 20 ++++++++++++----
>  3 files changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/runtest.sh b/runtest.sh
> index 92ef6e02463d..f3fb3e343c81 100755
> --- a/runtest.sh
> +++ b/runtest.sh
> @@ -573,7 +573,7 @@ if [[ -z $USE_INSTALLED ]]; then
>      	exit 1
>      fi
>      build/dtprobed $dtprobed_flags &
> -    dtprobed_pid=$!
> +    export dtprobed_pid=$!
>      ZAPTHESE+=($dtprobed_pid)
>  else
>      dtrace="/usr/sbin/dtrace"
> diff --git a/test/unittest/usdt/tst.manyprocs.sh b/test/unittest/usdt/tst.manyprocs.sh
> index 6ebd873672b4..87d3d452bcc3 100755
> --- a/test/unittest/usdt/tst.manyprocs.sh
> +++ b/test/unittest/usdt/tst.manyprocs.sh
> @@ -5,10 +5,8 @@
>  # Licensed under the Universal Permissive License v 1.0 as shown at
>  # http://oss.oracle.com/licenses/upl.
>  #
> -# Verify that dtprobed can handle lots of processes.  We don't check that
> -# it's cleaning them up, and we don't explicitly check that it's not dead,
> -# but if the dead-process-cleanup process fails and kills dtprobed we will
> -# find out when later tests fail due to lack of dtprobed.
> +# Verify that dtprobed can handle lots of processes.  Also check that it is
> +# cleaning up wreckage from old dead processes.
>  #
>  if [ $# != 1 ]; then
>  	echo expected one argument: '<'dtrace-path'>'
> @@ -35,14 +33,27 @@ if [ $? -ne 0 ]; then
>  	exit 1
>  fi
>  
> -cat > test.c <<EOF
> +cat > test.c <<'EOF'
> +#include <signal.h>
> +#include <stdlib.h>
> +#include <unistd.h>
>  #include "prov.h"
>  
>  int
>  main(int argc, char **argv)
>  {
> +	long instance;
> +
>  	TEST_PROV_GO();
>  
> +	instance = strtol(argv[1], NULL, 10);
> +
> +	/*
> +	 * Kill every other instance of ourself, so the DOF destructors
> +	 * never run.
> +	 */
> +	if (instance % 2)
> +		kill(getpid(), SIGKILL);
>  	return 0;
>  }
>  EOF
> @@ -64,7 +75,19 @@ if [ $? -ne 0 ]; then
>  fi
>  
>  for ((i=0; i < 1024; i++)); do
> -    ./test
> +    ./test $i
>  done
>  
> +# When doing in-tree testing, the DOF stash directory
> +# should contain at most five or so DOFs, even though 512
> +# processes left stale DOF around.  (Allow up to ten in
> +# case the most recent cleanup is still underway.)

How about checking (before cleanup is triggered) that there was indeed a
multitude of stale DOF left around?  So that we can ensure that cleanup
really happened?

> +if [[ $dtrace != "/usr/sbin/dtrace" ]] && [[ -n $DTRACE_OPT_DOFSTASHPATH ]]; then
> +    NUMDOFS="$(find $DTRACE_OPT_DOFSTASHPATH/stash/dof -type f | wc -l)"
> +    if [[ $NUMDOFS -gt 10 ]]; then
> +        echo "DOF stash contains too many old DOFs: $NUMDOFS" >&2
> +        exit 1
> +    fi
> +fi
> +
>  exit 0
> diff --git a/test/unittest/usdt/tst.multitrace.sh b/test/unittest/usdt/tst.multitrace.sh
> index 07ed14a77f80..1a3081fd208f 100755
> --- a/test/unittest/usdt/tst.multitrace.sh
> +++ b/test/unittest/usdt/tst.multitrace.sh
> @@ -8,7 +8,6 @@
>  # Test multiple simultaneous tracers, invoked successively (so there
>  # are multiple dtracers and multiple processes tracing the same probes).
>  #
> -# @@xfail: something up with multiple simultaneous exiting tracers
>  if [ $# != 1 ]; then
>  	echo expected one argument: '<'dtrace-path'>'
>  	exit 2
> @@ -44,7 +43,7 @@ main(int argc, char **argv)
>  {
>  	size_t i;
>  
> -	sleep(5);
> +	sleep(10);
>  	for (i = 0; i < 5; i++) {
>  		if (TEST_MULTITRACE_GO_ENABLED())
>  			TEST_MULTITRACE_GO();
> @@ -73,7 +72,7 @@ if [ $? -ne 0 ]; then
>  fi
>  
>  script() {
> -	$dtrace -qws /dev/stdin $1 $2 $3 <<'EOF'
> +	exec $dtrace -qws /dev/stdin $1 $2 $3 <<'EOF'
>  	int fired[pid_t];
>  	int exited[pid_t];
>  
> @@ -115,11 +114,24 @@ script() {
>  		exit(1);
>  	}
>  EOF
> -	echo tracer $3: exited
>  }
>  
>  ./test 1 &
>  ONE=$!
> +
> +# If doing in-tree testing, force dtprobed to reparse its DOF now, as
> +# if re-executed with a newer version of dtprobed with incompatible
> +# parse state.  Overwrite the parsed DOF with crap first, to force
> +# a failure if it simply doesn't reparse at all.
> +if [[ $dtrace != "/usr/sbin/dtrace" ]] && [[ -n $dtprobed_pid ]]; then
> +    sleep 1
> +    for parsed in $DTRACE_OPT_DOFSTASHPATH/stash/dof-pid/*/*/parsed; do
> +        echo 'a' > $parsed
> +    done
> +    kill -USR2 $dtprobed_pid
> +    sleep 1
> +fi
> +
>  ./test 2 0 &
>  TWO=$!
>  
> -- 
> 2.43.0.272.gce700b77fd
> 
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list