[DTrace-devel] [PATCH 08/19] Support multiple overlying probes in the uprobe trampoline

Kris Van Hees kris.van.hees at oracle.com
Thu Oct 24 02:42:34 UTC 2024


On Thu, Aug 29, 2024 at 01:25:47AM -0400, eugene.loh at oracle.com wrote:
> From: Eugene Loh <eugene.loh at oracle.com>
> 
> An underlying probe could support all sorts of overlying probes:
>   - pid entry
>   - pid return
>   - pid offset
>   - USDT
>   - USDT is-enabled
> 
> The overlying probes belonging to an underlying probe match the
> underlying probe -- except possibly in pid.  So, an underlying
> probe loops over its overlying probes, looking for a pid match.
> 
> The trampoline would look for only one match.
> 
> However, more than one overlying probe might match.  Therefore,
> change the loop to keep going even after a match has been found.
> 
> Incidentally, it is actually only pid offset probes that could
> "collide" with any other overlying probes for a given pid:
> 
> -)  pid return probes are implemented with uretprobes
>     and so cannot "collide" with any other probes
> 
> -)  no two USDT probes -- is-enabled or not -- can map
>     to the same underlying probe for any pid
> 
> -)  no USDT probe -- is-enabled or not -- can map to
>     to the same underlying probe as a pid entry
> 
> So, possibly one could optimize the trampoline -- e.g., by adding
> BPF code to exit once two matches have been made.
> 
> Incidentally, there is a small error in the patch.  The flag we
> pass in to dt_cg_tramp_copy_args_from_regs() should depend on
> whether the overlying probe is a pid or USDT probe.  We used to
> check PP_IS_FUNCALL, the upp could be for both.  Instead of
> fixing this problem, let us simply pretend it's a pid probe --
> a later patch will move USDT probes to a different mechanism anyhow.

I think it is more clear to just keep the logic that is there for passing the
flag, even if it is not exactly correct.  At least it is not a change from
something that might be wrong to something that is more likely to be wrong.
Either way, you are changing it in a subsequent patch so the change to a
hardcoded 1 is not really helping anything other than making someone perhaps
wonder why you changed it.

Alternatively, you could perhaps change the check for the PP_IS_FUNCALL flag
to checking the prp->prov->impl, and selecting 1 for dt_pid and 0 for dt_usdt?
Yes, you will change that in the subsequent patch, but it would be actually
fixing this to be the correct condition anyway, right?

Summary, wrong -> different-wrong seems silly, wrong -> wrong (keeping it
the same) is reasonable, wrong -> correct is of course preferred.

> 
> Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
> ---
>  libdtrace/dt_prov_uprobe.c          |  53 +++++-------
>  test/unittest/pid/tst.entry_off0.sh | 125 ++++++++++++++++++++++++++++
>  2 files changed, 147 insertions(+), 31 deletions(-)
>  create mode 100755 test/unittest/pid/tst.entry_off0.sh
> 
> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
> index 70e5c32d..bc5545fb 100644
> --- a/libdtrace/dt_prov_uprobe.c
> +++ b/libdtrace/dt_prov_uprobe.c
> @@ -687,45 +687,23 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>  	const list_probe_t	*pop;
>  	uint_t			lbl_exit = pcb->pcb_exitlbl;
>  
> -	dt_cg_tramp_prologue(pcb);
> +	dt_cg_tramp_prologue(pcb);                             // call this only once... is PRID set/relocated correctly?

I do not believe this change is needed (adding the comment).  A trampoline can
only ever have a single prologue.

>  
>  	/*
>  	 * After the dt_cg_tramp_prologue() call, we have:
>  	 *				//     (%r7 = dctx->mst)
>  	 *				//     (%r8 = dctx->ctx)
>  	 */
> -
> -	dt_cg_tramp_copy_regs(pcb);
> -	if (upp->flags & PP_IS_RETURN)
> -		dt_cg_tramp_copy_rval_from_regs(pcb);
> -	else
> -		dt_cg_tramp_copy_args_from_regs(pcb,
> -						!(upp->flags & PP_IS_FUNCALL));
> -
> -	/*
> -	 * Retrieve the PID of the process that caused the probe to fire.
> -	 */
> -	emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_get_current_pid_tgid));
> -	emit(dlp,  BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32));
> +	dt_cg_tramp_copy_regs(pcb);                            // call this only once for all clauses?
>  
>  	/*
> -	 * Generate a composite conditional clause:
> -	 *
> -	 *	if (pid == PID1) {
> -	 *		dctx->mst->prid = PRID1;
> -	 *		< any number of clause calls >
> -	 *		goto exit;
> -	 *	} else if (pid == PID2) {
> -	 *		dctx->mst->prid = PRID2;
> -	 *		< any number of clause calls >
> -	 *		goto exit;
> -	 *	} else if (pid == ...) {
> -	 *		< ... >
> -	 *	}
> +	 * Loop over overlying probes, calling clauses for those that match:
>  	 *
> -	 * It is valid and safe to use %r0 to hold the pid value because there
> -	 * are no assignments to %r0 possible in between the conditional
> -	 * statements.
> +	 *	for overlying probes (that match except possibly for pid)
> +	 *		if (pid matches) {
> +	 *			dctx->mst->prid = PRID1;
> +	 *			< any number of clause calls >
> +	 *		}
>  	 */
>  	for (pop = dt_list_next(&upp->probes); pop != NULL;
>  	     pop = dt_list_next(pop)) {
> @@ -740,6 +718,20 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>  		idp = dt_dlib_add_probe_var(pcb->pcb_hdl, prp);
>  		assert(idp != NULL);
>  
> +		/*
> +		 * Register copies.  FIXME:  What can be optimized?
> +		 */

I would drop the FIXME.  There is not really much of anything that can be
optimized because we need the register content in case some clause makes use
of regs[].  And arguments need to be copied for argN and args[], unless we
want to implement arg-access using a provider callback during CG, and that
would be an optimization across all providers so definitely not something for
now.  So, no need to have a FIXME.  It's biger than USDT :)

> +		if (upp->flags & PP_IS_RETURN)
> +			dt_cg_tramp_copy_rval_from_regs(pcb);
> +		else
> +			dt_cg_tramp_copy_args_from_regs(pcb, 1);

Per my comment above, I think this should still retain the code that was here
before (well, higher up) to select the correct flag to pass rather than just
hardcoding it as 1 and mentioning it is a bug.  It is easy enough to just keep
what was there as logic, and then change that in a subsequent patch.

Using (prp->prov->impl == &dt_pid ? 1 : 0) would do the trick, right?

> +
> +		/*
> +		 * Retrieve the PID of the process that caused the probe to fire.
> +		 */
> +		emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_get_current_pid_tgid));
> +		emit(dlp,  BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32));

Wouldn't it be better to retrieve the PID once, and store it in %r6, and then
use it at every branch rather than retrieving the PID for every case?

> +
>  		/*
>  		 * Check whether this pid-provider probe serves the current
>  		 * process, and emit a sequence of clauses for it when it does.
> @@ -747,7 +739,6 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>  		emit(dlp,  BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, pid, lbl_next));
>  		emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, prp->desc->id), idp);
>  		dt_cg_tramp_call_clauses(pcb, prp, DT_ACTIVITY_ACTIVE);
> -		emit(dlp,  BPF_JUMP(lbl_exit));
>  		emitl(dlp, lbl_next,
>  			   BPF_NOP());
>  	}
> diff --git a/test/unittest/pid/tst.entry_off0.sh b/test/unittest/pid/tst.entry_off0.sh
> new file mode 100755
> index 00000000..f1a75b6a
> --- /dev/null
> +++ b/test/unittest/pid/tst.entry_off0.sh
> @@ -0,0 +1,125 @@
> +#!/bin/bash
> +#
> +# Oracle Linux DTrace.
> +# Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
> +# Licensed under the Universal Permissive License v 1.0 as shown at
> +# http://oss.oracle.com/licenses/upl.
> +#
> +
> +dtrace=$1
> +
> +trig=`pwd`/test/triggers/ustack-tst-basic
> +
> +DIRNAME="$tmpdir/enter_off0.$$.$RANDOM"
> +mkdir -p $DIRNAME
> +cd $DIRNAME
> +
> +# Run DTrace, dumping all probe functions and names, plus PC, in a.out.
> +
> +$dtrace $dt_flags -c $trig -n '
> +pid$target:a.out::
> +{
> +	@[probefunc, probename, uregs[R_PC]] = count();
> +}
> +
> +profile:::tick-1s
> +{
> +	exit(0);
> +}' > D.out
> +if [ $? -ne 0 ]; then
> +	echo ERROR: dtrace
> +	cat D.out
> +	exit 1
> +fi
> +
> +# Generate the expected list of functions for our trigger program.
> +
> +echo main > expected.tmp
> +echo mycallee >> expected.tmp
> +for x in 0 1 2 3 4 5 6 7 8 9 \
> +         a b c d e f g h i j k l m n o p q r s t u v w x y z \
> +         A B C D E F G H I J K L M N O P Q R S T U V W X Y Z; do
> +    echo myfunc_$x >> expected.tmp
> +done
> +sort expected.tmp > expected.txt
> +
> +# Check output for probe name "0" or "entry".
> +
> +awk '$2 == "0" || $2 == "entry"' D.out | awk '
> +{
> +	fun = $1;
> +	prb = $2;
> +	PC = $3;
> +	cnt = $4;
> +}
> +
> +# Check that the count is 1.
> +cnt != 1 {
> +	print "ERROR: count is not 1";
> +	print;
> +	exit(1);
> +}
> +
> +# Check that we have not gotten the same (fun,prb) already.
> +prb == "0" && fun in PC0 {
> +	print "ERROR: already have offset 0 for this func";
> +	print;
> +	exit(1);
> +}
> +prb == "entry" && fun in PCentry {
> +	print "ERROR: already have entry for this func";
> +	print;
> +	exit(1);
> +}
> +
> +# Record the PC.
> +prb ==   "0"   { PC0[fun] = PC; }
> +prb == "entry" { PCentry[fun] = PC; }
> +
> +# Do final matchup.
> +END {
> +	# Walk functions for the offset-0 probes.
> +	for (fun in PC0) {
> +		# Make sure each offset-0 probe has a matching entry probe.
> +		if (!(fun in PCentry)) {
> +			print "ERROR: func", fun, "has offset-0 but no entry";
> +			exit(1);
> +		}
> +
> +		# Make sure the matching probes report the same PC.
> +		if (PC0[fun] != PCentry[fun]) {
> +			print "ERROR: func", fun, "has mismatching PCs for offset-0 and entry:", PC0[fun], PCentry[fun];
> +			exit(1);
> +		}
> +
> +		# Dump the function name and delete these entries.
> +		print fun;
> +		delete PC0[fun];
> +		delete PCentry[fun];
> +	}
> +
> +	# Check if there are any leftover entry probes.
> +	for (fun in PCentry) {
> +		print "ERROR: func", fun, "has entry but no offset-0";
> +		exit(1);
> +	}
> +}
> +' | sort > awk.out
> +
> +# Report any problems.
> +
> +if ! diff -q awk.out expected.txt; then
> +	echo ERROR: diff failure
> +	echo ==== function list
> +	cat awk.out
> +	echo ==== expected function list
> +	cat expected.txt
> +	echo ==== diff
> +	diff awk.out expected.txt
> +	echo ==== DTrace output
> +	cat D.out
> +	exit 1
> +fi
> +
> +echo success
> +exit 0
> -- 
> 2.43.5
> 



More information about the DTrace-devel mailing list