[DTrace-devel] [PATCH v3 3/3] String args to dt_strcmp() need to be copied if not DPTR

Kris Van Hees kris.van.hees at oracle.com
Tue Oct 11 23:33:58 UTC 2022


On Tue, Oct 11, 2022 at 04:45:39PM -0400, Kris Van Hees via DTrace-devel wrote:
> On Mon, Oct 10, 2022 at 03:19:28PM -0400, Eugene Loh via DTrace-devel wrote:
> > Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
> 
> Thanks.
> 
> > though:
> > 
> > *)  I still think one can just use tmp2 directly without needing to check
> > flags and branch (unpleasant for the BPF verifier) simply to access a single
> > byte in t.  As you point out, tmp2 gets overwritten, but that overwriting is
> > easily avoided by doing probe_read(tmp1, ..., tmp1).
> 
> Yes, but I wasn't touching the existing implementation since that was not
> in the scope of this patch.  Based on your feedback, I'll happily make that
> change.

Actually we cannot do this because the verifier will fail to validate the code
as a result of execname potentially not being a valid pointer value (in the
theoretical case that we are unable to access the current task info).  By
having the copying explicitly covered for both strings, the verifier accepts
the code as valid.  And the impact applies to the case where the first string
is a kernel string, which I think is acceptable for now.

A future patch will have to do more work in ensuring that we handle failure
cases in a way that does not cause the BPF verifier to choke.  But for now,
I think it is best to at least ensure that this works - even if we incur a
tiny performance impact in some cases.

> > *)  For that matter, might also make sense to skip the flag check for s as
> > well.  Just always use the helper instead in order to reduce branching. 
> > That philosophy is already clearly present in strcmp() and is what made
> > strcmp() possible.
> 
> I'd rather not because the reason you copy the string earlier in the function
> is mostly because you need to have scratch copies to modify, which is valid,
> but quite different from the need here in case of kernel strings.  DTrace
> managed storage should be accessible through direct and indirect loads and by
> actually using those when possible we enable the BPF verifier to provide better
> validation of the BPF code we generate.  That is a benefit.  Also, a helper
> call *is* more expensive than a regular instruction.
> 
> > *)  The test naming is odd in a couple of respects:
> > 
> >     *)  execname-cmp-bvar?  But execname is a bvar.  How about
> > execname-cmp-prbname?
> 
> The point is that execname is a special bvar.  The bvar we compare with is not
> relevant in terms of exercising the functionality.
> 
> >     *)  execname-cmp-bvar_[lhs|rhs]?  So lhs and rhs refer to what?  How
> > about instead execname-cmp-bvar and bvar-cmp-execname and skip the lhs/rhs
> > labels.
> 
> I think the test itself shows clearly which is which.  I don't expect the name
> to be perfectly self explanatory because for many tests that is simply not
> possible.  I do tend to name tests so that the main thing being tested
> (execname, in this case) is listed first.
> 
> > *)  How about more thorough comparisons and results testing?  If the tests
> > are set up well, adding lots of checks (< <= dtrace dtracd dtracf etc.) is
> > just a matter of adding a few lines, and, in light of all the mistakes that
> > can come up, checking against a results file might be really useful.  If you
> > like, I can post some tests.
> 
> I certainly encourage submitting additional tests for the strcmp() function.
> But this particular patch does not have to do with the actual correctness of
> the strcmp() implementation but rather with the fact that the code fails to
> pass the BPF verifier when kernel strings are used.  I.e. a code generation
> issue, which is why the tests are in unittest/codegen.
> > 
> > On 10/8/22 13:19, Kris Van Hees via DTrace-devel wrote:
> > > The dt_strcmp() implementation makes use of indirect load instructions
> > > to retrieve characters from either string argument.  That only works for
> > > DTrace-managed storage pointers.  So, any argument that is not marked
> > > DT_NF_DPTR must be copied into a tstring before the indirect loads are
> > > done.
> > > 
> > > Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> > > Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
> > > ---
> > >   bpf/strcmp.S                                  | 52 ++++++++++++++-----
> > >   libdtrace/dt_cg.c                             | 13 ++++-
> > >   .../codegen/tst.execname-cmp-bvar_lhs.d       | 19 +++++++
> > >   .../codegen/tst.execname-cmp-bvar_rhs.d       | 19 +++++++
> > >   .../codegen/tst.execname-cmp-const_lhs.d      | 19 +++++++
> > >   .../codegen/tst.execname-cmp-const_rhs.d      | 19 +++++++
> > >   .../codegen/tst.kernel_str-cmp-bvar_lhs.d     | 19 +++++++
> > >   .../codegen/tst.kernel_str-cmp-bvar_rhs.d     | 19 +++++++
> > >   .../codegen/tst.kernel_str-cmp-const_lhs.d    | 19 +++++++
> > >   .../codegen/tst.kernel_str-cmp-const_rhs.d    | 19 +++++++
> > >   10 files changed, 202 insertions(+), 15 deletions(-)
> > >   create mode 100644 test/unittest/codegen/tst.execname-cmp-bvar_lhs.d
> > >   create mode 100644 test/unittest/codegen/tst.execname-cmp-bvar_rhs.d
> > >   create mode 100644 test/unittest/codegen/tst.execname-cmp-const_lhs.d
> > >   create mode 100644 test/unittest/codegen/tst.execname-cmp-const_rhs.d
> > >   create mode 100644 test/unittest/codegen/tst.kernel_str-cmp-bvar_lhs.d
> > >   create mode 100644 test/unittest/codegen/tst.kernel_str-cmp-bvar_rhs.d
> > >   create mode 100644 test/unittest/codegen/tst.kernel_str-cmp-const_lhs.d
> > >   create mode 100644 test/unittest/codegen/tst.kernel_str-cmp-const_rhs.d
> > > 
> > > diff --git a/bpf/strcmp.S b/bpf/strcmp.S
> > > index 75177469..49d1e64a 100644
> > > --- a/bpf/strcmp.S
> > > +++ b/bpf/strcmp.S
> > > @@ -58,12 +58,13 @@ dt_strcmp_not :
> > >   	.size	dt_strcmp_not, .-dt_strcmp_not
> > >   /*
> > > - * int dt_strcmp(char *s, char *t, char *tmp1, char *tmp2) {
> > > + * int dt_strcmp(char *s, char *t, char *tmp1, char *tmp2, uint64_t flags) {
> > >    *
> > >    *     [%fp-8]=s
> > >    *     [%fp-16]=t
> > >    *     [%fp-24]=tmp1
> > >    *     [%fp-32]=tmp2
> > > + *     r9=flags
> > >    *
> > >    *     r8 = STRSZ
> > >    *
> > > @@ -86,8 +87,14 @@ dt_strcmp_not :
> > >    *     // based on this location, judge if the strings are >, <, or ==
> > >    *     if (r0 > r6) goto Lsame;
> > >    *     if (r0 > r7) goto Lsame;
> > > - *     if (s[r0] > t[r0]) return +1;
> > > - *     if (s[r0] < t[r0]) return +1;
> > > + *
> > > + *     s += r0;
> > > + *     t += r0;
> > > + *
> > > + *     if (flags & 1) { } else { bpf_probe_read(tmp1, 1, s); s = tmp1; }
> > > + *     if (flags & 2) { } else { bpf_probe_read(tmp2, 1, t); t = tmp2; }
> > > + *     if (s[0] > t[0]) return +1;
> > > + *     return -1;
> > >    *
> > >    *     // if all chars are the same, break tie on string length
> > >    * Lsame:
> > > @@ -105,6 +112,7 @@ dt_strcmp :
> > >   	stxdw	[%fp+-16], %r2		/* Spill t */
> > >   	stxdw	[%fp+-24], %r3		/* Spill tmp1 */
> > >   	stxdw	[%fp+-32], %r4		/* Spill tmp2 */
> > > +	mov	%r9, %r5		/* flags */
> > >   	lddw	%r8, STRSZ		/* r8 = STRSZ */
> > > @@ -156,22 +164,38 @@ dt_strcmp :
> > >   	sub	%r0, 1			/* r0 -= 1 */
> > >   	jgt	%r0, %r6, .Lsame	/* if (r0 > r6) goto Lsame */
> > > -	jgt	%r0, %r8, .Lsame	/* if (r0 > r8) goto Lsame */
> > > +	jgt	%r0, %r7, .Lsame	/* if (r0 > r7) goto Lsame */
> > > +
> > > +	ldxdw	%r6, [%fp+-8]
> > > +	add	%r6, %r0		/* s += r0 */
> > > +	ldxdw	%r7, [%fp+-16]
> > > +	add	%r7, %r0		/* t += r0 */
> > > +
> > > +	jset	%r9, 1, .Ls_is_dptr
> > > +	ldxdw	%r1, [%fp+-24]
> > > +	mov	%r2, 1
> > > +	mov	%r3, %r6
> > > +	mov	%r6, %r1		/* s = tmp1 (after the helper call) */
> > > +	call	BPF_FUNC_probe_read	/* bpf_probe_read(tmp1, 1, s) */
> > > +.Ls_is_dptr:
> > > +
> > > +	jset	%r9, 2, .Lt_is_dptr
> > > +	ldxdw	%r1, [%fp+-32]
> > > +	mov	%r2, 1
> > > +	mov	%r3, %r7
> > > +	mov	%r7, %r1		/* t = tmp2 (after the helper call) */
> > > +	call	BPF_FUNC_probe_read	/* bpf_probe_read(tmp2, 1, t) */
> > > +.Lt_is_dptr:
> > > -	ldxdw	%r4, [%fp+-8]
> > > -	add	%r4, %r0
> > > -	ldxb	%r4, [%r4+0]		/* s[r0] */
> > > -	and	%r4, 0xff
> > > +	ldxb	%r6, [%r6+0]		/* s[0] */
> > > +	and	%r6, 0xff
> > > -	ldxdw	%r5, [%fp+-16]
> > > -	add	%r5, %r0
> > > -	ldxb	%r5, [%r5+0]		/* t[r0] */
> > > -	and	%r5, 0xff
> > > +	ldxb	%r7, [%r7+0]		/* t[0] */
> > > +	and	%r7, 0xff
> > > -	jle	%r4, %r5, 2		/* if (s[r0] > t[r0]) return +1 */
> > > +	jle	%r6, %r7, 2		/* if (s[0] > t[0]) return +1 */
> > >   	mov	%r0, 1
> > >   	exit
> > > -	jge	%r4, %r5, 2		/* if (s[r0] < t[r0]) return +1 */
> > >   	mov	%r0, -1
> > >   	exit
> > > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > > index c4e49129..ef639161 100644
> > > --- a/libdtrace/dt_cg.c
> > > +++ b/libdtrace/dt_cg.c
> > > @@ -3294,6 +3294,7 @@ dt_cg_compare_op(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp, uint_t op)
> > >   	    dt_node_is_string(dnp->dn_right)) {
> > >   		dt_ident_t *idp;
> > >   		uint_t Lcomp = dt_irlist_label(dlp);
> > > +		uint32_t flags = 0;
> > >   		uint64_t off1, off2;
> > >   		/* if either pointer is NULL, just compare pointers */
> > > @@ -3301,7 +3302,16 @@ dt_cg_compare_op(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp, uint_t op)
> > >   		emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, dnp->dn_right->dn_reg, 0, Lcomp));
> > >   		/*
> > > -		 * Otherwise, replace pointers with relative string values.
> > > +		 * Set flags to indicate which arguments (if any) are pointers
> > > +		 * to DTrace-managed storage.
> > > +		 */
> > > +		if (dnp->dn_left->dn_flags & DT_NF_DPTR)
> > > +			flags |= 1;
> > > +		if (dnp->dn_right->dn_flags & DT_NF_DPTR)
> > > +			flags |= 2;
> > > +
> > > +		/*
> > > +		 * Replace pointers with relative string values.
> > >   		 * Specifically, replace left with strcmp(left,right)+1 and
> > >   		 * right with 1, so we can use unsigned comparisons.
> > >   		 */
> > > @@ -3318,6 +3328,7 @@ dt_cg_compare_op(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp, uint_t op)
> > >   		emit(dlp,  BPF_MOV_REG(BPF_REG_4, BPF_REG_3));
> > >   		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, off1));
> > >   		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, off2));
> > > +		emit(dlp,  BPF_MOV_IMM(BPF_REG_5, flags));
> > >   		idp = dt_dlib_get_func(yypcb->pcb_hdl, "dt_strcmp");
> > >   		assert(idp != NULL);
> > >   		dt_regset_xalloc(drp, BPF_REG_0);
> > > diff --git a/test/unittest/codegen/tst.execname-cmp-bvar_lhs.d b/test/unittest/codegen/tst.execname-cmp-bvar_lhs.d
> > > new file mode 100644
> > > index 00000000..d36bbf07
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.execname-cmp-bvar_lhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace(probename == execname);
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.execname-cmp-bvar_rhs.d b/test/unittest/codegen/tst.execname-cmp-bvar_rhs.d
> > > new file mode 100644
> > > index 00000000..dcdf3507
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.execname-cmp-bvar_rhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace(execname == probename);
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.execname-cmp-const_lhs.d b/test/unittest/codegen/tst.execname-cmp-const_lhs.d
> > > new file mode 100644
> > > index 00000000..63e8d987
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.execname-cmp-const_lhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace("dtrace" == execname);
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.execname-cmp-const_rhs.d b/test/unittest/codegen/tst.execname-cmp-const_rhs.d
> > > new file mode 100644
> > > index 00000000..e256f3a9
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.execname-cmp-const_rhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace(execname == "dtrace");
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.kernel_str-cmp-bvar_lhs.d b/test/unittest/codegen/tst.kernel_str-cmp-bvar_lhs.d
> > > new file mode 100644
> > > index 00000000..ef95815a
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.kernel_str-cmp-bvar_lhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace(probename == curthread->comm);
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.kernel_str-cmp-bvar_rhs.d b/test/unittest/codegen/tst.kernel_str-cmp-bvar_rhs.d
> > > new file mode 100644
> > > index 00000000..1dd9a4ed
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.kernel_str-cmp-bvar_rhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace(curthread->comm == probename);
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.kernel_str-cmp-const_lhs.d b/test/unittest/codegen/tst.kernel_str-cmp-const_lhs.d
> > > new file mode 100644
> > > index 00000000..e877966e
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.kernel_str-cmp-const_lhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace("dtrace" == curthread->comm);
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > > diff --git a/test/unittest/codegen/tst.kernel_str-cmp-const_rhs.d b/test/unittest/codegen/tst.kernel_str-cmp-const_rhs.d
> > > new file mode 100644
> > > index 00000000..baebe17f
> > > --- /dev/null
> > > +++ b/test/unittest/codegen/tst.kernel_str-cmp-const_rhs.d
> > > @@ -0,0 +1,19 @@
> > > +/*
> > > + * Oracle Linux DTrace.
> > > + * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
> > > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > > + * http://oss.oracle.com/licenses/upl.
> > > + */
> > > +
> > > +#pragma D option quiet
> > > +
> > > +BEGIN
> > > +{
> > > +	trace(curthread->comm == "dtrace");
> > > +	exit(0);
> > > +}
> > > +
> > > +ERROR
> > > +{
> > > +	exit(1);
> > > +}
> > 
> > _______________________________________________
> > DTrace-devel mailing list
> > DTrace-devel at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/dtrace-devel
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list