[DTrace-devel] [PATCH 3/3] String args to dt_strcmp() need to be copied if not DPTR

Kris Van Hees kris.van.hees at oracle.com
Thu Oct 6 23:21:19 UTC 2022


On Thu, Oct 06, 2022 at 02:26:47PM -0400, Eugene Loh via DTrace-devel wrote:
> My biggest complaint here is still minor, but this patch should have at
> least one test.

Oh it does.  I just forgot to actually add them to the commit.  Sorry about
that.  Will post a v2 with that added.

> Also...
> 
> On 10/6/22 12:21, Kris Van Hees via DTrace-devel wrote:
> > The dt_strcmp() implementation makes use of indirect load instructions
> > to retrieve characters from either string argument.  That only works for
> > DTrace-managed storage pointers.  So, any argument that is not marked
> > DT_NF_DPTR must be copied into a tstring before the indirect loads are
> > done.
> > 
> > Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> > ---
> >   bpf/strcmp.S      | 48 ++++++++++++++++++++++++++++++++++++-----------
> >   libdtrace/dt_cg.c | 11 +++++++++++
> >   2 files changed, 48 insertions(+), 11 deletions(-)
> > 
> > diff --git a/bpf/strcmp.S b/bpf/strcmp.S
> > index 75177469..bdfbb169 100644
> > --- a/bpf/strcmp.S
> > +++ b/bpf/strcmp.S
> > @@ -58,7 +58,7 @@ dt_strcmp_not :
> >   	.size	dt_strcmp_not, .-dt_strcmp_not
> >   /*
> > - * int dt_strcmp(char *s, char *t, char *tmp1, char *tmp2) {
> > + * int dt_strcmp(char *s, char *t, char *tmp1, char *tmp2, uint64_t flags) {
> >    *
> >    *     [%fp-8]=s
> >    *     [%fp-16]=t
> 
> These comments describe where we're spilling/saving the arguments. Now that
> flags is introduced, it should be added to those comments.

Sure.

> > @@ -86,6 +86,8 @@ dt_strcmp_not :
> >    *     // based on this location, judge if the strings are >, <, or ==
> >    *     if (r0 > r6) goto Lsame;
> >    *     if (r0 > r7) goto Lsame;
> > + *     if (flags & 1) { bpf_probe_read_str(tmp1, STRSZ, s); s = tmp1; }
> > + *     if (flags & 2) { bpf_probe_read_str(tmp2, STRSZ, t); t = tmp2; }
> 
> That's a potentially big copy for just needing a single char.  Maybe that's
> okay since work done by helper functions is arguably "cheap" in some sense
> (especially complexity seen by the BPF verifier).  So, I guess okay?

Hm, good point.  And since we only ever need that one character I could just do
a read of size 1.  Other than that, there is no other way to do this.  BPF
will not allow a load from an arbitrary address.

v2 on its way (and it will address the rest as well, though with changed code
as well).

> >    *     if (s[r0] > t[r0]) return +1;
> >    *     if (s[r0] < t[r0]) return +1;
> >    *
> 
> Comments should reflect that we've saved r0 on r8 and are now using r8. 
> That would make these comments consistent with the actual code that follows,
> as well as the comments that accompany that code.
> 
> > @@ -105,6 +107,7 @@ dt_strcmp :
> >   	stxdw	[%fp+-16], %r2		/* Spill t */
> >   	stxdw	[%fp+-24], %r3		/* Spill tmp1 */
> >   	stxdw	[%fp+-32], %r4		/* Spill tmp2 */
> > +	mov	%r9, %r5		/* flags */
> >   	lddw	%r8, STRSZ		/* r8 = STRSZ */
> > @@ -157,21 +160,44 @@ dt_strcmp :
> >   	jgt	%r0, %r6, .Lsame	/* if (r0 > r6) goto Lsame */
> >   	jgt	%r0, %r8, .Lsame	/* if (r0 > r8) goto Lsame */
> > +	mov	%r8, %r0
> > -	ldxdw	%r4, [%fp+-8]
> > -	add	%r4, %r0
> > -	ldxb	%r4, [%r4+0]		/* s[r0] */
> > -	and	%r4, 0xff
> > +	jset	%r9, 1, .Ls_is_dptr
> > +	ldxdw	%r6, [%fp+-24]		/* s = tmp1 */
> 
> The pseudocode/comment you gave earlier put s=tmp1 *after* the copy.  Might
> as well use the same order here for better consistency between comments and
> code.
> 
> > -	ldxdw	%r5, [%fp+-16]
> > -	add	%r5, %r0
> > -	ldxb	%r5, [%r5+0]		/* t[r0] */
> > -	and	%r5, 0xff
> > +	mov	%r1, %r6
> > +	lddw	%r2, STRSZ
> > +	ldxdw	%r3, [%fp+-8]
> > +	call	BPF_FUNC_probe_read_str	/* bpf_probe_read_str(tmp1, r8, s) */
> s/r8/STRSZ/ in comment
> 
> > +	ja	.Lis_t_dptr
> > +.Ls_is_dptr:
> > +	ldxdw	%r6, [%fp+-8]		/* Load s */
> > +
> > +.Lis_t_dptr:
> > +	jset	%r9, 2, .Lt_is_dptr
> > +	ldxdw	%r7, [%fp+-32]		/* t = tmp2 */
> 
> Same comment as for s=tmp1.
> 
> > +	mov	%r1, %r7
> > +	lddw	%r2, STRSZ
> > +	ldxdw	%r3, [%fp+-16]
> > +	call	BPF_FUNC_probe_read_str	/* bpf_probe_read_str(tmp2, r8, t) */
> 
> s/r8/STRSZ/ in comment
> 
> > +	ja	.Lflags_done
> > +.Lt_is_dptr:
> > +	ldxdw	%r7, [%fp+-16]		/* Load t */
> > +.Lflags_done:
> 
> I think these two code sequences would be simpler if each one were
> rearranged.  Currently:
>         jset Ls_is_ptr
>         [...tricky case...]
>         ja Lis_t_ptr
>     Ls_it_ptr:
>         r7 = s
>     Lis_t_ptr:
> How about:
>         r7 = s
>         jset Lsdone
>         [...tricky case...]
>     Lsdone:
> This reduces the number of labels (for the two sequences) from four to two,
> making the code easier to follow IMHO.  The meaning of flags would have to
> be flipped, but that's easy enough.  There is no change in instructions
> listed or executed.
> 
> > +
> > +	add	%r6, %r8
> > +	ldxb	%r6, [%r6+0]		/* s[r8] */
> > +	and	%r6, 0xff
> > +
> > +	add	%r7, %r8
> > +	ldxb	%r7, [%r7+0]		/* t[r8] */
> > +	and	%r7, 0xff
> > -	jle	%r4, %r5, 2		/* if (s[r0] > t[r0]) return +1 */
> > +	jle	%r6, %r7, 2		/* if (s[r8] > t[r8]) return +1 */
> >   	mov	%r0, 1
> >   	exit
> > -	jge	%r4, %r5, 2		/* if (s[r0] < t[r0]) return +1 */
> > +	jge	%r6, %r7, 2		/* if (s[r8] < t[r8]) return +1 */
> >   	mov	%r0, -1
> >   	exit
> > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > index c4e49129..62456268 100644
> > --- a/libdtrace/dt_cg.c
> > +++ b/libdtrace/dt_cg.c
> > @@ -3294,12 +3294,22 @@ dt_cg_compare_op(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp, uint_t op)
> >   	    dt_node_is_string(dnp->dn_right)) {
> >   		dt_ident_t *idp;
> >   		uint_t Lcomp = dt_irlist_label(dlp);
> > +		uint32_t flags = 0;
> >   		uint64_t off1, off2;
> >   		/* if either pointer is NULL, just compare pointers */
> >   		emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, dnp->dn_left->dn_reg, 0, Lcomp));
> >   		emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, dnp->dn_right->dn_reg, 0, Lcomp));
> > +		/*
> > +		 * Set flags to indicate which arguments (if any) are pointers
> > +		 * to DTrace-managed storage.
> > +		 */
> > +		if (dnp->dn_left->dn_flags & DT_NF_DPTR)
> > +			flags |= 1;
> > +		if (dnp->dn_right->dn_flags & DT_NF_DPTR)
> > +			flags |= 2;
> > +
> >   		/*
> >   		 * Otherwise, replace pointers with relative string values.
> >   		 * Specifically, replace left with strcmp(left,right)+1 and
> 
> "Otherwise"?  Ah, there was an "if", but now the "if" and "Otherwise" are
> separated by new intervening code.  Maybe just drop the "Otherwise" at this
> point.
> 
> > @@ -3318,6 +3328,7 @@ dt_cg_compare_op(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp, uint_t op)
> >   		emit(dlp,  BPF_MOV_REG(BPF_REG_4, BPF_REG_3));
> >   		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, off1));
> >   		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, off2));
> > +		emit(dlp,  BPF_MOV_IMM(BPF_REG_5, flags));
> >   		idp = dt_dlib_get_func(yypcb->pcb_hdl, "dt_strcmp");
> >   		assert(idp != NULL);
> >   		dt_regset_xalloc(drp, BPF_REG_0);
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list