[DTrace-devel] [PATCH v2 3/3] String args to dt_strcmp() need to be copied if not DPTR

Eugene Loh eugene.loh at oracle.com
Mon Oct 10 19:05:44 UTC 2022


On 10/8/22 01:07, Kris Van Hees wrote:

> On Sat, Oct 08, 2022 at 12:31:37AM -0400, Eugene Loh via DTrace-devel wrote:
>> Actually, wait a minute.  Does this patch pass testing?  The new tests check
>> against BPF verifier failures, but are the comparisons being performed
>> correctly?  Do older tests like operators/tst.str_comparison*.d pass?
>> (str_comparison-basic.d has been failing for a while due to NULL-pointer
>> support, but that's a different issue.)  Comments below:
>>
>> On 10/6/22 20:48, Kris Van Hees via DTrace-devel wrote:
>>> diff --git a/bpf/strcmp.S b/bpf/strcmp.S
>>> @@ -86,8 +87,14 @@ dt_strcmp_not :
>>>     *     // based on this location, judge if the strings are >, <, or ==
>>>     *     if (r0 > r6) goto Lsame;
>>>     *     if (r0 > r7) goto Lsame;
>>> - *     if (s[r0] > t[r0]) return +1;
>>> - *     if (s[r0] < t[r0]) return +1;
>>> +
>>> + *     s += r0;
>>> + *     t += r0;
>>> +
>>> + *     if (flags & 1) { } else { bpf_probe_read(tmp1, 1, s); s = tmp1; }
>>> + *     if (flags & 2) { } else { bpf_probe_read(tmp2, 1, t); t = tmp2; }
>>> + *     if (s[0] > t[0]) return +1;
>>> + *     if (s[0] < t[0]) return +1;
>>>     *
>>>     *     // if all chars are the same, break tie on string length
>>>     * Lsame:
>>> @@ -158,20 +166,37 @@ dt_strcmp :
>>>    	jgt	%r0, %r6, .Lsame	/* if (r0 > r6) goto Lsame */
>>>    	jgt	%r0, %r8, .Lsame	/* if (r0 > r8) goto Lsame */
>> Hmm.  The above two lines predate this patch, but presumably the second line
>> is wrong.  Earlier comments claim we should be checking r0>r7, not r0>r8.
> Pre-existing but obviously wrong so fixing it.
>
>>> -	ldxdw	%r4, [%fp+-8]
>>> -	add	%r4, %r0
>>> -	ldxb	%r4, [%r4+0]		/* s[r0] */
>>> -	and	%r4, 0xff
>>> +	ldxdw	%r6, [%fp+-8]
>>> +	add	%r6, %r8		/* s += r0 */
>>> +	ldxdw	%r7, [%fp+-16]
>>> +	add	%r7, %r8		/* t += r0 */
>> The code is doing +=%r8 but the comments claim +=r0.  I think the comments
>> are right.  The code is wrong.  Presumably the problem is the v1 patch
>> stashed r0 in r8, but that is no longer the case in the v2 patch.
> Yes, I typo'd that.  Fixed.  Thanks for catching that.
>
> Doing too many things at once, I did run the tests but sadly did that without
> first actually recompiling the code (so it ran with older code).  Bah.
>
>> By the way, this patch's fancy handling of t (identification of !DPTR via
>> flags and special code to handle that case) is unnecessary.  We already
>> copied t into tmp2 anyhow, and there it still sits.  So, there is no need
>> for this patch's new code in the t case.  Instead of loading t (%fp+-16)
>> just load tmp2 (%fp+-32). Simple fix.
> How so?  Prior to the jgt branches you do:
>
> 	r0 = bpf_probe_read_str(tmp2, r8, tmp1);
>
> which overwrites tmp2 which is therefore no longer t.

Doh!  Okay, good point.

On the other hand, we don't really care about the copied data... only in 
the length r0.  So, one could s/tmp2/tmp1/ in that copy: we'd copy from 
tmp1 to itself simply to get the length.  I tested it and the strcmp 
tests seem to work fine.

So I think we could employ that optimization if we wanted to.

>>> +	jset	%r9, 1, .Ls_is_dptr
>>> +	ldxdw	%r1, [%fp+-24]
>>> +	mov	%r2, 1
>>> +	mov	%r3, %r6
>>> +	mov	%r6, %r1		/* s = tmp1 (after the helper call) */
>>> +	call	BPF_FUNC_probe_read	/* bpf_probe_read(tmp1, r8, s) */
>>> +.Ls_is_dptr:
>>> +
>>> +	jset	%r9, 2, .Lt_is_dptr
>>> +	ldxdw	%r1, [%fp+-32]
>>> +	mov	%r2, 1
>>> +	mov	%r3, %r7
>>> +	mov	%r7, %r1		/* t = tmp2 (after the helper call) */
>>> +	call	BPF_FUNC_probe_read	/* bpf_probe_read(tmp2, r8, t) */
>>> +.Lt_is_dptr:
>>> +
>>> +	ldxb	%r6, [%r6+0]		/* s[0] */
>>> +	and	%r6, 0xff
>>> -	ldxdw	%r5, [%fp+-16]
>>> -	add	%r5, %r0
>>> -	ldxb	%r5, [%r5+0]		/* t[r0] */
>>> -	and	%r5, 0xff
>>> +	ldxb	%r7, [%r7+0]		/* t[0] */
>>> +	and	%r7, 0xff
>>> -	jle	%r4, %r5, 2		/* if (s[r0] > t[r0]) return +1 */
>>> +	jle	%r6, %r7, 2		/* if (s[0] > t[0]) return +1 */
>>>    	mov	%r0, 1
>>>    	exit
>>> -	jge	%r4, %r5, 2		/* if (s[r0] < t[r0]) return +1 */
>>> +	jge	%r6, %r7, 2		/* if (s[0] < t[0]) return +1 */
>>>    	mov	%r0, -1
>>>    	exit
>> This is also a little weird -- in part due to pre-existing oddities and now
>> even more peculiar due to the patch.  If that last branch (jge) jumps, we
>> end up at .Lsame.  (That code doesn't show up here since the patch doesn't
>> change it.)  In .Lsame, we compare %r6 and %r7, but this patch has just
>> overwritten them.  So that suggests this patch is wrong.
> See, this is why I prefer branches to labels rather than immediate values in
> the source assembly code.  Labels are so much easier to follow.

I can see the value in labeling branches (mainly to guard against 
problems when someone inserts instructions), but here it seems to me to 
be a matter of the logic being a little loose.  E.g., we could replace 
the +2 jump with a jump to .Lsame, but the point remains that we're 
unnecessarily jumping to somewhere where the operands have possibly been 
overwritten.

But maybe I'm missing the point of what you're saying.

>> But, the overwriting might not matter.  (Presumably, r0 is the first byte
>> that does not match.  So one we test s[r0]>t[r0] and fall through, there is
>> no further need for testing.  We presumably know s[r0]!=t[r0].  So the
>> pre-existing code could just have returned +1 at this point, without further
>> branching.)
> Well the comments are wrong in the pre-existing code also because it states
> that s[r0] < t[r0] *and* s[r0] > t[r0] both return +1 which is of course wrong
> (and the code does it right).

Agreed.

> So, after the if (s[0] > t[0]) return +1 we can just do a return -1, because
> we indeed know that s[0] != t[0] at that point.  And therefore there will not
> be a fallthrough case, and the .Lsame comparison of r6 and r7 is fine.

Agreed (though I've made enough mistakes on this already I'm losing 
confidence in my judgment).

> v3 on the way with these changes.

R-b on the way.



More information about the DTrace-devel mailing list