[DTrace-devel] [PATCH] Optimize dt_cg_store_val() for string values

Kris Van Hees kris.van.hees at oracle.com
Mon Nov 15 21:02:08 UTC 2021


On Mon, Nov 15, 2021 at 03:41:56PM -0500, Eugene Loh via DTrace-devel wrote:
> On 11/15/21 2:45 PM, Kris Van Hees wrote:
> 
> > On Mon, Nov 15, 2021 at 02:18:13PM -0500, Eugene Loh via DTrace-devel wrote:
> > > My main question is whether the simplified handling of the string-length
> > > prefix handles the prefix properly.  E.g.
> > > 
> > >      $ cat s.d
> > >      #pragma D option rawbytes
> > >      BEGIN {
> > > trace("abcdefghijklmopnqrstuvwxyzABCDEFGHIJKLMOPNQRSTUVWXYZ");
> > >        exit(0);
> > >      }
> > >      $ sudo dtrace -xstrsize=64 -s s.d
> > >      dtrace: script 's.d' matched 1 probe
> > >      CPU     ID                    FUNCTION:NAME
> > >        1      1                           :BEGIN
> > >               0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
> > >           0: 00 34 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6f .4abcdefghijklmo
> > >          10: 70 6e 71 72 73 74 75 76 77 78 79 7a 41 42 43 44 pnqrstuvwxyzABCD
> > >          20: 45 46 47 48 49 4a 4b 4c 4d 4f 50 4e 51 52 53 54 EFGHIJKLMOPNQRST
> > >          30: 55 56 57 58 59 5a 00 00 00 00 00 00 00 00 00 00 UVWXYZ..........
> > >          40: 00 00 00                                         ...
> > > 
> > >      $ sudo dtrace -xstrsize=8 -s s.d
> > >      dtrace: script 's.d' matched 1 probe
> > >      CPU     ID                    FUNCTION:NAME
> > >        2      1                           :BEGIN
> > >               0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
> > >           0: 00 34 61 62 63 64 65 66 67 68 00 .4abcdefgh.
> > > 
> > > In the first run, the string has 52=0x34 chars and we see that in the
> > > output.  In the second run, even though the string is truncated to 8 chars,
> > > the prefix is still 0x34.
> > That is a different bug - I am working on that.  It has to do with the fact
> > that the strtab needs to be able to store strings that are longer than strsize
> > but D scripts should not "see" that.
> 
> I'm confused.  This bug is introduced by the proposed patch.  What is the
> plan, for example, for tests like test/unittest/codegen/tst.str_*.d ? 
> Should this patch xfail or skip such tests?

The proposed patch causes a bug in other code to become noticable.  Which is
why I am pairing it with a patch (that I am finishing up right now) that will
fix that bug.  This patch is an optimization, not a bug fix.

> > > I guess my second question is whether we're going to get rid of the
> > > string-length prefix altogether.  If the answer is either "yes" or "maybe,"
> > > I think it'd make sense to do that first before worrying about optimizing
> > > such code.
> > Given that it is not done yet and given that we have not made a final decision
> > on that yet, we should at least move forward with fixing bugs with the current
> > state of things.
> 
> Is there a test that demonstrates which bug is supposed to be fixed with
> this patch?

Again, this patch optimizes code.  It is not a bug fix.  So there is no need
for a test.

> > 
> > > Also...
> > > 
> > > On 11/15/21 12:31 PM, Kris Van Hees via DTrace-devel wrote:
> > > > Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> > > > ---
> > > >    libdtrace/dt_cg.c | 48 +++++++++++++----------------------------------
> > > >    1 file changed, 13 insertions(+), 35 deletions(-)
> > > > 
> > > > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > > > index 455e5440..930355f8 100644
> > > > --- a/libdtrace/dt_cg.c
> > > > +++ b/libdtrace/dt_cg.c
> > > > @@ -958,9 +958,6 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
> > > >    		return 0;
> > > >    	} else if (dt_node_is_string(dnp)) {
> > > > -		uint_t		size_ok = dt_irlist_label(dlp);
> > > > -		int		reg;
> > > > -
> > > >    		dt_cg_check_notnull(dlp, drp, dnp->dn_reg);
> > > >    		TRACE_REGSET("store_val(): Begin ");
> > > > @@ -968,49 +965,30 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
> > > >    				 size + DT_STRLEN_BYTES + 1, 1, pfp, arg);
> > > >    		/*
> > > > -		 * Retrieve the length of the string, limit it to the maximum
> > > > -		 * string size, and store it in the buffer at [%r9 + off].
> > > > +		 * Copy the length of the string from the source string as a
> > > > +		 * half-word (2 bytes) into the buffer at [%r9 + off].
> > > >    		 */
> > > > -		reg = dt_regset_alloc(drp);
> > > > -		if (reg == -1)
> > > > -			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> > > > -		dt_cg_strlen(dlp, drp, reg, dnp->dn_reg);
> > > > -		dt_regset_xalloc(drp, BPF_REG_0);
> > > > -		emit(dlp,  BPF_BRANCH_IMM(BPF_JLT, reg, size, size_ok));
> > > FWIW (not that anyone cares about code that is working and isn't likely to
> > > survive in any case), that jump might as well be JLE.
> > Good point and actually it does matter (and this code is likely to survive
> > anyway because the verifier probably needs it).
> 
> I'm again confused.  If the verifier needs it, then why does the patch
> remove it?

My mistake - I didn't realize you were commenting on code being removed.  I
would highly recommend not doing that because it is pointless and confusing.
If the code that was there was wrong or not optimal, that hardly matters
anymore when we are removing it.

> > > > -		emit(dlp,  BPF_MOV_IMM(reg, size));
> > > > -		emitl(dlp, size_ok,
> > > > -			   BPF_MOV_REG(BPF_REG_0, reg));
> > > > -		emit(dlp,  BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 8));
> > > > -		emit(dlp,  BPF_STORE(BPF_B, BPF_REG_9, off, BPF_REG_0));
> > > > -		emit(dlp,  BPF_STORE(BPF_B, BPF_REG_9, off + 1, reg));
> > > > -		dt_regset_free(drp, BPF_REG_0);
> > > > +		emit(dlp, BPF_LOAD(BPF_H, BPF_REG_0, dnp->dn_reg, 0));
> > > > +		emit(dlp, BPF_STORE(BPF_H, BPF_REG_9, off, BPF_REG_0));
> > > I know I've asked this multiple times before (and you've replied) but the
> > > use of %r0 without allocating it still confuses me. Specifically, let's say
> > > that that's a safe practice.  Then why, only a few lines later (and wherever
> > > we make function calls) do we xalloc/free %r0?
> > No function call here - but function call a few lines later.  We use %r0 here
> > as a temp in code that we know does not make a function call so we have total
> > control over its use.  When we face a function call, we reserve %r0 because it
> > will be clobbered for a possible return value (even if there isn't one).
> 
> I simply do not understand.  Both uses modify %r0's contents.  In either
> case, that's not a problem if %r0 is not already in use.  If we know %r0 is
> not in use and can be modified, we should be able to make a function call
> without having to xalloc/free %r0.

The difference is that in the first case the code generator has complete
control over the life of %r0 and its value.  In the second case (function call)
we do not have any control over its value because that will be set by the
function that we called.  In other words, due to the function call we need to
consider %r0 as no longer predictable, and we either use it with the value that
the function assigned, or we should mark it free, and then we could possibly
use it ourselves for other purposes.

> > > >    		/*
> > > > -		 * Copy the string to the output buffer at
> > > > -		 * [%r9 + off + DT_STRLEN_BYTES] because we need to skip the
> > > > -		 * length prefix.
> > > > +		 * Copy the string data (no more than STRSIZE + 1 bytes) to the
> > > > +		 * buffer at [%r9 + off + DT_STRLEN_BYTES].  We (ab)use the
> > > In what sense is this an abuse?  This seems to be totally the sort of thing
> > > probe_read_str() is for.  I suggest s/(ab)use/use/.
> > We are (on purpose) giving a max bytecount that is likely to be larger than we
> > can actually read, and we are depending on probe_read_str() to "save us" by
> > stopping once we hit the NUL byte.
> 
> That seemed to be what probe_read_str() was for, but I guess it doesn't
> matter.
> 
> > To me (given that I could actuallt retrieve
> > the length and probably should - but I don't because this is more efficient),
> > this is a potential abuse although it is making use of a convenient feature.
> > Hence the parantheses.  YMMV
> > 
> > > > +		 * fact that probe_read_str) stops at the terminating NUL byte.
> > > >    		 */
> > > >    		if (dt_regset_xalloc_args(drp) == -1)
> > > >    			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> > > > -		emit(dlp,  BPF_MOV_REG(BPF_REG_1, BPF_REG_9));
> > > > -		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off + DT_STRLEN_BYTES));
> > > > -		emit(dlp,  BPF_MOV_REG(BPF_REG_2, reg));
> > > > -		dt_regset_free(drp, reg);
> > > Incidentally, as far as the pre-existing code goes... free(reg)? Looking a
> > > few lines ahead...
> > Bugs in pre-existing code?  How remarkable :)
> > 
> > > > -		emit(dlp,  BPF_MOV_REG(BPF_REG_3, dnp->dn_reg));
> > > > +		emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_9));
> > > > +		emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off + DT_STRLEN_BYTES));
> > > > +		emit(dlp, BPF_MOV_IMM(BPF_REG_2, size + 1));
> > > > +		emit(dlp, BPF_MOV_REG(BPF_REG_3, dnp->dn_reg));
> > > >    		dt_regset_free(drp, dnp->dn_reg);
> > > >    		dt_cg_tstring_free(pcb, dnp);
> > > > -		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STRLEN_BYTES));
> > > > +		emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STRLEN_BYTES));
> > > >    		dt_regset_xalloc(drp, BPF_REG_0);
> > > Here is the xalloc(%r0), even though we've already used %r0 (without
> > > xalloc'ing it).
> > See above...
> > 
> > > > -		emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_probe_read));
> > > > +		emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read_str));
> > > >    		dt_regset_free_args(drp);
> > > > -
> > > > -		/*
> > > > -		 * Write the NUL terminating byte.
> > > > -		 */
> > > > -		emit(dlp,  BPF_MOV_REG(BPF_REG_0, BPF_REG_9));
> > > > -		emit(dlp,  BPF_ALU64_REG(BPF_ADD, BPF_REG_0, reg));
> > > Here the pre-existing code uses reg, which it has already freed.
> > > 
> > > > -		emit(dlp,  BPF_STORE_IMM(BPF_B, BPF_REG_0, off + DT_STRLEN_BYTES, 0));
> > > >    		dt_regset_free(drp, BPF_REG_0);
> > > >    		TRACE_REGSET("store_val(): End   ");
> > > _______________________________________________
> > > DTrace-devel mailing list
> > > DTrace-devel at oss.oracle.com
> > > https://oss.oracle.com/mailman/listinfo/dtrace-devel
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list