[DTrace-devel] [PATCH] Optimize dt_cg_store_val() for string values

Kris Van Hees kris.van.hees at oracle.com
Mon Nov 15 19:45:26 UTC 2021


On Mon, Nov 15, 2021 at 02:18:13PM -0500, Eugene Loh via DTrace-devel wrote:
> My main question is whether the simplified handling of the string-length
> prefix handles the prefix properly.  E.g.
> 
>     $ cat s.d
>     #pragma D option rawbytes
>     BEGIN {
> trace("abcdefghijklmopnqrstuvwxyzABCDEFGHIJKLMOPNQRSTUVWXYZ");
>       exit(0);
>     }
>     $ sudo dtrace -xstrsize=64 -s s.d
>     dtrace: script 's.d' matched 1 probe
>     CPU     ID                    FUNCTION:NAME
>       1      1                           :BEGIN
>              0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
>          0: 00 34 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6f .4abcdefghijklmo
>         10: 70 6e 71 72 73 74 75 76 77 78 79 7a 41 42 43 44 pnqrstuvwxyzABCD
>         20: 45 46 47 48 49 4a 4b 4c 4d 4f 50 4e 51 52 53 54 EFGHIJKLMOPNQRST
>         30: 55 56 57 58 59 5a 00 00 00 00 00 00 00 00 00 00 UVWXYZ..........
>         40: 00 00 00                                         ...
> 
>     $ sudo dtrace -xstrsize=8 -s s.d
>     dtrace: script 's.d' matched 1 probe
>     CPU     ID                    FUNCTION:NAME
>       2      1                           :BEGIN
>              0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
>          0: 00 34 61 62 63 64 65 66 67 68 00 .4abcdefgh.
> 
> In the first run, the string has 52=0x34 chars and we see that in the
> output.  In the second run, even though the string is truncated to 8 chars,
> the prefix is still 0x34.

That is a different bug - I am working on that.  It has to do with the fact
that the strtab needs to be able to store strings that are longer than strsize
but D scripts should not "see" that.

> I guess my second question is whether we're going to get rid of the
> string-length prefix altogether.  If the answer is either "yes" or "maybe,"
> I think it'd make sense to do that first before worrying about optimizing
> such code.

Given that it is not done yet and given that we have not made a final decision
on that yet, we should at least move forward with fixing bugs with the current
state of things.

> Also...
> 
> On 11/15/21 12:31 PM, Kris Van Hees via DTrace-devel wrote:
> > Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> > ---
> >   libdtrace/dt_cg.c | 48 +++++++++++++----------------------------------
> >   1 file changed, 13 insertions(+), 35 deletions(-)
> > 
> > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > index 455e5440..930355f8 100644
> > --- a/libdtrace/dt_cg.c
> > +++ b/libdtrace/dt_cg.c
> > @@ -958,9 +958,6 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
> >   		return 0;
> >   	} else if (dt_node_is_string(dnp)) {
> > -		uint_t		size_ok = dt_irlist_label(dlp);
> > -		int		reg;
> > -
> >   		dt_cg_check_notnull(dlp, drp, dnp->dn_reg);
> >   		TRACE_REGSET("store_val(): Begin ");
> > @@ -968,49 +965,30 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
> >   				 size + DT_STRLEN_BYTES + 1, 1, pfp, arg);
> >   		/*
> > -		 * Retrieve the length of the string, limit it to the maximum
> > -		 * string size, and store it in the buffer at [%r9 + off].
> > +		 * Copy the length of the string from the source string as a
> > +		 * half-word (2 bytes) into the buffer at [%r9 + off].
> >   		 */
> > -		reg = dt_regset_alloc(drp);
> > -		if (reg == -1)
> > -			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> > -		dt_cg_strlen(dlp, drp, reg, dnp->dn_reg);
> > -		dt_regset_xalloc(drp, BPF_REG_0);
> > -		emit(dlp,  BPF_BRANCH_IMM(BPF_JLT, reg, size, size_ok));
> 
> FWIW (not that anyone cares about code that is working and isn't likely to
> survive in any case), that jump might as well be JLE.

Good point and actually it does matter (and this code is likely to survive
anyway because the verifier probably needs it).

> > -		emit(dlp,  BPF_MOV_IMM(reg, size));
> > -		emitl(dlp, size_ok,
> > -			   BPF_MOV_REG(BPF_REG_0, reg));
> > -		emit(dlp,  BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 8));
> > -		emit(dlp,  BPF_STORE(BPF_B, BPF_REG_9, off, BPF_REG_0));
> > -		emit(dlp,  BPF_STORE(BPF_B, BPF_REG_9, off + 1, reg));
> > -		dt_regset_free(drp, BPF_REG_0);
> > +		emit(dlp, BPF_LOAD(BPF_H, BPF_REG_0, dnp->dn_reg, 0));
> > +		emit(dlp, BPF_STORE(BPF_H, BPF_REG_9, off, BPF_REG_0));
> 
> I know I've asked this multiple times before (and you've replied) but the
> use of %r0 without allocating it still confuses me. Specifically, let's say
> that that's a safe practice.  Then why, only a few lines later (and wherever
> we make function calls) do we xalloc/free %r0?

No function call here - but function call a few lines later.  We use %r0 here
as a temp in code that we know does not make a function call so we have total
control over its use.  When we face a function call, we reserve %r0 because it
will be clobbered for a possible return value (even if there isn't one).

> >   		/*
> > -		 * Copy the string to the output buffer at
> > -		 * [%r9 + off + DT_STRLEN_BYTES] because we need to skip the
> > -		 * length prefix.
> > +		 * Copy the string data (no more than STRSIZE + 1 bytes) to the
> > +		 * buffer at [%r9 + off + DT_STRLEN_BYTES].  We (ab)use the
> 
> In what sense is this an abuse?  This seems to be totally the sort of thing
> probe_read_str() is for.  I suggest s/(ab)use/use/.

We are (on purpose) giving a max bytecount that is likely to be larger than we
can actually read, and we are depending on probe_read_str() to "save us" by
stopping once we hit the NUL byte.  To me (given that I could actuallt retrieve
the length and probably should - but I don't because this is more efficient),
this is a potential abuse although it is making use of a convenient feature.
Hence the parantheses.  YMMV

> > +		 * fact that probe_read_str) stops at the terminating NUL byte.
> >   		 */
> >   		if (dt_regset_xalloc_args(drp) == -1)
> >   			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> > -		emit(dlp,  BPF_MOV_REG(BPF_REG_1, BPF_REG_9));
> > -		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off + DT_STRLEN_BYTES));
> > -		emit(dlp,  BPF_MOV_REG(BPF_REG_2, reg));
> > -		dt_regset_free(drp, reg);
> 
> Incidentally, as far as the pre-existing code goes... free(reg)? Looking a
> few lines ahead...

Bugs in pre-existing code?  How remarkable :)

> > -		emit(dlp,  BPF_MOV_REG(BPF_REG_3, dnp->dn_reg));
> > +		emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_9));
> > +		emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off + DT_STRLEN_BYTES));
> > +		emit(dlp, BPF_MOV_IMM(BPF_REG_2, size + 1));
> > +		emit(dlp, BPF_MOV_REG(BPF_REG_3, dnp->dn_reg));
> >   		dt_regset_free(drp, dnp->dn_reg);
> >   		dt_cg_tstring_free(pcb, dnp);
> > -		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STRLEN_BYTES));
> > +		emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STRLEN_BYTES));
> >   		dt_regset_xalloc(drp, BPF_REG_0);
> 
> Here is the xalloc(%r0), even though we've already used %r0 (without
> xalloc'ing it).

See above...

> > -		emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_probe_read));
> > +		emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read_str));
> >   		dt_regset_free_args(drp);
> > -
> > -		/*
> > -		 * Write the NUL terminating byte.
> > -		 */
> > -		emit(dlp,  BPF_MOV_REG(BPF_REG_0, BPF_REG_9));
> > -		emit(dlp,  BPF_ALU64_REG(BPF_ADD, BPF_REG_0, reg));
> 
> Here the pre-existing code uses reg, which it has already freed.
> 
> > -		emit(dlp,  BPF_STORE_IMM(BPF_B, BPF_REG_0, off + DT_STRLEN_BYTES, 0));
> >   		dt_regset_free(drp, BPF_REG_0);
> >   		TRACE_REGSET("store_val(): End   ");
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list