[DTrace-devel] [PATCH] Add support for strtok() subroutine

Kris Van Hees kris.van.hees at oracle.com
Fri Nov 19 05:24:03 UTC 2021


On Thu, Nov 18, 2021 at 08:15:04PM -0500, Eugene Loh via DTrace-devel wrote:
> Thanks for the preliminary feedback.
> 
> I've incorporated some of your suggestions.
> 
> I've also made an unrelated change.  The issue is what strtok(NULL, ...)
> means.  I understood it to mean that the string pointer is NULL, which is
> apparently not exactly right.  E.g., strtok(x, ...), where x is a NULL
> pointer, does not qualify. Rather, the first argument should statically be
> NULL.  It is not a run-time condition, which was how I had treated it.
> 
> Finally, I had some comments or questions on other feedback.
> 
> On 11/13/21 12:22 AM, Kris Van Hees wrote:
> 
> > On Mon, Nov 08, 2021 at 02:50:44PM -0500, eugene.loh at oracle.com wrote:
> > > From: Eugene Loh <eugene.loh at oracle.com>
> > > 
> > > Note that the space for tstrings used to allocate an extra tstring at
> > > the end in order to pacify the BPF verifier.  Use this extra space
> > The "used to allocate" makes no sense here because we still do that, and
> > your patch does not change that either.  Just say that the space for tstrings
> > allocates space for an extra string to pacify the BPF verifier.
> 
> I'm having trouble following this.  Pacifying the verifier was what we
> used to do.  With this patch, we need the extra space, which can now
> no longer overlap the stack scratch space.  Putting the strtok() space
> at the end eliminates the need for the pacifying.
> 
> Now, maybe you're saying we still pacify the verifier, but we're also
> using that space for strtok(), but that's not quite right.  E.g., if the
> stack scratch space were very long, the BPF verifier would already
> be pacified, but we would still need to add extra strtok() space.
> 
> Further, we're only adding space for the last tstring.  Each tstring's
> use was already being handled by the following tstring.  We didn't
> allocate space for 8 tstrings; we simply made the 4 tstrings contiguous.
> The one exception was the last tstring.
> 
> Clearly, this is a somewhat nuanced point.  I do not follow what point
> you are making or how to address it.

My point was simply that grammatically, your comment is confusing.  You state
that we 'used to allocate' something and then you mention that you 'use this
extra space'.  The use of the past tense would imply that we no longer allocate
an extra tstring which would also mean that you cannot use that extra space
because you just implied it is no longer being allocated.

There is no need to mention what was done or not done - just state what is
being done with the introduction of this patch and leave it at that.  The
patch itself will make it clear how that differs from what we were doing
before.

> > > for the internal storage needed by strtok() between calls.  Since a
> > > stack() call might occur between strtok() calls, this space can no
> > > longer overlap the stack scratch space.
> > > Compute the tstring length once -- in dt_cg_tstring_reset -- so that
> > > its value does not need to be maintained in multiple places.
> > This should be done differently and in a separate patch.  Really, the only
> > things that needs to be done is put the string size value outside of the loop.
> > There should not be any need to store it in dtp (more on that below).
> > 
> > > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> > > @@ -298,21 +298,19 @@ dt_bpf_gmap_create(dtrace_hdl_t *dtp)
> > >   	 *	- maximum trace buffer record size, rounded up to the
> > >   	 *	  multiple of 8
> > >   	 *	- the greater of:
> > > -	 *		+ the maximum stack trace size
> > > -	 *		+ four times the maximum string size (incl. length
> > > -	 *		  and allowing round up to multiple of 8)
> > > -	 *		  plus the maximum string size (to accomodate the BPF
> > > -	 *		  verifier)
> > > +	 *		- the maximum stack trace size
> > > +	 *		- the size for all tstrings
> > > +	 *	- internal state for strtok()
> > > +	 *		- the offset index
> > > +	 *		- the maximum string size
> > > +	 *		- a terminating NULL char
> > >   	 */
> > >   	memsz = roundup(sizeof(dt_mstate_t), 8) +
> > >   		8 +
> > >   		roundup(dtp->dt_maxreclen, 8) +
> > >   		MAX(sizeof(uint64_t) * dtp->dt_options[DTRACEOPT_MAXFRAMES],
> > > -		    DT_TSTRING_SLOTS *
> > > -			roundup(DT_STRLEN_BYTES +
> > > -				dtp->dt_options[DTRACEOPT_STRSIZE], 8) +
> > > -		    dtp->dt_options[DTRACEOPT_STRSIZE] + 1
> > > -		);
> > > +		    DT_TSTRING_SLOTS * dtp->dt_tstringlen) +
> > I don't think a dtp->dt_tstringlen is warranted.  And if it were, it should be
> > dtp->dt_strsize or something like that because tstrings are simply string
> > values.  The size is not specific to tstrings.
> 
> A tstring might be used differently than a string.  In many cases,
> I use tstrings as working space that needs to be a multiple of 8.
> Normal strings don't have that.

A tstring is meant to be a string.  It is a temporary string.  You may use it
differently but that doesn't change what it is :)  If we are enforcing that
the storage for tstrings needs to be a multiple of 8, then it would make sense
to do the same for strings because they really ought to be interchangable.

> > > +		sizeof(uint64_t) + dtp->dt_options[DTRACEOPT_STRSIZE] + 1;
> > So, the temporary source string storage
> 
> Well, the strtok() internal storage.
> 
> > does not need to be rounded up to the
> > nearest multiple of 8?
> 
> Right.
> 
> > And no need to accomodate the trailing NUL byte?  Why?
> 
> ?  It does have the trailing byte.  Hence the +1.

I must have somehow overlooked that.  Sorry.

> > > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > > @@ -154,6 +154,11 @@ dt_cg_tramp_prologue_act(dt_pcb_t *pcb, dt_activity_t act)
> > >   	emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, roundup(dtp->dt_maxreclen, 8)));
> > >   	emit(dlp,  BPF_STORE(BPF_DW, BPF_REG_FP, DCTX_FP(DCTX_MEM), BPF_REG_0));
> > > +	/* store -1 ("uninitialized") in the offset prefix of strtok storage */
> > > +	emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, dtp->dt_tstrtok));
> > > +	emit(dlp,  BPF_MOV_IMM(BPF_REG_1, -1));
> > > +	emit(dlp,  BPF_STORE(BPF_DW, BPF_REG_0, 0, BPF_REG_1));
> > It might require an extra instruction, but I think I would like it more if a
> > strtok state pointer were to be added to the mstate instead of having this
> > hardcoded offset approach in the code generator.
> 
> This code is gone.  The "state" (whether strtok internal storage is
> initialized) is now statically known.  It's known at cg time.  It is no
> longer set or read with BPF instructions.

Are you sure you can know this at CG time?  What if you have something like:

	trace(id == 1 ? strtok(x, del) : x);

If id == 1, you will have an initialized strtok state.  If id != 1, you will
not have an initialized state.  A subsequent strtok(NULL, del) would be an
issue then because you do not know whether there is an initialized state.

Also, what if the clause that has the initial strtok() has a predicate that
evaluates as false, and thus the strtok() is not executed, but following
clauses do not have a predicate (or one that is true), and if those have a
strtok(NULL,, ...) it will not have a valid state.

How do you statically determine this?  It is based on runtime behaviour.

> > >   	/*
> > >   	 * Store pointer to BPF map "name" in the DTrace context field "fld" at
> > >   	 * "offset".
> > > @@ -807,6 +812,16 @@ dt_cg_tstring_reset(dtrace_hdl_t *dtp)
> > >   	int		i;
> > >   	dt_tstring_t	*ts;
> > > +	/*
> > > +	 * A tstring might be used to hold a string, along with its length
> > > +	 * prefix.  Alternatively, there may be a terminating NULL char.
> > > +	 *
> > > +	 * Some functions also process strings 8 chars at a time for
> > > +	 * greater efficiency.  So round up to a multiple of 8.
> > > +	 */
> > > +	dtp->dt_tstringlen = roundup(MAX(1, DT_STRLEN_BYTES) +
> > > +				     dtp->dt_options[DTRACEOPT_STRSIZE], 8);
> > This is wrong,  There is no alternatively...  Strings have a size prefix, their
> > character bytes, and a terminating NUL byte.
> 
> Good to know.  It always seemed to me there was some confusing
> ambivalence about whether they included room for a NUL byte beyond
> STRSIZE.  I'm glad there is no such ambiguity.
> 
> > Anyway, as mentioned above, I am
> > posting a patch to put this size calculation outside the loop, but not as a
> > dtp member.  There may be a case to be made for a dtp->dt_strsize since we do
> > use the DT_STRLEN_BYTES + dtp->dt_options[DTRACEOPT_STRSIZE] + 1 value in
> > various places.  All probably should be rounded up to the nearest multiple of 8
> > since some of your functions want to go through the string in blocks of 8 bytes.
> 
> tstrings, for certain usage, need to be multiples of 8, but those usages
> do not require DT_STRLEN_BYTES prefixes.  Normal strings do not (IIUC)
> have the multiple-of-8 requirement.
> 
> But overallocating normal strings so that they can all have the same size
> is fine by me.  I'll go with your dt_strsize suggestion.  This size appears
> to
> be used three times (determining the tstring offsets, computing memsz,
> and computing the offset for the strtok internal state);  the expression for
> this size strikes me as sufficiently complex as to warrant such
> consolidation.
> > > +
> > >   	if (dtp->dt_tstrings == NULL) {
> > >   		dtp->dt_tstrings = dt_calloc(dtp, DT_TSTRING_SLOTS,
> > >   					    sizeof(dt_tstring_t));
> > > @@ -815,9 +830,7 @@ dt_cg_tstring_reset(dtrace_hdl_t *dtp)
> > >   		ts = dtp->dt_tstrings;
> > >   		for (i = 0; i < DT_TSTRING_SLOTS; i++, ts++)
> > > -			ts->offset = i *
> > > -				roundup(DT_STRLEN_BYTES +
> > > -					dtp->dt_options[DTRACEOPT_STRSIZE], 8);
> > > +			ts->offset = i * dtp->dt_tstringlen;
> > >   	}
> > >   	ts = dtp->dt_tstrings;
> > > @@ -3772,6 +3785,99 @@ dt_cg_subr_strstr(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
> > >   	TRACE_REGSET("    subr-strstr:End  ");
> > >   }
> > > +static void
> > > +dt_cg_subr_strtok(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
> > > +{
> > > +	dtrace_hdl_t	*dtp = yypcb->pcb_hdl;
> > > +	dt_node_t	*s1 = dnp->dn_args;
> > > +	dt_node_t	*s2 = s1->dn_list;
> > > +	dt_ident_t	*idp = dt_dlib_get_func(yypcb->pcb_hdl, "dt_strtok");
> > > +	uint64_t	off;
> > > +	uint_t		Lhave_str = dt_irlist_label(dlp);
> > > +	int		reg;
> > > +
> > > +	assert(idp != NULL);
> > > +
> > > +	TRACE_REGSET("    subr-strtok:Begin");
> > > +
> > > +	/* get string and copy internally if not NULL */
> > > +	/* (the 8-byte prefix is the offset, which we initialize to 0) */
> > > +	dt_cg_node(s1, dlp, drp);
> > > +	emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, s1->dn_reg, 0, Lhave_str));
> > > +	if (dt_regset_xalloc_args(drp) == -1)
> > > +		longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> > > +	emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_FP, DT_STK_DCTX));
> > > +	emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_1, DCTX_MEM));
> > > +	emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, dtp->dt_tstrtok));
> > See earlier comment about using mstate (similar to the legacy version).
> 
> You mean, every time the trampoline prologue is generated, we write BPF
> instructions to record an offset to strtok internal state even though that
> offset "never changes"?
> 
> Or maybe the confusion is that the earlier mstate reference came up with
> regards to code dealing with strtok state.  I stuffed that as a stowaway in
> the state but, again, it turns out actually to be static state and I've
> changed that accordingly.

OK.  But perhaps you might then as well eliminate dtp->dt_tstrtok and make
that a macro that depends on dtp->dt_strsize because the value you need here
is derived from dtp->dt_strsize and dtp->dt_options[DTRACEOPT_MAXFRAMES]
without any other variables.  No need to add a member to dtp for that.

> > > +	emit(dlp,  BPF_STORE_IMM(BPF_DW, BPF_REG_1, 0, 0));
> > > +	emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8));
> > > +	emit(dlp,  BPF_MOV_IMM(BPF_REG_2, dtp->dt_options[DTRACEOPT_STRSIZE] + 1));
> > > +	emit(dlp,  BPF_MOV_REG(BPF_REG_3, s1->dn_reg));
> > > +	dt_regset_free(drp, s1->dn_reg);
> > > +	if (s1->dn_tstring)
> > > +		dt_cg_tstring_free(yypcb, s1);
> > > +	emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STRLEN_BYTES));
> > > +	dt_regset_xalloc(drp, BPF_REG_0);
> > > +	emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_probe_read_str));
> > > @@ -5676,17 +5782,21 @@ dt_cg(dt_pcb_t *pcb, dt_node_t *dnp)
> > >   {
> > >   	dt_xlator_t	*dxp = NULL;
> > >   	dt_node_t	*act;
> > > +	dtrace_hdl_t	*dtp = pcb->pcb_hdl;
> > >   	if (pcb->pcb_regs == NULL) {
> > >   		pcb->pcb_regs = dt_regset_create(
> > > -					pcb->pcb_hdl->dt_conf.dtc_difintregs,
> > > +					dtp->dt_conf.dtc_difintregs,
> > >   					dt_cg_spill_store, dt_cg_spill_load);
> > >   		if (pcb->pcb_regs == NULL)
> > >   			longjmp(pcb->pcb_jmpbuf, EDT_NOMEM);
> > >   	}
> > >   	dt_regset_reset(pcb->pcb_regs);
> > > -	dt_cg_tstring_reset(pcb->pcb_hdl);
> > > +	dt_cg_tstring_reset(dtp);
> > > +
> > > +	dtp->dt_tstrtok = MAX(DT_TSTRING_SLOTS * dtp->dt_tstringlen,
> > > +	    sizeof(uint64_t) * dtp->dt_options[DTRACEOPT_MAXFRAMES]);
> > This should not be done for every dt_cg invocation.  It should be done once for
> > all compilations in a single dtrace invocation..

See above about making this a macro.

> > 
> > >   	dt_irlist_destroy(&pcb->pcb_ir);
> > >   	dt_irlist_create(&pcb->pcb_ir);
> > > diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> > > @@ -296,6 +296,8 @@ struct dtrace_hdl {
> > >   	uint_t dt_maxreclen;	/* largest record size across programs */
> > >   	uint_t dt_maxlvaralloc;	/* largest lvar alloc across pcbs */
> > >   	dt_tstring_t *dt_tstrings; /* temporary string slots */
> > > +	size_t dt_tstringlen;	/* length of a tstring */
> > Not needed.
> > 
> > > +	uint64_t dt_tstrtok;	/* offset to internal state for strtok() calls */
> > I think it would make more sense to put the pointer to strtok data in mstate.
> 
> I'm fuzzy on the organization here.  Do you mean in mstate?  Or in dctx? 
> E.g., just as dctx has pointers to mstate, buf, and mem, you mean add one
> for strtok state as well?

See above - just get rid of dt_tstrtok and use a macro that gives the offset.
Sure, that means it gets calculated a few times but that is hardly an issue.

> > >   	dt_list_t dt_modlist;	/* linked list of dt_module_t's */
> > >   	dt_module_t **dt_mods;	/* hash table of dt_module_t's */
> > >   	uint_t dt_modbuckets;	/* number of module hash buckets */
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list