[DTrace-devel] [PATCH 2/3] Introduce temporary string space for string manipulation functions

Eugene Loh eugene.loh at oracle.com
Thu Sep 2 22:25:25 PDT 2021


On 9/3/21 1:04 AM, Kris Van Hees wrote:

> On Thu, Sep 02, 2021 at 02:22:19PM -0400, Eugene Loh wrote:
>> On 9/2/21 2:41 AM, Kris Van Hees wrote:
>>> String functions that have a string as return value need to store that
>>> string in a temporary location so it can be used in following code.
>>> We need at most 4 temporary strings to support nesting string functions
>>> in an expression.
>> How do you know 4?
> The most amount of stirng arguments to any function is 2, and if the result
> is a string as well, that means we may have 3 temporary strings in use.  But
> since string functions can be nested, we can (at most) end up with one tstring
> (possibly from a nested function) along wth a nested function being processed
> and needing 2 tstrings for its arguments and 1 for its result.  That makes for
> a total of 4.
Such an explanation would be good to have in the code.
>>> diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
>>>    	 */
>>>    	memsz = roundup(sizeof(dt_mstate_t), 8) +
>>>    		8 +
>>>    		roundup(dtp->dt_maxreclen, 8) +
>>>    		MAX(sizeof(uint64_t) * dtp->dt_options[DTRACEOPT_MAXFRAMES],
>>> -		    3 * dtp->dt_options[DTRACEOPT_STRSIZE]);
>>> +		    DT_TSTRING_SLOTS *
>>> +			(DT_STRLEN_BYTES + dtp->dt_options[DTRACEOPT_STRSIZE] +
>>> +		    dtp->dt_options[DTRACEOPT_STRSIZE]
>>> +		));
>> Looks like STRSIZE is in there twice?  Do you want space for a
>> terminating NULL?
> That extra STRSIZE is to deal with the BPF verifier not understanding that two
> values can be strictly related.  E.g. if A + B = STRSIZE, and we copy A chars
> from s1 and B chars from s2, we know that we will never exceed the string size
> limit, yet the BPF verifier will consider A and B independent and with a range
> of [0 .. STRSIZE].  Extending the space by STRSIZE bytes ensures the verifier
> does not complain.
Okay.  An explanation in a comment would be great since the way it is 
just looks like a typo.
>
>>>    	if (create_gmap(dtp, "mem", BPF_MAP_TYPE_PERCPU_ARRAY,
>>>    			sizeof(uint32_t), memsz, 1) == -1)
>>>    		return -1;		/* dt_errno is set for us */
>>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>>> +
>>> +/*
>>> + * Initialize the temporary string offsets and mark all not in use.
>>> + */
>>> +static void
>>> +dt_cg_tstring_init(dtrace_hdl_t *dtp)
>>> +{
>>> +	int		i;
>>> +	dt_tstring_t	*ts = dt_cg_tstrings;
>>> +
>>> +	for (i = 0; i < ARRAY_SIZE(dt_cg_tstrings); i++, ts++) {
>>> +		ts->offset = i * (DT_STRLEN_BYTES +
>>> +				  dtp->dt_options[DTRACEOPT_STRSIZE]);
>>> +		ts->in_use = 0;
>>> +	}
>>> +}
>> Here and elsewhere, it would seem that DT_TSTRING_SLOTS would be
>> preferable over ARRAY_SIZE(dt_cg_tstrings).  The former is more concise
>> and direct.  Also, it will work even if one were to allocate
>> dt_cg_tstrings dynamically.
>>
>> Should the offsets be computed with space for terminating NULL chars?
> Possibly, though the code base has never been very good at being consistent
> with whether the maximum string size includes the NUL byte or not.
It seems that that needs to be cleaned up at some point.  E.g., 
strsize=256 should mean a string of 256 chars should work.
> Whether
> to use ARRAY_SIZE or DT_TSTIRNG_SLOTS doesn't matter much if the allocation is
> static.  With dynamic allocation it needs to be the constant of course.
So, DT_TSTRING_SLOTS would be better... and more concise.



More information about the DTrace-devel mailing list