[DTrace-devel] [PATCH v2 11/12] Add support for umod(), usym(), and uaddr()

Eugene Loh eugene.loh at oracle.com
Tue Jun 15 12:47:24 PDT 2021


On 6/15/21 9:41 AM, Kris Van Hees wrote:

> On Fri, Jun 11, 2021 at 09:28:24PM -0400, Eugene Loh wrote:
>> On 6/11/21 8:36 PM, Kris Van Hees wrote:
>>
>>> Found something else that also applies to ustack...  See below...
>>>
>>> On Fri, Jun 11, 2021 at 08:17:57PM -0400, eugene.loh at oracle.com wrote:
>>>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>>>> @@ -739,13 +740,34 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
>>>> +	if (kind == DTRACEACT_USYM ||
>>>> +	    kind == DTRACEACT_UMOD ||
>>>> +	    kind == DTRACEACT_UADDR) {
>>>> +		off = dt_rec_add(dtp, dt_cg_fill_gap, kind, 16, 8, NULL, arg);
>>>> +
>>>> +		/* preface the value with the user process tgid */
>>>> +		if (dt_regset_xalloc_args(drp) == -1)
>>>> +			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>>>> +		dt_regset_xalloc(drp, BPF_REG_0);
>>>> +		emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_get_current_pid_tgid));
>>>> +		dt_regset_free_args(drp);
>>>> +		emit(dlp, BPF_STORE(BPF_W, BPF_REG_9, off, BPF_REG_0));
>>>> +		dt_regset_free(drp, BPF_REG_0);
>>> The tgid is a 32-bit value that you are writing to a 64-bit wide space in the
>>> output buffer using a BPF_W store instruction, so it only touches 4 bytes.
>>> But that means that the other 8 bytes have undefined data in them.  But then
>>> you read that in (dt_consume functions) as a 64-bit value.  Most of the time
>>> that will work, but I can see situations where eventually you may run out of
>>> luck and there will be non-zero data in those other 4 bytes.
>>>
>>> The easiest solution would be to actually read the tgid as a 32-bit wide
>>> integer, ignoring the 4 padding bytes.
>> Thanks for noticing this.  I'm fuzzy here:  there are no endian issues
>> here, are there?
> Good question... I think there are none right now, but that doesn't mean that
> if we were to support other architectures there might not be any.  After all,
> if we write a 4-byte entity and the consumer is reading it as a 8-byte entity
> the 4 bytes may end up at the wrong end of the 8-byte entity.
>
> I think that the cleanest way would be to have 2 data records for these
> actions (a 4-byte one for the tgid and a 8-byte one for the value), and then
> have the consumer retrieve them based on the record descriptions.  That
> ensures that the (out-of-band) metadata has the correct information.

I'm not familiar with a number of aspects of the consumer, but it looks 
like something like dt_print_ustack() expects to be passed a single 
record with a bunch of uint64_t values, the first of which is the tgid 
and the rest frames.  That's what DTv1 did, and so it can't be "too bad" 
(for some definition of that phrase).  To change things would mean a 
bunch of reorganization.  Meanwhile, the "extra" 4 bytes would be extra 
only if the record landed on an addr&7==4 boundary (less than half the 
time? just a guess) since if we're on an addr&7=0 boundary we're going 
to have to pad that extra 4 bytes anyhow.  Further, those 4 bytes are 
amortized over a stack anyhow... nominally 8*100 bytes.  (That 
amortization argument doesn't hold for umod/usym/uaddr, of course.)

Anyhow, the existing (from DTv1) dt_print_u*() functions are expecting a 
single record.  My sense is that changing that would be relatively 
intrusive... not a task for "right now."

What do you think?

>> I suppose another approach might be to tgid&=0xffffffff?
> Yes, but then you are deliberately using a 64-bit value to store a 32-bit
> value, just because the consumer is reading it as a 64-bit value even though
> it *knows* it is supposed to be a 32-bit value.  Might as well just store it
> as 32-bit since we support different size integers in the output stream.
>
>>> Alternatively (but slightly more complex in code) you could actually have two
>>> records for each U* action, one with a 4-byte data item, and one with a 8-byte
>>> data item.  That is supported (since there are actually quite a few actions
>>> that have multiple data items associated with them), but again - slightly more
>>> complex even if it is 'cleaner'.
>>>
>>> Either way, I am pretty certain that you do need to do something about this
>>> 32-bit vs 64-bit data issue in view of the potential for non-zero data in the
>>> padding bytes.
>> That's interesting.  I think I had (and tested) a BPF_DW version of one
>> of these patches, and it was fine.  Presumably, it *definitely* had
>> nonzero bytes in that padding (because it called the BPF helper function
>> and then wrote all 8 bytes, which the consumer then read). The day might
>> have been saved by the consumer calling a function expecting a pid_t,
>> and so the garbage was coerced to 32 bits and one got shamefully lucky.
>> A dangerous way to live.
>>
>>>> +
>>>> +		/* then store the value */
>>>> +		emit(dlp, BPF_STORE(BPF_DW, BPF_REG_9, off + 8, dnp->dn_reg));
>>>> +		dt_regset_free(drp, dnp->dn_reg);
>>>> +
>>>> +		return 0;
>>>> +	}
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list