[DTrace-devel] [PATCH] ERROR probe implementation
Eugene Loh
eugene.loh at oracle.com
Fri Jan 29 22:03:04 PST 2021
On 1/29/21 7:27 PM, Eugene Loh wrote:
> On 1/28/21 1:10 AM, Kris Van Hees wrote:
>> On Thu, Jan 28, 2021 at 12:24:24AM -0500, Eugene Loh wrote:
>>
>>> These tests started to XPASS spuriously -- that is, they actually
>>> "fail".
>>> test/unittest/variables/bvar/tst.caller.d
>>> test/unittest/variables/bvar/tst.errno.d
>>> test/unittest/variables/bvar/tst.ipl.d
>>> test/unittest/variables/bvar/tst.ucaller.d
>>> test/unittest/variables/bvar/tst.ustackdepth.d
>>> test/unittest/variables/bvar/tst.vtimestamp.d
>>> test/unittest/variables/bvar/tst.walltimestamp.d
>>> I do not yet understand what is happening here. Note that "caller" is
>>> not yet implemented. Consider:
>>> # dtrace -n 'BEGIN { trace(caller); }'
>>> dtrace: error on enabled probe ID 2 (ID 1: dtrace:::BEGIN):
>>> illegal operation in action #1
>>> dtrace: processing aborted: Invalid library ERROR action
>>> # echo $?
>>> 1
>>> # dtrace -n 'BEGIN { trace(caller); exit(0) }'
>>> dtrace: error on enabled probe ID 2 (ID 1: dtrace:::BEGIN):
>>> illegal operation in action #1
>>> dtrace: processing aborted: Invalid library ERROR action
>>> # echo $?
>>> 1
>>> # dtrace -n 'BEGIN { trace(caller); exit(1) }'
>>> dtrace: error on enabled probe ID 2 (ID 1: dtrace:::BEGIN):
>>> illegal operation in action #1
>>> dtrace: processing aborted: Invalid library ERROR action
>>> # echo $?
>>> 1
>>> # dtrace -n 'BEGIN { trace(caller); exit(caller != -1 ? 0 :
>>> 1); }'
>>> dtrace: error on enabled probe ID 2 (ID 1: dtrace:::BEGIN):
>>> illegal operation in action #1
>>> # echo $?
>>> 0
>>> The return value looks wrong. If you like, I can look into this some,
>>> but I assume this reflects a bug.
>> This is almost certainly a side effect of the consume bug that I
>> mentioned
>> earlier - where BEGIN, ERROR (and END) are all fired before we even
>> start
>> processing data buffers.
>>
>> If you want to take a look at that consume code (dt_consume_begin and
>> related
>> functions), that would be quite useful. Especially for test cases,
>> this is a
>> situation tht pops up quite often.
> The consume problem is an interesting bug -- and I'm trying to educate
> myself about the consumer, especially dt_consume_begin, et al. -- but
> I think the problem I mentioned is something else. E.g.,
>
> # dtrace -n 'BEGIN { i = 1; j = 0 } tick-1000ms { trace(i / j );
> trace(i / j ); exit(12); } tick-1100ms { exit(34); }'
> dtrace: error on enabled probe ID 3 (ID 91650: profile:::tick-1000ms):
> divide-by-zero in action #1
> CPU ID FUNCTION:NAME
> 2 91651 :tick-1100ms
> # echo !?
> 34
> # dtrace -n ' tick-1000ms {
> trace(caller); exit(12); } tick-1100ms { exit(34); }'
> dtrace: error on enabled probe ID 2 (ID 91650: profile:::tick-1000ms):
> illegal operation in action #1
> dtrace: processing aborted: Invalid library ERROR action
> # echo !?
> 1
> # dtrace -n ' tick-1000ms { trace(caller);
> trace(caller); exit(12); } tick-1100ms { exit(34); }'
> dtrace: error on enabled probe ID 2 (ID 91650: profile:::tick-1000ms):
> illegal operation in action #1
> dtrace: error on enabled probe ID 1 (ID 3: dtrace:::ERROR): illegal
> operation in action #1
> dtrace: error on enabled probe ID 1 (ID 3: dtrace:::ERROR): unknown
> fault in action #-1 at DIF offset 12
> # echo !?
> 0
>
> Also, there seem to be some funny records in the perf_event buffer.
> So, I think something is wrong on the producer side with get_bvar.o
> errors. I'll poke around.
I assume the problem is that we call get_bvar() to get caller and then
write that value to the buffer and execute the rest of the clause. I
mean, that's great, that's what's supposed to happen, but if get_bvar()
encountered an error, then we should record the error and abort the
clause. Currently, it seems we do not abort the clause. It'd be nice
to have a doc that says what is expected to happen.
More information about the DTrace-devel
mailing list