[DTrace-devel] [PATCH v2 02/17] cg: add dt_cg_trace

Nick Alcock nick.alcock at oracle.com
Thu Mar 24 19:57:39 UTC 2022


On 23 Mar 2022, Eugene Loh via DTrace-devel spake thusly:

> Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
> assuming the following prove insignificant...
>
> I did try to build this.  I got:
>     % make clean
>     [...]
>     % make
>     [...]
>     CC: test/utils/badioctl.c
>     LINK: badioctl
>     CC: test/utils/showUSDT.c
>     LINK: showUSDT
>     CC: test/utils/print-stack-layout.c
>     LINK: print-stack-layout
>     % make clean
>     [...]
>     % make debugging=yes
>     [...]
>     GENDOF: test/triggers/usdt-tst-forker-prov.d
>     LINK: usdt-tst-forker
>     GENHDR: test/triggers/usdt-tst-special-prov.d
>     CC: test/triggers/usdt-tst-special.c
>     GENDOF: test/triggers/usdt-tst-special-prov.d
>     dtrace: failed to link script test/triggers/usdt-tst-special-prov.d:
>         an error was encountered while processing dtrace-user/build/test-triggers--usdt-tst-special.o
>     make: *** [Makerules:31: dtrace-user/build/test-triggers--usdt-tst-special-prov.o] Error 1
> I have not tried to troubleshoot this.

It's unrelated, something wrong in dt_link I think.

> Incidentally, I tried running this (ignoring the test make error), but then operations on trace_pipe hung when I went to look at the
> output.  Any idea what I'm doing wrong?

You have to cat the trace_pipe before you start dtrace -- it doesn't
buffer anything, so start it in another terminal.

> Also, the "(counter, value)" interface strikes me as odd.  Since bpf_trace_printfk() allows three arguments (if you count  the right 
> way), how about dt_cg_trace(dlp, drp, isreg_flags, val0, val1, val2) where
>     if (isreg_flags & 1) val0 is a reg; else it's an IMM
>     if (isreg_flags & 2) val1 is a reg; else it's an IMM
>     if (isreg_flags & 4) val2 is a reg; else it's an IMM
> For someone who is interested in fewer values, it's easy enough to feed a 0 value and not set the corresponding isreg bit.

Sure, but I don't see any way to make this *readable*. I mean
do you really think this:

dt_cg_trace(dlp, drp, (1 >> 1) & (1 >> 2), 1, b, c);

(to print the values of regs b and c with the immediate value 1 used as
a call-site counter) or even worse this:

dt_cg_trace(dlp, drp, 6, 1, b, c);

(which I got wrong twice, yes, I'm that bad at bit-shifting in my head)
is actually better than

dt_cg_trace(dlp, drp, 1, 1, b);
dt_cg_trace(dlp, drp, 1, 2, c);

... hm. Of course that's still pretty nasty!

How about fusing ISREG and COUNTER? The real problem with this interface
is that it's got two numeric args next to each other and they frankly
are easy to mix up. But if you have an ISREG which can be any value but
in which the lower bit is used as the ISREG check, you can say:

dt_cg_trace(dlp, drp, 0, b);
dt_cg_trace(dlp, drp, 1, c);
dt_cg_trace(dlp, drp, 3, d);

In the above, b is an immediate value: c and d are registers whose
values should be printed. Odd values of ISREG mean the arg is a
register. Even values mean it's an immediate. ISREG is also used as a
counter (which is fine, the counter value is literally arbitrary, it's
just used to distinguish call sites from each other).

Bingo, no bit twiddling, just "is it odd or even" which even I can do in
my head :P you don't get to emit three values at once but frankly for
debugging this is pretty pointless, and it does add a lot of complexity
trying to specify whether each of those values is a reg or not
independently. If you can think of a nice syntax to specify it (not
manual bit twiddling!) I'm happy, but right now I can't.

> On 3/14/22 5:30 PM, Nick Alcock wrote:
>> A call to
>>
>>    dt_cg_trace(dlp, drp, 0, 666, foo);
>>
>> will trace the immediate value of FOO at codegen time.
>>
>> Under debugging=no this call has no effect and is compiled out.
>
>
> My preference would be a shorter commit message.  E.g.
>         Provide a function dt_cg_trace() to insert BPF code to dump an immediate
>         value or else a register's contents to /sys/kernel/debug/tracing/trace_pipe
>         when dtrace is compiled with debugging=yes.
> Notably, the semantics of the call are more clearly explained in the code comments.

I'm trying to make it easy to figure out how to use this stuff by people
who are just looking at the git log, not digging through the code. This
seems quite likely for people trying to debug stuff.

>> +override CFLAGS += -O0 -g -DDEBUGGING
>> +override BPFCFLAGS += -DDEBUGGING
>>   endif
>>     ifneq ($(coverage),no)
>
> The "DEBUGGING" name strikes me as awfully generic, especially when
> there are already DEBUG and YYDEBUG in the source code and there are
> multiple debugging features I could imagine we'll ultimately want to
> turn on individually.

It is intentionally generic. It doesn't mean 'dt_cg_trace is active', it
means 'we ran make with debugging=yes'. It just so happens that we only
turn on tracing if you do that. There could be other things hanging off
it too.

>> diff --git a/bpf/Build b/bpf/Build
>> index 062381b8cdd0..5f1881f14061 100644
>> --- a/bpf/Build
>> +++ b/bpf/Build
>> @@ -38,7 +38,8 @@ bpf_dlib_SOURCES = \
>>   	strlen.c \
>>   	strrchr.S \
>>   	strtok.S \
>> -	substr.S
>> +	substr.S \
>> +	trace_ptr.c
>>     bpf-check: $(objdir)/include/.dir.stamp
>>   	$(BPFC) $(BPFCPPFLAGS) $(bpf_dlib_CPPFLAGS) $(BPFCFLAGS) -S \
>> diff --git a/bpf/trace_ptr.c b/bpf/trace_ptr.c
>> new file mode 100644
>> index 000000000000..9dcbc720a115
>> --- /dev/null
>> +++ b/bpf/trace_ptr.c
>> @@ -0,0 +1,23 @@
>> +#ifdef DEBUGGING
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
>
> 2022?

Oh dammit. Fixed.

>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>> index eca50b29a500..b5152bd63caa 100644
>> --- a/libdtrace/dt_cg.c
>> +++ b/libdtrace/dt_cg.c
>> @@ -25,6 +25,9 @@
>>   #include <bpf_asm.h>
>>     static void dt_cg_node(dt_node_t *, dt_irlist_t *, dt_regset_t *);
>> +static void _dt_unused_
>> +dt_cg_trace(dt_irlist_t *dlp _dt_unused_, dt_regset_t *drp _dt_unused_,
>> +	    int isreg _dt_unused_, int counter _dt_unused_, uint64_t val _dt_unused_);
>
> Why do we prototype this?  We do not normally do that for static functions.

So it can be called from anywhere in the file without having its
location nailed to the top of the file.

(I am very much of the opinion that functions should be sorted in
functional groups, with forwards used as necessary, rather than strictly
sorting things in reverse dependency order. All that does is makes the
file needlessly hard to grasp on first reading, since that order is
likely to be completely different from any order that makes it easy to
understand what is actually going on.)

>>   +/*
>> + * Trace an immediate counter and a value to the log at
>> + * /sys/kernel/debug/tracing/trace_pipe.  If ISREG, VAL is a register number:
>> + * otherwise, it's an immediate value.
>
> I'd prefer s/:/./.  And then capitalize "otherwise".

Really? "Otherwise" is a connective... I'd call that mildly
ungrammatical (though commonplace and not actually a problem).

Changed, anyway.

>> + */
>> +static void _dt_unused_
>> +dt_cg_trace(dt_irlist_t *dlp _dt_unused_, dt_regset_t *drp _dt_unused_,
>> +	    int isreg _dt_unused_, int counter _dt_unused_, uint64_t val _dt_unused_)
>
> Why all the _dt_unused_?  Is this for the "#ifndef DEBUGGING" case? I thought the usual way this sort of thing is handled is with:

Yep.

>     #ifndef DEBUGGING
>     #define dt_cg_trace(dlp, drp, isreg, counter, val)
>     #else
>     ... function definition ...
>     #fi

Seriously?! This approach has been deprecated for literally *decades*,
ever since people started getting bitten by POSIX saying you could do
just that. It leads to identifiers that can sometimes be used as
function pointers but sometimes not, depending on whether they're a
macro right now or not. (Sure, nobody would use dt_cg_trace in that way,
but this is a matter of consistency: it's easier to just say that *no*
functions should be diked out by turning them into macros, ever.)

Empty static functions are eliminated by the compiler (the Linux kernel
will literally not work with a compiler that doesn't do that), and even
if they weren't, unconditionally calling a function is literally free
these days (branch predictors are that good).

Plus, of course, the *other* problem with the macro approach is that
means your program gets properly typechecked only sometimes -- you can
pass in anything you like as parameters except when debugging is on
when suddenly your calls get typechecked and things break.

What actual *advantage* does this have? I mean yes it's probably
harmless for this specific case, but in the general case I can see
absolutely no benefit in doing this. I mean yes it means you have a few
less _dt_unused_, but there is a real cost to that.

>> +{
>> +#ifdef DEBUGGING
>> +	dt_ident_t	*idp = dt_dlib_get_func(yypcb->pcb_hdl, "dt_trace_ptr");
>
> Typically we follow idp=dt_dlib_get_func() with an assert(idp!=NULL).

Oops! Fixed. (File audited for further instances, none found.)

>> +	if (dt_regset_xalloc_args(drp) == -1)
>> +		longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>> +	dt_regset_xalloc(drp, BPF_REG_0);
>
> Move the xalloc(%r0) to immediately before the BPF_CALL_FUNC().

Really? ... ok, yes, that's quite consistent. Audited for further
instances, fixed (a good few instances of inconsistency here, all fixed:
will push as a separate commit).

>> +	emit(dlp,  BPF_MOV_IMM(BPF_REG_1, counter));
>> +	if (isreg)
>> +		emit(dlp,  BPF_MOV_REG(BPF_REG_2, val));
>> +	else
>> +		emit(dlp,  BPF_MOV_IMM(BPF_REG_2, val));
>
> One weird case is where val is %r1.  (Not so weird, actually, if one
> is debugging function args.)  In that case, we would first overwrite
> %r1 with "counter" and then put that copy in %r2.  I think this
> problem is fixed if you first write %r2 and then write %r1.

Adjusted.

There is a bigger problem with this stuff later in the series. The
bounds checkers have args like

static void
dt_cg_check_bounds(dt_irlist_t *dlp, dt_regset_t *drp, int regptr, int basereg,
		   int reg, int size, int lenreg, int sizemax, int lenmax)

where REG and LENREG are registers, and SIZE may be (iff SIZEMAX is <
0).  The common case of calling this (from dt_cg_check_scratch_bounds)
allocates two regs before calling this function, and inside this
function we allocate more on some code paths.  But... nothing stops
these allocations spilling REG or LENREG! Spilling LENREG is survivable,
with extra code (which I'm not sure how to write), but if we spill REG
we a) don't detect it and b) are in serious trouble, because the whole
point of this function is to assign bounds to REG!

I'm tempted to add some sort of guarding mechanism to the dt_regset so
you can say "We are citing this register by number, do not spill it".
Something like dt_regset_protect(dlp, reg); and then
dt_regset_unprotect(dlp, reg) later on, and this flips a bit in a
bitfield and bits which are on are not spilled.

Does this seem sensible to anyone else?

-- 
NULL && (void)



More information about the DTrace-devel mailing list