[DTrace-devel] [PATCH 01/14] Fix stack-skip counts for caller and stackdepth
Kris Van Hees
kris.van.hees at oracle.com
Thu Jun 19 16:20:22 UTC 2025
On Thu, Jun 19, 2025 at 02:03:56PM +0100, Nick Alcock wrote:
> On 16 Jun 2025, Eugene Loh verbalised:
>
> > On 6/13/25 10:33, Nick Alcock wrote:
> >
> >>> Note that we declare the skip count volatile. The compiler might
> >>> optimize code that uses the STACK_SKIP value, but we will subsequently
> >>> perform relocations that adjust this value.
> >> ... why doesn't this apply to every other extern global variable in
> >> get_bvar()? They're all similarly relocated...
> >
> > Right. There is potentially a broader problem. But we simply do not have evidence of misbehavior in other cases. Ruggedizing
> > other cases could be the subject of a different patch.
>
> Aha, OK. I was just wondering if there was some extra reason.
>
> > The problem in this case is that the compiler seems to assume &symbol!=0, which is reasonable except that we violate that behavior
> > for our relocation tricks.
>
> I wonder where the code for that is... plenty of symbols have value
> zero.
>
> But, really...
>
> > Consider the C code:
> >
> > uint64_t dt_bvar_stackdepth(const dt_dctx_t *dctx)
> > {
> > uint32_t bufsiz = (uint32_t) (uint64_t) (&STKSIZ);
> > char *buf = dctx->mem + (uint64_t)(&STACK_OFF);
>
> Hm...
>
> extern uint64_t STACK_SKIP;
>
> So we encode information about the stack size and skip value by encoding
> it in the *address* of the variable? Is there some reason we don't use
> its value? unlike the stack offset, we're *using* it as a value, not an
> address...
The issue seems to be (and perhaps this is a cross compiler problem) that e.g.
extern uint64_t PC;
and then code accessing the value of PC (e.g. foo(PC) as a call argument) will
yield:
50: 18 02 00 00 00 00 00 00 lddw %r2,0
58: 00 00 00 00 00 00 00 00
50: R_BPF_INSN_64 PC
60: 79 22 00 00 00 00 00 00 ldxdw %r2,[%r2+0]
which shows that it is interpreting PC as an address to a symbol, because it
loads the address of the symbol and then dereferences it with offset 0. So,
we cannot plug in the value during relocation because the only value we can
put there would be an address where the vlaue can be found. To get around
this, we "use" Tthe address as the entity to store the value in, knowing that
we *never* will interpret it as an address for these specific externs.
> > uint64_t retv;
> > volatile uint64_t skip = (uint64_t)(&STACK_SKIP);
> >
> > retv = bpf_get_stack(dctx->ctx,
> > buf,
> > bufsiz,
> > skip & BPF_F_SKIP_FIELD_MASK);
> >
> > if (skip)
> > return retv / sizeof(uint64_t) - 1; // branch "A"
> > return retv / sizeof(uint64_t); // branch "B"
> > }
> >
> > If you omit "volatile", the compiler assumes &STACK_SKIP!=0. The emitted code has:
>
> (which is a reasonable assumption if not freestanding, I'd say. Why
> don't we compile BPF code with -ffreestanding? BPF is almost the
> *definition* of a freestanding environment...)
>
> > *) no run-time "if (skip)" check
> >
> > *) no code for branch "B"
> >
> > *) only code for branch "A"
> >
> > If you include "volatile", however, the compiler caches &STACK_SKIP on
> > the BPF stack and later performs a run-time check on its value to
> > correctly execute either branch "A" or branch "B".
>
> This feels very mucvh like a workaround to me. Does compiling BPF with
> -ffreestanding help?
>
> I mean it fixes a bug, so I suppose it should go in if nothing else
> works, but using volatile is almost always a desperate sticking plaster
> and this feels like one of those occasions to me.
>
> --
> NULL && (void)
More information about the DTrace-devel
mailing list