[DTrace-devel] [PATCH 2/8] Reduce stack depth if kernel returns NULL frames
Kris Van Hees
kris.van.hees at oracle.com
Wed Aug 28 20:37:39 UTC 2024
On Wed, Aug 28, 2024 at 04:23:09PM -0400, Eugene Loh via DTrace-devel wrote:
> On 8/28/24 16:17, Kris Van Hees wrote:
>
> > On Wed, Aug 28, 2024 at 04:11:29PM -0400, Eugene Loh wrote:
> > > It's been a while, but if I remember correctly it's actually just the other
> > > way around. That is, to make stack and depth consistent, we're forced into
> > > this "depth reduction" patch. Put another way, the stack simply does not
> > > include NULL pointers -- there is no way to remove them.
> > >
> > > E.g., let's say we have a buffer of 8 pointers and we ask for the stack and
> > > get back:
> > >
> > > 0xdead 0xbeef 0xfeed 0xface NULL NULL NULL NULL
> > >
> > > Looks like 4 pointers. But we don't count them. We just use the return
> > > value from the helper function. If it tells us "4", then everything is
> > > consistent. But for some reason (I haven't looked at the code to figure out
> > > why), it can be 5 or even 6. So this patch bumps that value down -- to
> > > attain the consistency I think you're asking about.
> > So you are saying that the number of values that is actually filled in is not
> > consistent with the return value of the bpf_get_stack() helper? That would
> > sound like a kernel bug.
>
> I think there might be a kernel bug. For a variety of practical reasons, I went with a DTrace workaround (and haven't looked at the kernel code).
I'll look around to see if I can figure out what is going on and whether it is
a kernel bug (which of course may already have been fixed in some kernel
version). In general, we should mention in patches like this that it seems to
be a kernel bug (or first determine if it is) because otherwise we're going
to implement workarounds without ever making sure the kernel bug gets fixed.
Or if it is not a kernel bug, we need to look elsewhere why this goes wrong.
More information about the DTrace-devel
mailing list