[DTrace-devel] [PATCH] bpf: allocate the buffers BPF map to fit highest CPU id

Fri Dec 12 04:36:28 UTC 2025

On Thu, Dec 11, 2025 at 06:16:37PM -0500, Eugene Loh wrote:
> On 12/11/25 17:22, Kris Van Hees via DTrace-devel wrote:
> 
> > Even when less than the possible number of CPUs are online, the 'buffers'
> > BPF map should be allocated based on the highest possible CPU id because
> > probe data is written to the bufer
> 
> s/bufer/buffer/

Thanks.

> I guess a problem here is that we do not test this case.  It would be hard,
> but I suppose we should change this situation?

I am not entirely sure we *can* test this properly specialized configs.  The
problem came to light on a configuration that conveniently demonstrated the
problem because of a large gap of unavailable CPUs (0-127 possible, but only
0-7,100-127 were online).  Because in that situation, it is quite likely that
the CPU on which the events we are interested in occur falls beyond the end
of the buffers BPF map.

> What are the chances of our at least writing out what is per online CPU and
> what is per possible CPU id?  That way, a future developer would (hopefully)
> not have to reverse engineer this stuff.

What do you mean?  What is not clear about the current implementation?  I
do not think that there was an issue here of things not being documented
sufficiently - I simply made a silly mistake in the code.

> It seems to me that this patch is an improvement over the status quo.  On
> the other hand, given the limitations on what we're doing -- notably, the
> lack of regular, automated testing -- that should probably at least be
> acknowledged in the commit message.

I again do not know what you mean by this.  It is not an improvement over a
status quo, it is a fix for a genuine bug.

I also do not think it is fair to say that we lack regular, automated testing.
That is something we actually *are* doing (i.e. the build-and-test robots we
use internally).  This is a situation that primarily has not been detected
until now because we don't have all unusual configurations available for
testing at all times.  A system that has this gap of CPU ids in the list of
online CPUs is not common.  We can look into building a system just for that,
if we can, but it is the type of situation that hardly any automated testing
would uncover.  You simply cannot test everything.

I can add more details to the commit message about what the exact problem is
that is being solved.

> > that corresponds to a given CPU id,
> > which could be part of non-sequential CPU id configurations.
> > 
> > Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> > ---
> >   libdtrace/dt_bpf.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> > index 0a57b7d2..6568a572 100644
> > --- a/libdtrace/dt_bpf.c
> > +++ b/libdtrace/dt_bpf.c
> > @@ -755,7 +755,7 @@ gmap_create_buffers(dtrace_hdl_t *dtp)
> >   {
> >   	return create_gmap(dtp, "buffers", BPF_MAP_TYPE_PERF_EVENT_ARRAY,
> >   			   sizeof(uint32_t), sizeof(uint32_t),
> > -			   dtp->dt_conf.num_online_cpus);
> > +			   dtp->dt_conf.max_cpuid);
> >   }
> >   /*