[DTrace-devel] [PATCH 1/2] Add a cpuinfos BPF map

Eugene Loh eugene.loh at oracle.com
Tue Apr 1 22:54:29 UTC 2025


Here is a proposal.  First, two observations:

1.  (As Alan pointed out to me in a facepalm moment), one can write a 
simple D script to check enqueue_task_*()'s rq->cpu against the current 
CPU.  He and I both find that the two CPUs are generally -- but not 
always -- the same.  So, the strictly correct thing to do is use the 
rq->cpu value, even though you can just use the current CPU and be 
correct "99%" of the time.

2.  A BPF program can access per-cpu-array values on other CPUs. Well, I 
guess you need commit 0734311 ("bpf: add bpf_map_lookup_percpu_elem for 
percpu map").  That's in 5.18. That is, UEK9.

So my proposal is to leave the per-cpu cpuinfo BPF map alone. Perform a 
runtime test whether bpf_map_lookup_percpu_elem() is available.  If so, 
do that cross-CPU lookup -- the 2/2 patch I posted -- but using the new 
helper function.  If not, use a simpler on-CPU lookup, which should be 
right "99%" of the time. (I have a simple patch that uses the current 
CPU.  Pretty simple.)

On 4/1/25 01:36, Kris Van Hees wrote:

> This is not the way to go about this.  If, in order to implement the cpuinfo_t
> argument to sched probes, a regular BPF array map is needed so that cpuinfo
> data can be accessed for any given CPU id, then the existing map should be
> replaced with the new one, and its use updated to access the new one.  That
> way you can also keep the name of the map, etc...
>
> Introducing this new map with exactly the same data, and then hoping to
> deprecate the old one later is making things more messy.
>
> On Mon, Mar 31, 2025 at 05:45:00PM -0400, eugene.loh--- via DTrace-devel wrote:
>> From: Eugene Loh <eugene.loh at oracle.com>
>>
>> The cpuinfo BPF map is a per-CPU map that has CPU information
>> on each CPU for that CPU.
>>
>> Add a cpuinfos BPF map that allows any CPU to access information
>> for any other CPU.
>>
>> For now, we retain the older per-CPU map.  If desired, a future
>> patch can migrate existing uses of the per-CPU map to the new
>> map, decommissioning the old one.  This would include map set up:
>>
>> *)  libdtrace/dt_dlibs.c:   DT_BPF_SYMBOL(cpuinfo, DT_IDENT_PTR),
>>
>> *)  libdtrace/dt_impl.h:    int dt_cpumap_fd;
>>
>> *)  libdtrace/dt_bpf.c:     dtp->dt_cpumap_fd = ...
>>      libdtrace/dt_bpf.c:     CREATE_MAP(cpuinfo)
>>
>> and map use:
>>
>> *)  bpf/get_agg.c
>> *)  bpf/get_bvar.c
>> *)  libdtrace/dt_cg.c
>> *)  libdtrace/dt_prov_lockstat.c
>>
>> Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
>> ---
>>   libdtrace/dt_bpf.c   | 13 +++++++++++++
>>   libdtrace/dt_dlibs.c |  1 +
>>   libdtrace/dt_impl.h  |  1 +
>>   3 files changed, 15 insertions(+)
>>
>> diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
>> index 6d42a96c7..8da51d6b9 100644
>> --- a/libdtrace/dt_bpf.c
>> +++ b/libdtrace/dt_bpf.c
>> @@ -786,7 +786,20 @@ gmap_create_cpuinfo(dtrace_hdl_t *dtp)
>>   	if (dtp->dt_cpumap_fd == -1)
>>   		return -1;
>>   
>> +	dtp->dt_cpusmap_fd = create_gmap(dtp, "cpuinfos",
>> +					BPF_MAP_TYPE_HASH,
>> +					sizeof(uint32_t),
>> +					sizeof(dt_bpf_cpuinfo_t), ncpus);
>> +	if (dtp->dt_cpusmap_fd == -1)
>> +		return -1;
>> +
>>   	rc = dt_bpf_map_update(dtp->dt_cpumap_fd, &key, data);
>> +
>> +	for (i = 0, ci = &conf->cpus[0]; i < ncpus && rc != -1; i++, ci++) {
>> +		key = ci->cpu_id;
>> +		rc = dt_bpf_map_update(dtp->dt_cpusmap_fd, &key, &data[ci->cpu_id]);
>> +	}
>> +
>>   	dt_free(dtp, data);
>>   	if (rc == -1)
>>   		return dt_bpf_error(dtp,
>> diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c
>> index 21df22a8a..0f19f3566 100644
>> --- a/libdtrace/dt_dlibs.c
>> +++ b/libdtrace/dt_dlibs.c
>> @@ -61,6 +61,7 @@ static const dt_ident_t		dt_bpf_symbols[] = {
>>   	DT_BPF_SYMBOL(agggen, DT_IDENT_PTR),
>>   	DT_BPF_SYMBOL(buffers, DT_IDENT_PTR),
>>   	DT_BPF_SYMBOL(cpuinfo, DT_IDENT_PTR),
>> +	DT_BPF_SYMBOL(cpuinfos, DT_IDENT_PTR),
>>   	DT_BPF_SYMBOL(dvars, DT_IDENT_PTR),
>>   	DT_BPF_SYMBOL(gvars, DT_IDENT_PTR),
>>   	DT_BPF_SYMBOL(lvars, DT_IDENT_PTR),
>> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
>> index 68fb8ec53..a5e42801c 100644
>> --- a/libdtrace/dt_impl.h
>> +++ b/libdtrace/dt_impl.h
>> @@ -390,6 +390,7 @@ struct dtrace_hdl {
>>   	int dt_aggmap_fd;	/* file descriptor for the 'aggs' BPF map */
>>   	int dt_genmap_fd;	/* file descriptor for the 'agggen' BPF map */
>>   	int dt_cpumap_fd;	/* file descriptor for the 'cpuinfo' BPF map */
>> +	int dt_cpusmap_fd;	/* file descriptor for the 'cpuinfos' BPF map */
>>   	int dt_usdt_pridsmap_fd; /* file descriptor for the 'usdt_prids' BPF map */
>>   	int dt_usdt_namesmap_fd; /* file descriptor for the 'usdt_names' BPF map */
>>   	dtrace_handle_err_f *dt_errhdlr; /* error handler, if any */
>> -- 
>> 2.43.5
>>
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list