[DTrace-devel] [PATCH 30/61] Add a block of zeroes to use to initialize BPF maps

Kris Van Hees kris.van.hees at oracle.com
Tue Aug 2 15:52:16 UTC 2022


On Tue, Aug 02, 2022 at 01:20:48AM -0400, Kris Van Hees via DTrace-devel wrote:
> On Fri, Jul 08, 2022 at 10:45:14AM -0400, eugene.loh--- via DTrace-devel wrote:
> > From: Eugene Loh <eugene.loh at oracle.com>
> > 
> > There are cases in BPF code where we want to read a contiguous block of
> > zeroes.  Since the block is read-only, it can be shared among all CPUs and
> > among multiple purposes to save BPF map space (locked memory).  Create such
> > a block.
> > 
> > The block might as well reside in an already existing BPF map to consolidate
> > allocations.  The BPF_MAP_TYPE_ARRAY candidates are arguably:
> > 
> >               key type    val type    # elems
> > 
> >     "state"   uint32_t    uint32_t    3
> >     "strtab"  uint32_t    N / A       1
> >     "gvars"   uint32_t    N / A       1
> > 
> > The "state" map is an odd choice architecturally, but it has a lot of
> > spare room.
> 
> The 'state' map is a very odd choice simply because the value size is uint32_t
> and in order for an element in that map to be used as a block of zeros, you
> need to define the map with a value size equal to the size of the block of
> zroes, which means that all elements in that map suddenly become that size.
> 
> I would not mention that map here as a candidate because it really does not
> make sense to even consider it.
> 
> > The "gvars" map is arguably a decent choice;  we can think of there being a
> > big "global variable" of rather arbitrary size that we reference by pointer.
> 
> Correct.
> 
> > The "strtab" map, for completeness, is another choice.
> 
> You might want to mention here that strtab already contains a nice chunk of
> zeros at the end, which ...
> > 
> > Let us use gvars.
> 
> ... would make me be inclined to argue that 'strtab' is the better choice.
> Why would we create a block of zeros in 'gvars' when we already have one in
> 'strtab' (and it is perfectly fine to make that one larger).
> 
> > For code generation, we will need to know the offset of this region.
> > This patch supports deferring the specification of this offset until
> > relocation, but I suspect we might actually know the offset at cg time
> > since it's only dependent on dt_idhash_datasize(dtp->dt_globals).
> 
> This is equally simple when using 'strtab' because the offset of the zero
> is dtp->dt_strlen.

I think you should also add a cg function that generates the code sequence to
get a pointer to the zero block, to be used by all code that needs it.

> > Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
> > ---
> >  libdtrace/dt_bpf.c   | 5 +++--
> >  libdtrace/dt_bpf.h   | 1 +
> >  libdtrace/dt_cc.c    | 3 +++
> >  libdtrace/dt_cg.c    | 2 +-
> >  libdtrace/dt_dlibs.c | 1 +
> >  libdtrace/dt_impl.h  | 2 ++
> >  6 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> > index 87cc0814..ee9aa846 100644
> > --- a/libdtrace/dt_bpf.c
> > +++ b/libdtrace/dt_bpf.c
> > @@ -351,8 +351,9 @@ dt_bpf_gmap_create(dtrace_hdl_t *dtp)
> >  	if (pr_mapfd == -1)
> >  		return -1;		/* dt_errno is set for us */
> >  
> > -	/* global variables */
> > -	sz = P2ROUNDUP(dt_idhash_datasize(dtp->dt_globals), 8);
> > +	/* global variables, including a region of zeroes */
> > +	dtp->dt_zerooffset = P2ROUNDUP(dt_idhash_datasize(dtp->dt_globals), 8);
> > +	sz = dtp->dt_zerooffset + dtp->dt_zerosize;
> >  	if (sz > 0 &&
> >  	    create_gmap(dtp, "gvars", BPF_MAP_TYPE_ARRAY,
> >  			sizeof(uint32_t), sz, 1) == -1)
> > diff --git a/libdtrace/dt_bpf.h b/libdtrace/dt_bpf.h
> > index 12b9fce3..40cbcf43 100644
> > --- a/libdtrace/dt_bpf.h
> > +++ b/libdtrace/dt_bpf.h
> > @@ -36,6 +36,7 @@ extern "C" {
> >  #define DT_CONST_TASK_COMM	16
> >  #define DT_CONST_MUTEX_OWNER	17
> >  #define DT_CONST_RWLOCK_CNTS	18
> > +#define DT_CONST_ZERO_OFF	19
> >  
> >  extern int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
> >  			   int group_fd, unsigned long flags);
> > diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
> > index e3166647..5d85eb82 100644
> > --- a/libdtrace/dt_cc.c
> > +++ b/libdtrace/dt_cc.c
> > @@ -2454,6 +2454,9 @@ dt_link_construct(dtrace_hdl_t *dtp, const dt_probe_t *prp, dtrace_difo_t *dp,
> >  				nrp->dofr_data = total_offset;
> >  				continue;
> >  			}
> > +			case DT_CONST_ZERO_OFF:
> > +				nrp->dofr_data = dtp->dt_zerooffset;
> > +				continue;
> >  			default:
> >  				/* probe name -> value is probe id */
> >  				if (strchr(idp->di_name, ':') != NULL)
> > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > index a84359ea..f48452ee 100644
> > --- a/libdtrace/dt_cg.c
> > +++ b/libdtrace/dt_cg.c
> > @@ -208,7 +208,7 @@ dt_cg_tramp_prologue_act(dt_pcb_t *pcb, dt_activity_t act)
> >  		DT_CG_STORE_MAP_PTR("scratchmem", DCTX_SCRATCHMEM);
> >  	if (dt_idhash_datasize(dtp->dt_aggs) > 0)
> >  		DT_CG_STORE_MAP_PTR("aggs", DCTX_AGG);
> > -	if (dt_idhash_datasize(dtp->dt_globals) > 0)
> > +	if (dt_idhash_datasize(dtp->dt_globals) > 0 || dtp->dt_zerosize > 0)
> >  		DT_CG_STORE_MAP_PTR("gvars", DCTX_GVARS);
> >  	if (dtp->dt_maxlvaralloc > 0)
> >  		DT_CG_STORE_MAP_PTR("lvars", DCTX_LVARS);
> > diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c
> > index 89ea062d..41ef28d9 100644
> > --- a/libdtrace/dt_dlibs.c
> > +++ b/libdtrace/dt_dlibs.c
> > @@ -89,6 +89,7 @@ static const dt_ident_t		dt_bpf_symbols[] = {
> >  	DT_BPF_SYMBOL_ID(TASK_COMM, DT_IDENT_SCALAR, DT_CONST_TASK_COMM),
> >  	DT_BPF_SYMBOL_ID(MUTEX_OWNER, DT_IDENT_SCALAR, DT_CONST_MUTEX_OWNER),
> >  	DT_BPF_SYMBOL_ID(RWLOCK_CNTS, DT_IDENT_SCALAR, DT_CONST_RWLOCK_CNTS),
> > +	DT_BPF_SYMBOL_ID(ZERO_OFF, DT_IDENT_SCALAR, DT_CONST_ZERO_OFF),
> >  
> >  	/* End-of-list marker */
> >  	{ NULL, }
> > diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> > index 37088a5d..149f8b18 100644
> > --- a/libdtrace/dt_impl.h
> > +++ b/libdtrace/dt_impl.h
> > @@ -295,6 +295,8 @@ struct dtrace_hdl {
> >  	uint_t dt_maxdvarsize;	/* largest dynamic variable across programs */
> >  	uint_t dt_maxtuplesize;	/* largest tuple across programs */
> >  	uint_t dt_maxlvaralloc;	/* largest lvar alloc across pcbs */
> > +	uint_t dt_zerosize;	/* zero region, size */
> > +	uint_t dt_zerooffset;	/* zero region, offset */
> >  	dt_tstring_t *dt_tstrings; /* temporary string slots */
> >  	dt_list_t dt_modlist;	/* linked list of dt_module_t's */
> >  	dt_htab_t *dt_mods;	/* hash table of dt_module_t's */
> > -- 
> > 2.18.4
> > 
> > 
> > _______________________________________________
> > DTrace-devel mailing list
> > DTrace-devel at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/dtrace-devel
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list