[DTrace-devel] [PATCH 1/6] Change storage for global variables

Eugene Loh eugene.loh at oracle.com
Wed Mar 31 22:39:05 PDT 2021


Thanks for the reviews.  I've incorporated your feedback so far (patches 
1-4), but I have two questions, one with regards to some future work and 
one with how to deal with a FIXME...

On 3/31/21 1:27 PM, Kris Van Hees wrote:
> I am a bit concerned that the fact that we're needing to load the gvars
> pointer using two instructions for every access (get or set) for a global
> variable.  We may need to look at a way to optimize that, but that is
> definitely for future work.

I would like to understand this concern better.  (No rush since this is 
future work.)  With the new scheme, we set a gvar like this:

     lddw %rtmp, [%fp+DT_STK_DCTX]
     lddw %rtmp, [%rtmp+DCTX_GVARS]
     stdw [%rtmp+$offset], %rsrc

and we get a gvar like this:

     lddw %rdst, [%fp+DT_STK_DCTX]
     lddw %rdst, [%rdst+DCTX_GVARS]
     lddw %rdst, [%rdst+$offset]

There may be opportunities for improvement, but those instructions 
contrast with what we used to do, which was many more instructions, plus 
function calls, plus register pressure due to those calls.

For example, here is what we used to do for setting a gvar:

     mov  %r2, %rsrc
     mov  %r1, $id
     call dt_set_gvar

Not only are those three instructions no better than the new 
implementation, they also exert greater register pressure due to the 
function call.  Also, the function call entails execution of quite a few 
more instructions.  It looks like a one-liner:

     dt_set_gvar(uint32_t id, uint64_t val) {
         bpf_map_update_elem(&gvars, &id, &val, BPF_ANY);
     }

but that compiles to:

     stw  [%fp-4], %r1
     stdw [%fp-16], %r2
     mov  %r4, 0
     mov  %r3, %fp
     add  %r3, -16
     mov  %r2, %fp
     add  %r2, -4
     lddw %r1, &gvars
     call bpf_map_update_elem
     exit

Or, to get a gvar, we used to do this:

     mov  %r1, $id
     call dt_get_gvar
     lddw %rtmp, [%fp+DT_STK_DCTX]   | dt_cg_check_fault
     lddw %rtmp, [%rtmp+DCTX_MST]    | dt_cg_check_fault
     lddw %rtmp, [%rtmp+DMST_FAULT]  | dt_cg_check_fault
     jeq  %rtmp, 0, 2                | dt_cg_check_fault
     mov  %r0, 0                     | dt_cg_check_fault
     exit                            | dt_cg_check_fault
     mov  %rdst, %r0

which is already quite a bit worse than the three instructions we have 
with the new patch.  And, again, there is also the function call (and 
accompanying register pressure) from

     dt_get_gvar(uint32_t id) {
         uint64_t *val = bpf_map_lookup_elem(&gvars, &id);
         return val ? *val : 0;
     }

which compiles to

     stw  [%fp-4], %r1
     mov  %r2, %fp
     add  %r2, -4
     lddw %r1, &gvars
     call bpf_map_lookup_elem
     jeq  %r0, 0, 1
     lddw %r0, [%r0+0]
     exit

Further optimization of the new implementation is, of course, just fine, 
but it seems like the new patch already gets us "90%" (or whatever) of 
the way we can go.

> On Fri, Mar 19, 2021 at 01:45:29PM -0400, eugene.loh at oracle.com wrote:
>> diff --git a/libdtrace/dt_parser.c b/libdtrace/dt_parser.c
>> @@ -1661,6 +1661,10 @@ dt_node_decl(void)
>>   			if (idp == NULL)
>>   				longjmp(yypcb->pcb_jmpbuf, EDT_NOMEM);
>>   
>> +			/* FIXME: do not do this for aggregations; their size set in dt_cg.c */
>> +			dt_ident_set_storage(idp, 8,
>> +					     ctf_type_size(dtt.dtt_ctfp, type));
>> +
> I am confused... Does this mean that you are setting the storage size here but
> really shouldn't be doing it?  If you shouldn't be doing this here for aggs,
> then wouldn't you just put in a conditional clause for it?  That is what you
> do in the next part of the patch...

Alas, yes, you're completely right.  The rub is that this is kind of my 
first foray into the parser.  At the latter site, I use 
idkind!=DT_IDENT_AGG, where idkind is an argument to the function 
dt_xcook_ident().  At the current site, no such thing.  I mean, there is 
an idkind, but it's

                 idkind = assc ? DT_IDENT_ARRAY : DT_IDENT_SCALAR;

So, I can use a suggestion:  how do I test for aggs?  Otherwise, I'll 
stare more at this tomorrow.

>
>>   			dt_ident_type_assign(idp, dtt.dtt_ctfp, dtt.dtt_type);
>>   
>>   			/*
>> @@ -2751,6 +2755,10 @@ dt_xcook_ident(dt_node_t *dnp, dt_idhash_t *dhp, uint_t idkind, int create)
>>   		if (idp == NULL)
>>   			longjmp(yypcb->pcb_jmpbuf, EDT_NOMEM);
>>   
>> +		/* aggregation storage size is set later, in dt_cg.c */
>> +		if (idkind != DT_IDENT_AGG)
>> +			dt_ident_set_storage(idp, 8, 8);
>> +
>>   		/*
>>   		 * Arrays and aggregations are not cooked individually. They
>>   		 * have dynamic types and must be referenced using operator [].



More information about the DTrace-devel mailing list