[DTrace-devel] [PATCH 1/6] Change storage for global variables
Eugene Loh
eugene.loh at oracle.com
Wed Mar 31 22:39:05 PDT 2021
Thanks for the reviews. I've incorporated your feedback so far (patches
1-4), but I have two questions, one with regards to some future work and
one with how to deal with a FIXME...
On 3/31/21 1:27 PM, Kris Van Hees wrote:
> I am a bit concerned that the fact that we're needing to load the gvars
> pointer using two instructions for every access (get or set) for a global
> variable. We may need to look at a way to optimize that, but that is
> definitely for future work.
I would like to understand this concern better. (No rush since this is
future work.) With the new scheme, we set a gvar like this:
lddw %rtmp, [%fp+DT_STK_DCTX]
lddw %rtmp, [%rtmp+DCTX_GVARS]
stdw [%rtmp+$offset], %rsrc
and we get a gvar like this:
lddw %rdst, [%fp+DT_STK_DCTX]
lddw %rdst, [%rdst+DCTX_GVARS]
lddw %rdst, [%rdst+$offset]
There may be opportunities for improvement, but those instructions
contrast with what we used to do, which was many more instructions, plus
function calls, plus register pressure due to those calls.
For example, here is what we used to do for setting a gvar:
mov %r2, %rsrc
mov %r1, $id
call dt_set_gvar
Not only are those three instructions no better than the new
implementation, they also exert greater register pressure due to the
function call. Also, the function call entails execution of quite a few
more instructions. It looks like a one-liner:
dt_set_gvar(uint32_t id, uint64_t val) {
bpf_map_update_elem(&gvars, &id, &val, BPF_ANY);
}
but that compiles to:
stw [%fp-4], %r1
stdw [%fp-16], %r2
mov %r4, 0
mov %r3, %fp
add %r3, -16
mov %r2, %fp
add %r2, -4
lddw %r1, &gvars
call bpf_map_update_elem
exit
Or, to get a gvar, we used to do this:
mov %r1, $id
call dt_get_gvar
lddw %rtmp, [%fp+DT_STK_DCTX] | dt_cg_check_fault
lddw %rtmp, [%rtmp+DCTX_MST] | dt_cg_check_fault
lddw %rtmp, [%rtmp+DMST_FAULT] | dt_cg_check_fault
jeq %rtmp, 0, 2 | dt_cg_check_fault
mov %r0, 0 | dt_cg_check_fault
exit | dt_cg_check_fault
mov %rdst, %r0
which is already quite a bit worse than the three instructions we have
with the new patch. And, again, there is also the function call (and
accompanying register pressure) from
dt_get_gvar(uint32_t id) {
uint64_t *val = bpf_map_lookup_elem(&gvars, &id);
return val ? *val : 0;
}
which compiles to
stw [%fp-4], %r1
mov %r2, %fp
add %r2, -4
lddw %r1, &gvars
call bpf_map_lookup_elem
jeq %r0, 0, 1
lddw %r0, [%r0+0]
exit
Further optimization of the new implementation is, of course, just fine,
but it seems like the new patch already gets us "90%" (or whatever) of
the way we can go.
> On Fri, Mar 19, 2021 at 01:45:29PM -0400, eugene.loh at oracle.com wrote:
>> diff --git a/libdtrace/dt_parser.c b/libdtrace/dt_parser.c
>> @@ -1661,6 +1661,10 @@ dt_node_decl(void)
>> if (idp == NULL)
>> longjmp(yypcb->pcb_jmpbuf, EDT_NOMEM);
>>
>> + /* FIXME: do not do this for aggregations; their size set in dt_cg.c */
>> + dt_ident_set_storage(idp, 8,
>> + ctf_type_size(dtt.dtt_ctfp, type));
>> +
> I am confused... Does this mean that you are setting the storage size here but
> really shouldn't be doing it? If you shouldn't be doing this here for aggs,
> then wouldn't you just put in a conditional clause for it? That is what you
> do in the next part of the patch...
Alas, yes, you're completely right. The rub is that this is kind of my
first foray into the parser. At the latter site, I use
idkind!=DT_IDENT_AGG, where idkind is an argument to the function
dt_xcook_ident(). At the current site, no such thing. I mean, there is
an idkind, but it's
idkind = assc ? DT_IDENT_ARRAY : DT_IDENT_SCALAR;
So, I can use a suggestion: how do I test for aggs? Otherwise, I'll
stare more at this tomorrow.
>
>> dt_ident_type_assign(idp, dtt.dtt_ctfp, dtt.dtt_type);
>>
>> /*
>> @@ -2751,6 +2755,10 @@ dt_xcook_ident(dt_node_t *dnp, dt_idhash_t *dhp, uint_t idkind, int create)
>> if (idp == NULL)
>> longjmp(yypcb->pcb_jmpbuf, EDT_NOMEM);
>>
>> + /* aggregation storage size is set later, in dt_cg.c */
>> + if (idkind != DT_IDENT_AGG)
>> + dt_ident_set_storage(idp, 8, 8);
>> +
>> /*
>> * Arrays and aggregations are not cooked individually. They
>> * have dynamic types and must be referenced using operator [].
More information about the DTrace-devel
mailing list