[DTrace-devel] min() and max() aggregation map initialization
Eugene Loh
eugene.loh at oracle.com
Fri Dec 4 13:25:01 PST 2020
Sorry for the spam, but this message might have sent before I finished
typing. So... finishing here:
On 12/04/2020 01:15 PM, Eugene Loh wrote:
> Per discussions with Kris, this code should probably go into
> dt_aggregate_go(). After the aggsz<=0 check, add something like
>
> dt_idhash_iter(dtp->dt_aggs, (dt_idhash_f *) init_minmax, NULL);
>
> And then just stick the callback function before dt_aggregate_go().
> Maybe something like:
>
> static int init_minmax(dt_idhash_t *dhp, dt_ident_t *idp, void
> *arg) {
> dt_ident_t *id = idp->di_iarg;
> uint64_t value;
>
> assert(idp->di_kind == DT_IDENT_AGG);
> assert(id);
> if (id->di_id == DT_AGG_MIN)
> value = INT64_MAX;
> else if (id->di_id == DT_AGG_MAX)
> value = INT64_MIN;
> else
> return 0;
>
> ...write "value" at offset idp->di_offset...
>
> return 0;
> }
>
> For stuff like dtp->dt_aggmap_fd, you might need Kris's latest branch.
>
> Writing directly to the aggregation map from the callback function
> might be tricky. It's all a single key=0. So, I *think* you have to
> update the entire map. If so, allocate aggsz memory in
> dt_aggregate_go(). Write "value" at offset idp->di_offset within the
> callback function. When you're done iterating over all the
> aggregations, then do a single dt_bpf_map_update(dtp->dt_aggmap_fd,
> &key, ptr).
>
>
> On 12/03/2020 03:15 PM, Eugene Loh wrote:
>> Incidentally, my answer from last night was not quite right.
>>
>> I was suggesting iterating over the aggregations with dt_idhash_iter().
>> Then, there would be some callback function with a dt_ident_t pointer,
>> call it idp. I think idp->di_kind should be DT_IDENT_AGG for each one.
>>
>> But what kind of aggregation function does it use? I suggested looking
>> at idp->di_id, but the problem with that is it's the ID that identifies
>> the aggregation, not the aggregation function it uses. E.g., if you
>> have "@a = count()", then idp->di_id identifies "@a", while you want to
>> know about "count()".
>>
>> Apparently, you can still get at the aggregation function. I mean, you
>> have to be able to since "@a = count(); @a = max(1)" is illegal. So, we
>> have to be able to check that the aggregation function associated with
>> an aggregation is always consistent. The code for that is in
>> dt_parser.c; look for the string "aggregation redefined". It gives
>> away the dirty secret: di_iarg! Oh, so counterintuitive. Anyhow,
>> you can:
>>
>> dt_ident_t *aggfunc = idp->di_iarg;
>>
>> Then, you can check that the aggfunc is min or max by comparing
>> aggfunc->di_id to DT_AGG_MIN or DT_AGG_MAX (or comparing
>> aggfunc->di_name to "min" or "max", but that's probably a bad idea!).
>>
>> On 12/02/2020 09:01 PM, Eugene Loh wrote:
>>> Totally guessing here, but maybe it'll get you started. Maybe where
>>> you
>>> describe, you can dt_idhash_iter(dtp->dt_aggs,...). Then check the
>>> di_kind for each ident and look for DT_IDENT_AGG. If so, check di_id
>>> for
>>> what kind of aggregation it is. If it's MIN or MAX, you can get the
>>> offset from di_offset, and then initialize both values (di_offset and
>>> di_offset + 8). I've never done any of this, so I have no idea if it's
>>> right.
>>>
>>>
>>> On 12/02/2020 07:46 PM, david.mclean at oracle.com wrote:
>>>> I noticed during my testing that min() and max() need to be
>>>> initialized
>>>> to non-zero values before they are used.
>>>> For example, it is a problem if the initialized value is zero for
>>>> max()
>>>> and after that only negative values are fed to max() -- the result
>>>> will
>>>> be a value of zero falsely populating the final mapped values with an
>>>> incorrectly large value.
>>>>
>>>> Eugene pointed me to using dt_bpf_map_update() to populate these
>>>> initialization values, but I don't know where in the code base I would
>>>> want to issue the function to initialize the values. I've been
>>>> spending
>>>> my time in the aggs and _impl functions and I don't know my way around
>>>> enough to easily find the preferred place for the call.
>>>>
>>>> My guess is I should add something toward the end of
>>>> dt_bpf_gmap_create(), maybe right after the other dt_bpf_map_update()
>>>> call, and find a way to detect which maps belong to min() or max()
>>>> aggregations.
>>>>
>>>> I figure for min() I would want to initialize to the highest value
>>>> (0x7FFFFFFFFFFFFFFF) and for max() I should initialize the lowest
>>>> value
>>>> (0x8000000000000000).
>>>>
>>>> Some guidance to speed this up would be appreciated.
>>>>
>>>> _______________________________________________
>>>> DTrace-devel mailing list
>>>> DTrace-devel at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>>> _______________________________________________
>>> DTrace-devel mailing list
>>> DTrace-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>
More information about the DTrace-devel
mailing list