[DTrace-devel] min() and max() aggregation map initialization

Eugene Loh eugene.loh at oracle.com
Fri Dec 4 13:25:01 PST 2020


Sorry for the spam, but this message might have sent before I finished 
typing.  So... finishing here:


On 12/04/2020 01:15 PM, Eugene Loh wrote:
> Per discussions with Kris, this code should probably go into 
> dt_aggregate_go().  After the aggsz<=0 check, add something like
>
>     dt_idhash_iter(dtp->dt_aggs, (dt_idhash_f *) init_minmax, NULL);
>
> And then just stick the callback function before dt_aggregate_go().  
> Maybe something like:
>
>     static int init_minmax(dt_idhash_t *dhp, dt_ident_t *idp, void 
> *arg) {
>         dt_ident_t *id = idp->di_iarg;
>         uint64_t value;
>
>         assert(idp->di_kind == DT_IDENT_AGG);
>         assert(id);
>         if (id->di_id == DT_AGG_MIN)
>             value = INT64_MAX;
>         else if (id->di_id == DT_AGG_MAX)
>             value = INT64_MIN;
>         else
>             return 0;
>
>         ...write "value" at offset idp->di_offset...
>
>         return 0;
>     }
>
> For stuff like dtp->dt_aggmap_fd, you might need Kris's latest branch.
>
> Writing directly to the aggregation map from the callback function 
> might be tricky.  It's all a single key=0.  So, I *think* you have to 
> update the entire map.  If so, allocate aggsz memory in 
> dt_aggregate_go(). Write "value" at offset idp->di_offset within the 
> callback function.  When you're done iterating over all the 
> aggregations, then do a single dt_bpf_map_update(dtp->dt_aggmap_fd, 
> &key, ptr).
>
>
> On 12/03/2020 03:15 PM, Eugene Loh wrote:
>> Incidentally, my answer from last night was not quite right.
>>
>> I was suggesting iterating over the aggregations with dt_idhash_iter().
>> Then, there would be some callback function with a dt_ident_t pointer,
>> call it idp.  I think idp->di_kind should be DT_IDENT_AGG for each one.
>>
>> But what kind of aggregation function does it use?  I suggested looking
>> at idp->di_id, but the problem with that is it's the ID that identifies
>> the aggregation, not the aggregation function it uses.  E.g., if you
>> have "@a = count()", then idp->di_id identifies "@a", while you want to
>> know about "count()".
>>
>> Apparently, you can still get at the aggregation function.  I mean, you
>> have to be able to since "@a = count(); @a = max(1)" is illegal.  So, we
>> have to be able to check that the aggregation function associated with
>> an aggregation is always consistent.  The code for that is in
>> dt_parser.c;  look for the string "aggregation redefined".  It gives
>> away the dirty secret:  di_iarg!  Oh, so counterintuitive. Anyhow, 
>> you can:
>>
>>           dt_ident_t *aggfunc = idp->di_iarg;
>>
>> Then, you can check that the aggfunc is min or max by comparing
>> aggfunc->di_id to DT_AGG_MIN or DT_AGG_MAX (or comparing
>> aggfunc->di_name to "min" or "max", but that's probably a bad idea!).
>>
>> On 12/02/2020 09:01 PM, Eugene Loh wrote:
>>> Totally guessing here, but maybe it'll get you started.  Maybe where 
>>> you
>>> describe, you can  dt_idhash_iter(dtp->dt_aggs,...). Then check the
>>> di_kind for each ident and look for DT_IDENT_AGG. If so, check di_id 
>>> for
>>> what kind of aggregation it is.  If it's MIN or MAX, you can get the
>>> offset from di_offset, and then initialize both values (di_offset and
>>> di_offset + 8).  I've never done any of this, so I have no idea if it's
>>> right.
>>>
>>>
>>> On 12/02/2020 07:46 PM, david.mclean at oracle.com wrote:
>>>> I noticed during my testing that min() and max() need to be 
>>>> initialized
>>>> to non-zero values before they are used.
>>>> For example, it is a problem if the initialized value is zero for 
>>>> max()
>>>> and after that only negative values are fed to max() -- the result 
>>>> will
>>>> be a value of zero falsely populating the final mapped values with an
>>>> incorrectly large value.
>>>>
>>>> Eugene pointed me to using dt_bpf_map_update() to populate these
>>>> initialization values, but I don't know where in the code base I would
>>>> want to issue the function to initialize the values.  I've been 
>>>> spending
>>>> my time in the aggs and _impl functions and I don't know my way around
>>>> enough to easily find the preferred place for the call.
>>>>
>>>> My guess is I should add something toward the end of
>>>> dt_bpf_gmap_create(), maybe right after the other dt_bpf_map_update()
>>>> call, and find a way to detect which maps belong to min() or max()
>>>> aggregations.
>>>>
>>>> I figure for min() I would want to initialize to the highest value
>>>> (0x7FFFFFFFFFFFFFFF) and for max() I should initialize the lowest 
>>>> value
>>>> (0x8000000000000000).
>>>>
>>>> Some guidance to speed this up would be appreciated.
>>>>
>>>> _______________________________________________
>>>> DTrace-devel mailing list
>>>> DTrace-devel at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>>> _______________________________________________
>>> DTrace-devel mailing list
>>> DTrace-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>




More information about the DTrace-devel mailing list