[DTrace-devel] min() and max() aggregation map initialization

Eugene Loh eugene.loh at oracle.com
Fri Dec 4 13:15:34 PST 2020


Per discussions with Kris, this code should probably go into 
dt_aggregate_go().  After the aggsz<=0 check, add something like

     dt_idhash_iter(dtp->dt_aggs, (dt_idhash_f *) init_minmax, NULL);

And then just stick the callback function before dt_aggregate_go().  
Maybe something like:

     static int init_minmax(dt_idhash_t *dhp, dt_ident_t *idp, void *arg) {
         dt_ident_t *id = idp->di_iarg;
         uint64_t value;

         assert(idp->di_kind == DT_IDENT_AGG);
         assert(id);
         if (id->di_id == DT_AGG_MIN)
             value = INT64_MAX;
         else if (id->di_id == DT_AGG_MAX)
             value = INT64_MIN;
         else
             return 0;

                 dt_bpf_map_update(dtp->dt_aggmap_fd, &key, ptr)
        printf("EUGENE: offset %d\n", idp->di_offset);
         return 0;
     }

For stuff like dtp->dt_aggmap_fd, you might need Kris's latest branch.

The map-update stuff is tricky.  It's all a single key=0.  So, I *think* 
you have to update the entire map.  If so, you have to allocate some memory.


On 12/03/2020 03:15 PM, Eugene Loh wrote:
> Incidentally, my answer from last night was not quite right.
>
> I was suggesting iterating over the aggregations with dt_idhash_iter().
> Then, there would be some callback function with a dt_ident_t pointer,
> call it idp.  I think idp->di_kind should be DT_IDENT_AGG for each one.
>
> But what kind of aggregation function does it use?  I suggested looking
> at idp->di_id, but the problem with that is it's the ID that identifies
> the aggregation, not the aggregation function it uses.  E.g., if you
> have "@a = count()", then idp->di_id identifies "@a", while you want to
> know about "count()".
>
> Apparently, you can still get at the aggregation function.  I mean, you
> have to be able to since "@a = count(); @a = max(1)" is illegal.  So, we
> have to be able to check that the aggregation function associated with
> an aggregation is always consistent.  The code for that is in
> dt_parser.c;  look for the string "aggregation redefined".  It gives
> away the dirty secret:  di_iarg!  Oh, so counterintuitive.  Anyhow, you can:
>
>           dt_ident_t *aggfunc = idp->di_iarg;
>
> Then, you can check that the aggfunc is min or max by comparing
> aggfunc->di_id to DT_AGG_MIN or DT_AGG_MAX (or comparing
> aggfunc->di_name to "min" or "max", but that's probably a bad idea!).
>
> On 12/02/2020 09:01 PM, Eugene Loh wrote:
>> Totally guessing here, but maybe it'll get you started.  Maybe where you
>> describe, you can  dt_idhash_iter(dtp->dt_aggs,...). Then check the
>> di_kind for each ident and look for DT_IDENT_AGG. If so, check di_id for
>> what kind of aggregation it is.  If it's MIN or MAX, you can get the
>> offset from di_offset, and then initialize both values (di_offset and
>> di_offset + 8).  I've never done any of this, so I have no idea if it's
>> right.
>>
>>
>> On 12/02/2020 07:46 PM, david.mclean at oracle.com wrote:
>>> I noticed during my testing that min() and max() need to be initialized
>>> to non-zero values before they are used.
>>> For example, it is a problem if the initialized value is zero for max()
>>> and after that only negative values are fed to max() -- the result will
>>> be a value of zero falsely populating the final mapped values with an
>>> incorrectly large value.
>>>
>>> Eugene pointed me to using dt_bpf_map_update() to populate these
>>> initialization values, but I don't know where in the code base I would
>>> want to issue the function to initialize the values.  I've been spending
>>> my time in the aggs and _impl functions and I don't know my way around
>>> enough to easily find the preferred place for the call.
>>>
>>> My guess is I should add something toward the end of
>>> dt_bpf_gmap_create(), maybe right after the other dt_bpf_map_update()
>>> call, and find a way to detect which maps belong to min() or max()
>>> aggregations.
>>>
>>> I figure for min() I would want to initialize to the highest value
>>> (0x7FFFFFFFFFFFFFFF) and for max() I should initialize the lowest value
>>> (0x8000000000000000).
>>>
>>> Some guidance to speed this up would be appreciated.
>>>
>>> _______________________________________________
>>> DTrace-devel mailing list
>>> DTrace-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel




More information about the DTrace-devel mailing list