[DTrace-devel] [DTrace] DTrace userspace branch dm/2.0-branch-dev-aggs updated. 2ffdae6376cbe280267577edbfc5dc93a776398f

Eugene Loh eugene.loh at oracle.com
Wed Nov 25 17:05:09 PST 2020


----- david.mclean at oracle.com wrote:

> On 11/25/20 9:41 AM, Eugene Loh wrote:
> > On 11/24/2020 08:57 PM, nick.alcock at oracle.com wrote:
> > 
> >> commit 2ffdae6376cbe280267577edbfc5dc93a776398f
> >> Author: David  Mc Lean <david.mclean at oracle.com>
> >> Date:   Thu Nov 19 12:04:39 2020 -0800
> >>
> >>       aggregations max(), min(), sum(), and stddev() WIP
> >>
> > 2.  The aggregation functions started with this foo() and foo_impl()
> > approach, where foo_impl() would get called twice for each update (since
> > there were two copies to update).  With the *quantize() functions, which
> > have "complicated" conversions from values to bin numbers, there is a
> > third function.  We call that other function once and the foo_impl()
> > function twice.  In your case, I would imagine this new function to
> > determine the 128-bit square, placing this result in a pair of
> > registers, and to have the std_impl() function simply do the +1, +val,
> > and +valsquared aggregation.
> 
> So, still the two functions I have but with the heavy lifting in the 
> other function -- if I follow what you are conveying.

I hadn't thought of it like that, but I guess so.

The reason I hadn't thought of it like that was that things
like quantize() have three functions:  the "parent" function,
the _impl() function (which is called twice), and the "heavy
lifting" function.  The "parent" and the "heavy lifter" are
different functions for two reasons:

1)  They do rather different things.  The "parent" gets and
checks arguments (at least for lquantize and llquantize),
which is pretty hairy.  The heavy lifter implements a bunch
of stuff in BPF.

2)  The parent does usual C stuff.  The heavy lifter emits
BPF instructions, which we might emit directly or we implement
in C, to be compiled by the BPF cross compiler.

For standard deviation... don't know.  Combining so you're
left with a grand total of only two functions is perhaps the
way to go.

> > Incidentally, the DTv1 code has that implementation in terms of
> > dtrace_[multiply|shift|add]_128 in dtrace_probe_ctx.c.  Did you decide
> > not to go that route?  It's intended to be more general (multiply two
> > different factors rather than squaring just one; shift by an arbitrary
> > amount rather than a known 32), so it could be simplified a lot before
> > implementing in BPF.
> 
> I could use some more detail about how I would go that route.  I've been 
> just looking at how to get the job done with this BPF code, maybe there 
> is a different way that would be better.

I'm unclear what more detail you'd like.  Again, the code is in
dtrace_probe_ctx.c.  The functions are dtrace_[multiply|shift|add]_128.
The code can be simplified for the case at hand (both factors are the
same, the shift is fixed, etc.).  Largely the same as what you're doing,
though the "carry over" is not just one nibble.



More information about the DTrace-devel mailing list