[DTrace-devel] [PATCH] Action clear() should clear only one aggregation

Thu Aug 15 01:06:04 UTC 2024

On Wed, Aug 14, 2024 at 07:05:44PM -0400, Eugene Loh wrote:
> On 8/14/24 18:28, Kris Van Hees wrote:
> 
> > On Wed, Aug 14, 2024 at 05:50:27PM -0400, Eugene Loh wrote:
> > > On 8/14/24 11:57, Kris Van Hees wrote:
> > > 
> > > > On Tue, Aug 13, 2024 at 10:59:25PM -0400, eugene.loh at oracle.com wrote:
> > > > > From: Eugene Loh <eugene.loh at oracle.com>
> > > > > 
> > > > > FIXME:  Is libdtrace/libdtrace.ver correctly updated?
> > > Incidentally, what has to be done in this file?
> > > 
> > > > I think that keeping dtrace_aggregate_walk() as-is would be better, along with
> > > > adding a function dtrace_aggregate_clear_one(dtp, aid) that performs the
> > > > lookup from aid to dtrae_aggdata_t *agd, and then calls
> > > > dt__aggregate_clear_one(agd, dtp) to do the actual clearing.  There is no need
> > > > to go through a walker function when we already know we need to operate on
> > > > just one aggregation.
> > > I'm not convinced.
> > > 
> > > "Just one aggregation"?  Yes, just one aggid, but possibly many keys.  To
> > > narrow down to which agd we want, we need to know the key -- and clear()
> > > doesn't give us the key, plus there will generally be more than one key
> > > anyhow.
> > Ah true.  I blanked on the fact that clear(@a) clears that aggregation, and
> > any indexed aggregations with the same name (and thus same aid).
> > 
> > Anyway, given that dtrace_aggregate_walk() is an API function that is exported
> > by libdtrace, we should not change it.
> > 
> > So, I'd propose that instead of introducing a new function, we can make use
> > of the existing dtrace_aggregate_walk() by passing both dtp and the aid as an
> > arg struct.  E.g.
> > 
> > typedef struct dt_clear_arg {
> > 	dtrace_hdl_t	*dtp;
> > 	dtrace_aggid_t	aid;
> > } dt_clear_arg_t;
> 
> Hmm.  Then both dt_consume.c and dt_aggregate.c need to know about this new
> struct.  So, copy it to both files?  Introduce a new dt_aggregate.h?  Ick.

It would be appropriate to add it to dt_aggregate.h because it is a struct that
defines the callback argument.  There is enough precedent for that - in fact,
it is the way this is typically done.

> Maybe the thing to do is use dtrace_aggregate_walk() with a callback in
> dt_consume.c.  The callback can filter on the agg id and call
> dt_aggregate_clear_one() accordingly... I *think* that works.  Let me know
> if you're on board with this approach.

I think I'd rather see the dt_clear_arg_t thing.  I don't think a callback in
dt_consume.c would work without this struct argument because you still would
not be able to get both dtp and aid passed to that callback.

> > void
> > dtrace_aggregate_clear(dtrace_hdl_t *dtp)
> > {
> > 	dt_clear_arg_t	arg = { dtp, aid };
> > 
> >          dtrace_aggregate_walk(dtp, dt_aggregate_clear_one, &arg);
> > }
> > 
> > nd then dt_aggregate_clear_one() can check the aid and return if the aid
> > is not IDNOE and does not match the one for the aggregation it is called for.
> > That means you also need to change dt_aggregate_snap_one() but there you can
> > also define that arg and set aid as IDNONE in that because we know that if we
> > call dt_aggregate_clear_one() from here weknow it is for an aggregation that
> > needs to be cleared.
> > 
> > > Now, one can argue that in the current implementation and its current hash
> > > function, the aggid tells us at least which hash bucket to search.  So we
> > > could write a special, reduced loop.  Maybe so, but that would be wed to
> > > this particular choice of hash function, and I'd argue the current choice is
> > > a bad one.  E.g., consider:
> > > 
> > >          tick-10 { @pc[uregs[R_PC]] = count() }
> > >          syscall::write:entry { @writeentry = count() }
> > > 
> > > Lots of agd's will be sharing one bucket and another agd is sitting all by
> > > itself.  We ought to change the hash function, and then we'd have to undo
> > > the "search just one bucket" optimization.
> > > 
> > > PS  Here is an oddity:
> > > 
> > >      # /usr/sbin/dtrace -n 'BEGIN { @a[1] = count(); @a[2] = count();
> > > clear(@a[1]); exit(0); }'
> > >      dtrace: description 'BEGIN ' matched 1 probe
> > >       CPU     ID                    FUNCTION:NAME
> > >        34      1                           :BEGIN
> > > 
> > >              1                0
> > >              2                0
> > > 
> > > You *can* specify the key in the clear()!  But dtrace ignores the key.
> > > Silently.  Same behavior in Solaris.