[DTrace-devel] [PATCH 4/6] cg: add argument remapping in the trampoline

Tue Oct 29 14:51:59 UTC 2024

On Tue, Oct 29, 2024 at 01:09:08PM +0000, Nick Alcock wrote:
> On 25 Oct 2024, Kris Van Hees said:
> 
> > On Fri, Oct 25, 2024 at 04:56:26PM +0100, Nick Alcock wrote:
> >> > However, I think it might be a good idea to have this function actually rewrite
> >> > the mapping data to become a pure i -> i mapping.
> >> 
> >> Can we guarantee that trampoline() is only ever invoked once, and that
> >> it's invoked after probe_info is called? If it changes the remapping
> >> array, any subsequent invocation will do the wrong thing. If it's
> >> invoked before probe_info, it becomes impossible for probe_info to
> >> adjust the xlated types appropriately.
> >> 
> >> This all seems like way too much coupling to me. At least done
> >> this way trampoline() is idempotent and doesn't depend on being called
> >> in the right (undocumented) order wrt probe_info.
> >
> > If the actual probe arguments (argN) are being reshuffled (which is what you
> > are doing by shuffling the arguments in the trampoline), then the mapping
> > needs to be updated to become an identity mapping i -> i).  Otherwise there
> > could be other code at some point that tries to apply the argument mapping
> > and that would be really bad.
> 
> OK so I tried to implement this and it all fell over as soon as I
> thought about it in depth. The problem is -- and you might not have
> realised this because it was different when I started and I had to move
> things around before anything would work -- trampoline() and
> probe_info() are functions defined on *different probes*. trampoline()
> is called on the underlying probes, and emits code for them that
> reshuffles the args according to the mapping info (while ignoring the
> types entirely); but probe_info() is called on the *overlying* probes.
> It digs down to *one* underlying probe to get the dt_uprobe_t, but it
> has no idea which one (if there is more than one): it's basically
> arbitrary.

Yes, that is correct.  It is also not really important in the sense that by
their very nature the underlying probes *must* be identical in terms of the
argument data.  After all, the argument data is defined at the level of the
meta-probe (the probe as defined in the provider declaration that gets
processed with dtrace -G at build tie to generate the DOF).  So it is always
associated with all the tracepoints for that probe (and each tracepoint has
its own uprobe and thus also its own underlying probe).

None of this matters though in terms of how the argument data is stored.  If
you can store it in raw format, you can also store it in dt_argdesc_t format.

> So we cannot possibly have the trampoline() for the underlying probe go
> changing the dt_uprobe_t's mapping unless we can be absolutely sure that
> the trampoline() for all the underlying probes is called before the
> probe_info() for *any* of the overlying probes. Are we really sorting
> things like that? I thought this was a TODO, if that...

Changing how you store the data does not change the logic you already have in
place to handling the mapping of arguments, etc.  It is just a change in
how things are stored - not a change in behaviour.

> I tried storing the arg info in dt_argdesc_t structures in the
> dt_uprobe_t too and it rapidly turned into *another* nightmare, because
> now instead of one allocation for all the args at once we have one
> allocation per arg, plus up to two more for its arg type strings -- but
> we can't just hand the argdesc back to probe_info()s caller wholesale
> because it frees it after it's done with it! So we'd actually have to
> loop through them and copy the whole thing (including all the arg
> strings) and then do the *same thing* in the underlying probe freeing
> function, and all this complexity buys us what? The ability to save four
> lines in probe_info() going through the nargv so we can index it like an
> array (which we'd have to do the nargc *and* xargc in your scheme).

Really?  Your current patch stores the argument data in dt_uprobe_t in
create_underlying() using 3 allocations each with a memcpy.  The alternative
could use a single dt_calloc(dtp, psp->pps_xargc, sizeof(dt_argdesc_t)) and
a loop to populate the elements.  It would have to do 2 strdups per argument
or you could be fancy and copy all type strings as a single memcpy() into the
first argument and store pointers into that for the rest for the other args.
As long as we have our own code to free all that, such an opitmization can
certainly be used.

The loop to find the strings in the combined memory blobs therefore gets done
here (once) instead of in probe_info (every time that gets called).

You already need to allocate an dt_argdesc_t array and populate it within
probe_info because that is what the caller expects, so the work to do that
does not change, other than that you are making a deep copy of the dt_argdesc_t
array already available from the underlying probe rather than needing to
construct one from raw data for every call to probe_info.

> Sorry, this seems again to have zero benefits and a great deal of extra
> complexity.

Pretty much the same complexity, and you gain the benefit of actually storing
the argument data in the structure format that exists for that very reason.