[DTrace-devel] [PATCH v3 06/21] dtprobed: add the DOF stash

Thu Feb 15 12:26:43 UTC 2024

On 14 Feb 2024, Kris Van Hees stated:
>> I did that at first, but it persisted in not working.
>> 
>> Alas, it turns out the output of the parser *is* PID-specific despite
>> the DOF being the same for every mapping of a given, uh, mapping,
>> because the parsing process depends on the dof_helper_t :( in
>> particular, the dofhp_addr is address-space-specific, so more or less
>> process-specific. For now, I think this is the best we can do. (In
>> future, maybe we can pass the dofhp_addr on along with the relocs it is
>> used with, and relocate the raw DOF later on in dtrace proper, but for
>> now parsing isn't so expensive that a parse per ioctl is likely to be a
>> performance problem. We can usually parse many DOFs per parser process,
>> in a stream...)
>
> I am not following this...  The parser can surely be adjusted to not resolve
> the relocations (because we are also storing, per pid, the data needed to do
> that relocation when we actually need it - in dtrace itself).  And that makes
> it that we can resue the parsed form without any problem, right?

Yeah. My current idea is to record the file offset of every addr in the
stream of dof_parser_t records, then emit all the relocs into the stream
as well and relocate them as we split them up into probes, in
dof_stash.c. This is actually pretty easy :)

(Assuming we need this code at all. From your comments elsewhere that
it's only used for userspace symbol relocs which are more or less
entirely unimplemented elsewhere, I think I'll try simply ripping out
the reloc code in the parser wholesale, seeing if things still work, and
if they do simply moving back to a parse-once-per-mapping approach.
First thing, see if any relocs are ever processed: if you are right,
none will be, and this is dead code.)

>> > Then I would add a file:
>> >
>> > 	.../dof-pid/$pid/$dev-$ino.dh
>> >
>> > which contains the dof_helper_t data for that mapping in this particular task
>> > (by pid).  Together with the data in .../dof-pid/$pid/$dev-$ino (which is
>> > .../dof/$dev-$ino), you have enough to identify everything we need to know
>> > about the USDT probes in a particular task.
>> 
>> ... in the world you thought we were in, in which parsing is not
>> affected by per-process state, that is practical. In this one, alas no.
>> I had something similar to this design originally, but it doesn't work
>> :(
>> 
>> [snipped bits which are only relevant if the redesign above went
>>  through, which alas... I wish.]
>
> Again, I do not see where the problem lies.

It's possible that I had two simultaneous bugs that misled me into
believing that parsed DOF was per-PID (something I considered at the
time but wrote off as implausible after doing parsing per-PID fixed
everything). Will reinvestigate.

>> >>  *
>> >>  * /run/dtrace/probes: Per-probe info, written by dtprobed, read by DTrace.
>> >>  *
>> >>  *    .../$pid/$prv$pid/$mod/$fun/$prb: Hardlink from $prv$pid:$mod:$fun:$prb
>> >>  *    above; parsed representation of one probe in a given process. Removed by
>> >>  *    dtprobed when the process dies, or if all mappings containing the probe
>> >>  *    are unmmapped.  Used by DTrace for tracing by PID.
>> >
>> > Why not use .../$pid/$prv/$mod/$fun/$prb?  Since you are using globbing to
>> > search for probe name matches, it doesn't matter where the pid is placed, and
>> > there is no need to have it listed twice.  DTrace can still register the probe
>> > internally as $prv$pid after its definition has been found.
>> 
>> Because the provider name DTrace is dealing with already has the PID as
>> part of the name, and slicing them in half again would prevent us from
>> ever putting USDT probes in programs with numbers at the end of their
>> names (something which is very common! I have 300+ of these in /usr/bin
>> here.)
>
> Why does this matter?  You are not slicing anything in half that has to do
> with the exectable name because provider name != executable name.  The only
> issue is where the provider name ends in numeric digits so this would already
> have been an issue with DTrace overall?

Oh true! I always forget that :)

Using $pid$prv is *still* beneficial because it means we can do matching
in DTrace using nothing but glob(), without needing to cut anything
apart, but the benefit is marginal because we cut things apart in dtrace
*anyway*. Hmm.

-- 
NULL && (void)