[DTrace-devel] [PATCH v3 06/21] dtprobed: add the DOF stash

Nick Alcock nick.alcock at oracle.com
Wed Feb 14 13:53:42 UTC 2024


On 14 Feb 2024, Kris Van Hees told this:

> On Mon, Feb 12, 2024 at 10:00:37PM +0000, Nick Alcock wrote:
>> On 12 Feb 2024, Kris Van Hees spake thusly:
>> 
>> > Initial comments (mostly on the comments, that are pretty crucial for people
>> > to understand what is going on here).
>> 
>> ... here's the new top-of-file comment block, which hopefully make a bit
>> more sense now:
>> 
>> /*
>>  * Oracle Linux DTrace; DOF state storage management.
>>  * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
>>  * Licensed under the Universal Permissive License v 1.0 as shown at
>>  * http://oss.oracle.com/licenses/upl.
>>  *
>>  * The DOF stash is the principal state preservation mechanism used by dtprobed.
>>  * Keeping DOF on disk keeps memory management complexity low, makes it
>>  * possible for dtprobed to track DOF across daemon restarts and upgrades,
>>  * and gives the rest of dtrace a place to acquire the DOF contributed by
>>  * processes on the system for USDT probe operations.
>>  *
>>  * The state storage structure is as follows:
>>  *
>>  * /run/dtrace/stash: Things private to dtprobed.
>
> Perhaps use a trailing / for all entries here and below that are directories,
> so it is immediately obvious what are files and what are directories?

Ack! good idea, particularly given the mistakes I'd already made there :)

>>  *
>>  *    .../dof/$dev-$ino: DOF contributed by particular mappings, in raw form
>>  *    (as received from some probe-containing program).
>
> I would think that this is a directory that contains:
>
> 	   ../raw				- raw DOF
> 	   ../parsed				- parsed DOF
> 	   ../parsed/$prv:$mod:$fun:$prb	- per-probe parsed DOF
>
> because this data is not task (or PID) specific.  Granted, dtrace will use
> a modified provider name ($prv$pid) when it creates probes for a specific task,
> but at this level there is nothing task specific in this data.

I did that at first, but it persisted in not working.

Alas, it turns out the output of the parser *is* PID-specific despite
the DOF being the same for every mapping of a given, uh, mapping,
because the parsing process depends on the dof_helper_t :( in
particular, the dofhp_addr is address-space-specific, so more or less
process-specific. For now, I think this is the best we can do. (In
future, maybe we can pass the dofhp_addr on along with the relocs it is
used with, and relocate the raw DOF later on in dtrace proper, but for
now parsing isn't so expensive that a parse per ioctl is likely to be a
performance problem. We can usually parse many DOFs per parser process,
in a stream...)

>>  *
>>  *    .../dof-pid/$pid/$dev-$ino: contains everything relating to DOF
>>  *    contributed by a particular USDT-containing ELF object within a given
>>  *    process (henceforth "DOF source").  Each DTRACEHIOC_ADDDOF ioctl call
>>  *    creates one of these.
>
> I would make this a hardlink to .../dof/$dev-$ino.

... you just said you wanted this to be a directory :) can't hardlink
those, which breaks the "use the link count" refcounting method I'm
using: so I'd need something else, probably another per-mapping file
under dof-pid hardlinked to something (the raw DOF?) under the $dev-$ino
directory. However...

> Then I would add a file:
>
> 	.../dof-pid/$pid/$dev-$ino.dh
>
> which contains the dof_helper_t data for that mapping in this particular task
> (by pid).  Together with the data in .../dof-pid/$pid/$dev-$ino (which is
> .../dof/$dev-$ino), you have enough to identify everything we need to know
> about the USDT probes in a particular task.

... in the world you thought we were in, in which parsing is not
affected by per-process state, that is practical. In this one, alas no.
I had something similar to this design originally, but it doesn't work
:(

[snipped bits which are only relevant if the redesign above went
 through, which alas... I wish.]

>>  *
>>  * /run/dtrace/probes: Per-probe info, written by dtprobed, read by DTrace.
>>  *
>>  *    .../$pid/$prv$pid/$mod/$fun/$prb: Hardlink from $prv$pid:$mod:$fun:$prb
>>  *    above; parsed representation of one probe in a given process. Removed by
>>  *    dtprobed when the process dies, or if all mappings containing the probe
>>  *    are unmmapped.  Used by DTrace for tracing by PID.
>
> Why not use .../$pid/$prv/$mod/$fun/$prb?  Since you are using globbing to
> search for probe name matches, it doesn't matter where the pid is placed, and
> there is no need to have it listed twice.  DTrace can still register the probe
> internally as $prv$pid after its definition has been found.

Because the provider name DTrace is dealing with already has the PID as
part of the name, and slicing them in half again would prevent us from
ever putting USDT probes in programs with numbers at the end of their
names (something which is very common! I have 300+ of these in /usr/bin
here.)

(And we need the pid to be separate because there are things we need to
do when processes go away which require us to remove the entire
directory, and it's far easier to do that if they're all grouped
together properly under one dir.)

-- 
NULL && (void)



More information about the DTrace-devel mailing list