[DTrace-devel] [PATCH v3 06/21] dtprobed: add the DOF stash
Nick Alcock
nick.alcock at oracle.com
Thu Feb 15 21:55:53 UTC 2024
On 15 Feb 2024, Kris Van Hees said:
> On Thu, Feb 15, 2024 at 01:13:17PM +0000, Nick Alcock wrote:
>> On 15 Feb 2024, Kris Van Hees uttered the following:
>> >> + /*
>> >> + * Probe: make (mod, fun, probe) dirs as needed: close
>> >> + * out file, open new one, make hardlink to it.
>> >> + */
>> >> + case DIT_PROBE: {
>> >> + assert(state >= 0);
>> >
>> > This is ok, but DIT_TRACEPOINT uses:
>> >
>> > assert(state == DIT_PROBE || state == DIT_TRACEPOINT);
>> >
>> > which is much better, right? So why not use that here?
>>
>> We're not checking the same thing. DIT_TRACEPOINT needs to validate that
>> the tracepoint is preceded by a probe (or a probe and a string of
>> tracepoints): DIT_PROBE only needs there to have been at least one
>> provider (which means the state might be *anything* this state machine
>> can reach, but cannot be -1, the before-we-started state).
>
> Maybe make it explicitly list all three possibilities? That makes it more
> clear I think, and avoids problems if the state machine might need to be
> expanded in the future to include other cases where DIT_PROBE is not valid
> after just any previous state.
OK, done. I was thinking that it would be an advantage to have it work
even after DIT_* was expanded, but that is obvious rubbish! Now we get
informed instantly if we expand DIT_* and fail to update this switch
accordingly.
>> (This is all outright asserts because it can't fire unless dof_parser.c
>> is buggy anyway.)
>>
>> >> + /*
>> >> + * This rather weird pattern carries out all closes and
>> >> + * then errors if any of them fail.
>> >> + */
>> >> + err = maybe_close(mod_dir);
>> >> + err |= maybe_close(fun_dir);
>> >> + err |= maybe_close(fd);
>> >
>> > Why not do this at the beginning of the DIT_PROBE processing? Obviously, if
>> > a previous DIT_PROBE existed, this new one ends the previous one. There does
>> > not seem to be any point in not doing these closes immediately. Even if the
>> > processing of this probe fails somehow, the previous one is known to be done.
>>
>> I did that originally, but it meant duplicating all of this three times,
>> at the start of DIT_PROBE, after the loop, and in the error handlers.
>> There's too much error handling and closing noise in this function already...
>
> You already have it three times in this function. I merely mean that this
> instance can be moved to the beginning of the DIT_PROBE case, which seems
> better. It looks odd that other work is done first, even though it does not
> in any way affect whether these closes are done, and it also does not depend
> on them not being done yet.
... am I? Oh dammit you're quite right. OK, moved to the top, right
after 'state' is set. I thought it was clearer to have them closed right
before they were used, but if you'd rather do it further away (and keep
all the initialization in one place, I suppose), that's fine with me.
Doesn't mean we can get rid of any of the other closes though :(
>> >> + /*
>> >> + * Write out the raw DOF, if not already known, and link it into its
>> >> + * per-PID dir: write the helper directly out. dof_stash_write tells us
>> >> + * whether the DOF actually needed writing or not (via a straight size
>> >> + * comparison, which suffices to spot almost everything other than the
>> >> + * pathological case of stale DOF from a killed process whose inode is
>> >> + * unlinked and replaced with another identically-numbered inode with
>> >> + * DOF of the *exact same length*: this seems too unlikely to be worth
>> >> + * bothering with.)
>> >> + */
>> >> + new_dof = 1;
>> >> + switch (dof_stash_write(dof_dir, dof_name, dof, size)) {
>> >> + case 0: new_dof = 0; break;
>> >> + case -1: goto err_unlink_nomsg; break;
>> >> + }
>> >
>> > Why this odd switch statement? For one, the -1 case doesn't need a break
>> > because it has a goto right before that break. Couldn't this be simply
>> > written as:
>> >
>> > new_dof = 1;
>> > if (dof_stash_write(dof_dir, dof_name, dof, size) == -1)
>> > goto err_unlink_nomsg;
>> > new_dof = 0;
>> >
>> > or even:
>> >
>> > new_dof = dof_stash_write(dof_dir, dof_name, dof, size);
>> > if (new_dof == -1)
>> > goto err_unlink_nomsg;
>>
>> dof_stash_write_raw() has a tripartite return value:
>>
>> -1: error
>> >0: DOF written, size given
>> 0: *no DOF written*.
>>
>> The case statement is handling that, much as it would do for fork()
>> (which I modelled this on, return-value-wise).
>>
>> I've augmented the comment above it to indicate that.
>
> Ah, my fault. I was looking at the wrong function when checking what the
> return values of dof_stash_write() are. My stupidity - sorry.
My fault for using such a terrible name for it! (Fixed in the tree I
pushed a few minutes ago.)
>> >> + *
>> >> + * The first entry must be a file: the next must be directories. If this is
>> >> + * violated, just return without unlinking anything (but do log an error).
>> >
>> > First entry of what? Next of what? Where is this validated?
>>
>> I kind of assumed you could look at its single use:
>>
>> const char *dirnames[] = {pid_name, prov, mod, fun, prb, NULL};
>> if (unlinkat_many(probes_dir, dirnames) < 0) {
>
> Sure, but since that conflicted with the comment, I was confused.
Yeah, mea culpa, fixed comment...
>> (I generalized it because I just bet we'll be needing this again in
>> dtprobed when we get systemwide tracing, for the non-pid-containing
>> /run/dtrace/probes/$prov/$mod/$fun/$prb dir.)
>
> Since you are generalizing anyway, why not handle the case where the last
> element is a directory as well? So this could be used to get rid of an
> empty directory as well? It seems odd to make a generalized function that
> has such a specific condition on it that seems unnecessary. Removing a chain
> of directories of which the end is an empty directory seems useful.
... hm, it actually makes the code simpler! I was avoiding doing it
because I thought it would increase complexity for something I wasn't
using :)
> And you already pass an argument dirnames which would imply that its elements
> are directory names. The fact that the last element (as you say below is to
> be a file is rather counter-intuitive for something names dirnames.
Yeah, I've changed the comment too:
* Act like rm -rd, removing a file and some number of empty directories above
* it, as given by the names argument: the last element may be a directory or a
* file. [...]
... which is hopefully a bit clearer?
--
NULL && (void)
More information about the DTrace-devel
mailing list