[DTrace-devel] restoring how DTrace adds probes

Mon Jun 8 12:25:06 PDT 2020

We should change DTrace's loading of probes to respond to which probes 
are requested by the user.  That is, we are currently populating probes 
in a fixed manner without regard for which probe descriptions were 
specified by the user, while legacy DTrace would load probes as 
requested by the user.  Restoring the legacy behavior and taking user 
requests in account is important for some probes -- like profile-* and 
tick-* -- that do not exist until the user requests them.

BACKGROUND.

Note that in the dtrace command-line tool, there are five passes through 
the command-line arguments.  Important events are:
   - the call to dtrace_open()
   - the second pass over arguments, collecting program specifications
   - the fourth pass, compiling these specifications

In legacy DTrace() (e.g., the master branch), probes are looked up -- or 
created if necessary -- during that fourth-pass compilation.  That is, 
we call
dt_compile_one_clause(),
which calls dt_setcontext() to look for a "representative probe",
which calls dt_probe_info(),
which calls dtrace_probe_iter(),
which dt_ioctl(dtp, DTRACEIOC_PROBEMATCH, &pd) to look up probes.

Then, in the kernel, dtrace_ioctl() has:
     /*
      * Before we attempt to match this probe, we want to give
      * all providers the opportunity to provide it.
      */
     dtrace_probe_provide(&desc, NULL);

That function loops over providers.  Each provider can provide probes to 
satisfy the specified probe description.

In contrast, in the most recent implementation of DTrace (e.g., the 
2.0-branch-dev branch), near the end of dtrace_open() -- strictly 
speaking, dt_vopen() -- we add:
     /*
      * Initialize the collection of probes that is made available by the
      * known providers.
      */
     dt_probe_init(dtp);
     for (i = 0; i < ARRAY_SIZE(dt_providers); i++)
                 dt_providers[i]->populate(dtp);

That is, we loop over providers to populate probes.  This is done before 
looking at any of the probe descriptions in command-line arguments or 
specified scripts.  The probes are provided from set lists, such as 
predefined dtrace probes (e.g., BEGIN and END) or probes found in lists 
in the tracefs file system.

Later, for example in dt_probe_iter(), the list of probes having been 
defined, we simply look up whether a probe is known or not. We do not 
currently allow a user to specify a probe that we did not already know.

I don't think I've thoroughly captured how the kernel handles probe 
queries in legacy DTrace;  the main point is simply that DTrace used to 
provide probes "on demand" while we now pre-define probe lists.

PROPOSAL.

Instead of the current "populate()" approach, in which each provider 
populates the list of probes without knowledge of the user's probe 
descriptions, let us revert to legacy DTrace's on-demand "provide()" 
approach, in which we track probes exclusively in response to 
user-specified probe descriptions. Specifically,

*)  In dt_vopen(), remove the new code that loops over providers calling 
their populate() functions.

*)  In dt_probe_iter(), allow providers to provide a probe.

DISCUSSION.

*)  We could use both mechanisms.  That is, during dtrace_open(), we can 
use the "populate()" model to populate probes without regards to 
user-specified probe descriptions.  Then later, when compiling probe 
descriptions, add probes as necessary.  But there is no value in having 
two mechanisms, and there is no sense in adding possibly tens of 
thousands of probes even in the relatively common case where the user 
specifies fewer than a dozen.

*)  We should clean up these two functions:
         - dtrace_probe_iter()
         -     dt_probe_iter()
Once again, we should revert somewhat to the legacy implementation.  
That is, dt_probe_iter() used to be a static function that was used 
simply as a callback within dtrace_probe_iter().  In the current 
version, dt_probe_iter() is no longer a static function.  On the other 
hand, it is still used only by dtrace_probe_iter().  And 
dtrace_probe_iter() has become simply a wrapper for dt_probe_iter().  
Therefore:
     - move probe iteration from dt_probe_iter() into dtrace_probe_iter()
     - if dt_probe_iter() is still needed, make it static once again

*)  DTrace has the IMHO peculiar practice of listing probes multiple 
times.  Consider:
     dtrace -n END -l -n BEGIN -n BEGIN
        ID   PROVIDER    MODULE    FUNCTION NAME
         2     dtrace                       END
         1     dtrace                       BEGIN
         1     dtrace                       BEGIN
This behavior is seen in both legacy and the current DTrace. Leaving 
this behavior is fine.

PROFILE PROVIDER.

While many providers can benefit from this proposal, it was motivated by 
the immediate desire to add a profile provider.  The following 
discussion is in that specific context.

*)  While users can specify their own profile probes, the provider 
should nonetheless provide some suggestive probes, following legacy 
DTrace's suit.  E.g., from the legacy kernel file dtrace/profile_dev.c:
     profile_rates[] = { 97, 199, 499, 997, 1999, 4001, 4999, ... };
     profile_ticks[] = { 1, 10, 100, 500, 1000, 5000, ...};

*)  A user probe description such as "profile:::" would clearly refer to 
all existing profile probes rather than to all conceivable profile probes.

*)  Legacy DTrace added profile probes persistently.  That is, if a user 
specified profile-1234, that probe would persist -- even to other DTrace 
invocations -- until the profile kernel module were unloaded.  The 
current implementation need not mimic such persistent behavior.