[DTrace-devel] [PATCH 03/04] rawtp: report lockmem issues when determining rawtp argument count

Kris Van Hees kris.van.hees at oracle.com
Wed Nov 29 21:12:55 UTC 2023


On Wed, Nov 29, 2023 at 03:51:55PM -0500, Eugene Loh via DTrace-devel wrote:
> On 11/27/23 16:28, Kris Van Hees wrote:
> 
> > On Mon, Nov 27, 2023 at 04:07:51PM -0500, Eugene Loh wrote:
> > > On 11/22/23 22:22, Kris Van Hees wrote:
> > > 
> > > > On Wed, Nov 22, 2023 at 10:01:37PM -0500, Eugene Loh via DTrace-devel wrote:
> > > > > On 11/22/23 10:49, Kris Van Hees via DTrace-devel wrote:
> > > > > 
> > > > > > diff --git a/libdtrace/dt_prov_rawtp.c b/libdtrace/dt_prov_rawtp.c
> > > > > > @@ -181,7 +181,9 @@ static int probe_info(dtrace_hdl_t *dtp, const dt_probe_t *prp,
> > > > > >     		dif.dtdo_len = 2;
> > > > > >     		bpf_fd = dt_bpf_prog_load(dt_rawtp.prog_type, &dif, 0, NULL, 0);
> > > > > > -		if (bpf_fd == -1)
> > > > > > +		if (bpf_fd == -EPERM)
> > > > > Is that right?  I think the only negative value is -1.  Might it be that you
> > > > > want to test errno==+EPERM?  And do we know that EPERM really means
> > > > > lockmem?  Or are we simply guessing that that's the most likely explanation?
> > > > It is actually correct, or as correct as we can get.  A result value of
> > > > -EPERM is almost always a lockmem issue, based on the reading of the kernel
> > > > code that relates to this.  And -EPERM *is* -1.
> > > > 
> > > > But other error codes can be returned also, for other failures.
> > > What I wrote might have been pretty confusing.  Let me try again. The return
> > > value is either a legal value or else it is -1 (which coincidentally is
> > > -EPERM).  In order to distinguish among different errors, we cannot test the
> > > return value;  we have to test errno.
> > Ah true, I meant to ensure that when dt_bpf_prog_load() fails, it would return
> > -errno but never made that change (but thought I did).  Duh.
> > 
> > Let me look at this - I probably will make that change anyway and then this
> > will be correct, because it makes sense to return -errno on error and a valid
> > fd on success.
> 
> Okay, I guess.  The dt_bpf_prog_load() does have other call sites. Some
> expect the return value to be -errno (which would be correct if you make the
> change you propose) while another checks errno explicitly (believing that
> EPERM means a helper function is unavailable rather than being a symptom of
> lockmem problems).

See v2 patches (being posted right now).

> > > > > # dtrace -lv > out.default
> > > > > # dtrace -xlockmem=1 -lv > out.small
> > > > > # wc -l out.*
> > > > >    1445230 out.default
> > > > >    1380822 out.small
> > > > >    2826052 total
> > > > > # grep lockmem out.small
> > > > > # grep Cannot out.small
> > > > > #
> > > > > 
> > > > > Are things working as expected?
> > > I'm still curious about this point.  I thought the point of the patch was
> > > that if there is a lockmem issue, that issue would be reported.  When I run
> > > with -xlockmem=1, the output is truncated significantly, but no telltale
> > > error message is printed.
> > That is the behaviour you see with just this patch (and not having the later
> > patch in the series applied)?  That is odd.  On what OL version and kernel?
> 
> OL7.9
> 5.4.17
> 
> FWIW, lockmem stuff is relevant only on these older kernels if one believes,
> e.g., test/unittest/misc/tst.lockmem-cmdline.x and the like:
>     # Somehow, UEKR6 (5.4.17) has problems with the the locked-memory limit,
>     # but UEKR7 (5.15.0) does not
>     echo "no locked-memory limit on newer kernels?"
> 
> Anyhow, one question is whether one is detecting the lockmem issues
> correctly;  that is the EPERM stuff discussed above.  The other question is
> what happens when we pass the lockmem errmsg up the call stack.  I think
> this part of cmd/dtrace.c is pertinent:
> 
>         case DMODE_LIST:
>                 for (i = 0; i < g_cmdc; i++)
>                         list_prog(&g_cmdv[i]);
>                 if (g_cmdc == 0)
>                         dtrace_probe_iter(g_dtp, NULL, list_probe, NULL);
> 
> It looks like list_prog() -- specifically, its callee list_stmt() --
> delivers the errmsg to the user while the g_cmdc==0 code path here does
> not.  So "dtrace -xlockmem=1 -lvP rawtp" should deliver the lockmem message,
> while "dtrace -xlockmem=1 -lv" will not deliver the errmsg in question.
> 
> Let me know if you'd like me to submit a patch or if you want to throw it
> into your lockmem/rawtp series.
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list