[DTrace-devel] [PATCH] ERROR probe implementation

Thu Jan 21 09:46:38 PST 2021

On Thu, Jan 21, 2021 at 12:25:56PM -0500, Eugene Loh wrote:
> On 1/21/21 2:04 AM, Kris Van Hees wrote:
> 
> > On Wed, Jan 20, 2021 at 11:23:59PM -0500, Eugene Loh wrote:
> >> The test .r results files are missing "at DIF offset" clauses.  Is this
> >> a loss of functionality?  The commit message explicitly mentions offset
> >> as being something we can record.
> > The commit message documents what the ERROR probe supports in terms of design
> > (i.e. what it supports as part of DTrace's design).  Right now, we cannot give
> > instruction offsets yet for the place where the ERROR probe is triggered, so
> > we pass in -1 which means there is no declared offset.  So, the .r files are
> > not missing the "at DIF offset" information inasmuch as they do not mention it
> > because the tests are not expected to emit it.  It does not invalidate the
> > test.  It can be seen as a minor loss of functionality but the legacy code
> > allows for this type of situation already.  I do plan on being able to report
> > the instruction offset at some point in the future - it is a WIP though.
> 
> Hmm.  I'm a little nervous that the tests are going in the wrong 
> direction here, but I guess I'll just assume you'll manage the WIP and 
> we'll end up in the right place eventually.

The problem with the4 tests is that the offset isd not an entity that we should
be validating because it depends on the code generation implementation where a
change in any one of a series of tokens can cause the offset to change.  We can
have tests to validate the offset but they are much more complex than the tests
we have now because they will have to include dumping the disassemblt and
validating the offset against the disassembly to ensure that the correct
instruction offset is reported.  That is not the purpose of these basic tests,
and so the offset is irrelevant here.  When we start reporting it, we should
amend these tests to strip out that information or at least make it be
substituted by a token (e.g. replace the offset value with #).

> >> Since both dt_cg_tramp_epilogue() and dt_cg_tramp_epilogue_advance()
> >> have grown -- and in exactly the same way -- perhaps it makes sense to
> >> implement the former by calling the latter.  This could also simplify
> >> dt_prov_dtrace.c, the only caller of dt_cg_tramp_epilogue_advance().
> > The problem is that they both implement the same code at their beginning, then
> > one implements extra code, and then they again share code.  So I cannot just
> > call one from the other.
> 
> Right, but the amount of code they're sharing has grown. Meanwhile, 
> conditionalizing the "extra code" that's in the middle is easy enough.  
> There are different ways of doing this, but if one is okay with using 
> "if (act != DT_ACTIVITY_ACTIVE)" then not only is dt_cg_tramp_epilogue() 
> just a call to dt_cg_tramp_epilogue_advance(), but the "activity" code 
> in dt_prov_dtrace.c's trampoline() function could get cleaned up a 
> little as well.

Hm, yes, but I really prefer the providers to be able to just call
dt_cg_tramp_epilogue() because dt_cg_tramp_epilogue_advance() is a special
case that only exists because of the BEGIN and END probes.

> >> In bpf/get_bvar.c, the "#define error()" strikes me as funny.  To me, it
> >> does not feel like it buys much.  Anyhow, that's an opinion thing.
> >> FWIW, if we are okay with the ({...}) construct, then the following
> >> advice in CODING-STYLE should be updated:
> >>           Macros with multiple statements should
> >>           be enclosed in a do-while(0) block
> > Well, it makes it possible to write "return error(...)" which is similar to
> > the typical "return dt_probe_error(...)" that one would use.  To make that
> > possible we need a macro expression using the ({ ... }) syntax.  That is not
> > the same as a multi-statement code block.
> 
> My point on ({...}) was that if we are willing to use this construct 
> (which is available in "newer" gcc compilers, I thought), then our 
> advice in CODING-STYLES regarding multiple-statement macros should be 
> updated.  I can update CODING-STYLE (and our use of "do {} while(0)") in 
> a separate patch if you like.

Sure.  I see no reason to not use that construct, though I do think it should
be used sparingly.

> >> Why was dt_cg_node()'s DT_TOK_PTR/DT_TOK_DOT case handling a NULL
> >> pointer specially, and why has this handling been changed?  I realize
> >> that NULL pointers are bad addresses, but they are certainly not the
> >> only ones.  This is only one very special case.
> > BPF requires that pointers are validated to not be NULL.  And since a NULL
> > pointer is bad news, it is appropriate to report it as a BADADDR fault.
> 
> Okay.  So, for me, it'd be nice if the NULL check were accompanied by a 
> short comment that says this is to appease the BPF verifier.  But, maybe 
> that's just me.

It is not jsut for the BPF verifier but it is one of the considerations.
In other words, the BPF verifier needs it *and* NULL is a very specific case of
an address that is guaranteed to be bad.  The legacy code had special handling
for NULL as well at the kernel level because a NULL pointer deref would cause
a major issue as compared to a regular bad pointer just being reported with a
error-indicating return value of various memory access functions.

Special handling to avoid NULL pointer deref is common - I don't think a
specific comment is warranted.