[DTrace-devel] [PATCH] ERROR probe implementation

Mon Jan 25 11:43:31 PST 2021

On 1/21/21 2:04 AM, Kris Van Hees wrote:

> On Wed, Jan 20, 2021 at 11:23:59PM -0500, Eugene Loh wrote:
>> Why was dt_cg_node()'s DT_TOK_PTR/DT_TOK_DOT case handling a NULL
>> pointer specially, and why has this handling been changed?  I realize
>> that NULL pointers are bad addresses, but they are certainly not the
>> only ones.  This is only one very special case.
> BPF requires that pointers are validated to not be NULL.  And since a NULL
> pointer is bad news, it is appropriate to report it as a BADADDR fault.
>
> Sure, there are many other addresses that are bad addresses as well, but we
> currently do not have a mechanism to validate them.  As we implement proper
> kernel memory accesses vs BPF program data accesses, support for more generic
> BADADDR faults will be added.
>
>> Somewhat related to the above, why are tests changed from
>>       -    *(char *)NULL;
>>       +    trace(((struct task_struct *)NULL)->pid);
>> I guess part of the issue is that some of these constructs will be
>> caught by the BPF verifier.
> Very simple...  *(char *)NULL generates code that will not pass the BPF
> verifier because we do not generate correct code for it (yet) whereas we
> do generate valid code for ((struct task_struct *)NULL)->pid.  The test is
> about the ERROR probe firing with a BADADDR fault, so the mechanism for
> triggering that is not relevant - picking one that actually works for the
> test is prudent.

I wanted to revisit these issues.

FWIW our docs say that "*(char *)NULL" should trigger the ERROR probe. 
https://docs.oracle.com/en/operating-systems/oracle-linux/dtrace-guide/dt_prov.html#dt_error_prov 
The "*(char *)NULL" construct is the quintessential example -- both in 
our docs and in our test suite -- of how to trigger ERROR.  But it 
doesn't work that way now because the BPF verifier complains.

But the BPF verifier complaint is only because of the way we handle this 
D code.  Just as with DT_TOK_DOT and DT_TOK_PTR, we could (and for back 
compatibility should) handle NULL dereference ourselves.

What do you think about adding such a change?

         case DT_TOK_DEREF:
                 dt_cg_node(dnp->dn_child, dlp, drp);
                 dnp->dn_reg = dnp->dn_child->dn_reg;

                 if (!(dnp->dn_flags & DT_NF_REF)) {
                         uint_t ubit;
+                       uint_t ok = dt_irlist_label(dlp);

                         /*
                          * Save and restore DT_NF_USERLAND across 
dt_cg_load():
                          * we need the sign bit from dnp and the user 
bit from
                          * dnp->dn_child in order to get the proper opcode.
                          */
                         ubit = dnp->dn_flags & DT_NF_USERLAND;
                         dnp->dn_flags |=
                             (dnp->dn_child->dn_flags & DT_NF_USERLAND);

+                       /* if NULL pointer, report BADADDR */
+                       ok = dt_irlist_label(dlp);
+                       emit(dlp,  BPF_BRANCH_IMM(BPF_JNE, dnp->dn_reg, 
0, ok));
+                       dt_cg_probe_error(yypcb, -1, DTRACEFLT_BADADDR, 0);
+                       emitl(dlp, ok,
+                                  BPF_NOP());
+
                         /* FIXME: Does not handled signed or userland */
                         emit(dlp, BPF_LOAD(dt_cg_load(dnp, ctfp, 
dnp->dn_type), dnp->dn_reg, dnp->dn_reg, 0));

                         dnp->dn_flags &= ~DT_NF_USERLAND;
                         dnp->dn_flags |= ubit;
                 }
                 break;