[DTrace-devel] [PATCH v3 13/19] alloca: support null pointers

Mon Mar 28 17:44:00 UTC 2022

On 28 Mar 2022, Kris Van Hees stated:
> On Thu, Mar 24, 2022 at 06:24:39PM +0000, Nick Alcock via DTrace-devel wrote:
>> Variables used to store pointers obtained from alloca() cannot be reused
>> for any other purpose -- but there is one purpose people are almost
>> certain to want, which is assigning NULL to them.  You don't want that
>> to elicit an error message!
>> 
>> Making this work is quite tricky.  We can't just stuff a 0 into
>> variables and load it, because that would lead to us subtracting
>> scratchbot from it and then adding scratchbot back afterwards, giving
>> null pointers a huge random value in the variables.  So we detect
>> the specific case of var = NULL at codegen time by literal matching the
>> parse tree and turn it into a store of a sentinel value,
>> DT_CG_ALLOCA_NULLPTR (~(1<<30)).  We can then detect that at load time
>> and turn it back into a NULL.
>
> I do not follow this...  If you store 0 in a variable, and then load it and
> end up subtracting scratchbot from it and then adding scratchbot back, you
> better end up with 0 again.  Which you can define to be an invalid scratchmem
> offset by decree (i.e. enforce that allocation starts at offset 1).  So why
> is that not sufficient?

That's possible in theory, but has the horrible consequence (to me,
anyway) that ((struct foo *)NULL)->bar) is not trapped (because that is
not at offset zero). This would fail on any normal system, so I consider
it a QoI matter that we fail here. Storing 0 (in the register) as a
unique value that cannot appear as a valid scratchmem offset does that
with only a couple of instructions of overhead. (It used to be much
worse, but since the null pointer space nightmare mentioned in relic
comments below turned out to be unnecessary, it's actually quite easy.)

>> But that's not enough.  We might say "it's up to the user to not
>> dereference it", but the verifier forces us to ensure that this can't
>> happen.  We can't leave a 0 in there either because it's not a pointer
>> value and the verifier's tracking of what happens if a 0 is dereferenced
>> hits map_value-versus-literal problems: since every *other* value we
>> can hold in an alloca parse tree node is a pointer, we should arrange
>> for this to be one too.
>
> Where exactly is this causing issues with the verifier in terms of map_value
> vs plain integer value?
> 
>> certainly can't be a pointer to NULL -- that's not a map_value!  So we
>> introduce a new "null pointer space", in the shape of a BPF array twice
>> the size of scratch space, and set null pointers to point halfway
>> through that at load time.  Then it's just a matter of having the
>> bounds-checkers check for both 0 and the null pointer space value and
>> raise suitable out-of-bounds errors in both cases.  (As usual,
>> in-scratch and out-of-scratch bounds checkers diverge: *both* consider
>> null pointers out of bounds, but the in-scratch checker only checks for
>> the null-pointer-space version so it can raise a suitable error, since
>> null pointers that are still 0 raise the right error anyway.  This means
>> null pointer detection adds relatively little code except to bcopy.)
>
> This really seems overly complicated.  So why can't we just use 0 (which is in
> fact scratchbot, which is a valid pointer i.e. map_value)?  Any adjustment to
> that value (add or subtract) due to e.g. indexing would then simply need to be
> bounds checked against the bounds of the scratch space, i.e. offset greater
> than 0 and less than the scratch space size - 1.

Sorry, this stuff is residual commit message junk I should have cleaned
up. The null pointer space has been gone for some time :) null pointers
are encoded as DT_CG_ALLOCA_NULLPTR in variables, and represented when
not in variables as 0 (with +/- scratchlen also considered a null
pointer).

> I really do not see where we need a 'null pointer space' rather than 0 being
> the null pointer, and not-0 nit being the null pointer.

You are quite right!

> Given that the scratch space may need to be pretty big, doubling it just to be
> able to handle a NULL pointer seems rather extreme as well.

Well, we have to double the scratch space in *any* case -- with the
latter half unused -- but that's for an entirely different reason: that
you might choose to do a 1-long bcopy() from the last byte in scratch
space, but that size is a value in a register, and the verifier cannot
use values in registers to determine bounds. So we must use a worst-case
static bound, which is that for a bcopy() into the first valid scratch
address, running up to the last: thus we need twice the space we would
otherwise need. (Well, twice less one byte, but honestly it makes
reasoning about this easier to just double it.)

>> -	emit(dlp,  BPF_BRANCH_IMM(BPF_JNE, reg, 0, lbl_notnull));
>> -	dt_cg_probe_error(yypcb, DT_LBL_NONE, DTRACEFLT_BADADDR, 0);
>> +	uint_t	lbl_null = dt_irlist_label(dlp);
>> +
>> +	emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, reg, 0, lbl_null));
>> +	emit(dlp,  BPF_BRANCH_IMM(BPF_JSGT, reg, opt_scratchsize,
>> +				  lbl_notnull));
>> +	emit(dlp,  BPF_BRANCH_IMM(BPF_JSLT, reg, -opt_scratchsize,
>> +				  lbl_notnull));
>> +	dt_cg_probe_error(yypcb, lbl_null, DTRACEFLT_BADADDR, 0);
>
> This is a problem because we're doing this for every not-NULL check, even
> if alloca is not involved.

As noted above, I consider this a QoI matter, since every other null
pointer check in every implementation out there will correct diagnose
(by crashing) pointers *near* null, as well as pointers *at* null. The
verifer is happy because we have a null check too.

> E.g. a simple dtrace -n 'BEGIN { s = probename; trace(s); exit(0); }' 
>
> now gets these extra conditionals added that are not necessary.

Yes, but conversely,

trace((struct foo *)NULL)->bar);

is now correctly diagnosed as a null pointer dereference. (This really
happens, and quite a lot too. Even the kernel itself has done it
internally, and indeed this has been the source of security holes in the
past...)

We should certainly do it at least for alloca pointers, given that they
are often likely to be pointers to structures or arrays -- but it's only
two insns, dammit: are we so short of insns that this is actually
causing a problem?