[DTrace-devel] [PATCH v2] cg: fix store/load for struct and union members for alloca pointers

Thu Aug 31 18:33:10 UTC 2023

On Thu, Aug 31, 2023 at 07:09:23PM +0100, Nick Alcock wrote:
> On 31 Aug 2023, Kris Van Hees via DTrace-devel verbalised:
> 
> > @@ -3051,7 +3051,7 @@ dt_cg_store(dt_node_t *src, dt_irlist_t *dlp, dt_regset_t *drp, dt_node_t *dst)
> >  	int		dreg = dst->dn_reg;
> >  
> >  	/*
> > -	 * If we're loading a bit-field, the size of our store is found by
> > +	 * If we're storing into a bit-field, the size of our store is found by
> >  	 * rounding dst's cte_bits up to a byte boundary and then finding the
> >  	 * nearest power of two to this value (see clp2(), above).
> >  	 */
> > @@ -3062,13 +3062,15 @@ dt_cg_store(dt_node_t *src, dt_irlist_t *dlp, dt_regset_t *drp, dt_node_t *dst)
> >  		size = dt_node_type_size(dst);
> >  
> >  	/*
> > -	 * If we're loading a writable non-alloca lvalue, and it's a
> > -	 * dereference, and *its* child is an alloca pointer, then this is a
> > -	 * dereferenced alloca pointer and needs bounds-checking (which could
> > -	 * not be done at deref time due to not knowing the size of the write).
> > +	 * If we're storing into a writable lvalue that is a dereference of an
> > +	 * alloca pointer or if we are storing into a writable alloca lvalue,
> > +	 * we need to do bounds-checking and turn the offset value into a real
> > +	 * pointer.
> 
> Really stupid comment errors... but the deref thing is because I
> thought that struct derefs for writable lvalues were always converted
> into (*foo).bar by the DT_TOK_PTR code -- but looking at it now, if that
> code ever existed, it is gone I think (and this doesn't allow for array
> derefs at all, for instance: whoops, didn't think of that).

Array derefs are different (converted into *(ptr + offset) elsewhere).

Since *ptr = val does not place an ALLOCA taint on *ptr, the code you wrote
is needed to handle this case.

> > -	if (dst->dn_flags & DT_NF_WRITABLE && dst->dn_flags & DT_NF_LVALUE
> > -	    && dst->dn_op == DT_TOK_DEREF && dst->dn_child->dn_flags & DT_NF_ALLOCA) {
> > +	if (dst->dn_flags & DT_NF_WRITABLE && dst->dn_flags & DT_NF_LVALUE &&
> > +	    ((dst->dn_op == DT_TOK_DEREF &&
> > +	      dst->dn_child->dn_flags & DT_NF_ALLOCA) ||
> > +	     dst->dn_flags & DT_NF_ALLOCA)) {
> 
> This looks good, assuming that something somewhere is propagating taint
> across the -> (and I think that should happen mostly automatically,
> though DT_TOK_PTR and DT_TOK_DOT aren't explicitly doing it).

Well, yes, otherwise this code wouldn't work and the tests would fail.

> >  		assert(!(dst->dn_flags & DT_NF_BITFIELD));
> >  
> >  		if ((dreg = dt_regset_alloc(drp)) == -1)
> > @@ -6473,7 +6475,7 @@ dt_cg_node(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
> >  		}
> >  
> >  		if (m.ctm_offset != 0)
> > -			emit(dlp, BPF_ALU64_IMM(BPF_ADD, dnp->dn_left->dn_reg, m.ctm_offset / NBBY));
> > +			emit(dlp, BPF_ALU64_IMM(BPF_ADD, dnp->dn_reg, m.ctm_offset / NBBY));
> 
> ... um, a few lines below this we see:
> 
>         dnp->dn_reg = dnp->dn_left->dn_reg;
> 
> Has it even been initialized at this point? If not, I think we should be
> using dnp->dn_left still.

Well, of course.  The call to dt_cg_node() for dnp->dn_left does that, right?

> >  		if (!(dnp->dn_flags & DT_NF_REF)) {
> >  			uint_t	op;
> > @@ -6481,8 +6483,20 @@ dt_cg_node(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
> >  
> >  			op = dt_cg_ldsize(dnp, ctfp, m.ctm_type, &size);
> >  
> > +			/*
> > +			 * If the left-hand side of PTR or DOT is an alloca
> > +			 * pointer, bounds-check the pointer reference.
> > +			 */
> > +			if (dnp->dn_left->dn_flags & DT_NF_ALLOCA) {
> > +				assert(dnp->dn_flags & DT_NF_ALLOCA);
> > +				dt_cg_alloca_access_check(dlp, drp, dnp->dn_reg,
> > +							  DT_ISIMM, size);
> > +				dt_cg_alloca_ptr(dlp, drp, dnp->dn_reg,
> > +						 dnp->dn_reg);
> > +			}
> 
> dnp->dn_reg is probably still just as uninitialized here. dnp->dn_left?
> (Maybe the code has changed from what I'm looking at, and dnp->dn_reg
> is initialized higher up now. In that case, this code is fine.)

See above comment.

> The LHS access check is definitely necessary iff we're not always
> getting foo->bar and foo[bar] transformed into (*foo).bar etc as I was
> assuming we were, But I'm wondering why that isn't happening... I'd
> swear it used to.

No, foo->bar is converted for translator members into the code that will
generate tha value of the member, and foo[bar] is converted into
*(foo + offsetof(foo, bar))).  There is nothing that converts foo->bar, since,
well, that is done here (and that is fortunate for us because otherwise this
patch would be much more complex).

> > +
> >  			if (dnp->dn_left->dn_flags & (DT_NF_ALLOCA | DT_NF_DPTR))
> > -				emit(dlp, BPF_LOAD(op, dnp->dn_left->dn_reg, dnp->dn_left->dn_reg, 0));
> > +				emit(dlp, BPF_LOAD(op, dnp->dn_reg, dnp->dn_reg, 0));
> 
> Ditto.

See comment above.

> > diff --git a/test/unittest/funcs/alloca/tst.alloca-array-load.d b/test/unittest/funcs/alloca/tst.alloca-array-load.d
> 
> Nice tests.