[DTrace-devel] [PATCH v2 08/17] alloca: load and store

Nick Alcock nick.alcock at oracle.com
Mon Mar 14 21:30:16 UTC 2022


With alloca() and the parser taint code in place, we can handle storing
and loading alloca()ed values from variables.  We detect these easily
enough: stores from nodes with DT_NF_ALLOCA turned on, to identifiers
with DT_IDFLG_ALLOCA turned on are derived from alloca() and need
reduction to scalar offsets via subtraction of the DCTX_SCRATCHMEM
value; loads from identifiers with DT_IDFLG_ALLOCA turned on need
addition again.

The store case is fairly easy (a straight subtraction, hanging on to the
un-subtracted value for use as the return value of the allocation
itself), with one wrinkle: we detect stores from nodes with
DT_NF_NONALLOCA turned on, and complain if both DT_NF_ALLOCA and
DT_NF_NONALLOCA are simultaneously active, because this indicates that
someone is reusing a variable used to store alloca()ed pointers for
non-alloca purposes too.  We can't have that, since we are modifying the
variable at both load and store time!

The load case is trickier, and is pushed into a new dt_cg_load_alloca
function to keep it out of the way.  Converting the offset back into a
map_value pointer requires quite a few contortions: we have to
bounds-check everything because most of the values we need start out as
scalars, but we're trying to add them to a map_value... but the verifier
only uses conditionals to adjust its idea of bounds if the conditional
is == or != (not useful for us), or if one side of the conditional is a
constant.  So we have to come up with constant upper bounds for things
to satisfy the verifier, before checking the *real* bounds in
conditionals the verifier will always accept.

In particular, the scratchlen turns up as an unbounded scalar: we bound
it by ANDing it with one less than the next highest power of two above
the scratchlen option's value (which is guaranteed not to affect the
actual value, which must be less than that, but bounds the scratchlen
for the verifier).

The actual bounds checking is shuffled off into a new function,
dt_cg_check_bounds: among other things this is passed the reg to
bounds-check, the actual scratchlen to check against, and a constant
maximum scratch upper bound for the verifier (twice the actual
scratchlen, so that when we are chcking real accesses with nonzero
lengths, rather than simple pointers, we don't need to worry about
accesses right at the end of the allocated space: we can just let the
verifier assume that all accesses might be max-length, and they will
still fit).  The reg comes back suitably bounded to do arithmetic on it
(to convert it back from an offset to a pointer), and suitably
bounds-checked to dereference.

dt_cg_check_bounds is significantly more flexible than we need here:
it will check pointers, not just offsets (subtracting them from a base
register, i.e. the scratch base), and it will check ranges, not just a
simple pointer, so we can use it to make sure that accesses to scratch
space are in range.

There is no support for TLS storage yet, nor even diagnosis of attempts
to do so.  I don't know what it would involve.  Help solicited: it's
probably either next to impossible or dead easy :)

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
---
 libdtrace/dt_cg.c                             | 220 +++++++++++++++++-
 .../unittest/funcs/alloca/tst.string-alloca.d |  24 ++
 .../unittest/funcs/alloca/tst.string-alloca.r |   1 +
 3 files changed, 240 insertions(+), 5 deletions(-)
 create mode 100644 test/unittest/funcs/alloca/tst.string-alloca.d
 create mode 100644 test/unittest/funcs/alloca/tst.string-alloca.r

diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
index 9c9da7299b59..1eae3c381bc1 100644
--- a/libdtrace/dt_cg.c
+++ b/libdtrace/dt_cg.c
@@ -2104,6 +2104,155 @@ dt_cg_load(dt_node_t *dnp, ctf_file_t *ctfp, ctf_id_t type)
 #endif
 }
 
+/*
+ * For an alloca'ed allocation, verify that a read/write op on REG of size SIZE
+ * will not exceed the bounds given by 0..LENREG; LENREG must not exceed LENMAX
+ * (a cg-time constant value).  If REGPTR is nonzero, REG is a pointer, and is
+ * reduced by BASEREG before comparison with LENREG.  On return, REG is
+ * bounds-checked, and converted to a pointer if it was not already one (by
+ * addition of BASEREG).  If SIZEMAX is >-1, SIZE is a register, with a
+ * max allowed value given by SIZEMAX.
+ *
+ * If BASEREG is -1, it is replaced with an immediate zero value (so you can use
+ * this to check if REG is between 0...LENREG without turning REG into a pointer
+ * at all).
+ */
+static void
+dt_cg_check_bounds(dt_irlist_t *dlp, dt_regset_t *drp, int regptr, int basereg,
+		   int reg, int size, int lenreg, int sizemax, int lenmax)
+{
+	uint_t	lbl_ok = dt_irlist_label(dlp);
+	uint_t	lbl_size_err = dt_irlist_label(dlp);
+	uint_t	lbl_err = dt_irlist_label(dlp);
+
+	if (sizemax < 0)
+		emit(dlp,  BPF_BRANCH_IMM(BPF_JLT, lenreg, size, lbl_err));
+	else {
+		emit(dlp,  BPF_BRANCH_REG(BPF_JLT, lenreg, size, lbl_err));
+	}
+
+	/*
+	 * Check for below/above scratch space by reduction to offsets and
+	 * comparision, to allow the verifier to bounds-check.  Satisfy
+	 * the verifier after reduction via masking.
+	 */
+
+	if (regptr && basereg > -1) {
+		emit(dlp,  BPF_BRANCH_REG(BPF_JLT, reg, basereg, lbl_err));
+		emit(dlp, BPF_ALU64_REG(BPF_SUB, reg, basereg));
+	}
+
+	emit(dlp,  BPF_BRANCH_IMM(BPF_JSLT, reg, 0, lbl_err));
+	emit(dlp,  BPF_BRANCH_IMM(BPF_JGE, reg, lenmax, lbl_err));
+	emit(dlp,  BPF_ALU64_IMM(BPF_AND, reg, clp2(lenmax) - 1));
+
+	/*
+	 * Check the read/write.  Two distinct code paths, one for the case when
+	 * the size is in a register, the other for the case when it is not.  In
+	 * both caes we first do a runtime check of the read/write actually
+	 * carried out (which the verifier will ignore, because it's a reg/reg
+	 * test not using EQ or NE), then do a test against the axtual size of
+	 * scratch space, including a max-size buffer at the end specifically to
+	 * allow dynamically-sized writes to succeed without exceeding the
+	 * bound.
+	 */
+
+	if (sizemax < 0) {
+
+		/* Immediate size.  */
+
+		emit(dlp, BPF_ALU64_IMM(BPF_ADD, reg, size));
+		emit(dlp, BPF_BRANCH_REG(BPF_JGT, reg, lenreg, lbl_size_err));
+
+		/* Verifier placation.  */
+		emit(dlp, BPF_BRANCH_IMM(BPF_JGE, reg, lenmax, lbl_size_err));
+		emit(dlp, BPF_ALU64_IMM(BPF_SUB, reg, size));
+	} else {
+		/* Size in a reg.  */
+
+		emit(dlp, BPF_ALU64_REG(BPF_ADD, reg, size));
+		emit(dlp, BPF_BRANCH_REG(BPF_JGT, reg, lenreg, lbl_size_err));
+		emit(dlp, BPF_ALU64_REG(BPF_SUB, reg, size));
+
+		/* Verifier placation.  */
+		emit(dlp, BPF_ALU64_IMM(BPF_ADD, reg, sizemax));
+		emit(dlp, BPF_BRANCH_IMM(BPF_JGE, reg, lenmax, lbl_size_err));
+		emit(dlp, BPF_ALU64_IMM(BPF_SUB, reg, sizemax));
+	}
+	emit(dlp,  BPF_JUMP(lbl_ok));
+
+	dt_cg_probe_error_regval(yypcb, lbl_err, -1, DTRACEFLT_BADADDR, reg);
+	if (sizemax < 0)
+		dt_cg_probe_error(yypcb, lbl_size_err, -1, DTRACEFLT_BADSIZE, size);
+	else
+		dt_cg_probe_error_regval(yypcb, lbl_size_err, -1, DTRACEFLT_BADSIZE,
+					 size);
+
+	if (basereg > -1)
+		emitl(dlp,  lbl_ok,
+			    BPF_ALU64_REG(BPF_ADD, reg, basereg));
+	else
+		emitl(dlp,  lbl_ok,
+			    BPF_NOP());
+
+	dt_cg_check_fault(yypcb);
+}
+
+/* Handle loading a pointer to alloca'ed space.  */
+
+static void
+dt_cg_load_alloca(dt_node_t *dst, dt_irlist_t *dlp, dt_regset_t *drp,
+    dt_ident_t *idp)
+{
+	/*
+	 * Loads from identifiers with DT_IDFLG_ALLOCA set and DT_NF_ALLOCA set
+	 * on the target load DCTX_SCRATCHMEM + the value in the map, converting
+	 * the scratchmem-relative offset back into a properly-bounded pointer;
+	 * the value is bounds-checked before addition and bounded by the
+	 * expected access size.  The pointer is checked again when
+	 * dereferenced, because it is perfectly possible for users to add or
+	 * subtract from it, taking it out of bounds again.
+	 */
+	if ((idp->di_flags & DT_IDFLG_ALLOCA) && (dst->dn_flags & DT_NF_ALLOCA)) {
+		int	opt_scratchsize = yypcb->pcb_hdl->dt_options[DTRACEOPT_SCRATCHSIZE];
+		int	scratchbot, scratchlen;
+
+		/*
+		 * Get the scratch length and check it.
+		 */
+		if ((scratchlen = dt_regset_alloc(drp)) == -1)
+			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
+
+		emit(dlp,  BPF_LOAD(BPF_DW, scratchlen, BPF_REG_FP,
+				    DT_STK_DCTX));
+		emit(dlp,  BPF_LOAD(BPF_DW, scratchlen, scratchlen, DCTX_MST));
+		emit(dlp,  BPF_LOAD(BPF_DW, scratchlen, scratchlen,
+				    DMST_SCRATCH_TOP));
+		emit(dlp, BPF_ALU64_IMM(BPF_AND, scratchlen, clp2(opt_scratchsize + 1) - 1));
+
+		dt_cg_check_bounds(dlp, drp, 0, -1, dst->dn_reg, 0, scratchlen,
+				   -1, opt_scratchsize * 2);
+		/*
+		 * Turn the offset into a pointer, mask to bound it (apparently
+		 * necessary because the check above fails to induce bounds in
+		 * the verifier), then re-check bounds afterwards to deal with
+		 * the addition.
+		 */
+		if ((scratchbot = dt_regset_alloc(drp)) == -1)
+			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
+
+		emit(dlp, BPF_ALU64_IMM(BPF_AND, dst->dn_reg, clp2(opt_scratchsize + 1) - 1));
+		emit(dlp, BPF_LOAD(BPF_DW, scratchbot, BPF_REG_FP, DT_STK_DCTX));
+		emit(dlp, BPF_LOAD(BPF_DW, scratchbot, scratchbot, DCTX_SCRATCHMEM));
+		emit(dlp, BPF_ALU64_REG(BPF_ADD, dst->dn_reg, scratchbot));
+		dt_cg_check_bounds(dlp, drp, 1, scratchbot, dst->dn_reg, 0, scratchlen,
+				   -1, opt_scratchsize * 2);
+
+		dt_regset_free(drp, scratchbot);
+		dt_regset_free(drp, scratchlen);
+	}
+}
+
 static void
 dt_cg_load_var(dt_node_t *dst, dt_irlist_t *dlp, dt_regset_t *drp)
 {
@@ -2126,15 +2275,17 @@ dt_cg_load_var(dt_node_t *dst, dt_irlist_t *dlp, dt_regset_t *drp)
 			emit(dlp, BPF_LOAD(BPF_DW, dst->dn_reg, dst->dn_reg, DCTX_GVARS));
 
 		/* load the variable value or address */
-		if (dst->dn_flags & DT_NF_REF)
+		if (dst->dn_flags & DT_NF_REF) {
+			assert(!(dst->dn_flags & DT_NF_ALLOCA));
 			emit(dlp, BPF_ALU64_IMM(BPF_ADD, dst->dn_reg, idp->di_offset));
-		else {
+		} else {
 			size_t	size = dt_node_type_size(dst);
 
 			assert(size > 0 && size <= 8 &&
 			       (size & (size - 1)) == 0);
 
 			emit(dlp, BPF_LOAD(ldstw[size], dst->dn_reg, dst->dn_reg, idp->di_offset));
+			dt_cg_load_alloca(dst, dlp, drp, idp);
 		}
 
 		return;
@@ -2196,6 +2347,8 @@ dt_cg_load_var(dt_node_t *dst, dt_irlist_t *dlp, dt_regset_t *drp)
 	emite(dlp, BPF_CALL_FUNC(idp->di_id), idp);
 	dt_regset_free_args(drp);
 
+	/* TODO: alloca loads for TLS vars (after rebase above TLS work). */
+
 	dt_cg_check_fault(yypcb);
 
 	if ((dst->dn_reg = dt_regset_alloc(drp)) == -1)
@@ -2426,9 +2579,60 @@ dt_cg_store_var(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp,
 	uint_t	varid, lbl_done;
 	int	reg;
 	size_t	size;
+	int	store_reg = dnp->dn_reg;
 
 	idp->di_flags |= DT_IDFLG_DIFW;
 
+	dt_cg_check_fault(yypcb);
+
+	/*
+	 * Stores of DT_NF_NONALLOCA nodes into identifiers with DT_IDFLG_ALLOCA
+	 * set indicate that an identifier has been reused for both alloca and
+	 * non-alloca purposes.  Block this since it prevents us knowing whether
+	 * to apply an offset at load time.
+	 */
+	if (dnp->dn_flags & DT_NF_ALLOCA && idp->di_flags & DT_IDFLG_NONALLOCA) {
+		xyerror(D_ALLOCA_INCOMPAT, "%s: cannot reuse the "
+			"same identifier for both alloca and "
+			"non-alloca allocations\n",
+			idp->di_name);
+	}
+
+	/*
+	 * Stores of DT_NF_ALLOCA nodes to identifiers with DT_IDFLG_ALLOCA set
+	 * reduce the value stored by the value of DCTX_SCRATCHMEM first,
+	 * turning it into a scratchmem-relative offset.  Literal nulls load in
+	 * an otherwise-invalid value, statically distinguishable from all valid
+	 * ones.
+	 *
+	 * This is all done in a temporary register, to avoid disturbing the
+	 * return value of =.
+	 */
+	if (dnp->dn_flags & DT_NF_ALLOCA && idp->di_flags & DT_IDFLG_ALLOCA) {
+		int scratchbot, isnull = 0;
+
+		if ((store_reg = dt_regset_alloc(drp)) == -1)
+			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
+
+		if (dnp->dn_kind == DT_NODE_OP2 && dnp->dn_op == DT_TOK_ASGN &&
+		    dnp->dn_right && dnp->dn_right->dn_kind == DT_NODE_INT &&
+		    dnp->dn_right->dn_value == 0)
+			isnull = 1;
+
+		if (!isnull) {
+			if ((scratchbot = dt_regset_alloc(drp)) == -1)
+				longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
+
+			emit(dlp, BPF_LOAD(BPF_DW, scratchbot, BPF_REG_FP, DT_STK_DCTX));
+			emit(dlp, BPF_LOAD(BPF_DW, scratchbot, scratchbot, DCTX_SCRATCHMEM));
+			emit(dlp, BPF_MOV_REG(store_reg, dnp->dn_reg));
+			emit(dlp, BPF_ALU64_REG(BPF_SUB, store_reg, scratchbot));
+
+			dt_regset_free(drp, scratchbot);
+		} else
+			emit(dlp, BPF_MOV_IMM(store_reg, DT_CG_ALLOCA_NULLPTR));
+	}
+
 	/* global and local variables (that is, not thread-local) */
 	if (!(idp->di_flags & DT_IDFLG_TLS)) {
 		if ((reg = dt_regset_alloc(drp)) == -1)
@@ -2455,21 +2659,22 @@ dt_cg_store_var(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp,
 			srcsz = dt_node_type_size(dnp->dn_right);
 			size = MIN(srcsz, idp->di_size);
 
-			dt_cg_memcpy(dlp, drp, reg, dnp->dn_reg, size);
+			dt_cg_memcpy(dlp, drp, reg, store_reg, size);
 		} else {
 			size = idp->di_size;
 
 			assert(size > 0 && size <= 8 &&
 			       (size & (size - 1)) == 0);
 
-			emit(dlp, BPF_STORE(ldstw[size], reg, idp->di_offset, dnp->dn_reg));
+			emit(dlp, BPF_STORE(ldstw[size], reg, idp->di_offset, store_reg));
 		}
 
 		dt_regset_free(drp, reg);
-		return;
+		goto out;
 	}
 
 	/* TLS var */
+	/* XXX implement for alloca */
 	varid = idp->di_id - DIF_VAR_OTHER_UBASE;
 	size = idp->di_size;
 
@@ -2485,6 +2690,7 @@ dt_cg_store_var(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp,
 	dt_regset_xalloc(drp, BPF_REG_0);
 	emite(dlp, BPF_CALL_FUNC(idp->di_id), idp);
 	dt_regset_free_args(drp);
+
 	lbl_done = dt_irlist_label(dlp);
 	emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, dnp->dn_reg, 0, lbl_done));
 
@@ -2518,6 +2724,10 @@ dt_cg_store_var(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp,
 
 	emitl(dlp, lbl_done,
 		   BPF_NOP());
+
+out:
+	if (store_reg != dnp->dn_reg)
+		dt_regset_free(drp, store_reg);
 }
 
 /*
diff --git a/test/unittest/funcs/alloca/tst.string-alloca.d b/test/unittest/funcs/alloca/tst.string-alloca.d
new file mode 100644
index 000000000000..ae1da7484b69
--- /dev/null
+++ b/test/unittest/funcs/alloca/tst.string-alloca.d
@@ -0,0 +1,24 @@
+/*
+ * Oracle Linux DTrace.
+ * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
+ * Licensed under the Universal Permissive License v 1.0 as shown at
+ * http://oss.oracle.com/licenses/upl.
+ */
+
+/*
+ * ASSERTION: You can copy a string into an alloca'ed region and read
+ *	      it out again.
+ *
+ * SECTION: Actions and Subroutines/alloca()
+ */
+
+#pragma D option quiet
+#pragma D option scratchsize=512
+
+BEGIN
+{
+	x = (string *)alloca(sizeof(string) + 1);
+	*x = "abc";
+	trace(*x);
+	exit(0);
+}
diff --git a/test/unittest/funcs/alloca/tst.string-alloca.r b/test/unittest/funcs/alloca/tst.string-alloca.r
new file mode 100644
index 000000000000..8baef1b4abc4
--- /dev/null
+++ b/test/unittest/funcs/alloca/tst.string-alloca.r
@@ -0,0 +1 @@
+abc
-- 
2.35.1.261.g8402f930ba.dirty




More information about the DTrace-devel mailing list