[DTrace-devel] [PATCH 4/4] spec: correct semantics of specsize overflow

Kris Van Hees kris.van.hees at oracle.com
Thu May 11 20:30:00 UTC 2023


On Thu, May 11, 2023 at 02:44:47PM -0400, Eugene Loh via DTrace-devel wrote:
> Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
> 
> with usual comments/questions.
> 
> On 5/9/23 03:19, Kris Van Hees via DTrace-devel wrote:
> > The speculations implementation (commit 7fe6ce9a "consume, cg: implement
> > speculations") did not detect speculation data exceeding the specsize
> > limit until the consumer added data fragments to a speculation buffer.
> > As a result, all statements in clauses were always executed.
> > 
> > The legacy behaviour is to abort clause execution once a data recording
> > action causes the speculation buffer to overflow.  That causes any
> > further statementa in the clause to be skipped.  The testsuite contains
> 
> statementa
> ->
> statements
> 
> > an explicit test for this: tst.SpecSizeVariations4.d.  Unfortunately,
> > the test (and expected results) were amended to match the (incorrect)
> > new semantics.
> > 
> > This patch restores the legacy behaviour by tracking the size of the
> > speculation across clauses that contribute to a given speculation, so
> > that we can detect speculation buffer overflows as they occur due to
> > a specific data recording action.
> > 
> > Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> > 
> > diff --git a/bpf/speculation.c b/bpf/speculation.c
> > index 9f3c0e6c..04b1d3f0 100644
> > --- a/bpf/speculation.c
> > +++ b/bpf/speculation.c
> > @@ -105,29 +105,32 @@ noinline uint32_t dt_speculation(void)
> >    * Begin a speculation given an already-assigned ID.
> >    *
> >    * We consider a speculation ID usable only if it exists in the speculation map
> > - * (indicating that speculation() has returned it) with a a zero read value
> > + * (indicating that speculation() has returned it) and it is not being drained
> >    * (indicating that neither commit() nor discard() have been called for it).
> >    * We bump the written value by one to indicate that another speculative buffer
> >    * is (or will soon be, once this clause terminates and its epilogue runs)
> >    * available for draining.
> > + *
> > + * We return the speculation entry in the map or 0.
> >    */
> 
> How about NULL instead of 0?  Also, since the function is so short and well
> documented, the lead comment block might be unnecessary.

NULL is not known as a symbol in these C source files that are compiled into
BPF, so 0 is the way to go.  Unless we define NULL ourselves somewhere, but
since we haven't yet, I go with what we have been doing.

> > -noinline int32_t
> > +noinline dt_bpf_specs_t *
> >   dt_speculation_speculate(uint32_t id)
> >   {
> >   	dt_bpf_specs_t *spec;
> >   	if ((spec = bpf_map_lookup_elem(&specs, &id)) == NULL)
> > -		return -1;
> > +		return 0;
> 
> How about NULL instead of 0?

Same.

> >   	/*
> >   	 * Spec already being drained: do not continue to emit new
> >   	 * data into it.
> >   	 */
> >   	if (spec->draining)
> > -		return -1;
> > +		return 0;
> 
> How about NULL instead of 0?

Same.

> >   	atomic_add(&spec->written, 1);
> > -	return 0;
> > +
> > +	return spec;
> >   }
> >   /*
> > diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> > @@ -813,7 +815,8 @@ dt_cg_prologue(dt_pcb_t *pcb, dt_node_t *pred)
> >   	TRACE_REGSET("Prologue: End  ");
> >   	/*
> > -	 * Account for 32-bit epid (at offset 0) and 32-bit tag (at offset 4).
> > +	 * Account for 32-BIt EPID (at offset 0) and 32-bit speculation ID (at
> > +	 * offset 4).
> >   	 */
> >   	pcb->pcb_bufoff += 2 * sizeof(uint32_t);
> >   }
> 
> Capitalized BIt?

Thanks.

> > @@ -994,8 +1012,6 @@ dt_cg_check_fault(dt_pcb_t *pcb)
> >   	emit(dlp,  BPF_LOAD(BPF_DW, reg, reg, DCTX_MST));
> >   	emit(dlp,  BPF_LOAD(BPF_DW, reg, reg, DMST_FAULT));
> >   	emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, reg, 0, lbl_ok));
> > -	emit(dlp,  BPF_MOV_IMM(BPF_REG_0, 0));
> > -	emit(dlp,  BPF_RETURN());
> >   	emitl(dlp, lbl_ok,
> >   		   BPF_NOP());
> >   	dt_regset_free(drp, reg);
> 
> What?  Is this change a mistake?

Oh darn - so *that* is where the code vanished.  I'll get rid of this portion
(since it is obviously wrong), and adjust the later patch to change this into
a simple jne %r0, 0, pcb->pcb_exitlbl.

> > @@ -1407,10 +1424,48 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
> > +ok:
> > +	/*
> > +	 * If this clause contains a speculate() statement, we need to
> > +	 * generate code to verify that the amount of data recorded for this
> > +	 * speculation (incl. what was just added) does not exceed the
> > +	 * specsize limit.  If it does, further execution of this clause
> > +	 * should be aborted.
> > +	 */
> > +	if (cflags & DT_CLSFLAG_SPECULATE) {
> > +		uint_t	lbl_ok = dt_irlist_label(dlp);
> > +		size_t	specsize = dtp->dt_options[DTRACEOPT_SPECSIZE];
> > +
> > +		/*
> > +		 * if (*((uint32_t *)&buf[DBUF_SPECID]) != 0) {
> > +		 *     if (dctx->dmst->specsize + off + size >
> 
> dctx->dmst->specsize + off + size
> ->
> dctx->dmst->specsize + (off + size)

What does that gain us?

> > +		 *	   dtp->dt_options[DTRACEOPT_SPECSIZE])
> > +		 *         return;
> > +		 * }
> > +		 */
> > +		dt_regset_xalloc(drp, BPF_REG_0);
> 
> Why hardwire to %r0???  Why not just let dt_regset_alloc() find one for you?
> 
> > +		emit(dlp,  BPF_LOAD(BPF_W, BPF_REG_0, BPF_REG_9, DBUF_SPECID));
> > +		emit(dlp,  BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_ok));
> > +
> > +		emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_FP, DT_STK_DCTX));
> > +		emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_0, DCTX_MST));
> > +		emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_0, DMST_SPECSIZE));
> > +		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, off + size));
> > +		emit(dlp,  BPF_BRANCH_IMM(BPF_JLE, BPF_REG_0, specsize, lbl_ok));
> 
> How about changing
>     if (reg <= specsize) goto lbl_ok
> to
>     if (reg > specsize) goto pcb_exitlbl
> 
> That would be simpler and map better to epilogue and your check_fault()
> patch.

Good idea.

> > +		emit(dlp,  BPF_MOV_IMM(BPF_REG_0, 0));
> > +		emit(dlp,  BPF_RETURN());
> > +		dt_regset_free(drp, BPF_REG_0);
> > +
> > +		emitl(dlp, lbl_ok,
> > +			   BPF_NOP());
> 
> Are the regset_free() and lbl_ok out of order?  E.g., let's say you jump to
> lbl_ok.  Then, if there were any reg fill code, you would miss it.

Won't matter anymore after the change you suggested above.  But yes.

> > +	}
> > +
> > +	return 0;
> >   }
> >   static void
> > @@ -1973,14 +2029,16 @@ dt_cg_act_speculate(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
> >   		longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> >   	/*
> > -	 *	rc = dt_speculation_speculate(spec);
> > +	 *	dt_bpf_specs_t *spec = dt_speculation_speculate(specid);
> >   	 *				// lddw %r1, %dn_reg
> >   	 *				// call dt_speculation_speculate
> >   	 *				//     (%r1 ... %r5 clobbered)
> >   	 *				//     (%r0 = error return)
> 
> %r0=error comment needs updating or removal (comment within a comment)

Yes, thanks.

> > -	 *	if (rc != 0)		// jne %r0, 0, lbl_exit
> > +	 *	if (spec == 0)		// jne %r0, 0, lbl_exit
> 
> jne comment needs updating or removal (comment within a comment)

Yes, thanks.

> >   	 *		goto exit;
> > -	 *	*((uint32_t *)&buf[4]) = spec;	// mov [%r9 + 4], %dn_reg
> > +	 *	*((uint32_t *)&buf[DBUF_SPECID]) = specid;
> > +	 *				// mov [%r9 + DBUF_SPECID], %dn_reg
> > +	 *	dctx->dmst->specsize = spec->size;
> >   	 *	exit:			// nop
> >   	 */
> > diff --git a/libdtrace/dt_dctx.h b/libdtrace/dt_dctx.h
> > @@ -74,6 +76,21 @@ typedef struct dt_dctx {
> >   #define DCTX_SIZE	((int16_t)sizeof(dt_dctx_t))
> > +/*
> > + * The dctx->buf pointer references a block of memory that contains:
> > + *
> > + *                       +----------------+
> > + *                  0 -> | EPID           |
> > + *                       +----------------+
> > + *		    4 -> | Speculation ID |
> 
> Tab is inconsistent with the style of this comment block.

Fixed.

> > + *                       +----------------+
> > + *                       | Trace Data     |
> > + *                       |      ...       |
> > + *                       +----------------+
> > + */
> > +#define DBUF_EPID	0
> > +#define DBUF_SPECID	4
> > +
> >   /*
> >    * The dctx->mem pointer references a block of memory that contains:
> >    *
> > diff --git a/test/unittest/speculation/tst.SpecSizeVariations4.d b/test/unittest/speculation/tst.SpecSizeVariations4.d
> 
> Good, but:
> 
> 1)  The test (fwiw) was passing.  Then in the last patch it failed. Now it's
> passing again (this time properly).  From a testing perspective, there was a
> regression in this test in the previous patch.  Ideally, that would not be
> so.  Maybe xfail this test in the previous patch, even if that xfail is
> short-lived?

The expected results file was wrong.  The test was passing because it was
looking for the wrong output.  This patch provides the correct implementation
*and* ensures that the test is looking for the correct output.

> 2)  The long format strings are deceptive.  They suggest that they are the
> ones responsible for the overflow.  How about getting rid of them.  Or:
> 
>         printf("%lld\n", x);    /* Lots of data... */
>         printf("%lld\n", x);    /* ... has to be crammed into this buffer...
> */
>         printf("%lld\n", x);    /* ... until it overflows... */
>         printf("%lld\n", x);    /* ... and causes flops. */

What exactly causes the overflow of data is an implementation detail that I
think the test does not need to care about, as long as we generate enough
data to cause the overflow.  That said, this is a test we important from the
legacy version and I'd prefer not to make changes we do not need.  That makes
it easier to ascertain we are essentially using the same test.

So yes, I see your point (and I expect that the test may have been written as
it was due to perhaps an earlier implementation of DTrace doing things slightly
differently), but I do not think it is a problem to keep the test as much as
we can as it was in the legacy version.

> 3)  The test is weak in that it checks only speculations in a single
> clause.  How about spreading the speculative data over multiple clauses?

But the whole point of the test *is* to exercise the semantics of an overflow
occuring in a clause.

But yes, we do need more (and better) tests for the various speculation buffer
sizing effects vs the amount of data recorded in a speculation.  It seems like
they used to exist or were planned to be written (see the comment block in the
err.BufSizeVariations1.d test).

This particular patch is addressing a specific problem - but I do agree we need
to add more tests.



More information about the DTrace-devel mailing list