[DTrace-devel] [PATCH v2 4/5] Implement the support code for generating aggregation data

Mon Nov 30 21:48:42 PST 2020

On 11/30/2020 08:11 PM, Kris Van Hees wrote:

> On Mon, Nov 30, 2020 at 06:42:29PM -0800, Eugene Loh wrote:
>> On 11/24/2020 01:50 PM, Kris Van Hees wrote:
>>
>>> This patch adds support for aggregations to the D compiler and provides
>>> an implementation for the count() and lquantize() aggregation functions
>>> as examples on how to use the aggegation support.
>>> The DT_CG_AGG_IMPL macro will:
>>>       1. Prepare the first data copy for writing and redirect the consumer
>>>          to read from the second copy.  A register 'dreg' will be reserved
>>>          to hold a pointer to the data to be updated.
>>>       2. Call f(dlp, drp, dreg, ...) to perform the data update.
>>>       3. Prepare the second data copy for writing and redirect the consumer
>>>          to read from the first copy.  A register 'dreg' will be reserved
>>>          to hold a pointer to the data to be updated.
>>>       4. Call f(dlp, drp, dreg, ...) to perform the data update.
>> It would be clearer if one just said that steps 1 and 2 are repeated for
>> the other buffer than to copy and paste them as steps 3 and 4. Also, the
>> difference between "first" and "second" is immaterial. They're just two,
>> equivalent buffers, with the producer writing one and the consumer
>> reading the other.
> Actually, there is a minor difference: the latch sequence number is only
> ever updated in the first copy

You seem to make the same point in this response below, but this makes 
no sense to me.  The latch sequence number has to be updated when each 
copy is updated, and that sequence number is not particularly associated 
with either copy;  it's associated with the CPU.  Each time, the per-CPU 
latch sequence number is then used to determine which copy to use.  The 
two copies are quite symmetric with respect to the seq#.

> (which is why it is referred to as primary
> copy in the code comments.)  I don't think there is much gain from replacing
> the 4 lines that comprise points 3 and 4 with a single line that sayd that
> 1 and 2 are to be repeated for the other buffer because points 1 and 3 both
> refer to two buffers.  I think being overly verbose here serves the purpose
> of being abundantly clear about what the macro does.
>
>>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>>> +/*
>>> + * Prepare the aggregation buffer for updating for a specific aggregation, and
>>> + * return a register that holds a pointer to the aggregation data to be
>>> + * updated.
>>> + *
>>> + * We update the latch seqcount (first value in the aggregation buffer) to
>>> + * signal that reads should be directed to the alternate copy of the data.  We
>>> + * then determine the location of data for the given aggregation that can be
>>> + * updated.  This value is stored in the register returned from this function.
>>> + */
>>> +static int
>>> +dt_cg_agg_buf_prepare(dt_ident_t *aid, int size, dt_irlist_t *dlp,
>>> +		      dt_regset_t *drp)
>>> +{
>>> +	int		rx, ry;
>>> +	struct bpf_insn	instr;
>>> +
>>> +	TRACE_REGSET("            Prep: Begin");
>>> +
>>> +	dt_regset_xalloc(drp, BPF_REG_0);
>>> +	rx = dt_regset_alloc(drp);
>>> +	ry = dt_regset_alloc(drp);
>>> +	assert(rx != -1 && ry != -1);
>>> +
>>> +	/*
>>> +	 *				//     (%rX = register for agg ptr)
>>> +	 *				//     (%rY = register for agg data)
>>> +	 *	agd = dctx->agg;	// lddw %r0, [%fp + DT_STK_DCTX]
>>> +	 *				// lddw %rX, [%r0 + DCTX_AGG]
>>> +	 */
>>> +	instr = BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_FP, DT_STK_DCTX);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_LOAD(BPF_DW, rx, BPF_REG_0, DCTX_AGG);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +
>>> +	/*
>>> +	 *	off = (*agd % 2) * size	// lddw %rY, [%rX + 0]
>>> +	 *	      + aid->di_offset	// and %rY, 1
>>> +	 *	      + sizeof(uint64_t);
>>> +	 *				// mul %rY, size
>>> +	 *				// add %rY, aid->di_offset +
>>> +	 *				//		sizeof(uint64_t)
>>> +	 *	(*agd)++;		// mov %r0, 1
>>> +	 *				// xadd [%rX + 0], %r0
>>> +	 *	agd += off;		// add %rX, %rY
>>> +	 */
>>> +	instr = BPF_LOAD(BPF_DW, ry, rx, 0);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_ALU64_IMM(BPF_AND, ry, 1);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_ALU64_IMM(BPF_MUL, ry, size);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_ALU64_IMM(BPF_ADD, ry, aid->di_offset + sizeof(uint64_t));
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_MOV_IMM(BPF_REG_0, 1);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_XADD_REG(BPF_DW, rx, 0, BPF_REG_0);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +	instr = BPF_ALU64_REG(BPF_ADD, rx, ry);
>>> +	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>>> +
>>> +	dt_regset_free(drp, ry);
>>> +	dt_regset_free(drp, BPF_REG_0);
>>> +
>>> +	TRACE_REGSET("            Prep: End  ");
>>> +
>>> +	return rx;
>>> +}
>>> +
>>> +#define DT_CG_AGG_IMPL(aid, sz, dlp, drp, f, ...) \
>>> +	do {								\
>>> +		int	dreg;						\
>>> +									\
>>> +		TRACE_REGSET("        Upd: Begin ");			\
>>> +									\
>>> +		/*							\
>>> +		 * Redirect reading to secondary copy.  The register	\
>>> +		 * returned holds the base pointer to the primary copy.	\
>>> +		 */							\
>> The nomenclature "primary" and "secondary" is introduced here and only
>> here, it seems.  Why?  It suggests one of the buffers is more important
>> or permanent or something and the other is a back-up, but there is
>> really no difference between the two.
> "first data copy" and "second data copy" vs "primary copy" and "secondary
> copy"...  I am not sure it really matters or that it would seem confusing.
> As I mention above, the first data copy (primary copy, first buffer, ...)
> is slightly different because it stores the latch sequence number.

No.  Neither copy stores the seq#.  The seq# (in the current 
implementation) is shared by all copies of all aggregations (on a 
per-CPU basis).

>>> +		dreg = dt_cg_agg_buf_prepare((aid), (sz), (dlp), (drp));\
>>> +									\
>>> +		/*							\
>>> +		 * Call the update implementation.  Aggregation data is \
>>> +		 * accessed through register 'reg'.			\
>>> +		 */							\
>>> +		(f)((dlp), (drp), dreg, ## __VA_ARGS__);		\
>>> +		dt_regset_free((drp), dreg);				\
>>> +									\
>>> +		TRACE_REGSET("        Upd: Switch");			\
>>> +									\
>>> +		/*							\
>>> +		 * Redirect reading to primary copy.  The register	\
>>> +		 * returned holds the base pointer to the secondary	\
>>> +		 * copy.						\
>>> +		 */							\
>>> +		dreg = dt_cg_agg_buf_prepare((aid), (sz), (dlp), (drp));\
>>> +									\
>>> +		/*							\
>>> +		 * Call the update implementation.  Aggregation data is \
>>> +		 * accessed through register 'reg'.			\
>>> +		 */							\
>>> +		(f)((dlp), (drp), dreg, ## __VA_ARGS__);		\
>>> +		dt_regset_free((drp), dreg);				\
>>> +									\
>>> +		TRACE_REGSET("        Upd: End   ");			\
>>> +	} while (0)
>>> +