[DTrace-devel] [PATCH v2] Implement new dtrace_eprobedesc_t and dtrace_datadesc_t design

Eugene Loh eugene.loh at oracle.com
Thu Apr 9 11:49:26 PDT 2020


Just took a relatively quick look.  A little feedback below, basically 
around comments.

On 04/09/2020 12:19 AM, Kris Van Hees wrote:

> In DTrace v1, the dtrace_eprobedesc_t structure was used to obtain
> metadata information about the probe data stream from the kernel.
> Each EPID was also associated with a speciifc dtrace_probedesc_t,

specific

> identifying the probe for which the EPID represents an enabling.
> Each EPID identified a unique dtrace_eprobedesc_t while the
> dtrace_probedesc_t could be shared between multiple EPIDs.
>
>      EPID -[1-to-1]-> dtrace_eprobedesc_t
>      EPID -[n-to-1]-> dtrace_probedesc_t
>
> In the DTrace v2 design, where everything resides in userspace, we
> can create that metadata directly.  We can also make use of the fact
> that when a single program is attached to multiple probes, we do not
> need more than one copy of the metadata.  We also make use of the fact
> that the EPID identifies a unique (probe, program) tuple.
>
>      EPID -[n-to-1]-> dtrace_datadesc_t
>      EPID -[n-to-1]-> dtrace_probedesc_t
>
> The probe data description consists of records that each describe one
> data item in the probe data.  Records are added as D code is being
> compiled, and they specify the type, size, and alignment of the data
> item.  The information from the records is used when consuming data
> from the probe data buffer.
>
> The dt_rec_add() function is used to register a data item as a new
> record in the data description.  It will take care of updating the
> pcb_bufoff value that is used to know where the next data item is to
> be stored in the data buffer.  The offset of the data item being
> registered is returned so it can be used during code generation.
>
> The dt_rec_add() function will ensure that data items are aligned
> properly in the data buffer by inserting padding (zeros) as needed
> into the buffer between the previously stored item and the one we
> are going to store.  E.g. if a 32-bit value wasu stored at offset 16,

wasu -> was

This explanation is fine, though it is a little weird to talk both about 
bits and bytes.  How about 4-byte instead of 32-bit and 8-byte instead 
of 64-bit.  They're synonymous in any reader's mind, so not a big deal.

But then the intent is clear enough anyhow;  so you could just delete 
the "E.g., if a..." sentence entirely.  Indeed, how about replacing the 
two dt_rec_add() paragraphs with simply:

         The dt_rec_add() function is used to register a data item as a new
         record in the data description.  It will advance pcb_bufoff to the
         proper alignment and call a code-generation callback function to
         fill in any needed zero padding.

> and we want to store a 64-bit value next, code will be generated to
> store a 4-byte zero value as padding at offset 20 so that the 64-bit
> value will be stored at offset 24 (which is properly aligned for
> 8-byte data items).
>
> This commit also disables the (incomplete) implementation of the
> trace() action for string values.  Where before it would simply end
> up putting bogus values into the data buffer, it may now cause actual
> corruption of the data buffer.
>
> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> ---
>   include/dtrace/metadesc.h |  39 +++---
>   libdtrace/dt_aggregate.c  |  10 +-
>   libdtrace/dt_bpf.c        |  14 +--
>   libdtrace/dt_cc.c         |   5 +
>   libdtrace/dt_cg.c         | 189 +++++++++++++++-------------
>   libdtrace/dt_consume.c    |  96 ++++++++------
>   libdtrace/dt_handle.c     |  26 ++--
>   libdtrace/dt_impl.h       |  14 ++-
>   libdtrace/dt_map.c        | 257 +++++++++++++++++++++++++-------------
>   libdtrace/dt_pcb.h        |   4 +-
>   libdtrace/dt_program.c    |  18 ++-
>   libdtrace/dtrace.h        |   8 +-
>   12 files changed, 407 insertions(+), 273 deletions(-)
>
> diff --git a/include/dtrace/metadesc.h b/include/dtrace/metadesc.h
> index 7d800e97..10c6c2a5 100644
> --- a/include/dtrace/metadesc.h
> +++ b/include/dtrace/metadesc.h
> @@ -21,16 +21,16 @@
>    * DTrace separates the trace data stream from the metadata stream.  The only
>    * metadata tokens placed in the data stream are enabled probe identifiers
>    * (EPIDs) or (in the case of aggregations) aggregation identifiers.  In order
> - * to determine the structure of the data, DTrace consumers pass the token to
> - * the kernel, and receive in return a corresponding description of the enabled
> - * probe (via the dtrace_eprobedesc structure) or the aggregation (via the
> - * dtrace_aggdesc structure).  Both of these structures are expressed in terms
> - * of record descriptions (via the dtrace_recdesc structure) that describe the
> - * exact structure of the data.  Some record descriptions may also contain a
> - * format identifier; this additional bit of metadata can be retrieved from the
> - * kernel, for which a format description is returned via the dtrace_fmtdesc
> - * structure.  Note that all four of these structures must be bitness-neutral
> - * to allow for a 32-bit DTrace consumer on a 64-bit kernel.
> + * to determine the structure of the data, DTrace uses the token to perform a
> + * lookup to retrieve the corresponding description of the enabled probe (via
> + * the dtrace_datadesc structure) or the aggregation (via the dtrace_aggdesc
> + * structure).
> + *
> + * Both of these structures are expressed in terms of record descriptions (via
> + * the dtrace_recdesc structure) that describe the exact structure of the data.
> + * Some record descriptions may also contain a format identifier; this
> + * additional bit of metadata refers to a format description described via a
> + * dtrace_fmtdesc structure.
>    */
>   typedef struct dtrace_recdesc {
>   	dtrace_actkind_t dtrd_action;		/* kind of action */
> @@ -42,14 +42,13 @@ typedef struct dtrace_recdesc {
>   	uint64_t dtrd_uarg;			/* user argument */
>   } dtrace_recdesc_t;
>   
> -typedef struct dtrace_eprobedesc {
> -	dtrace_epid_t dtepd_epid;		/* enabled probe ID */
> -	dtrace_id_t dtepd_probeid;		/* probe ID */
> -	uint64_t dtepd_uarg;			/* library argument */
> -	uint32_t dtepd_size;			/* total size */
> -	int dtepd_nrecs;			/* number of records */
> -	dtrace_recdesc_t dtepd_rec[1];		/* recods themselves */
> -} dtrace_eprobedesc_t;
> +typedef struct dtrace_datadesc {
> +	uint64_t dtdd_uarg;			/* library argument */
> +	uint32_t dtdd_size;			/* total size */
> +	int dtdd_nrecs;				/* number of records */
> +	dtrace_recdesc_t *dtdd_recs;		/* records themselves */
> +	int dtdd_refcnt;			/* reference count */
> +} dtrace_datadesc_t;
>   
>   typedef struct dtrace_aggdesc {
>   	DTRACE_PTR(char, dtagd_name);		/* not filled in by kernel */
> @@ -70,8 +69,8 @@ typedef struct dtrace_fmtdesc {
>   } dtrace_fmtdesc_t;
>   
>   #define DTRACE_SIZEOF_EPROBEDESC(desc)				\
> -	(sizeof (dtrace_eprobedesc_t) + ((desc)->dtepd_nrecs ?  \
> -	(((desc)->dtepd_nrecs - 1) * sizeof (dtrace_recdesc_t)) : 0))
> +	(sizeof (dtrace_eprobedesc_t) + ((desc)->dtdd_nrecs ?  \
> +	(((desc)->dtdd_nrecs - 1) * sizeof (dtrace_recdesc_t)) : 0))
>   
>   #define	DTRACE_SIZEOF_AGGDESC(desc)			       \
>   	(sizeof (dtrace_aggdesc_t) + ((desc)->dtagd_nrecs ?     \
> diff --git a/libdtrace/dt_aggregate.c b/libdtrace/dt_aggregate.c
> index b545cb32..908dc6d2 100644
> --- a/libdtrace/dt_aggregate.c
> +++ b/libdtrace/dt_aggregate.c
> @@ -644,8 +644,8 @@ hashnext:
>   		aggdata->dtada_size = size;
>   		aggdata->dtada_desc = agg;
>   		aggdata->dtada_handle = dtp;
> -		(void) dt_epid_lookup(dtp, agg->dtagd_epid,
> -		    &aggdata->dtada_edesc, &aggdata->dtada_pdesc);
> +		dt_epid_lookup(dtp, agg->dtagd_epid, &aggdata->dtada_ddesc,
> +			       &aggdata->dtada_pdesc);
>   		aggdata->dtada_normal = 1;
>   
>   		h->dtahe_hashval = hashval;
> @@ -1631,9 +1631,9 @@ dtrace_aggregate_walk_joined(dtrace_hdl_t *dtp, dtrace_aggvarid_t *aggvars,
>   				aggdata->dtada_size = agg->dtagd_size;
>   				aggdata->dtada_desc = agg;
>   				aggdata->dtada_handle = dtp;
> -				(void) dt_epid_lookup(dtp, agg->dtagd_epid,
> -				    &aggdata->dtada_edesc,
> -				    &aggdata->dtada_pdesc);
> +				dt_epid_lookup(dtp, agg->dtagd_epid,
> +					       &aggdata->dtada_ddesc,
> +					       &aggdata->dtada_pdesc);
>   				aggdata->dtada_normal = 1;
>   				zaggdata[i].dtahe_hashval = 0;
>   				zaggdata[i].dtahe_size = agg->dtagd_size;
> diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> index cacfa07a..2c90411b 100644
> --- a/libdtrace/dt_bpf.c
> +++ b/libdtrace/dt_bpf.c
> @@ -249,15 +249,16 @@ dt_bpf_reloc_prog(dtrace_hdl_t *dtp, const dt_probe_t *prp,
>    */
>   int
>   dt_bpf_load_prog(dtrace_hdl_t *dtp, const dt_probe_t *prp,
> -		 const dtrace_difo_t *dp)
> +		 dtrace_stmtdesc_t *sdp)
>   {
>   	struct bpf_load_program_attr	attr;
>   	dtrace_epid_t			epid;
> +	const dtrace_difo_t		*dp = sdp->dtsd_action->dtad_difo;
>   	int				logsz = BPF_LOG_BUF_SIZE;
>   	char				*log;
>   	int				rc;
>   
> -	epid = dt_epid_add(dtp, prp->desc->id, 0);
> +	epid = dt_epid_add(dtp, sdp->dtsd_ddesc, prp->desc->id);
>   
>   	/*
>   	 * Check whether there are any probe-specific relocations to be
> @@ -360,14 +361,7 @@ dt_bpf_stmt(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, dtrace_stmtdesc_t *sdp,
>   	if (!prp)
>   		return dt_set_errno(dtp, ESRCH);
>   
> -	/*
> -	 * In the new implementation, every statement has a single action,
> -	 * which is the entire program.  At a future code reworking, we should
> -	 * combine the action into the statement and do away with the action
> -	 * list.
> -	 */
> -	assert(ap == sdp->dtsd_action_last);
> -	fd = dt_bpf_load_prog(dtp, prp, ap->dtad_difo);
> +	fd = dt_bpf_load_prog(dtp, prp, sdp);
>   	if (fd < 0)
>   		return fd;
>   
> diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
> index cc299b93..ab987fc7 100644
> --- a/libdtrace/dt_cc.c
> +++ b/libdtrace/dt_cc.c
> @@ -259,6 +259,11 @@ dt_stmt_append(dtrace_stmtdesc_t *sdp, const dt_node_t *dnp)
>   			datarec = 1;
>   	}
>   
> +	/*
> +	 * Finalize the probe data description for the statement.
> +	 */
> +	dt_datadesc_finalize(yypcb->pcb_hdl, sdp->dtsd_ddesc);
> +
>   	if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
>   		longjmp(yypcb->pcb_jmpbuf, dtrace_errno(yypcb->pcb_hdl));
>   
> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> index 4945dead..f71fc62c 100644
> --- a/libdtrace/dt_cg.c
> +++ b/libdtrace/dt_cg.c
> @@ -126,7 +126,10 @@ dt_cg_prologue(dt_pcb_t *pcb)
>   	instr = BPF_STORE_IMM(BPF_W, BPF_REG_9, 4, 0);
>   	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>   
> -	pcb->pcb_bufoff += sizeof(uint64_t);
> +	/*
> +	 * Account for epid (at offset 0) and 4-byte padding (at offset 4).
> +	 */
> +	pcb->pcb_bufoff += 2 * sizeof(uint32_t);
>   }

The comments preceding and embedded in dt_cg_prologue in some places 
still refer to the datum after EPID as "padding".  Can these references 
be changed to refer to "tag"?  (After all, now with dt_cg_fill_gap, 
inserting a gap manually would be a waste of effort and space:  the next 
datum might not even need 8-byte alignment.) Those references are:

/*
  * Generate the function prologue:
  *      [...]
  *      6. Store the epid and 4 padding byes at the beginning of the output
  *         buffer
  * [...]
  */

         /*
          * We read epid from dctx (struct dt_bpf_context) and store it 
in the
          * first 4 bytes of the output buffer.  The next 4 bytes are padded
          * with 0s so that the next entry will be at a 8-byte boundary.
          * [...]
          * Account for epid (at offset 0) and 4-byte padding (at offset 4).
          */
>   /*
> @@ -187,26 +190,51 @@ dt_cg_epilogue(dt_pcb_t *pcb)
>   	dt_irlist_append(dlp, dt_cg_node_alloc(pcb->pcb_exitlbl, instr));
>   }
>   
> -static size_t
> +/*
> + * Generate an instruction sequence to fill a gap in the output buffer with 0
> + * values.  This is used to ensure that there are no uninitialized bytes in the
> + * output buffer that could result from alignment requirements for data
> + * records.
> + */
> +static void
> +dt_cg_fill_gap(dt_pcb_t *pcb, int gap)
> +{
> +	struct bpf_insn	instr;
> +	dt_irlist_t	*dlp = &pcb->pcb_ir;
> +	uint_t		off = pcb->pcb_bufoff;
> +
> +	if (gap & 1) {
> +		instr = BPF_STORE_IMM(BPF_B, BPF_REG_9, off, 0);
> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> +		off += 1;
> +	}
> +	if (gap & 2) {
> +		instr = BPF_STORE_IMM(BPF_H, BPF_REG_9, off, 0);
> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> +		off += 2;
> +	}
> +	if (gap & 4) {
> +		instr = BPF_STORE_IMM(BPF_W, BPF_REG_9, off, 0);
> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> +	}
> +}
> +
> +static void
>   dt_cg_act_breakpoint(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dnerror(dnp, D_UNKNOWN, "breakpoint() is not implemented (yet)\n");
>   	/* FIXME: Needs implementation */
> -
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_chill(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   	dnerror(dnp, D_UNKNOWN, "chill() is not implemented (yet)\n");
>   	/* FIXME: Needs implementation */
> -
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_clear(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dt_node_t *anp;
> @@ -236,8 +264,6 @@ dt_cg_act_clear(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   	 * DEPENDS ON: How aggregations are implemented using eBPF (hashmap?).
>   	 * AGGID = aid->di_id
>   	 */
> -
> -	return 0;
>   }
>   
>   /*
> @@ -247,22 +273,24 @@ dt_cg_act_clear(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>    * Stores:
>    *	integer (4 bytes)		-- speculation ID
>    */
> -static size_t
> +static void
>   dt_cg_act_commit(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	struct bpf_insn instr;
>   	dt_irlist_t *dlp = &pcb->pcb_ir;
> -	uint_t off = pcb->pcb_bufoff;
> +	uint_t off;
>   
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   
> +	off = dt_rec_add(pcb->pcb_hdl, dt_cg_fill_gap, DTRACEACT_COMMIT,
> +			 sizeof(uint64_t), sizeof(uint64_t), NULL,
> +			 DT_ACT_COMMIT);
> +
>   	instr = BPF_STORE(BPF_DW, BPF_REG_9, off, BPF_REG_0);	/* FIXME */
>   	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> -
> -	return sizeof(uint64_t);
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_denormalize(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dt_node_t *anp;
> @@ -292,8 +320,6 @@ dt_cg_act_denormalize(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   	 * DEPENDS ON: How aggregations are implemented using eBPF (hashmap?).
>   	 * AGGID = aid->di_id
>   	 */
> -
> -	return 0;
>   }
>   
>   /*
> @@ -303,19 +329,21 @@ dt_cg_act_denormalize(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>    * Stores:
>    *	integer (4 bytes)		-- speculation ID
>    */
> -static size_t
> +static void
>   dt_cg_act_discard(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	struct bpf_insn instr;
> -	dt_irlist_t *dlp = &pcb->pcb_ir;
> -	uint_t off = pcb->pcb_bufoff;
> +	struct bpf_insn	instr;
> +	dt_irlist_t	*dlp = &pcb->pcb_ir;
> +	uint_t		off;
>   
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   
> +	off = dt_rec_add(pcb->pcb_hdl, dt_cg_fill_gap, DTRACEACT_DISCARD,
> +			 sizeof(uint64_t), sizeof(uint64_t), NULL,
> +			 DT_ACT_DISCARD);
> +
>   	instr = BPF_STORE(BPF_DW, BPF_REG_9, off, BPF_REG_0);	/* FIXME */
>   	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> -
> -	return sizeof(uint64_t);
>   }
>   
>   /*
> @@ -324,83 +352,74 @@ dt_cg_act_discard(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>    * Stores:
>    *	integer (4 bytes)		-- return code
>    */
> -static size_t
> +static void
>   dt_cg_act_exit(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	struct bpf_insn instr;
> -	dt_irlist_t *dlp = &pcb->pcb_ir;
> -	uint_t off = pcb->pcb_bufoff;
> +	struct bpf_insn	instr;
> +	dt_irlist_t	*dlp = &pcb->pcb_ir;
> +	uint_t		off;
>   
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   
> +	off = dt_rec_add(pcb->pcb_hdl, dt_cg_fill_gap, DTRACEACT_EXIT,
> +			 sizeof(uint64_t), sizeof(uint64_t), NULL,
> +			 DT_ACT_EXIT);
> +
>   	instr = BPF_STORE(BPF_DW, BPF_REG_9, off, BPF_REG_0);	/* FIXME */
>   	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> -
> -	return sizeof(uint64_t);
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_freopen(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_ftruncate(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_jstack(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_normalize(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_panic(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_pcap(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_printa(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_printf(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_raise(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   	dnerror(dnp, D_UNKNOWN, "raise() is not implemented (yet)\n");
>   	/* FIXME: Needs implementation */
> -
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_setopt(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
>   /*
> @@ -410,22 +429,24 @@ dt_cg_act_setopt(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>    * Stores:
>    *	integer (4 bytes)		-- speculation ID
>    */
> -static size_t
> +static void
>   dt_cg_act_speculate(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	struct bpf_insn instr;
> -	dt_irlist_t *dlp = &pcb->pcb_ir;
> -	uint_t off = pcb->pcb_bufoff;
> +	struct bpf_insn	instr;
> +	dt_irlist_t	*dlp = &pcb->pcb_ir;
> +	uint_t		off;
>   
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   
> +	off = dt_rec_add(pcb->pcb_hdl, dt_cg_fill_gap, DTRACEACT_SPECULATE,
> +			 sizeof(uint64_t), sizeof(uint64_t), NULL,
> +			 DT_ACT_SPECULATE);
> +
>   	instr = BPF_STORE(BPF_DW, BPF_REG_9, off, BPF_REG_0);	/* FIXME */
>   	dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> -
> -	return sizeof(uint64_t);
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_stack(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dt_node_t *arg = dnp->dn_args;
> @@ -439,38 +460,32 @@ dt_cg_act_stack(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   
>   		nframes = (uint32_t)arg->dn_value;
>   	}
> -
> -	return 0; /* FIXME */
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_stop(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dnerror(dnp, D_UNKNOWN, "stop() is not implemented (yet)\n");
>   	/* FIXME: Needs implementation */
> -
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_symmod(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_system(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_trace(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	struct bpf_insn instr;
> -	dt_irlist_t *dlp = &pcb->pcb_ir;
> -	char n[DT_TYPE_NAMELEN];
> -	uint_t off = pcb->pcb_bufoff;
> +	struct bpf_insn	instr;
> +	dt_irlist_t	*dlp = &pcb->pcb_ir;
> +	char		n[DT_TYPE_NAMELEN];
> +	uint_t		off;
>   
>   	if (dt_node_is_void(dnp->dn_args)) {
>   		dnerror(dnp->dn_args, D_TRACE_VOID,
> @@ -486,11 +501,14 @@ dt_cg_act_trace(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   	dt_cg_node(dnp->dn_args, &pcb->pcb_ir, pcb->pcb_regs);
>   
>   	if (dt_node_is_scalar(dnp->dn_args)) {
> +		off = dt_rec_add(pcb->pcb_hdl, dt_cg_fill_gap,
> +				 DTRACEACT_DIFEXPR, sizeof(uint64_t),
> +				 sizeof(uint64_t), NULL, DT_ACT_TRACE);
> +
>   		instr = BPF_STORE(BPF_DW, BPF_REG_9, off, dnp->dn_args->dn_reg);
>   		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
>   		dt_regset_free(pcb->pcb_regs, dnp->dn_args->dn_reg);
> -
> -		return sizeof(uint64_t);
> +#if 0
>   	} else if (dt_node_is_string(dnp->dn_args)) {
>   		size_t sz = dt_node_type_size(dnp->dn_args);
>   
> @@ -513,28 +531,25 @@ dt_cg_act_trace(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   		dt_regset_free(pcb->pcb_regs, dnp->dn_args->dn_reg);
>   
>   		return sz + sizeof(uint64_t);
> +#endif
>   	} else
>   		dnerror(dnp->dn_args, D_PROTO_ARG,
>   			"trace( ) argument #1 is incompatible with prototype:\n"
>   			"\tprototype: scalar or string\n\t argument: %s\n",
>   			dt_node_type_name(dnp->dn_args, n, sizeof (n)));
> -
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_tracemem(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_trunc(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
> -	return 0;
>   }
>   
> -static size_t
> +static void
>   dt_cg_act_ustack(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   {
>   	dt_node_t *arg0 = dnp->dn_args;
> @@ -562,11 +577,9 @@ dt_cg_act_ustack(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>   
>   		strsize = (uint32_t)arg1->dn_value;
>   	}
> -
> -	return 0; /* FIXME */
>   }
>   
> -typedef size_t dt_cg_action_f(dt_pcb_t *, dt_node_t *, dtrace_actkind_t);
> +typedef void dt_cg_action_f(dt_pcb_t *, dt_node_t *, dtrace_actkind_t);
>   
>   typedef struct dt_cg_actdesc {
>   	dt_cg_action_f *fun;
> @@ -2719,13 +2732,9 @@ dt_cg(dt_pcb_t *pcb, dt_node_t *dnp)
>   
>   				idp = act->dn_expr->dn_ident;
>   				actdp = &_dt_cg_actions[DT_ACT_IDX(idp->di_id)];
> -				if (actdp->fun) {
> -					size_t	sz = 0;
> -
> -					sz = actdp->fun(pcb, act->dn_expr,
> -							actdp->kind);
> -					pcb->pcb_bufoff += sz;
> -				}
> +				if (actdp->fun)
> +					actdp->fun(pcb, act->dn_expr,
> +						   actdp->kind);
>   			} else {
>   				dt_cg_node(act->dn_expr, &pcb->pcb_ir,
>   					   pcb->pcb_regs);
> diff --git a/libdtrace/dt_consume.c b/libdtrace/dt_consume.c
> index 62e8e9cb..a376ecd6 100644
> --- a/libdtrace/dt_consume.c
> +++ b/libdtrace/dt_consume.c
> @@ -366,7 +366,7 @@ dt_flowindent(dtrace_hdl_t *dtp, dtrace_probedata_t *data, dtrace_epid_t last,
>       dtrace_bufdesc_t *buf, size_t offs)
>   {
>   	dtrace_probedesc_t *pd = data->dtpda_pdesc, *npd;
> -	dtrace_eprobedesc_t *epd = data->dtpda_edesc, *nepd;
> +	dtrace_datadesc_t *dd = data->dtpda_ddesc, *ndd;
>   	dtrace_flowkind_t flow = DTRACEFLOW_NONE;
>   	const char *p = pd->prv;
>   	const char *n = pd->prb;
> @@ -376,7 +376,7 @@ dt_flowindent(dtrace_hdl_t *dtp, dtrace_probedata_t *data, dtrace_epid_t last,
>   	static const char *r_str[2] = { " <- ", " <= " };
>   	static const char *ent = "entry", *ret = "return";
>   	static int entlen = 0, retlen = 0;
> -	dtrace_epid_t next, id = epd->dtepd_epid;
> +	dtrace_epid_t next, id = data->dtpda_epid;
>   	int rval;
>   
>   	if (entlen == 0) {
> @@ -421,7 +421,7 @@ dt_flowindent(dtrace_hdl_t *dtp, dtrace_probedata_t *data, dtrace_epid_t last,
>   	 * _next_ EPID.
>   	 */
>   	if (flow == DTRACEFLOW_RETURN) {
> -		offs += epd->dtepd_size;
> +		offs += dd->dtdd_size;
>   
>   		do {
>   			if (offs >= buf->dtbd_size) {
> @@ -442,7 +442,7 @@ dt_flowindent(dtrace_hdl_t *dtp, dtrace_probedata_t *data, dtrace_epid_t last,
>   				offs += sizeof (id);
>   		} while (next == DTRACE_EPIDNONE);
>   
> -		if ((rval = dt_epid_lookup(dtp, next, &nepd, &npd)) != 0)
> +		if ((rval = dt_epid_lookup(dtp, next, &ndd, &npd)) != 0)
>   			return (rval);
>   
>   		if (next != id && npd->id == pd->id)
> @@ -1867,7 +1867,7 @@ dt_consume_cpu(dtrace_hdl_t *dtp, FILE *fp, int cpu, dtrace_bufdesc_t *buf,
>   
>   again:
>   	for (offs = start; offs < end; ) {
> -		dtrace_eprobedesc_t *epd;
> +		dtrace_datadesc_t *dd;
>   
>   		/*
>   		 * We're guaranteed to have an ID.
> @@ -1883,14 +1883,14 @@ again:
>   			continue;
>   		}
>   
> -		if ((rval = dt_epid_lookup(dtp, id, &data.dtpda_edesc,
> -		    &data.dtpda_pdesc)) != 0)
> +		if ((rval = dt_epid_lookup(dtp, id, &data.dtpda_ddesc,
> +					   &data.dtpda_pdesc)) != 0)
>   			return (rval);
>   
> -		epd = data.dtpda_edesc;
> +		dd = data.dtpda_ddesc;
>   		data.dtpda_data = buf->dtbd_data + offs;
>   
> -		if (data.dtpda_edesc->dtepd_uarg != DT_ECB_DEFAULT) {
> +		if (data.dtpda_ddesc->dtdd_uarg != DT_ECB_DEFAULT) {
>   			rval = dt_handle(dtp, &data);
>   
>   			if (rval == DTRACE_CONSUME_NEXT)
> @@ -1919,8 +1919,8 @@ again:
>   		if (rval != DTRACE_CONSUME_THIS)
>   			return (dt_set_errno(dtp, EDT_BADRVAL));
>   
> -		for (i = 0; i < epd->dtepd_nrecs; i++) {
> -			dtrace_recdesc_t *rec = &epd->dtepd_rec[i];
> +		for (i = 0; i < dd->dtdd_nrecs; i++) {
> +			dtrace_recdesc_t *rec = &dd->dtdd_recs[i];
>   			dtrace_actkind_t act = rec->dtrd_action;
>   
>   			data.dtpda_data = buf->dtbd_data + offs +
> @@ -1956,7 +1956,7 @@ again:
>   					continue;
>   
>   				case DT_ACT_NORMALIZE:
> -					if (i == epd->dtepd_nrecs - 1)
> +					if (i == dd->dtdd_nrecs - 1)
>   						return (dt_set_errno(dtp,
>   						    EDT_BADNORMAL));
>   
> @@ -1974,12 +1974,12 @@ again:
>   					caddr_t val;
>   					int rv;
>   
> -					if (i == epd->dtepd_nrecs - 1) {
> +					if (i == dd->dtdd_nrecs - 1) {
>   						return (dt_set_errno(dtp,
>   						    EDT_BADSETOPT));
>   					}
>   
> -					valrec = &epd->dtepd_rec[++i];
> +					valrec = &dd->dtdd_recs[++i];
>   					valsize = valrec->dtrd_size;
>   
>   					if (valrec->dtrd_action != act ||
> @@ -2009,7 +2009,7 @@ again:
>   				}
>   
>   				case DT_ACT_TRUNC:
> -					if (i == epd->dtepd_nrecs - 1)
> +					if (i == dd->dtdd_nrecs - 1)
>   						return (dt_set_errno(dtp,
>   						    EDT_BADTRUNC));
>   
> @@ -2109,7 +2109,7 @@ again:
>   				}
>   
>   				n = (*func)(dtp, fp, fmtdata, &data,
> -				    rec, epd->dtepd_nrecs - i,
> +				    rec, dd->dtdd_nrecs - i,
>   				    (uchar_t *)buf->dtbd_data + offs,
>   				    buf->dtbd_size - offs);
>   
> @@ -2126,7 +2126,7 @@ nofmt:
>   				dtrace_print_aggdata_t pd;
>   				dtrace_aggvarid_t *aggvars;
>   				int j, naggvars = 0;
> -				size_t size = ((epd->dtepd_nrecs - i) *
> +				size_t size = ((dd->dtdd_nrecs - i) *
>   				    sizeof (dtrace_aggvarid_t));
>   
>   				if ((aggvars = dt_alloc(dtp, size)) == NULL)
> @@ -2138,11 +2138,11 @@ nofmt:
>   				 * forward through the records until we find
>   				 * a record from a different statement.
>   				 */
> -				for (j = i; j < epd->dtepd_nrecs; j++) {
> +				for (j = i; j < dd->dtdd_nrecs; j++) {
>   					dtrace_recdesc_t *nrec;
>   					caddr_t naddr;
>   
> -					nrec = &epd->dtepd_rec[j];
> +					nrec = &dd->dtdd_recs[j];
>   
>   					if (nrec->dtrd_uarg != rec->dtrd_uarg)
>   						break;
> @@ -2191,7 +2191,7 @@ nofmt:
>   
>   			if (act == DTRACEACT_TRACEMEM) {
>   				n = dt_print_tracemem(dtp, fp, rec,
> -				    epd->dtepd_nrecs - i,
> +				    dd->dtdd_nrecs - i,
>   				    buf->dtbd_data + offs);
>   
>   				if (n < 0)
> @@ -2253,7 +2253,7 @@ nextrec:
>   		 */
>   		rval = (*rfunc)(&data, NULL, arg);
>   nextepid:
> -		offs += epd->dtepd_size;
> +		offs += dd->dtdd_size;
>   		last = id;
>   	}
>   
> @@ -2293,7 +2293,7 @@ dt_consume_one(dtrace_hdl_t *dtp, FILE *fp, int cpu, char *buf,
>   	if (hdr->type == PERF_RECORD_SAMPLE) {
>   		char			*ptr = data;
>   		uint32_t		size, epid, tag;
> -		int			i, nrecs;
> +		int			i;
>   		dtrace_probedata_t	pdat;
>   
>   		/*
> @@ -2311,6 +2311,13 @@ dt_consume_one(dtrace_hdl_t *dtp, FILE *fp, int cpu, char *buf,
>   		if (ptr > buf + hdr->size)
>   			return -1;
>   
> +		/*
> +		 * Clear the probe data, and fill in data independent fields.
> +		 */
> +		memset(&pdat, 0, sizeof(pdat));
> +		pdat.dtpda_handle = dtp;
> +		pdat.dtpda_cpu = cpu;
> +
>   		size = *(uint32_t *)data;
>   		data += sizeof(size);
>   		ptr += sizeof(size) + size;
> @@ -2320,21 +2327,20 @@ dt_consume_one(dtrace_hdl_t *dtp, FILE *fp, int cpu, char *buf,
>   		data += sizeof(uint32_t);		/* skip padding */
>   		size -= sizeof(uint32_t);
>   
> -		epid = *(uint32_t *)data;
> -		data += sizeof(uint32_t);
> -		size -= sizeof(uint32_t);
> +		epid = ((uint32_t *)data)[0];
> +		tag = ((uint32_t *)data)[1];
>   
> -		tag = *(uint32_t *)data;
> -		data += sizeof(uint32_t);
> -		size -= sizeof(uint32_t);
> +		/*
> +		 * Fill in the epid and address of the epid in the buffer.  We
> +		 * need to pass this to the efunc.
> +		 */
> +		pdat.dtpda_epid = epid;
> +		pdat.dtpda_data = data;
>   
> -		memset(&pdat, 0, sizeof(pdat));
> -		pdat.dtpda_handle = dtp;
> -		pdat.dtpda_cpu = cpu;
> -		rval = dt_epid_lookup(dtp, epid, &pdat.dtpda_edesc,
> +		rval = dt_epid_lookup(dtp, epid, &pdat.dtpda_ddesc,
>   						 &pdat.dtpda_pdesc);
>   		if (rval != 0)
> -			return (rval);
> +			return rval;
>   
>   #if 0
>   		if (flow)
> @@ -2362,14 +2368,26 @@ dt_consume_one(dtrace_hdl_t *dtp, FILE *fp, int cpu, char *buf,
>   		/*
>   		 * FIXME: This code is temporary.
>   		 */
> -		nrecs = size / sizeof(uint64_t);
> -		for (i = 0; i < nrecs; i++) {
> -			int	n;
> +		for (i = 0; i < pdat.dtpda_ddesc->dtdd_nrecs; i++) {
> +			int			n;
> +			dtrace_recdesc_t	*rec;
> +
> +			rec = &pdat.dtpda_ddesc->dtdd_recs[i];
> +
> +			pdat.dtpda_data = data + rec->dtrd_offset;
> +			rval = (*rfunc)(&pdat, rec, arg);
> +
> +			if (rval == DTRACE_CONSUME_NEXT)
> +				continue;
> +
> +			if (rval == DTRACE_CONSUME_ABORT)
> +				return dt_set_errno(dtp, EDT_DIRABORT);
> +
> +			if (rval != DTRACE_CONSUME_THIS)
> +				return dt_set_errno(dtp, EDT_BADRVAL);
>   
>   			n = dt_printf(dtp, fp, quiet ? "%lld" : " %16lld",
> -				      *(uint64_t *)data);
> -			data += sizeof(uint64_t);
> -			size -= sizeof(uint64_t);
> +				      *(int64_t *)pdat.dtpda_data);
>   
>   			if (n < 0)
>   				return -1;
> diff --git a/libdtrace/dt_handle.c b/libdtrace/dt_handle.c
> index ead5e198..d567a265 100644
> --- a/libdtrace/dt_handle.c
> +++ b/libdtrace/dt_handle.c
> @@ -126,12 +126,12 @@ dtrace_handle_setopt(dtrace_hdl_t *dtp, dtrace_handle_setopt_f *hdlr,
>   }
>   
>   #define	DT_REC(type, ndx) *((type *)((uintptr_t)data->dtpda_data + \
> -    epd->dtepd_rec[(ndx)].dtrd_offset))
> +    dd->dtdd_recs[(ndx)].dtrd_offset))
>   
>   static int
>   dt_handle_err(dtrace_hdl_t *dtp, dtrace_probedata_t *data)
>   {
> -	dtrace_eprobedesc_t *epd = data->dtpda_edesc, *errepd;
> +	dtrace_datadesc_t *dd = data->dtpda_ddesc, *errdd;
>   	dtrace_probedesc_t *pd = data->dtpda_pdesc, *errpd;
>   	dtrace_errdata_t err;
>   	dtrace_epid_t epid;
> @@ -144,9 +144,9 @@ dt_handle_err(dtrace_hdl_t *dtp, dtrace_probedata_t *data)
>   	char *str;
>   	int len;
>   
> -	assert(epd->dtepd_uarg == DT_ECB_ERROR);
> +	assert(dd->dtdd_uarg == DT_ECB_ERROR);
>   
> -	if (epd->dtepd_nrecs != 5 || strcmp(pd->prv, "dtrace") != 0 ||
> +	if (dd->dtdd_nrecs != 5 || strcmp(pd->prv, "dtrace") != 0 ||
>   	    strcmp(pd->prb, "ERROR") != 0)
>   		return (dt_set_errno(dtp, EDT_BADERROR));
>   
> @@ -156,10 +156,10 @@ dt_handle_err(dtrace_hdl_t *dtp, dtrace_probedata_t *data)
>   	 */
>   	epid = (uint32_t)DT_REC(uint64_t, 0);
>   
> -	if (dt_epid_lookup(dtp, epid, &errepd, &errpd) != 0)
> +	if (dt_epid_lookup(dtp, epid, &errdd, &errpd) != 0)
>   		return (dt_set_errno(dtp, EDT_BADERROR));
>   
> -	err.dteda_edesc = errepd;
> +	err.dteda_ddesc = errdd;
>   	err.dteda_pdesc = errpd;
>   	err.dteda_cpu = data->dtpda_cpu;
>   	err.dteda_action = (int)DT_REC(uint64_t, 1);
> @@ -225,7 +225,7 @@ dt_handle_liberr(dtrace_hdl_t *dtp, const dtrace_probedata_t *data,
>   	char *str;
>   	int len;
>   
> -	err.dteda_edesc = data->dtpda_edesc;
> +	err.dteda_ddesc = data->dtpda_ddesc;
>   	err.dteda_pdesc = errpd;
>   	err.dteda_cpu = data->dtpda_cpu;
>   	err.dteda_action = -1;
> @@ -238,10 +238,10 @@ dt_handle_liberr(dtrace_hdl_t *dtp, const dtrace_probedata_t *data,
>   
>   	str = alloca(len);
>   
> -	snprintf(str, len, "error on enabled probe ID %u (ID %u: %s:%s:%s:%s): "
> -			   "%s\n",
> -		 data->dtpda_edesc->dtepd_epid, errpd->id, errpd->prv,
> -		 errpd->mod, errpd->fun, errpd->prb, faultstr);
> +	snprintf(str, len,
> +		 "error on enabled probe ID %u (ID %u: %s:%s:%s:%s): %s\n",
> +		 data->dtpda_epid, errpd->id, errpd->prv, errpd->mod,
> +		 errpd->fun, errpd->prb, faultstr);
>   
>   	err.dteda_msg = str;
>   
> @@ -443,10 +443,10 @@ dt_handle_setopt(dtrace_hdl_t *dtp, dtrace_setoptdata_t *data)
>   int
>   dt_handle(dtrace_hdl_t *dtp, dtrace_probedata_t *data)
>   {
> -	dtrace_eprobedesc_t *epd = data->dtpda_edesc;
> +	dtrace_datadesc_t *dd = data->dtpda_ddesc;
>   	int rval;
>   
> -	switch (epd->dtepd_uarg) {
> +	switch (dd->dtdd_uarg) {
>   	case DT_ECB_ERROR:
>   		rval = dt_handle_err(dtp, data);
>   		break;
> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> index c20d30c9..536b1745 100644
> --- a/libdtrace/dt_impl.h
> +++ b/libdtrace/dt_impl.h
> @@ -290,7 +290,7 @@ struct dtrace_hdl {
>   	ctf_id_t dt_type_usymaddr; /* cached CTF ident. for _usymaddr type */
>   	dtrace_epid_t dt_nextepid; /* next enabled probe ID to assign */
>   	size_t dt_maxprobe;	/* max enabled probe ID */
> -	dtrace_eprobedesc_t **dt_edesc; /* enabled probe descriptions */
> +	dtrace_datadesc_t **dt_ddesc; /* probe data descriptions */
>   	dtrace_probedesc_t **dt_pdesc; /* probe descriptions for enabled prbs */
>   	size_t dt_maxagg;	/* max aggregation ID */
>   	dtrace_aggdesc_t **dt_aggdesc; /* aggregation descriptions */
> @@ -709,10 +709,16 @@ extern int dt_aggregate_go(dtrace_hdl_t *);
>   extern int dt_aggregate_init(dtrace_hdl_t *);
>   extern void dt_aggregate_destroy(dtrace_hdl_t *);
>   
> -extern dtrace_epid_t dt_epid_add(dtrace_hdl_t *, dtrace_id_t, int);
> -extern int dt_epid_lookup(dtrace_hdl_t *, dtrace_epid_t,
> -    dtrace_eprobedesc_t **, dtrace_probedesc_t **);
> +extern dtrace_datadesc_t *dt_datadesc_create(dtrace_hdl_t *);
> +extern int dt_datadesc_finalize(dtrace_hdl_t *, dtrace_datadesc_t *);
> +extern dtrace_epid_t dt_epid_add(dtrace_hdl_t *, dtrace_datadesc_t *,
> +				 dtrace_id_t);
> +extern int dt_epid_lookup(dtrace_hdl_t *, dtrace_epid_t, dtrace_datadesc_t **,
> +			  dtrace_probedesc_t **);
>   extern void dt_epid_destroy(dtrace_hdl_t *);
> +typedef void (*dt_cg_gap_f)(dt_pcb_t *, int);
> +extern uint32_t dt_rec_add(dtrace_hdl_t *, dt_cg_gap_f, dtrace_actkind_t,
> +			   uint32_t, uint16_t, const char *, uint64_t);
>   extern int dt_aggid_lookup(dtrace_hdl_t *, dtrace_aggid_t, dtrace_aggdesc_t **);
>   extern void dt_aggid_destroy(dtrace_hdl_t *);
>   
> diff --git a/libdtrace/dt_map.c b/libdtrace/dt_map.c
> index 48ffa15d..af7451c6 100644
> --- a/libdtrace/dt_map.c
> +++ b/libdtrace/dt_map.c
> @@ -11,32 +11,95 @@
>   #include <assert.h>
>   
>   #include <dt_impl.h>
> +#include <dt_pcb.h>
>   #include <dt_probe.h>
>   #include <dt_printf.h>
>   
> +static void
> +dt_datadesc_hold(dtrace_datadesc_t *ddp)
> +{
> +	ddp->dtdd_refcnt++;
> +}
> +
> +static void
> +dt_datadesc_release(dtrace_hdl_t *dtp, dtrace_datadesc_t *ddp)
> +{
> +	if (--ddp->dtdd_refcnt > 0)
> +		return;
> +
> +	dt_free(dtp, ddp->dtdd_recs);
> +	dt_free(dtp, ddp);
> +}
> +
> +dtrace_datadesc_t *
> +dt_datadesc_create(dtrace_hdl_t *dtp)
> +{
> +	dtrace_datadesc_t	*ddp;
> +
> +	ddp = dt_zalloc(dtp, sizeof(dtrace_datadesc_t));
> +	if (ddp == NULL) {
> +		dt_set_errno(dtp, EDT_NOMEM);
> +		return NULL;
> +	}
> +
> +	dt_datadesc_hold(ddp);
> +
> +	return ddp;
> +}
> +
> +int
> +dt_datadesc_finalize(dtrace_hdl_t *dtp, dtrace_datadesc_t *ddp)
> +{
> +	dt_pcb_t	*pcb = dtp->dt_pcb;
> +
> +	/*
> +	 * If the number of allocated data records is greater than the actual
> +	 * number needed, we create a copy with the right number of records and
> +	 * free the oversized one.
> +	 */
> +	if (pcb->pcb_nrecs < pcb->pcb_maxrecs) {
> +		dtrace_recdesc_t	*nrecs;
> +
> +		nrecs = dt_calloc(dtp, pcb->pcb_nrecs,
> +				  sizeof(dtrace_recdesc_t));
> +		if (nrecs == NULL)
> +			return dt_set_errno(dtp, EDT_NOMEM);
> +
> +		memcpy(nrecs, ddp->dtdd_recs,
> +		       pcb->pcb_nrecs * sizeof(dtrace_recdesc_t));
> +		dt_free(dtp, ddp->dtdd_recs);
> +		ddp->dtdd_recs = nrecs;
> +		pcb->pcb_maxrecs = pcb->pcb_nrecs;
> +	}
> +
> +	ddp->dtdd_nrecs = pcb->pcb_nrecs;
> +
> +	return 0;
> +}
> +
>   #if 0
>   static int
>   dt_epid_add(dtrace_hdl_t *dtp, dtrace_epid_t id)
>   {
>   	dtrace_id_t max;
>   	int rval, i, maxformat;
> -	dtrace_eprobedesc_t *enabled, *nenabled;
> +	dtrace_datadesc_t *dd, *ndd;
>   	dtrace_probedesc_t *probe;
>   
>   	while (id >= (max = dtp->dt_maxprobe) || dtp->dt_pdesc == NULL) {
>   		dtrace_id_t new_max = max ? (max << 1) : 1;
>   		size_t nsize = new_max * sizeof (void *);
>   		dtrace_probedesc_t **new_pdesc;
> -		dtrace_eprobedesc_t **new_edesc;
> +		dtrace_datadesc_t **new_ddesc;
>   
>   		if ((new_pdesc = malloc(nsize)) == NULL ||
> -		    (new_edesc = malloc(nsize)) == NULL) {
> +		    (new_ddesc = malloc(nsize)) == NULL) {
>   			free(new_pdesc);
>   			return (dt_set_errno(dtp, EDT_NOMEM));
>   		}
>   
>   		memset(new_pdesc, 0, nsize);
> -		memset(new_edesc, 0, nsize);
> +		memset(new_ddesc, 0, nsize);
>   
>   		if (dtp->dt_pdesc != NULL) {
>   			size_t osize = max * sizeof (void *);
> @@ -44,69 +107,68 @@ dt_epid_add(dtrace_hdl_t *dtp, dtrace_epid_t id)
>   			memcpy(new_pdesc, dtp->dt_pdesc, osize);
>   			free(dtp->dt_pdesc);
>   
> -			memcpy(new_edesc, dtp->dt_edesc, osize);
> -			free(dtp->dt_edesc);
> +			memcpy(new_ddesc, dtp->dt_ddesc, osize);
> +			free(dtp->dt_ddesc);
>   		}
>   
>   		dtp->dt_pdesc = new_pdesc;
> -		dtp->dt_edesc = new_edesc;
> +		dtp->dt_ddesc = new_ddesc;
>   		dtp->dt_maxprobe = new_max;
>   	}
>   
>   	if (dtp->dt_pdesc[id] != NULL)
>   		return (0);
>   
> -	if ((enabled = malloc(sizeof (dtrace_eprobedesc_t))) == NULL)
> +	if ((dd = malloc(sizeof (dtrace_datadesc_t))) == NULL)
>   		return (dt_set_errno(dtp, EDT_NOMEM));
>   
> -	memset(enabled, 0, sizeof (dtrace_eprobedesc_t));
> -	enabled->dtepd_epid = id;
> -	enabled->dtepd_nrecs = 1;
> +	memset(dd, 0, sizeof (dtrace_datadesc_t));
> +	dd->dtdd_epid = id;
> +	dd->dtdd_nrecs = 1;
>   
> -	if (dt_ioctl(dtp, DTRACEIOC_EPROBE, enabled) == -1) {
> +	if (dt_ioctl(dtp, DTRACEIOC_EPROBE, dd) == -1) {
>   		rval = dt_set_errno(dtp, errno);
> -		free(enabled);
> +		free(dd);
>   		return (rval);
>   	}
>   
> -	if (DTRACE_SIZEOF_EPROBEDESC(enabled) != sizeof (*enabled)) {
> +	if (DTRACE_SIZEOF_EPROBEDESC(dd) != sizeof (*dd)) {
>   		/*
>   		 * There must be more than one action.  Allocate the
>   		 * appropriate amount of space and try again.
>   		 */
> -		if ((nenabled =
> -		    malloc(DTRACE_SIZEOF_EPROBEDESC(enabled))) != NULL)
> -			memcpy(nenabled, enabled, sizeof (*enabled));
> +		if ((ndd = malloc(DTRACE_SIZEOF_EPROBEDESC(dd))) != NULL)
> +			memcpy(ndd, dd, sizeof (*dd));
>   
> -		free(enabled);
> +		free(dd);
>   
> -		if ((enabled = nenabled) == NULL)
> +		if ((dd = ndd) == NULL)
>   			return (dt_set_errno(dtp, EDT_NOMEM));
>   
> -		rval = dt_ioctl(dtp, DTRACEIOC_EPROBE, enabled);
> +		rval = dt_ioctl(dtp, DTRACEIOC_EPROBE, dd);
>   
>   		if (rval == -1) {
>   			rval = dt_set_errno(dtp, errno);
> -			free(enabled);
> +			free(dd);
>   			return (rval);
>   		}
>   	}
>   
>   	if ((probe = malloc(sizeof (dtrace_probedesc_t))) == NULL) {
> -		free(enabled);
> +		free(dd);
>   		return (dt_set_errno(dtp, EDT_NOMEM));
>   	}
>   
> -	probe->id = enabled->dtepd_probeid;
> +	probe->id = dd->dtdd_probeid;
>   
>   	if (dt_ioctl(dtp, DTRACEIOC_PROBES, probe) == -1) {
>   		rval = dt_set_errno(dtp, errno);
>   		goto err;
>   	}
>   
> -	for (i = 0; i < enabled->dtepd_nrecs; i++) {
> +	for (i = 0; i < dd->dtdd_nrecs; i++) {
>   		dtrace_fmtdesc_t fmt;
> -		dtrace_recdesc_t *rec = &enabled->dtepd_rec[i];
> +		dtrace_recdesc_t *rec = &dd->dtdd_rec[i];
>   
>   		if (!DTRACEACT_ISPRINTFLIKE(rec->dtrd_action))
>   			continue;
> @@ -173,7 +235,7 @@ dt_epid_add(dtrace_hdl_t *dtp, dtrace_epid_t id)
>   	}
>   
>   	dtp->dt_pdesc[id] = probe;
> -	dtp->dt_edesc[id] = enabled;
> +	dtp->dt_ddesc[id] = dd;
>   
>   	return (0);
>   
> @@ -184,70 +246,56 @@ err:
>   	 * we have already allocated.  This is okay; these formats are
>   	 * hanging off of dt_formats and will therefore not be leaked.
>   	 */
> -	free(enabled);
> +	free(dd);
>   	free(probe);
>   	return (rval);
>   }
>   #else
> +/*
> + * Associate a probe data description and probe description with an enabled
> + * probe ID.  This means that the given ID refers to the program matching the
> + * probe data description being attached to the probe that matches the probe
> + * description.
> + */
>   dtrace_epid_t
> -dt_epid_add(dtrace_hdl_t *dtp, dtrace_id_t prid, int nrecs)
> +dt_epid_add(dtrace_hdl_t *dtp, dtrace_datadesc_t *ddp, dtrace_id_t prid)
>   {
> -	dtrace_id_t		max;
> +	dtrace_id_t		max = dtp->dt_maxprobe;
>   	dtrace_epid_t		epid;
> -	dtrace_eprobedesc_t	*eprobe;
>   
>   	epid = dtp->dt_nextepid++;
> -	max = dtp->dt_maxprobe;
> -	if (epid >= max || dtp->dt_edesc == NULL) {
> +	if (epid >= max || dtp->dt_ddesc == NULL) {
>   		dtrace_id_t		nmax = max ? (max << 1) : 2;
> -		dtrace_eprobedesc_t	**nedesc;
> +		dtrace_datadesc_t	**nddesc;
>   		dtrace_probedesc_t	**npdesc;
>   
> -		nedesc = dt_calloc(dtp, nmax, sizeof(void *));
> +		nddesc = dt_calloc(dtp, nmax, sizeof(void *));
>   		npdesc = dt_calloc(dtp, nmax, sizeof(void *));
> -		if (nedesc == NULL || npdesc == NULL) {
> -			dt_free(dtp, nedesc);
> +		if (nddesc == NULL || npdesc == NULL) {
> +			dt_free(dtp, nddesc);
>   			dt_free(dtp, npdesc);
>   			return dt_set_errno(dtp, EDT_NOMEM);
>   		}
>   
> -		if (dtp->dt_edesc != NULL) {
> +		if (dtp->dt_ddesc != NULL) {
>   			size_t	osize = max * sizeof(void *);
>   
> -			memcpy(nedesc, dtp->dt_edesc, osize);
> -			dt_free(dtp, dtp->dt_edesc);
> +			memcpy(nddesc, dtp->dt_ddesc, osize);
> +			dt_free(dtp, dtp->dt_ddesc);
>   			memcpy(npdesc, dtp->dt_pdesc, osize);
>   			dt_free(dtp, dtp->dt_pdesc);
>   		}
>   
> -		dtp->dt_edesc = nedesc;
> +		dtp->dt_ddesc = nddesc;
>   		dtp->dt_pdesc = npdesc;
>   		dtp->dt_maxprobe = nmax;
>   	}
>   
> -	if (dtp->dt_edesc[epid] != NULL)
> +	if (dtp->dt_ddesc[epid] != NULL)
>   		return epid;
>   
> -	/*
> -	 * This is a bit odd, but we want to make sure that nrecs is not a
> -	 * negative value while also taking into consideration that there is
> -	 * always rooms for one record in dtrace_eprobedesc_t, so for our
> -	 * allocation purposes we account for that default record slot.
> -	 */
> -	assert(nrecs >= 0);
> -	if (nrecs == 0)
> -		nrecs = 1;
> -
> -	eprobe = dt_zalloc(dtp, sizeof(dtrace_eprobedesc_t) +
> -				(nrecs - 1) * sizeof(dtrace_recdesc_t));
> -	if (eprobe == NULL)
> -		return dt_set_errno(dtp, EDT_NOMEM);
> -
> -	eprobe->dtepd_epid = epid;
> -	eprobe->dtepd_probeid = prid;
> -	eprobe->dtepd_nrecs = nrecs;
> -
> -	dtp->dt_edesc[epid] = eprobe;
> +	dt_datadesc_hold(ddp);
> +	dtp->dt_ddesc[epid] = ddp;
>   	dtp->dt_pdesc[epid] = (dtrace_probedesc_t *)dtp->dt_probes[prid]->desc;
>   
>   	return epid;
> @@ -255,22 +303,14 @@ dt_epid_add(dtrace_hdl_t *dtp, dtrace_id_t prid, int nrecs)
>   #endif
>   
>   int
> -dt_epid_lookup(dtrace_hdl_t *dtp, dtrace_epid_t epid,
> -    dtrace_eprobedesc_t **epdp, dtrace_probedesc_t **pdp)
> +dt_epid_lookup(dtrace_hdl_t *dtp, dtrace_epid_t epid, dtrace_datadesc_t **ddp,
> +	       dtrace_probedesc_t **pdp)
>   {
> -	int rval;
> -
> -#if 0
> -	if (epid >= dtp->dt_maxprobe || dtp->dt_pdesc[epid] == NULL) {
> -		if ((rval = dt_epid_add(dtp, epid)) != 0)
> -			return (rval);
> -	}
> -#endif
> -
>   	assert(epid < dtp->dt_maxprobe);
> -	assert(dtp->dt_edesc[epid] != NULL);
> +	assert(dtp->dt_ddesc[epid] != NULL);
>   	assert(dtp->dt_pdesc[epid] != NULL);
> -	*epdp = dtp->dt_edesc[epid];
> +
> +	*ddp = dtp->dt_ddesc[epid];
>   	*pdp = dtp->dt_pdesc[epid];
>   
>   	return (0);
> @@ -281,33 +321,84 @@ dt_epid_destroy(dtrace_hdl_t *dtp)
>   {
>   	size_t i;
>   
> -	assert((dtp->dt_pdesc != NULL && dtp->dt_edesc != NULL &&
> +	assert((dtp->dt_pdesc != NULL && dtp->dt_ddesc != NULL &&
>   	    dtp->dt_maxprobe > 0) || (dtp->dt_pdesc == NULL &&
> -	    dtp->dt_edesc == NULL && dtp->dt_maxprobe == 0));
> +	    dtp->dt_ddesc == NULL && dtp->dt_maxprobe == 0));
>   
>   	if (dtp->dt_pdesc == NULL)
>   		return;
>   
>   	for (i = 0; i < dtp->dt_maxprobe; i++) {
> -		if (dtp->dt_edesc[i] == NULL) {
> +		if (dtp->dt_ddesc[i] == NULL) {
>   			assert(dtp->dt_pdesc[i] == NULL);
>   			continue;
>   		}
>   
> +		dt_datadesc_release(dtp, dtp->dt_ddesc[i]);
>   		assert(dtp->dt_pdesc[i] != NULL);
> -		free(dtp->dt_edesc[i]);
> -		free(dtp->dt_pdesc[i]);
>   	}
>   
>   	free(dtp->dt_pdesc);
>   	dtp->dt_pdesc = NULL;
>   
> -	free(dtp->dt_edesc);
> -	dtp->dt_edesc = NULL;
> +	free(dtp->dt_ddesc);
> +	dtp->dt_ddesc = NULL;
>   	dtp->dt_nextepid = 0;
>   	dtp->dt_maxprobe = 0;
>   }
>   
> +uint32_t
> +dt_rec_add(dtrace_hdl_t *dtp, dt_cg_gap_f gapf, dtrace_actkind_t kind,
> +	   uint32_t size, uint16_t alignment, const char *fmt, uint64_t arg)
> +{
> +	dt_pcb_t		*pcb = dtp->dt_pcb;
> +	uint32_t		off;
> +	uint32_t		gap;
> +	dtrace_datadesc_t	*ddp = pcb->pcb_stmt->dtsd_ddesc;
> +	dtrace_recdesc_t	*rec;
> +	int			cnt, max;
> +
> +	assert(gapf);
> +
> +	cnt = pcb->pcb_nrecs + 1;
> +	max = pcb->pcb_maxrecs;
> +	if (cnt >= max) {
> +		int			nmax = max ? (max << 1) : cnt;
> +		dtrace_recdesc_t	*nrecs;
> +
> +		nrecs = dt_calloc(dtp, cnt, sizeof(dtrace_recdesc_t));
> +		if (nrecs == NULL)
> +			longjmp(pcb->pcb_jmpbuf, EDT_NOMEM);
> +
> +		if (ddp->dtdd_recs != NULL) {
> +			size_t	osize = max * sizeof(dtrace_recdesc_t);
> +
> +			memcpy(nrecs, ddp->dtdd_recs, osize);
> +			dt_free(dtp, ddp->dtdd_recs);
> +		}
> +
> +		ddp->dtdd_recs = nrecs;
> +		pcb->pcb_maxrecs = nmax;
> +	}
> +
> +	rec = &ddp->dtdd_recs[pcb->pcb_nrecs++];
> +	off = (pcb->pcb_bufoff + (alignment - 1)) & ~(alignment - 1);
> +	rec->dtrd_action = kind;
> +	rec->dtrd_size = size;
> +	rec->dtrd_offset = off;
> +	rec->dtrd_alignment = alignment;
> +	rec->dtrd_format = 0;	/* FIXME */
> +	rec->dtrd_arg = arg;
> +
> +	gap = off - pcb->pcb_bufoff;
> +	if (gap > 0)
> +		(*gapf)(pcb, gap);
> +
> +	pcb->pcb_bufoff = off + size;
> +
> +	return off;
> +}
> +
>   void *
>   dt_format_lookup(dtrace_hdl_t *dtp, int format)
>   {
> @@ -338,7 +429,6 @@ static int
>   dt_aggid_add(dtrace_hdl_t *dtp, dtrace_aggid_t id)
>   {
>   	dtrace_id_t max;
> -	dtrace_epid_t epid;
>   	int rval;
>   
>   	while (id >= (max = dtp->dt_maxagg) || dtp->dt_aggdesc == NULL) {
> @@ -438,8 +528,7 @@ dt_aggid_add(dtrace_hdl_t *dtp, dtrace_aggid_t id)
>   }
>   
>   int
> -dt_aggid_lookup(dtrace_hdl_t *dtp, dtrace_aggid_t aggid,
> -    dtrace_aggdesc_t **adp)
> +dt_aggid_lookup(dtrace_hdl_t *dtp, dtrace_aggid_t aggid, dtrace_aggdesc_t **adp)
>   {
>   	int rval;
>   
> diff --git a/libdtrace/dt_pcb.h b/libdtrace/dt_pcb.h
> index 8a73e7da..95d2fc8d 100644
> --- a/libdtrace/dt_pcb.h
> +++ b/libdtrace/dt_pcb.h
> @@ -41,7 +41,7 @@ typedef struct dt_pcb {
>   	dt_idhash_t *pcb_idents; /* current hash table of ambiguous idents */
>   	dt_idhash_t *pcb_pragmas; /* current hash table of pending pragmas */
>   	dt_regset_t *pcb_regs;	/* register set for code generation */
> -	uint_t pcb_bufoff;	/* output buffer offset (for DFUNCs) */
> +	uint32_t pcb_bufoff;	/* output buffer offset (for DFUNCs) */
>   	dt_irlist_t pcb_ir;	/* list of unrelocated IR instructions */
>   	uint_t pcb_exitlbl;	/* label for exit of program */
>   	uint_t pcb_asvidx;	/* assembler vartab index (see dt_as.c) */
> @@ -50,6 +50,8 @@ typedef struct dt_pcb {
>   	const dtrace_probedesc_t *pcb_pdesc; /* probedesc for current context */
>   	struct dt_probe *pcb_probe; /* probe associated with current context */
>   	dtrace_probeinfo_t pcb_pinfo; /* info associated with current context */
> +	int pcb_nrecs;		/* number of data record descriptions */
> +	int pcb_maxrecs;	/* alloc'd number of data record descriptions */
>   	dtrace_attribute_t pcb_amin; /* stability minimum for compilation */
>   	dt_node_t *pcb_dret;	/* node containing return type for assembler */
>   	dtrace_difo_t *pcb_difo; /* intermediate DIF object made by assembler */
> diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
> index 5555b176..b154668d 100644
> --- a/libdtrace/dt_program.c
> +++ b/libdtrace/dt_program.c
> @@ -220,17 +220,26 @@ dt_ecbdesc_create(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
>   dtrace_stmtdesc_t *
>   dtrace_stmt_create(dtrace_hdl_t *dtp, dtrace_ecbdesc_t *edp)
>   {
> -	dtrace_stmtdesc_t *sdp;
> +	dtrace_stmtdesc_t	*sdp;
> +	dtrace_datadesc_t	*ddp;
>   
> -	if ((sdp = dt_zalloc(dtp, sizeof (dtrace_stmtdesc_t))) == NULL)
> -		return (NULL);
> +	sdp = dt_zalloc(dtp, sizeof(dtrace_stmtdesc_t));
> +	if (sdp == NULL)
> +		return NULL;
> +
> +	ddp = dt_datadesc_create(dtp);
> +	if (ddp == NULL) {
> +		dt_free(dtp, sdp);
> +		return NULL;
> +	}
>   
>   	dt_ecbdesc_hold(edp);
>   	sdp->dtsd_ecbdesc = edp;
> +	sdp->dtsd_ddesc = ddp;
>   	sdp->dtsd_descattr = _dtrace_defattr;
>   	sdp->dtsd_stmtattr = _dtrace_defattr;
>   
> -	return (sdp);
> +	return sdp;
>   }
>   
>   dtrace_actdesc_t *
> @@ -346,6 +355,7 @@ dtrace_stmt_destroy(dtrace_hdl_t *dtp, dtrace_stmtdesc_t *sdp)
>   		dt_printf_destroy(sdp->dtsd_fmtdata);
>   
>   	dt_ecbdesc_release(dtp, sdp->dtsd_ecbdesc);
> +	dt_free(dtp, sdp->dtsd_ddesc);
>   	dt_free(dtp, sdp);
>   }
>   
> diff --git a/libdtrace/dtrace.h b/libdtrace/dtrace.h
> index 13844324..0b502f34 100644
> --- a/libdtrace/dtrace.h
> +++ b/libdtrace/dtrace.h
> @@ -135,6 +135,7 @@ typedef struct dtrace_stmtdesc {
>   	dtrace_ecbdesc_t *dtsd_ecbdesc;		/* ECB description */
>   	dtrace_actdesc_t *dtsd_action;		/* action list */
>   	dtrace_actdesc_t *dtsd_action_last;	/* last action in action list */
> +	dtrace_datadesc_t *dtsd_ddesc;		/* probe data description */
>   	void *dtsd_aggdata;			/* aggregation data */
>   	void *dtsd_fmtdata;			/* type-specific output data */
>   	void (*dtsd_callback)();		/* callback function for EPID */
> @@ -173,7 +174,8 @@ typedef enum {
>   
>   typedef struct dtrace_probedata {
>   	dtrace_hdl_t *dtpda_handle;		/* handle to DTrace library */
> -	dtrace_eprobedesc_t *dtpda_edesc;	/* enabled probe description */
> +	dtrace_epid_t dtpda_epid;		/* enabled probe ID */
> +	dtrace_datadesc_t *dtpda_ddesc;		/* probe data description */
>   	dtrace_probedesc_t *dtpda_pdesc;	/* probe description */
>   	unsigned int dtpda_cpu;			/* CPU for data */
>   	caddr_t dtpda_data;			/* pointer to raw data */
> @@ -262,7 +264,7 @@ extern dtrace_workstatus_t dtrace_work(dtrace_hdl_t *dtp, FILE *fp,
>   
>   typedef struct dtrace_errdata {
>   	dtrace_hdl_t *dteda_handle;		/* handle to DTrace library */
> -	dtrace_eprobedesc_t *dteda_edesc;	/* enabled probe inducing err */
> +	dtrace_datadesc_t *dteda_ddesc;		/* probe data inducing err */
>   	dtrace_probedesc_t *dteda_pdesc;	/* probe inducing error */
>   	unsigned int dteda_cpu;			/* CPU of error */
>   	int dteda_action;			/* action inducing error */
> @@ -355,7 +357,7 @@ extern int dtrace_handle_setopt(dtrace_hdl_t *dtp,
>   struct dtrace_aggdata {
>   	dtrace_hdl_t *dtada_handle;		/* handle to DTrace library */
>   	dtrace_aggdesc_t *dtada_desc;		/* aggregation description */
> -	dtrace_eprobedesc_t *dtada_edesc;	/* enabled probe description */
> +	dtrace_datadesc_t *dtada_ddesc;		/* probe data description */
>   	dtrace_probedesc_t *dtada_pdesc;	/* probe description */
>   	caddr_t dtada_data;			/* pointer to raw data */
>   	uint64_t dtada_normal;			/* the normal -- 1 for denorm */




More information about the DTrace-devel mailing list