[DTrace-devel] [PATCH] Fix size of string data in the trace output buffer

Eugene Loh eugene.loh at oracle.com
Tue Sep 7 21:49:12 PDT 2021


I guess this is okay, particularly since we'll probably be revisiting 
all this soon anyhow if/when the length prefix is removed.  So:
     Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
but questions/comments below.

To start with, isn't there a common case where the source is already a 
"limited" string, in which case one can just do a single straight copy, 
length prefix and all?

Also, does dt_cg_store_var() have the same issues?

On 9/7/21 8:37 PM, Kris Van Hees wrote:
> The size of string data items in the trace output buffer was not taking
> into account that we use a 2-byte length prefix and it also did not
> include the terminating NUL byte.
>
> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> ---
>   libdtrace/dt_cg.c                         | 27 +++++++++++++++--------
>   test/unittest/codegen/tst.str_data_size.d | 21 ++++++++++++++++++
>   test/unittest/codegen/tst.str_data_size.r |  4 ++++
>   3 files changed, 43 insertions(+), 9 deletions(-)
>   create mode 100644 test/unittest/codegen/tst.str_data_size.d
>   create mode 100644 test/unittest/codegen/tst.str_data_size.r
>
> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> index a8756581..acb7c6fa 100644
> --- a/libdtrace/dt_cg.c
> +++ b/libdtrace/dt_cg.c
> @@ -939,39 +939,48 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind,
>   		int		reg = dt_regset_alloc(drp);
>   
>   		off = dt_rec_add(pcb->pcb_hdl, dt_cg_fill_gap, kind,
> -				 size, 1, pfp, arg);
> +				 size + DT_STRLEN_BYTES + 1, 1, pfp, arg);
>   
>   		/*
>   		 * Retrieve the length of the string, limit it to the maximum
> -		 * string size, and adjust for the terminating NUL byte and the
> -		 * length prefix.
> +		 * string size, and store it in the buffer at [%r9 + off].
>   		 */
>   		dt_cg_strlen(dlp, drp, reg, dnp->dn_reg);
>   		emit(dlp,  BPF_BRANCH_IMM(BPF_JLT, reg, size, size_ok));

Okay, though LT->LE probably makes more sense.

>   		emit(dlp,  BPF_MOV_IMM(reg, size));
>   		emitl(dlp, size_ok,
> -			   BPF_ALU64_IMM(BPF_ADD, reg, 1 + DT_STRLEN_BYTES));
> +			   BPF_MOV_REG(BPF_REG_0, reg));
> +		emit(dlp,  BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 8));
> +		emit(dlp,  BPF_STORE(BPF_B, BPF_REG_9, off, BPF_REG_0));
> +		emit(dlp,  BPF_STORE(BPF_B, BPF_REG_9, off + 1, reg));
>   
>   		/*
> -		 * Copy string data to the output buffer at [%r9 + off].  The
> -		 * amount of bytes copied is the lesser of the data size and
> -		 * the maximum string size.
> +		 * Copy the string to the output buffer at
> +		 * [%r9 + off + DT_STRLEN_BYTES] because we need to skip the
> +		 * length prefix.
>   		 */
>   		if (dt_regset_xalloc_args(drp) == -1)
>   			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>   
>   		emit(dlp,  BPF_MOV_REG(BPF_REG_1, BPF_REG_9));
> -		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off));
> +		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off + DT_STRLEN_BYTES));
>   		emit(dlp,  BPF_MOV_REG(BPF_REG_2, reg));
>   		dt_regset_free(drp, reg);
>   		emit(dlp,  BPF_MOV_REG(BPF_REG_3, dnp->dn_reg));
>   		dt_regset_free(drp, dnp->dn_reg);
> +		emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STRLEN_BYTES));
>   		if (dnp->dn_tstring)
>   			dt_cg_tstring_free(pcb, dnp);

How about adding the +DT_STRLEN_BYTES instruction before the 
regset_free(dnp->dn_reg)?  That way, instruction generation (for the two 
%r3 instructions) is consolidated and code-generator resource management 
(reg and tstring) is consolidated.

>   		dt_regset_xalloc(drp, BPF_REG_0);
>   		emit(dlp,  BPF_CALL_HELPER(BPF_FUNC_probe_read));
>   		dt_regset_free_args(drp);
> -		dt_regset_free(drp, BPF_REG_0);
> +
> +		/*
> +		 * Write the NUL terminating byte in case we are truncating.
> +		 */
> +		emit(dlp,  BPF_MOV_REG(BPF_REG_0, BPF_REG_9));
> +		emit(dlp,  BPF_ALU64_REG(BPF_ADD, BPF_REG_0, reg));
> +		emit(dlp,  BPF_STORE_IMM(BPF_B, BPF_REG_0, off + DT_STRLEN_BYTES, 0));

I sure do not understand the register management here.  Earlier, you use 
%r0 for writing the length prefix, but never xalloc'ing the register.  
Here, you xalloc(r0) but you removed the free(r0).  Isn't that wrong?  
But I tried iterating on this and it appears someone else is freeing the 
register?  Who?  I'm confused.

>   
>   		return 0;
>   	}
> diff --git a/test/unittest/codegen/tst.str_data_size.d b/test/unittest/codegen/tst.str_data_size.d
> new file mode 100644
> index 00000000..a928e829
> --- /dev/null
> +++ b/test/unittest/codegen/tst.str_data_size.d
> @@ -0,0 +1,21 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + */
> +
> +#pragma D option rawbytes
> +#pragma D option strsize=5
> +#pragma D option quiet
> +
> +BEGIN
> +{
> +	trace(probeprov);
> +	exit(0);
> +}
> +
> +ERROR
> +{
> +	exit(1);
> +}
> diff --git a/test/unittest/codegen/tst.str_data_size.r b/test/unittest/codegen/tst.str_data_size.r
> new file mode 100644
> index 00000000..60483d74
> --- /dev/null
> +++ b/test/unittest/codegen/tst.str_data_size.r
> @@ -0,0 +1,4 @@
> +
> +             0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> +         0: 00 05 64 74 72 61 63 00                          ..dtrac.
> +



More information about the DTrace-devel mailing list