[DTrace-devel] [PATCH] Ensure string constants that are too long are truncated correctly

Eugene Loh eugene.loh at oracle.com
Fri Nov 19 23:18:44 UTC 2021


I confess I don't really understand the code.

Wouldn't it be easier to hack the strtab to clean long strings up before 
loading the strtab into the BPF map in the first place? When we load the 
strtab, we already know STRSIZE.  So we can walk the strings.  If a 
length prefix is too long, we rewrite it and we write a NUL char at the 
right spot.  Then we would not need the BPF runtime code, which would 
conceivably be copying a string to tstring each of many probe firings.

It seems as though we're adding new uses for tstrings.  Does this mean 
that the number of tstrings has to be increased?  E.g., do we want to 
support all these cases?
     # dtrace -qn 'BEGIN {trace(strjoin(probename, probename)); exit(0)}'
     BEGINBEGIN
     # dtrace -qn 'BEGIN {trace(strjoin(probename, strjoin(probename, 
probename))); exit(0)}'
     BEGINBEGINBEGIN
     # dtrace -qn 'BEGIN {trace(strjoin(probename, strjoin(probename, 
strjoin(probename, probename)))); exit(0)}'
     dtrace: libdtrace/dt_cg.c:844: dt_cg_tstring_xalloc: Assertion `i < 
DT_TSTRING_SLOTS' failed.
FWIW, without this patch, things are better, but only some.  If you run 
the above scripts and keep growing them in the same pattern, without the 
patch you get:
     BEGINBEGIN
     BEGINBEGINBEGIN
     BEGINBEGINBEGINBEGIN
     BEGINBEGINBEGINBEGINBEGIN
     BEGINBEGINBEGINBEGINBEGINBEGIN
     BEGINBEGINBEGINBEGIN
and beyond that the BPF verifier complains.  [PS  Right now, I'm just 
reporting problems.  I'm happy to look into any of them.]

Why do we have DT_STRLEN_BYTES at all?  Alternatively, what decisions 
have to be made before dropping that prefix?  We seem to be doing a lot 
of work that is still supporting a length prefix that was introduced for 
now outmoded reasons.

Perhaps this is outside the scope of the present patch, but wouldn't it 
make sense to wrap the BPF instructions in
         dt_cg_tstring_alloc(yypcb, dnp);
         emit(dlp,  BPF_LOAD(BPF_DW, dnp->dn_reg, BPF_REG_FP, DT_STK_DCTX));
         emit(dlp,  BPF_LOAD(BPF_DW, dnp->dn_reg, dnp->dn_reg, DCTX_MEM));
         emit(dlp,  BPF_ALU64_IMM(BPF_ADD, dnp->dn_reg, 
dnp->dn_tstring->dn_value));
into dt_cg_tstring_alloc()?  These instructions are emitted each time 
dt_cg_tstring_alloc() is called.

Two more...

On 11/19/21 12:38 AM, Kris Van Hees via DTrace-devel wrote:
> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> @@ -4305,7 +4352,9 @@ dt_cg_node(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
>   		break;
>   	}
>   
> -	case DT_TOK_STRING:
> +	case DT_TOK_STRING: {
> +		size_t	size = yypcb->pcb_hdl->dt_options[DTRACEOPT_STRSIZE];

Since size is used only once, how about just inlining its expression 
where it is used?

> +
>   		if ((dnp->dn_reg = dt_regset_alloc(drp)) == -1)
>   			longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>   
> @@ -4319,13 +4368,24 @@ dt_cg_node(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
>   			longjmp(yypcb->pcb_jmpbuf, EDT_STR2BIG);
>   
>   		/*
> -		 * Calculate the address of the string data in the 'strtab' BPF
> -		 * map.
> +		 * If the string constant is longer than the maximum string
> +		 * size, we create a truncated copy in a tstring and use the
> +		 * adress of the tstring.
> +		 *
> +		 * If the string constant is not longer than the maximum string
> +		 * size, we can just use the address of the string constant in
> +		 * the 'strtab' BPF  map.
>   		 */
> -		emit(dlp, BPF_LOAD(BPF_DW, dnp->dn_reg, BPF_REG_FP, DT_STK_DCTX));
> -		emit(dlp, BPF_LOAD(BPF_DW, dnp->dn_reg, dnp->dn_reg, DCTX_STRTAB));
> -		emit(dlp, BPF_ALU64_IMM(BPF_ADD, dnp->dn_reg, stroff));
> +		if (dt_node_type_size(dnp) > size) {

I think dt_node_type_size(dnp) includes the terminating NUL while size 
does not.  So don't we want to add 1 to the right side of the >?

> +			emit(dlp, BPF_MOV_IMM(dnp->dn_reg, stroff));
> +			dt_cg_strconst_to_tstring(dlp, drp, dnp);
> +		} else {
> +			emit(dlp, BPF_LOAD(BPF_DW, dnp->dn_reg, BPF_REG_FP, DT_STK_DCTX));
> +			emit(dlp, BPF_LOAD(BPF_DW, dnp->dn_reg, dnp->dn_reg, DCTX_STRTAB));
> +			emit(dlp, BPF_ALU64_IMM(BPF_ADD, dnp->dn_reg, stroff));
> +		}
>   		break;
> +	}
>   
>   	case DT_TOK_IDENT:
>   		/*



More information about the DTrace-devel mailing list