[DTrace-devel] [PATCH 01/10] Reduce register pressure in substr()
Eugene Loh
eugene.loh at oracle.com
Fri Mar 18 20:30:13 UTC 2022
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>
with some questions and comments...
On 3/18/22 3:04 PM, Kris Van Hees via DTrace-devel wrote:
> Have substr() return the result string pointer. This means that we can
> delay allocating the register to hold the result until after the call to
> dt_substr() has been made.
Good idea but this raises the question whether we should be doing this
sort of thing more universally. Or do we do this on a case-by-case
basis... "reactively"? Is it worth having a test case for this instance?
> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
> ---
> bpf/substr.S | 3 ++-
> libdtrace/dt_cg.c | 29 ++++++++++++++++-------------
> 2 files changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/bpf/substr.S b/bpf/substr.S
> index 8d501faa..9bd2400d 100644
> --- a/bpf/substr.S
> +++ b/bpf/substr.S
> @@ -6,7 +6,7 @@
> #define BPF_FUNC_probe_read_str 45
>
> /*
> - * void dt_substr(char *dst, const char *src, int32_t idx, int32_t cnt,
> + * char *dt_substr(char *dst, const char *src, int32_t idx, int32_t cnt,
> * uint64_t argc)
> *
> * %r1 = dst, %r2 = src, %r3 = idx, %r4 = cnt, %r5 = argc
> @@ -124,6 +124,7 @@ dt_substr :
> * &src[idx]);
> */
>
> + mov %r0, %r9
> exit
>
> .Lempty:
I confess I do not understand why there is no similar %r0=%r9 on the
.Lempty code path. If we jump to .Lempty, %r0 is "not properly" set and
then we return. The clause function then picks this %r0 up and goes on
its way. That seems to me to be wrong, but I cannot demonstrate an
error as easily as I would have thought. Is the %r0=%r9 not needed on
.Lempty?
> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> index 6f95260c..2cf48994 100644
> --- a/libdtrace/dt_cg.c
> +++ b/libdtrace/dt_cg.c
> @@ -3997,26 +3997,20 @@ dt_cg_subr_substr(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
> dt_cg_node(cnt, dlp, drp);
>
> /*
> - * Allocate the result register and associate it with a temporary
> - * string slot.
> + * Allocate a temporary string slot for the result.
> */
> - dnp->dn_reg = dt_regset_alloc(drp);
> - if (dnp->dn_reg == -1)
> - longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> dt_cg_tstring_alloc(yypcb, dnp);
>
> - emit(dlp, BPF_LOAD(BPF_DW, dnp->dn_reg, BPF_REG_FP, DT_STK_DCTX));
> - emit(dlp, BPF_LOAD(BPF_DW, dnp->dn_reg, dnp->dn_reg, DCTX_MEM));
> - emit(dlp, BPF_ALU64_IMM(BPF_ADD, dnp->dn_reg, dnp->dn_tstring->dn_value));
> -
> if (dt_regset_xalloc_args(drp) == -1)
> longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>
> - emit(dlp, BPF_MOV_REG(BPF_REG_1, dnp->dn_reg));
> - emit(dlp, BPF_MOV_REG(BPF_REG_2, str->dn_reg));
> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_FP, DT_STK_DCTX));
> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_1, DCTX_MEM));
> + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, dnp->dn_tstring->dn_value));
Might as well change those spaces to tabs since this code is being
touched anyhow.
> + emit(dlp, BPF_MOV_REG(BPF_REG_2, str->dn_reg));
> dt_regset_free(drp, str->dn_reg);
> dt_cg_tstring_free(yypcb, str);
> - emit(dlp, BPF_MOV_REG(BPF_REG_3, idx->dn_reg));
> + emit(dlp, BPF_MOV_REG(BPF_REG_3, idx->dn_reg));
> dt_regset_free(drp, idx->dn_reg);
> if (cnt != NULL) {
> emit(dlp, BPF_MOV_REG(BPF_REG_4, cnt->dn_reg));
> @@ -4030,8 +4024,17 @@ dt_cg_subr_substr(dt_node_t *dnp, dt_irlist_t *dlp, dt_regset_t *drp)
> idp = dt_dlib_get_func(yypcb->pcb_hdl, "dt_substr");
> assert(idp != NULL);
> emite(dlp, BPF_CALL_FUNC(idp->di_id), idp);
> - dt_regset_free_args(drp);
> +
> + /*
> + * Allocate the result register, and assign the result to it..
> + */
> + dnp->dn_reg = dt_regset_alloc(drp);
> + if (dnp->dn_reg == -1)
> + longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
> +
> + emit(dlp, BPF_MOV_REG(dnp->dn_reg, BPF_REG_0));
> dt_regset_free(drp, BPF_REG_0);
> + dt_regset_free_args(drp);
I think the dt_regset_free_args(drp) call should remain tight up against
the BPF_CALL_FUNC(). After all, as soon as that function call is made,
the regs are toast. Or, say dnp->dn_reg is assigned one of %r1-%r5 and
therefore spills that register. (Does that cause the BPF verifier to
complain?) Then we free_args() and fill the register, overwriting what
was just there.
> TRACE_REGSET(" subr-substr:End ");
> }
More information about the DTrace-devel
mailing list