[DTrace-devel] [PATCH 01/10] Reduce register pressure in substr()

Eugene Loh eugene.loh at oracle.com
Fri Mar 18 20:31:59 UTC 2022


On 3/18/22 4:30 PM, Eugene Loh wrote:

> Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>
Er I mean:  Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
> with some questions and comments...
>
> On 3/18/22 3:04 PM, Kris Van Hees via DTrace-devel wrote:
>> Have substr() return the result string pointer.  This means that we can
>> delay allocating the register to hold the result until after the call to
>> dt_substr() has been made.
> Good idea but this raises the question whether we should be doing this 
> sort of thing more universally.  Or do we do this on a case-by-case 
> basis... "reactively"?  Is it worth having a test case for this instance?
>> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
>> ---
>>   bpf/substr.S      |  3 ++-
>>   libdtrace/dt_cg.c | 29 ++++++++++++++++-------------
>>   2 files changed, 18 insertions(+), 14 deletions(-)
>>
>> diff --git a/bpf/substr.S b/bpf/substr.S
>> index 8d501faa..9bd2400d 100644
>> --- a/bpf/substr.S
>> +++ b/bpf/substr.S
>> @@ -6,7 +6,7 @@
>>   #define BPF_FUNC_probe_read_str    45
>>     /*
>> - * void dt_substr(char *dst, const char *src, int32_t idx, int32_t cnt,
>> + * char *dt_substr(char *dst, const char *src, int32_t idx, int32_t 
>> cnt,
>>    *          uint64_t argc)
>>    *
>>    * %r1 = dst, %r2 = src, %r3 = idx, %r4 = cnt, %r5 = argc
>> @@ -124,6 +124,7 @@ dt_substr :
>>                        *               &src[idx]);
>>                        */
>>   +    mov    %r0, %r9
>>       exit
>>     .Lempty:
>
> I confess I do not understand why there is no similar %r0=%r9 on the 
> .Lempty code path.  If we jump to .Lempty, %r0 is "not properly" set 
> and then we return.  The clause function then picks this %r0 up and 
> goes on its way.  That seems to me to be wrong, but I cannot 
> demonstrate an error as easily as I would have thought.  Is the 
> %r0=%r9 not needed on .Lempty?
>
>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>> index 6f95260c..2cf48994 100644
>> --- a/libdtrace/dt_cg.c
>> +++ b/libdtrace/dt_cg.c
>> @@ -3997,26 +3997,20 @@ dt_cg_subr_substr(dt_node_t *dnp, dt_irlist_t 
>> *dlp, dt_regset_t *drp)
>>           dt_cg_node(cnt, dlp, drp);
>>         /*
>> -     * Allocate the result register and associate it with a temporary
>> -     * string slot.
>> +     * Allocate a temporary string slot for the result.
>>        */
>> -    dnp->dn_reg = dt_regset_alloc(drp);
>> -    if (dnp->dn_reg == -1)
>> -        longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>>       dt_cg_tstring_alloc(yypcb, dnp);
>>   -        emit(dlp,  BPF_LOAD(BPF_DW, dnp->dn_reg, BPF_REG_FP, 
>> DT_STK_DCTX));
>> -        emit(dlp,  BPF_LOAD(BPF_DW, dnp->dn_reg, dnp->dn_reg, 
>> DCTX_MEM));
>> -        emit(dlp,  BPF_ALU64_IMM(BPF_ADD, dnp->dn_reg, 
>> dnp->dn_tstring->dn_value));
>> -
>>       if (dt_regset_xalloc_args(drp) == -1)
>>           longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>>   -    emit(dlp, BPF_MOV_REG(BPF_REG_1, dnp->dn_reg));
>> -    emit(dlp, BPF_MOV_REG(BPF_REG_2, str->dn_reg));
>> +        emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_FP, 
>> DT_STK_DCTX));
>> +        emit(dlp,  BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_1, DCTX_MEM));
>> +        emit(dlp,  BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 
>> dnp->dn_tstring->dn_value));
> Might as well change those spaces to tabs since this code is being 
> touched anyhow.
>> +    emit(dlp,  BPF_MOV_REG(BPF_REG_2, str->dn_reg));
>>       dt_regset_free(drp, str->dn_reg);
>>       dt_cg_tstring_free(yypcb, str);
>> -    emit(dlp, BPF_MOV_REG(BPF_REG_3, idx->dn_reg));
>> +    emit(dlp,  BPF_MOV_REG(BPF_REG_3, idx->dn_reg));
>>       dt_regset_free(drp, idx->dn_reg);
>>       if (cnt != NULL) {
>>           emit(dlp, BPF_MOV_REG(BPF_REG_4, cnt->dn_reg));
>> @@ -4030,8 +4024,17 @@ dt_cg_subr_substr(dt_node_t *dnp, dt_irlist_t 
>> *dlp, dt_regset_t *drp)
>>       idp = dt_dlib_get_func(yypcb->pcb_hdl, "dt_substr");
>>       assert(idp != NULL);
>>       emite(dlp,  BPF_CALL_FUNC(idp->di_id), idp);
>> -    dt_regset_free_args(drp);
>> +
>> +    /*
>> +     * Allocate the result register, and assign the result to it..
>> +     */
>> +    dnp->dn_reg = dt_regset_alloc(drp);
>> +    if (dnp->dn_reg == -1)
>> +        longjmp(yypcb->pcb_jmpbuf, EDT_NOREG);
>> +
>> +    emit(dlp, BPF_MOV_REG(dnp->dn_reg, BPF_REG_0));
>>       dt_regset_free(drp, BPF_REG_0);
>> +    dt_regset_free_args(drp);
> I think the dt_regset_free_args(drp) call should remain tight up 
> against the BPF_CALL_FUNC().  After all, as soon as that function call 
> is made, the regs are toast.  Or, say dnp->dn_reg is assigned one of 
> %r1-%r5 and therefore spills that register.  (Does that cause the BPF 
> verifier to complain?)  Then we free_args() and fill the register, 
> overwriting what was just there.
>
>>       TRACE_REGSET("    subr-substr:End ");
>>   }



More information about the DTrace-devel mailing list