[DTrace-devel] [PATCH 11/16] Unify common argument retrieval for providers
Eugene Loh
eugene.loh at oracle.com
Fri Mar 26 22:45:25 PDT 2021
On 3/23/21 10:24 PM, Kris Van Hees wrote:
> On Tue, Mar 23, 2021 at 07:06:10PM -0400, Eugene Loh wrote:
>> On 3/19/21 12:54 AM, Kris Van Hees wrote:
>>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>>> @@ -161,6 +170,103 @@ dt_cg_tramp_prologue(dt_pcb_t *pcb)
>>> dt_cg_tramp_prologue_act(pcb, DT_ACTIVITY_ACTIVE);
>>> }
>>>
>>> +/*
>>> + * Copy arguments from a dt_pt_regs structure referenced by the 'rp' argument.
>>> + *
>>> + * The caller must ensure that %r7 and %r8 contain the values set by the
>>> + * dt_cg_tramp_prologue*() functions.
>>> + */
>> This comment doesn't make complete sense, since the function actually
>> does not explicitly depend on %r8. It relies instead on argument rp.
>> So, do you want to eliminate the rp argument and use %r8 instead?
>> Another option is to pass in not only %r8 but also %r7, making clearer
>> what is going on (though making cloudier the assumptions about which
>> regs are being used where).
> Correct - it no longer depends on %r8 because the register set is found in
> different places in the different BPF contexts we need to work with.
I can't tell if we're on the same or different page here. The commit
message advertises that the tramp prologue will set %r7 and %r8 up
specially, and basically the whole point of this patch is that this is
what will make the function dt_cg_tramp_copy_args_from_regs() work --
that is, having %r7 and %r8 set up properly before entering this
function. The thing is, while the function explicitly relies on %r7 as
advertised, the callers pass %r8 in as the function argument rp. So,
this function no longer has explicit dependence on %r8. So, why does
its introductory comment still talk about %r8. E.g. the caller could
put the pointer to the BPF context in some other register (like %r6 or
something) and tell dt_cg_tramp_copy_args_from_regs() to use that
register instead.
>> In any case, referring to rp as a dt_pt_regs is confusing since this
>> pointer is otherwise referred to as a BPF context. I understand how
>> this pointer can be understood to point both to a BPF context and to a
>> dt_pt_regs, but one might as well just be consistent how one talks about
>> these things.
> Where is this 'rp' argument referred to as a BPF context? It is most
> definitely the register number that holds a pointer to the dt_pr_regs struct.
We pass %r8 in as rp, and %r8 is typically referred to as a pointer to
the BPF context.
The commit message says:
The trampoline prologue will now
set up %r7 as a pointer to the mstate and %r8 as a pointer to
the BPF
context.
This point is reiterated several times in the comments in dt_cg.c for
dt_cg_tramp_prologue_act().
So, for example, you can go to dt_prov_fbt.c, where the trampoline says:
dt_cg_tramp_prologue(pcb);
/*
* After the dt_cg_tramp_prologue() call, we have:
* // (%r7 = dctx->mst)
* // (%r8 = dctx->ctx)
*/
dt_cg_tramp_copy_args_from_regs(pcb, BPF_REG_8);
So, that second argument to dt_cg_tramp_copy_args_from_regs() -- namely,
rp -- is the "pointer to the BPF context."
>>> +void
>>> +dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp)
>>> +{
>>> + dt_irlist_t *dlp = &pcb->pcb_ir;
>>> + int i;
>>> + uint_t lbl_store = dt_irlist_label(dlp);
>>> +
>>> + /*
>>> + * for (i = 0; i < PT_REGS_ARGC; i++)
>>> + * dctx->mst->argv[i] = PT_REGS_PARAMi((dt_pt_regs *)rp);
>>
>> Would make more sense if PARAMi -> ARGi.
> Well, no, it ought to be PT_REGS_BPF_ARGi(...) since that one is actually
> defined for this purpose.
I'm not sure what you're saying here. If you look at the code (right
below) that this comment is describing, both the actual emit(BPF_*())
code and the pseudo-assembly in the comments have PT_REGS_ARG*.
>>> + * // lddw %r0, [%rp + PT_REGS_ARGi]
>>> + * // stdw [%r7 + DMST_ARG(i)], %r0
>>> + */
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG0));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG1));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(1), BPF_REG_0));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG2));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(2), BPF_REG_0));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG3));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(3), BPF_REG_0));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG4));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(4), BPF_REG_0));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG5));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(5), BPF_REG_0));
>>> +#ifdef PT_REGS_ARG6
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG6));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(6), BPF_REG_0));
>>> +#endif
>>> +#ifdef PT_REGS_ARG7
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, rp, PT_REGS_ARG7));
>>> + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(7), BPF_REG_0));
>>> +#endif
>>> +
>>> + /*
>>> + * Generate code to read the remainder of the arguments from the stack
>>> + * (if possible). We (ab)use the %r0 spill slot on the stack to read
>>> + * values using bpf_probe_read() because we know that it cannot be in
>>> + * use at this point.
>>> + *
>>> + * for (i = PT_REGS_ARGC;
>>> + * i < ARRAY_SIZE(((dt_mstate_t *)0)->argv); i++) {
>>> + * uint64_t val, tmp;
>>> + * int rc;
>>> + * uint64_t *sp;
>>> + *
>>> + * sp = (uint64_t *)(((dt_pt_regs *)rp)->sp;
>>> + * rc = bpf_probe_read(&tmp, sizeof(tmp),
>>> + * &sp[i - PT_REGS_ARGC +
>>> + * PT_REGS_ARGSTKBASE]);
>>> + * // mov %r1, %fp
>>> + * // add %r1, DT_STK_SPILL(0)
>>> + * // mov %r2, sizeof(uint64_t)
>>> + * // lddw %r3, [%rp + PT_REGS_SP]
>>> + * // mov %r0, i - PT_REGS_ARGC +
>>> + * PT_REGS_ARGSTKBASE
>>> + * // mul %r0, sizeof(uint64_t)
>>> + * // add %r3, %r0
>>> + * // call bpf_probe_read
>>> + * val = 0;
>>> + * // mov %r1, 0
>>> + * if (rc != 0)
>>> + * goto store:
>>> + * // jne %r0, 0, lbl_store
>>> + * val = tmp;
>>> + * // lddw %r1, [%rp + DT_STK_SPILL(0)]
>>> + *
>>> + * store:
>>> + * dctx->mst->argv[i] = val;
>>> + * // store:
>>> + * // stdw [%r7 + DMST_ARG(i)], %r1
>>> + * }
>>> + *
>>> + */
>>> + for (i = PT_REGS_ARGC; i < ARRAY_SIZE(((dt_mstate_t *)0)->argv); i++) {
>>> + emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP));
>>> + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_STK_SPILL(0)));
>>> + emit(dlp, BPF_MOV_IMM(BPF_REG_2, sizeof(uint64_t)));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, rp, PT_REGS_SP));
>>> + emit(dlp, BPF_MOV_IMM(BPF_REG_0, i - PT_REGS_ARGC + PT_REGS_ARGSTKBASE));
>>> + emit(dlp, BPF_ALU64_IMM(BPF_MUL, BPF_REG_0, sizeof(uint64_t)));
>>> + emit(dlp, BPF_ALU64_REG(BPF_ADD, BPF_REG_3, BPF_REG_0));
>>> + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read));
>>> + emit(dlp, BPF_MOV_IMM(BPF_REG_1, 1234));
>>> + emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, 0, lbl_store));
>>> + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_1, BPF_REG_FP, DT_STK_SPILL(0)));
>>> + emitl(dlp, lbl_store,
>>> + BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(i), BPF_REG_1));
>>> + }
>>> +}
>>> +
>> As you note elsewhere, the 1234 is wrong.
>>
>> You also have these instructions:
>> BPF_MOV_IMM(BPF_REG_0, i - PT_REGS_ARGC + PT_REGS_ARGSTKBASE);
>> BPF_ALU64_IMM(BPF_MUL, BPF_REG_0, sizeof(uint64_t));
>> BPF_ALU64_REG(BPF_ADD, BPF_REG_3, BPF_REG_0);
>> How about
>> BPF_ALU64_IMM(BPF_ADD, BPF_REG_3,
>> (i - PT_REGS_ARGC + PT_REGS_ARGSTKBASE) *
>> sizeof(uint64_t));
>> to save two instructions inside this loop. The expression could even be
>> simplified a little by redefining PT_REGS_ARGSTKBASE since it is
>> currently only ever used in the combination PT_REGS_ARGSTKBASE -
>> PT_REGS_ARGC anyhow.
> Yes, we can save two instructions there - that makes sense. Redefining
> PT_REGS_ARGSTKBASE doies not seem like a good idea because then you end
> up making this expression slightly less clear. Though perhaps *BASE might
> be better named *BIAS. Not sure which is more clear. Thoughts?
Not a big deal and no strong thoughts. I'm not used to thinking about
this stuff. I guess *BIAS is fine, having the longer expression is
fine, and it's going to take some brainpower to track all this anyhow.
My main concern was the two instructions.
>>> typedef struct {
>>> dt_irlist_t *dlp;
>>> dt_activity_t act;
>>> diff --git a/libdtrace/dt_pt_regs.h b/libdtrace/dt_pt_regs.h
>>> index 72e065c1..0bbb1be2 100644
>>> --- a/libdtrace/dt_pt_regs.h
>>> +++ b/libdtrace/dt_pt_regs.h
>>> @@ -26,7 +26,10 @@ typedef struct pt_regs dt_pt_regs;
>>> # define PT_REGS_ARG3 offsetof(dt_pt_regs, rcx)
>>> # define PT_REGS_ARG4 offsetof(dt_pt_regs, r8)
>>> # define PT_REGS_ARG5 offsetof(dt_pt_regs, r9)
>>> +# define PT_REGS_ARGC 6
>>> +# define PT_REGS_ARGSTKBASE 1
>>> # define PT_REGS_IP offsetof(dt_pt_regs, rip)
>>> +# define PT_REGS_SP offsetof(dt_pt_regs, rsp)
>>>
>>> # define PT_REGS_BPF_ARG0(r) ((r)->rdi)
>>> # define PT_REGS_BPF_ARG1(r) ((r)->rsi)
>>> @@ -35,6 +38,7 @@ typedef struct pt_regs dt_pt_regs;
>>> # define PT_REGS_BPF_ARG4(r) ((r)->r8)
>>> # define PT_REGS_BPF_ARG5(r) ((r)->r9)
>>> # define PT_REGS_BPF_IP(r) ((r)->rip)
>>> +# define PT_REGS_BPF_SP(r) ((r)->rsp)
>>> #elif defined(__aarch64__)
>>> typedef struct user_pt_regs dt_pt_regs;
>>> # define PT_REGS_ARG0 offsetof(dt_pt_regs, regs[0])
>>> @@ -43,7 +47,12 @@ typedef struct user_pt_regs dt_pt_regs;
>>> # define PT_REGS_ARG3 offsetof(dt_pt_regs, regs[3])
>>> # define PT_REGS_ARG4 offsetof(dt_pt_regs, regs[4])
>>> # define PT_REGS_ARG5 offsetof(dt_pt_regs, regs[5])
>>> +# define PT_REGS_ARG6 offsetof(dt_pt_regs, regs[6])
>>> +# define PT_REGS_ARG7 offsetof(dt_pt_regs, regs[7])
>>> +# define PT_REGS_ARGC 8
>>> +# define PT_REGS_ARGSTKBASE 2
>>> # define PT_REGS_IP offsetof(dt_pt_regs, pc)
>>> +# define PT_REGS_SP offsetof(dt_pt_regs, sp)
>>>
>>> # define PT_REGS_BPF_ARG0(r) ((r)->regs[0])
>>> # define PT_REGS_BPF_ARG1(r) ((r)->regs[1])
>>> @@ -51,7 +60,10 @@ typedef struct user_pt_regs dt_pt_regs;
>>> # define PT_REGS_BPF_ARG3(r) ((r)->regs[3])
>>> # define PT_REGS_BPF_ARG4(r) ((r)->regs[4])
>>> # define PT_REGS_BPF_ARG5(r) ((r)->regs[5])
>>> +# define PT_REGS_BPF_ARG5(r) ((r)->regs[6])
>>> +# define PT_REGS_BPF_ARG5(r) ((r)->regs[7])
>>> # define PT_REGS_BPF_IP(r) ((r)->pc)
>>> +# define PT_REGS_BPF_SP(r) ((r)->sp)
>>> #else
>>> # error ISA not supported
>>> #endif
>> Actually, just remove all (that is, both new and pre-existing)
>> PT_REGS_BPF_* definitions since they are unused.
> For now I think keeping them is better because it helps with understanding
> the comment block about populating the argument array.
But we're not using PT_REGS_BPF_*. And the PT_REGS_BPF_* definitions
explain nothing more than the PT_REGS_* definitions right above them.
More information about the DTrace-devel
mailing list