[DTrace-devel] [PATCH v3 03/08] Record CPU that BEGIN and END probes run on in the 'state' BPF map

Kris Van Hees kris.van.hees at oracle.com
Thu Sep 17 23:14:37 PDT 2020


On Thu, Sep 17, 2020 at 05:25:37PM -0700, Eugene Loh wrote:
> On 09/17/2020 03:06 PM, Kris Van Hees wrote:
> 
> > On Thu, Sep 17, 2020 at 02:50:46PM -0700, Eugene Loh wrote:
> >> On 09/16/2020 04:11 PM, Kris Van Hees wrote:
> >>> diff --git a/libdtrace/dt_prov_dtrace.c b/libdtrace/dt_prov_dtrace.c
> >>> @@ -102,6 +113,47 @@ static void trampoline(dt_pcb_t *pcb)
> >>>    
> >>>    	dt_cg_tramp_prologue_act(pcb, lbl_exit, act);
> >>>    
> >>> +	if (key) {
> >>> +		/*
> >>> +		 *     key = DT_STATE_BEGANON;
> >>> +		 *			// stw [%fp + DT_STK_SPILL(0)],
> >>> +		 *			//			DT_STATE_BEGANON
> >> Incidentally, there is something else I do not understand.  You use
> >> spill slots to stage arguments for function calls.  (E.g., something
> >> similar in staging the exit() action activity state.)  But don't we need
> >> to worry about if those spill slots are in use for some registers?
> > There is no code that can be using spill slots at this point because we're
> > generating the trampoline code.  At this point it is guaranteed that no
> > register spilling can have occured.
> 
> Fair enough, but I'd still like to understand the bigger picture.

OK, but let's not have that hold up the review of the patch.

> There are a number of times we want to look up or update map elements.  
> E.g.,
> 
> *) dt_cg_tramp_prologue_act() has to look up the activity state
> 
> *) dt_cg_tramp_prologue_act() has to look up mem
> 
> *) dt_cg_tramp_epilogue_advance() has to look up the activity state (for 
> the sneaky purpose of incrementing in-place)
> 
> *) dt_cg_act_exit() has to update the activity state
> 
> *) (dtrace provider) trampoline() has to update the activity state
> 
> In these cases, the keys (and values, for update) are known at code 
> generation but have to be passed to a helper function by reference. So 
> we have to write them somewhere and pass the address(es) of where we 
> wrote them.  Apparently, the selection of that temporary memory is made 
> differently at different times.  That is,
> 
> *) dt_cg_tramp_*() uses %fp+DCTX_FP(DCTX_MST)
> 
> *) dt_cg_act_exit() uses %fp+DT_STK_SPILL(*)
> 
> *) (dtrace provider) trampoline() uses %fp+DT_STK_SPILL(*)

Yes.  Unfortunately, BPF requires key and value to come from the stack
(actually, it *is* possible that the value might not be limited to just the
stack - I cannot remember offhand right now but that is not relevant here).
The difference in where we place things on the stack (of course) depends on
the stack layout for the function we're executing in.

The locations chosen for the different cases are deliberate choices because 
they are guaranteed to be safe.

> Two things make me uncomfortable:
> 
> *)  We're expecting the cg programmer to get creative and find a place 
> to write these temporary values each time.  The programmer has to think, 
> "What space is not being used at this moment?"

Well yes, that is essentially the joy of low level (assembler level)
programming.  We don't have higher level constructs at our disposal, so we
need to know our stack layout and make use of it.

> *)  I'm not sure whether it's okay for exit() to be using those spill 
> slots.  Maybe it's fine, but in any case the reasoning for "trampoline" 
> does not apply for exit().

Actually, similar reasoning can be applied here.  Each action is a statement
rather than an expression so there is no residual register allocation after an
action has generated its code.  The way the DTrace compiler works, actions are
generated as independent compilation units and thus their boundary implies that
there cannot be any spilled registers in existance.  The code that generates
the argument to the exit() action cannot have side effects that result in any
register remaining allocated aside from the register that holds the final
value, and there cannot be any spilled registers left one the expression has
been generated.  As such, it is safe to use spill slots here.
 
> Do those concerns make sense?

The concerns make sense when you are not yet familiar with this code and with
how DTrace's D compiler operates.  Were we to rewrite it from scratch, things
might start looking different but then we'd be rewriting the code generation
code as well.

I am looking at a more uniform way to handle map access in terms of where to
store the key and value because I don't like having to think about the stack
layout all the time either.  But that is still a work in progress because

1. it is never going to apply to all cases
2. it may result in needing to pre-allocate slots on the stack, and we don't
   have much stack space to play to begin with (and it is a hard limit - not
   a per-frame limit)

> >>> +		 *     val = bpf_get_smp_processor_id();
> >>> +		 *			// call bpf_get_smp_processor_id
> >>> +		 *			// stw [%fp + DT_STK_SPILL(1)], %r0
> >>> +		 *     bpf_map_update_elem(state, &key, &val, BPF_ANY);
> >>> +		 *			// lddw %r1, &state
> >>> +		 *			// mov %r2, %fp
> >>> +		 *			// add %r2, DT_STK_SPILL(0)
> >>> +		 *			// mov %r3, %fp
> >>> +		 *			// add %r3, DT_STK_SPILL(1)
> >>> +		 *			// mov %r4, BPF_ANY
> >>> +		 *			// call bpf_map_update_elem
> >>> +		 */
> >>> +		dt_ident_t	*state = dt_dlib_get_map(pcb->pcb_hdl, "state");
> >>> +
> >>> +		instr = BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_STK_SPILL(0), key);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_CALL_HELPER(BPF_FUNC_get_smp_processor_id);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_STORE(BPF_W, BPF_REG_FP, DT_STK_SPILL(1),
> >>> +				  BPF_REG_0);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		dt_cg_xsetx(dlp, state, DT_LBL_NONE, BPF_REG_1, state->di_id);
> >>> +		instr = BPF_MOV_REG(BPF_REG_2, BPF_REG_FP);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_STK_SPILL(0));
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_MOV_REG(BPF_REG_3, BPF_REG_FP);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, DT_STK_SPILL(1));
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_MOV_IMM(BPF_REG_4, BPF_ANY);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +		instr = BPF_CALL_HELPER(BPF_FUNC_map_update_elem);
> >>> +		dt_irlist_append(dlp, dt_cg_node_alloc(DT_LBL_NONE, instr));
> >>> +	}
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list