[DTrace-devel] Reviewed: Do not hard-code the temp register in aggregation functions

Wed Nov 25 22:43:49 PST 2020

On Wed, Nov 25, 2020 at 10:14:03PM -0800, Eugene Loh wrote:
> On 11/25/2020 09:28 PM, Kris Van Hees wrote:
> 
> > On Wed, Nov 25, 2020 at 05:34:48PM -0800, Eugene Loh wrote:
> >> On 11/20/2020 02:15 PM, kris.van.hees at oracle.com wrote:
> >>> [Commit 15efba57c50081ecba70bc236cb25f619c3f1c48]
> >>>> Do not hard-code the temp register in aggregation functions
> >>>>
> >>>> A few aggregation functions were hard-coding the temp register
> >>>> to BPF_REG_0.  Let dt_regset_alloc() pick the register.
> >>> Actually, for code that needs to use a short-term temporary register, %r0
> >>> is an ideal choice as long as you do not make any function calls or call
> >>> dt_cg_node() to generate code for a subtree.  So, using it as a scratch
> >>> register is perfectly fine, and often preferred because it reduces the
> >>> pressure on the dynamically allocated registers.
> >> I'm confused.  Are you saying you're against this patch? Presumably, in
> >> these functions, we do not know which registers are currently in use.
> >> Presumably, that is why we explicitly dt_regset_xalloc(BPF_REG_0) here.
> >> (In contrast, there are places we assume we know what the register usage
> >> is, and we hardcode register usage without regset alloc/free at all.)
> >> So, it seems to me the code would be more robust if we ask
> >> regset_alloc() to give us a free register rather than specifying which
> >> register to give us.  That induces no register pressure since it's going
> >> to give us a register that's free.  Our reasoning about which registers
> >> are free could be faulty;  in contrast, regset knows which registers are
> >> free.
> > Yes, I am against this patch.  While we indeed have dynamic allocation of
> > registers using regset alloc, there are cases where we know that %r0 can be
> > use without conflict.  These are cases where we know %r0 will not be in use
> > prior to the code getting called,
> 
> Thanks for the explanations, but -- regardless of the fate of this 
> particular patch -- I'd like to understand these issues.
> 
> How is it that we know that %r0 will not be in use prior to the code 
> getting called?

Well, for one, because we wrote this code.  More specifically, the construct
<agg> = <aggfun>(...) is parsed into an aggregation node that has the
aggregtion function (and arguments) associated with it, and we know that this
node must be a top level node, i.e. a statement.  This means that before and
after this node there cannot be any registers in use (besides the reserved %r9
register).  (Caveat: in the future we may want to hold the value of variables
in registers across statement boundaries, but even then we would only be able
to do so for registers %r6 through %r8 due to the BPF function calling
convention.)

So, since we know that registers cannot be in use, we know that %r0 cannot be
in use.

> In any case, even if it is possible to know that %r0 is not in use 
> (depending on how smart the developer is, with you being smarter than 
> me), what is the problem with just using any free register? That just 
> seems like a safer, more robust programming practice, even if a smarter 
> developer like you is able to get by without such a safety net.  (The 
> you/me thing is just that *I* don't think I'll always assess correctly 
> whether %r0 is okay in some situation.  I don't even understand this 
> situation.)

The problem with using just any register is that not all registers are the
same.  The %r0 register is special (as return value register), %r1 through
%r5 are special (function arguments, and considered clobbered even if we do
not use all of them for function arguments), %r6 through %r8 are general
purpose, and %r9 is reserved by us as pointer to the output buffer.

It is therefore not a bad idea (and it is actually common practice) to use
%r0 for temporary values e.g. when a value needs to be provided in a
register like with the value in xadd.  It means that the rest of the register
set (allocated from higher to lower) is available for e.g. more complex
expressions.

> > and we do not call any other codegen
> > functions (which means that nothing else could possibly use %r0 as well).
> > While we could use a dynamically allocated register in those cases, it is
> > perfectly fine to use %r0 when we know it is safe.  It is an ideal scratch
> > register for temporary values, and that is a common use in BPF code actually.
> >
> > We register its use simply because that provides a concenient check that
> > indeed nothing should have reserved it prior to its use here,
> 
> I don't think it checks that %r0 is not reserved.  If %r0 is in use and 
> we specifically ask for it, then %r0 is spilled and we're allowed to use 
> it.  So, if a developer (presumably someone like me rather than someone 
> like you) makes a mistake and explicitly asks for %r0 when it's already 
> in use, we get unnecessary spills and fills.  In contrast, had we asked 
> for "a free register," the unnecessary spills are avoided.

OK, I expressed myself poorly here.  It provides *me* with a convenient
check that nothing has reserved it.  If %r0 is already in use indeed it will
be spilled to stack and that will be noted in the regset trace output.  That
is what I use to detect these issues.  And that is an issue because it means
that the prior code did not free %r0 as it should have.

> > and nothing
> > should reserve it until we free it.  It also ensures that it shows up in
> > reg usage tracing output.
> >
> > ANd yes, there are cases where we hardcode register usage without using
> > alloc/free at all, and as I have communicated already, those need to be
> > addressed soon to rectify that oversight.