[DTrace-devel] Reviewed: Do not hard-code the temp register in aggregation functions
Kris Van Hees
kris.van.hees at oracle.com
Wed Nov 25 22:43:49 PST 2020
On Wed, Nov 25, 2020 at 10:14:03PM -0800, Eugene Loh wrote:
> On 11/25/2020 09:28 PM, Kris Van Hees wrote:
>
> > On Wed, Nov 25, 2020 at 05:34:48PM -0800, Eugene Loh wrote:
> >> On 11/20/2020 02:15 PM, kris.van.hees at oracle.com wrote:
> >>> [Commit 15efba57c50081ecba70bc236cb25f619c3f1c48]
> >>>> Do not hard-code the temp register in aggregation functions
> >>>>
> >>>> A few aggregation functions were hard-coding the temp register
> >>>> to BPF_REG_0. Let dt_regset_alloc() pick the register.
> >>> Actually, for code that needs to use a short-term temporary register, %r0
> >>> is an ideal choice as long as you do not make any function calls or call
> >>> dt_cg_node() to generate code for a subtree. So, using it as a scratch
> >>> register is perfectly fine, and often preferred because it reduces the
> >>> pressure on the dynamically allocated registers.
> >> I'm confused. Are you saying you're against this patch? Presumably, in
> >> these functions, we do not know which registers are currently in use.
> >> Presumably, that is why we explicitly dt_regset_xalloc(BPF_REG_0) here.
> >> (In contrast, there are places we assume we know what the register usage
> >> is, and we hardcode register usage without regset alloc/free at all.)
> >> So, it seems to me the code would be more robust if we ask
> >> regset_alloc() to give us a free register rather than specifying which
> >> register to give us. That induces no register pressure since it's going
> >> to give us a register that's free. Our reasoning about which registers
> >> are free could be faulty; in contrast, regset knows which registers are
> >> free.
> > Yes, I am against this patch. While we indeed have dynamic allocation of
> > registers using regset alloc, there are cases where we know that %r0 can be
> > use without conflict. These are cases where we know %r0 will not be in use
> > prior to the code getting called,
>
> Thanks for the explanations, but -- regardless of the fate of this
> particular patch -- I'd like to understand these issues.
>
> How is it that we know that %r0 will not be in use prior to the code
> getting called?
Well, for one, because we wrote this code. More specifically, the construct
<agg> = <aggfun>(...) is parsed into an aggregation node that has the
aggregtion function (and arguments) associated with it, and we know that this
node must be a top level node, i.e. a statement. This means that before and
after this node there cannot be any registers in use (besides the reserved %r9
register). (Caveat: in the future we may want to hold the value of variables
in registers across statement boundaries, but even then we would only be able
to do so for registers %r6 through %r8 due to the BPF function calling
convention.)
So, since we know that registers cannot be in use, we know that %r0 cannot be
in use.
> In any case, even if it is possible to know that %r0 is not in use
> (depending on how smart the developer is, with you being smarter than
> me), what is the problem with just using any free register? That just
> seems like a safer, more robust programming practice, even if a smarter
> developer like you is able to get by without such a safety net. (The
> you/me thing is just that *I* don't think I'll always assess correctly
> whether %r0 is okay in some situation. I don't even understand this
> situation.)
The problem with using just any register is that not all registers are the
same. The %r0 register is special (as return value register), %r1 through
%r5 are special (function arguments, and considered clobbered even if we do
not use all of them for function arguments), %r6 through %r8 are general
purpose, and %r9 is reserved by us as pointer to the output buffer.
It is therefore not a bad idea (and it is actually common practice) to use
%r0 for temporary values e.g. when a value needs to be provided in a
register like with the value in xadd. It means that the rest of the register
set (allocated from higher to lower) is available for e.g. more complex
expressions.
> > and we do not call any other codegen
> > functions (which means that nothing else could possibly use %r0 as well).
> > While we could use a dynamically allocated register in those cases, it is
> > perfectly fine to use %r0 when we know it is safe. It is an ideal scratch
> > register for temporary values, and that is a common use in BPF code actually.
> >
> > We register its use simply because that provides a concenient check that
> > indeed nothing should have reserved it prior to its use here,
>
> I don't think it checks that %r0 is not reserved. If %r0 is in use and
> we specifically ask for it, then %r0 is spilled and we're allowed to use
> it. So, if a developer (presumably someone like me rather than someone
> like you) makes a mistake and explicitly asks for %r0 when it's already
> in use, we get unnecessary spills and fills. In contrast, had we asked
> for "a free register," the unnecessary spills are avoided.
OK, I expressed myself poorly here. It provides *me* with a convenient
check that nothing has reserved it. If %r0 is already in use indeed it will
be spilled to stack and that will be noted in the regset trace output. That
is what I use to detect these issues. And that is an issue because it means
that the prior code did not free %r0 as it should have.
> > and nothing
> > should reserve it until we free it. It also ensures that it shows up in
> > reg usage tracing output.
> >
> > ANd yes, there are cases where we hardcode register usage without using
> > alloc/free at all, and as I have communicated already, those need to be
> > addressed soon to rectify that oversight.
More information about the DTrace-devel
mailing list