[DTrace-devel] Reviewed: Do not use BPF_REG_0 in dt_cg_agg_buf_prepare()

Wed Nov 25 23:19:57 PST 2020

On Wed, Nov 25, 2020 at 08:08:10PM -0800, Eugene Loh wrote:
> I'm confused...
> 
> On 11/20/2020 02:15 PM, kris.van.hees at oracle.com wrote:
> 
> > [Commit 8a11eab0d47aee9d48a55506113801af4f59c3fb]
> >> Do not use BPF_REG_0 in dt_cg_agg_buf_prepare()
> >>
> >> The function dt_cg_agg_buf_prepare() uses BPF_REG_0.  Perhaps this is
> >> all right, but it is not ideal.  The specific register is hardcoded in,
> >> and we do not check locally if the register is in use.
> > It is correct to use %r0 and I do not understand why you think it is not ideal.
> 
> Not ideal because it involves reasoning about correctness that someone 
> like me could get wrong.  In contrast, one can just ask regset_alloc() 
> for a free register and be safe.

See my reply to your other email in terms of not all registers being equal.

> > Since it is the de facto return value register it is actually a good choice for
> > scratch register when a temporary value is needed.  It should not be used when
> > the code generation calls dt_cg_node() of course, since that may cause it to be
> > (re)used incorrectly.  That's why we should always allocate it before we use it
> > (and I still need to add that in some places, just to be consistent).
> 
> Well, since one is allocating a register, might as well ask for a free 
> register rather than for a particular register.

Again, not all registers are equal, and we have a really limit set we can
work with.  Right now, we also do not allow more than 1 level of spilling.

Again, we are not just allocating a register.  Often we need a register for
a very short time (very few instructions) in a context where it is perfectly
safe to use %r0.  It is a good choice, and leave the higher registers
available for use in generating code for more complex expressions.

> > But, as you show in the code below, we actually do not need to use %r0 here,
> > and I agree there is no need to use more registers than needed.
> >
> >> Further, the comments suggest that the sequence latch should be
> >> incremented before computing the pointer.
> > Actually, no.
> 
> "No" as in "No that's not what supposed to happen" or as in "No that's 
> not what the comments say"?  The comments for the function seem to say 
> "update the latch, then determine the offset."  As for what's *supposed* 
> to happen, it would seem like that's kind of arbitrary in that the 
> consumer is not yet written.  I would think that, given a latch number, 
> the consumer is equally happy using either latch&1 or the other one.  
> Again, kind of arbitrary.

The sequence latch is incremented before compuitng the pointer, but the
pointer *must* be computed based on the old value of the sequence latch (i.e.
the value before the increment).

It is not arbitrary where the consumer reads from.  It *must* read from the
copy of the data that is *not* being updated.  Therefore, the sequence latch
(seq & 1) indicates the copy the consumer can safely read from, and the other
copy is the one being updated.  In other words, the data pointer can never be
calculated based on the current value of the sequence latch.

Now, we can indeed calculate the data pointer based on the current sequence
latch value, and then update the sequence latch value, because we do not
start updating the actual data until both the pointer is computed and the
sequence latch has been updated, but I chose to do the latch update first
since that immediately redirects the consumer to the alternate copy, and then
our producer context becomes the copy where data can be updated.

> > The sequence latch is incremented to signal the consuemr that
> > it should now read from the 'other' copy.  But the pointer is to be computed
> > based on the old value because the pointer indicates the copy to which we are
> > going to apply the update.  So:
> > 	new value -> where the consumer should read from
> > 	old value -> where we will apply the update
> 
> I'm confused.  The producer is running behind the consumer?  Or maybe 
> the use of "new" and "old" is confusing?

Look at it this way (seq is always read and updated in copy A only; I use C(v1)
for consumer reading value v1, P(v2) for producer updating from v1 to v2):

  Seq    0  1      2      3      4      ...

  Copy A .  P(v1)  C(v1)  P(v2)  C(v2)  ...

  Copy B .  C(0)   P(v1)  C(v1)  P(v2)  ...

So, in other words, the producer will write the new value in the buffer copy
that the consumer was reading from before.  Or even differently put, we move
the consumer to look at the alternate copy by incrementing the seq so that the
producer can update the copy that the consumer was previously looking at.  And
then we do the reverse to ensure that both copies will now have the new data.

The main driving logic is that the sequence latch (seq) tells the consumer
which copy it should read from.  That choice drives the rest of the logic.