[DTrace-devel] [PATCH 12/20] Implement variable length integer support

Mon Jun 7 06:51:08 PDT 2021

On Fri, Jun 04, 2021 at 12:58:21PM -0400, Eugene Loh wrote:
> Shouldn't there be some unit (internals?) testing?

There can be - I decided not to explicitly add unit test for this because
it is an integral component of the string support and that provides the
testting.  As in, string support cannot exist without this, and this is not
used by anything other than string support (for now - more on that below).

I may reconsider but the burden of providing adequate unit testing for both
the C-to-native code and C-to-BPF code (i.e. writing custom code to exercise
just this code) seems to be of limited value given that the same testing is
being done by the string support tests.

> In the commit message, replace
>          a given unsigned integers
> with
>          a given unsigned integer
> 
> Actually, I suggest rewriting the commit message altogether. Simply say 
> why you're doing this.  Information about the implementation should go 
> in the source code, where interested parties can more easily find it.  
> E.g., the entire commit message could simply be:
> 
>      Implement variable length integer support
> 
>      Strings will be prefixed with their lengths to eliminate frequent
>      recomputation of those lengths.  Since relatively short strings
>      are common, variable-length integers are used to save space.

I am still thinking about ths - what I would prefer to do is put a good
ref to the algorithm in the commit message but I haven't found one that
is sufficiently clear.  I don't think there is any issue with providing
details in the commit message, especially because it helps understand
which version of variable integer support we're going to be using here
(since there are quite a few different approaches out there).

> In libdtrace/dt_varint.h, I see
>      #define VARINT_9_PLIM        0xff
>      #define VARINT_9_SHIFT        0
>      #define VARINT_9_MAX        0xffffffffffffffff
> but they make no sense (and are never used).  Drop them.

Those were added for completion.  They are harmless and complete the set of
constants that define (and explain) the varint implementation.

> Are all the BPF vint functions used?

They will be.

> But my main reaction is that this patch is a whole lot of code and 
> complexity for rather meaningless optimization.  I can imagine 
> applications where one might want to optimize for variable integers, but 
> here we simply want to prefix a string with its length.  Now, if you 
> have very many, very short strings, then a short prefix makes sense.  If 
> you have even just one long string, the benefit gets lost in the noise.  
> The patch tries to optimize very aggressively, even though one can 
> cheaply pick off the low-hanging fruit.
> 
> E.g., let's say you have a string that requires a three-byte varint 
> prefix.  Apparently, the string is 16 Kbytes to 2 Mbytes long.  So we do 
> not care if the prefix uses 3 bytes or 8.  A five-byte varint is 
> necessarily accompanied by a string that is 256-Mbyte to 32-Gbyte; we 
> simply do not care about saving 8-5 = 3 bytes.  Let's not optimize for that.
> 
> Same with subtracting those VARINT_X_MIN values.  That's really trying 
> to shoehorn as many integers as possible into these varints;  that is a 
> solution to someone else's problem.  It makes no sense here.  The 
> savings we're pursuing are miniscule.
> 
> What's the cost?  Hard to say.  One can argue (hope) that no one will 
> ever have to look at this code again and that dragging hundreds of lines 
> of this code along doesn't matter.  On the other hand, loading BPF code 
> into the kernel and handing branching code to the BPF verifier is not to 
> be taken lightly.  I don't know how to assess that cost, but here are 
> some metrics:
> 
>      wc -l bpf/*.c bpf-unknown-none-objdump -d build/bpf--*.o | wc -l
> 
>       14 bpf/map_tvar.c
>       21 bpf/get_tvar.c         17 build/bpf--get_tvar.o
>       18 bpf/set_tvar.c         19 build/bpf--set_tvar.o
>       40 bpf/probe_error.c      21 build/bpf--probe_error.o
>       24 bpf/agg_lqbin.c        21 build/bpf--agg_lqbin.o
>       40 bpf/agg_qbin.c         71 build/bpf--agg_qbin.o
>      104 bpf/memcpy.c          119 build/bpf--memcpy.o
>      191 bpf/get_bvar.c        191 build/bpf--get_bvar.o
>       82 bpf/strnlen.c         233 build/bpf--strnlen.o
>      199 bpf/varint.c          257 build/bpf--varint.o
> 
> So, varint is the size leader in those metrics.  Does that matter?  I 
> don't know, but I suppose the cost is clearly affordable in some cases, 
> while in other cases we will probably wish we cost less.
> 
> Here is a different proposal.  How about if the leading bit is 0, then 
> the next 7 bits give the length.  If the leading bit is 1, then the next 
> 31 (or 63?) bits give the length.  This optimizes well for the 
> short-string case yet is quite simple.  The only down side is that it 
> gives up a few bytes in those cases where the accompanying strings are huge.

Your assumption in all this is that we're providing a custom approach to
encoding the length of strings.  That is not the purpose here, even if we
may in the end only use it for that.  First of all, we're using a fairly
standard varint implementation which means that it is a known entity rather
than entirely custom code.  Second, it allows for future use where the
usage characteristics may differ from the use case of just strings.

Your assumption that we are optimzing where no optimization is not necessary
does not quite hold given that we're simply using an existing algorithm so no
effort was wasted there.  Your suggested approach is very much a customized
approach that is based on a single use case and it still incurs pretty much
the same cost.  You still need to decoded the length of the string from a byte
sequence, unless you require longer strings to be 32-bit aligned in storage
which means you need to support different alignments for different string sizes
or align all on 32-bit boundaries which impacts very short strings quite a bit
more.  There are always costs and tradeoffs,

The complexity of the code is hardly an issue as is the size of th code in
number of source code lines, because runtime execution profile is what really
matters and the number of instructions executed for a particular case is quite
optimal in this implementation (and resembles what you propose anyway).

As for potential impact on the BPF verifier: if that is an issue then we can
deal with that later, and preferably by implementing a workaround *and*
proposing a fix for the BPF verifier to be better at validating code with
branches.  One such workaround would be to simply make the various branches
conditional upon the max string size (a constant) which will allow the BPF
verifier to mark those branches as unreachable code and therefore eliminate
them.  Or we can do that outselves.  But there is no need to do that just yet
(and there may never be a need for it).

> I can hold off reviewing the rest of this patch until we've discussed 
> this much.
> 
> On 6/2/21 1:48 AM, Kris Van Hees wrote:
> > This patch provides an implementation of variable-length integers,
> > both in libdtrace and as precompiled BPF code.  It is based on the
> > PrefixVarint algorithm.  Unsigned integers are encoded as follows:
> >
> >    0x0 - 0x7f		1 byte:  0xxxxxxx
> >    0x80 - 0x407f		2 bytes: 10xxxxxx xxxxxxxx
> >    0x4080 - 0x20407f	3 bytes: 110xxxxx xxxxxxxx xxxxxxxx
> >
> > and so on.  This algorithm ensures that there is exactly one
> > representation for a given unsigned integers.  It also ensures that
> > the number of bytes in the variable length representation can be
> > determined based on the first byte.  This makes decoding simpler.
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel