[DTrace-devel] [PATCH 12/20] Implement variable length integer support

Fri Jun 4 09:58:21 PDT 2021

Shouldn't there be some unit (internals?) testing?

In the commit message, replace
         a given unsigned integers
with
         a given unsigned integer

Actually, I suggest rewriting the commit message altogether. Simply say 
why you're doing this.  Information about the implementation should go 
in the source code, where interested parties can more easily find it.  
E.g., the entire commit message could simply be:

     Implement variable length integer support

     Strings will be prefixed with their lengths to eliminate frequent
     recomputation of those lengths.  Since relatively short strings
     are common, variable-length integers are used to save space.

In libdtrace/dt_varint.h, I see
     #define VARINT_9_PLIM        0xff
     #define VARINT_9_SHIFT        0
     #define VARINT_9_MAX        0xffffffffffffffff
but they make no sense (and are never used).  Drop them.

Are all the BPF vint functions used?

But my main reaction is that this patch is a whole lot of code and 
complexity for rather meaningless optimization.  I can imagine 
applications where one might want to optimize for variable integers, but 
here we simply want to prefix a string with its length.  Now, if you 
have very many, very short strings, then a short prefix makes sense.  If 
you have even just one long string, the benefit gets lost in the noise.  
The patch tries to optimize very aggressively, even though one can 
cheaply pick off the low-hanging fruit.

E.g., let's say you have a string that requires a three-byte varint 
prefix.  Apparently, the string is 16 Kbytes to 2 Mbytes long.  So we do 
not care if the prefix uses 3 bytes or 8.  A five-byte varint is 
necessarily accompanied by a string that is 256-Mbyte to 32-Gbyte; we 
simply do not care about saving 8-5 = 3 bytes.  Let's not optimize for that.

Same with subtracting those VARINT_X_MIN values.  That's really trying 
to shoehorn as many integers as possible into these varints;  that is a 
solution to someone else's problem.  It makes no sense here.  The 
savings we're pursuing are miniscule.

What's the cost?  Hard to say.  One can argue (hope) that no one will 
ever have to look at this code again and that dragging hundreds of lines 
of this code along doesn't matter.  On the other hand, loading BPF code 
into the kernel and handing branching code to the BPF verifier is not to 
be taken lightly.  I don't know how to assess that cost, but here are 
some metrics:

     wc -l bpf/*.c bpf-unknown-none-objdump -d build/bpf--*.o | wc -l

      14 bpf/map_tvar.c
      21 bpf/get_tvar.c         17 build/bpf--get_tvar.o
      18 bpf/set_tvar.c         19 build/bpf--set_tvar.o
      40 bpf/probe_error.c      21 build/bpf--probe_error.o
      24 bpf/agg_lqbin.c        21 build/bpf--agg_lqbin.o
      40 bpf/agg_qbin.c         71 build/bpf--agg_qbin.o
     104 bpf/memcpy.c          119 build/bpf--memcpy.o
     191 bpf/get_bvar.c        191 build/bpf--get_bvar.o
      82 bpf/strnlen.c         233 build/bpf--strnlen.o
     199 bpf/varint.c          257 build/bpf--varint.o

So, varint is the size leader in those metrics.  Does that matter?  I 
don't know, but I suppose the cost is clearly affordable in some cases, 
while in other cases we will probably wish we cost less.

Here is a different proposal.  How about if the leading bit is 0, then 
the next 7 bits give the length.  If the leading bit is 1, then the next 
31 (or 63?) bits give the length.  This optimizes well for the 
short-string case yet is quite simple.  The only down side is that it 
gives up a few bytes in those cases where the accompanying strings are huge.

I can hold off reviewing the rest of this patch until we've discussed 
this much.

On 6/2/21 1:48 AM, Kris Van Hees wrote:
> This patch provides an implementation of variable-length integers,
> both in libdtrace and as precompiled BPF code.  It is based on the
> PrefixVarint algorithm.  Unsigned integers are encoded as follows:
>
>    0x0 - 0x7f		1 byte:  0xxxxxxx
>    0x80 - 0x407f		2 bytes: 10xxxxxx xxxxxxxx
>    0x4080 - 0x20407f	3 bytes: 110xxxxx xxxxxxxx xxxxxxxx
>
> and so on.  This algorithm ensures that there is exactly one
> representation for a given unsigned integers.  It also ensures that
> the number of bytes in the variable length representation can be
> determined based on the first byte.  This makes decoding simpler.