[DTrace-devel] [PATCH v4] Implement TLS variables
Eugene Loh
eugene.loh at oracle.com
Tue Nov 30 23:03:38 UTC 2021
We should mention in the commit message that the value size in the TLS
BPF map is the largest TLS variable size, which has drawbacks.
I'm curious about the tvars->dvars renaming. If the use of this BPF map
will expand beyond just TLS variables, will a different kind of key be
needed? Something other than ((varid<<32)|pid)?
I had some questions about "store only nonzero" for TLS variables.
* E.g., what happens when I load an unassigned TLS variable?
I guess dt_cg_load_var() calls get_tvar(varid, 1), which
looks up the key and presumably gets a NULL value pointer.
Then -- and here is what puzzles me -- we create a hash entry
with the default value (0) and return a pointer to the newly
created zero value. Isn't that wrong? Why are we allocating
something in the hash table?
Results are correct, but that's because we're allocating a bunch of
zeroes in the BPF map. Consider:
$ git diff
diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
@@ -2076,7 +2076,7 @@ dt_cg_load_var(dt_node_t *dst,
dt_irlist_t *dlp, dt_regset_t *drp)
dt_regset_free_args(drp);
emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, 0,
lbl_notnull));
- emit(dlp, BPF_MOV_IMM(dst->dn_reg, 0));
+ emit(dlp, BPF_MOV_IMM(dst->dn_reg, 12345678));
emit(dlp, BPF_JUMP(lbl_done));
emitl(dlp, lbl_notnull,
BPF_MOV_REG(dst->dn_reg, BPF_REG_0));
$ dtrace -n 'self x; BEGIN { trace(self->x); exit(0) }'
dtrace: description 'self x' matched 1 probe
CPU ID FUNCTION:NAME
1 1 :BEGIN 0
That is, I hack the code to return 12345678 when a variable is
unassigned, but the D script returns 0.
Or, using unhacked bits, consider:
dtrace -n 'self x, y, z; BEGIN { trace(self->x); }'
dtrace -n 'self x, y, z; BEGIN { trace(self->x); trace(self->y); }'
dtrace -n 'self x, y, z; BEGIN { trace(self->x);
trace(self->y); trace(self->z); }'
None of them exits, so I can run:
bpftool map dump id `bpftool map | awk '/name dvars/ {print
int($1)}'`
I get
key: 00 00 00 00 00 00 00 00 value: 00 00 00 00
key: 00 00 00 00 09 66 00 00 value: 00 00 00 00
Found 2 elements
then
key: 00 00 00 00 00 00 00 00 value: 00 00 00 00
key: 01 00 00 00 15 66 00 00 value: 00 00 00 00
key: 00 00 00 00 15 66 00 00 value: 00 00 00 00
Found 3 elements
then
key: 00 00 00 00 00 00 00 00 value: 00 00 00 00
key: 02 00 00 00 2d 66 00 00 value: 00 00 00 00
key: 01 00 00 00 2d 66 00 00 value: 00 00 00 00
key: 00 00 00 00 2d 66 00 00 value: 00 00 00 00
Found 4 elements
That is, the dvars map shows that unassigned variables are being
allocated in the BPF map.
Here is another way of seeing the problem. Although
dtrace -xdynvarsize=15 -n 'self x, y, z, w; BEGIN {
trace(self->x); trace(self->y); trace(self->z);
trace(self->w); }'
correctly shows the values are 0, that's because it's allocating
zeroes in the BPF map until it runs out of room and then
interprets "ran out of room" as zero values. Hmm.
So next we try:
dtrace -xdynvarsize=15 -n 'self x, y, z, w; BEGIN {
trace(self->x); trace(self->y); trace(self->z); self->w =
1234; trace(self->w); }'
dtrace: error on enabled probe ID 2 (ID 1: dtrace:::BEGIN):
invalid address (0x0) in action #1
This exposes our having run out of room.
* What happens when I store a zero value? I guess dt_cg_store_var()
calls get_tvar(varid, dnp->dn_reg). The secret sauce is when
the contents of dnp->dn_reg are 0. Does that mean the deletion
only occurs when, both, we store by value AND the value is 0?
E.g., the following does not work; should it?
# cat z.d
#pragma D option dynvarsize=15
struct foo {
int x;
};
self struct foo a, b, c, d;
BEGIN {
self->a.x = 0;
self->b.x = 0;
self->c.x = 0;
self->d.x = 0;
exit(0);
}
ERROR {
printf("ugh\n");
exit(1);
}
# dtrace -s z.d
dtrace: script 'z.d' matched 2 probes
dtrace: error on enabled probe ID 3 (ID 1: dtrace:::BEGIN):
invalid address (0x0) in action #1
CPU ID FUNCTION:NAME
3 3 :ERROR ugh
In bpf/get_tvar.c, the variable name not_zero strikes me as clumsy, but
I'm not sure if you're going to be restructuring this code. So I'm
reluctant to recommend anything else.
In dt_bpf_gmap_create(), defining strdatasz (and using it only once) is
okay, but its original reason for being is gone with the new sizing of
TLS variables using dt_maxtlslen. So, it was unclear to me if continued
use of this new one-use-only variable was intentional.
Should one be able to run multiple DTrace jobs simultaneously? E.g.,
# dtrace -n 'BEGIN { x = 1; trace(x); }' &
[1] 8152
dtrace: description 'BEGIN ' matched 1 probe
CPU ID FUNCTION:NAME
3 1 :BEGIN 1
# dtrace -n 'BEGIN { x = 1; trace(x); }' &
[2] 8155
dtrace: description 'BEGIN ' matched 1 probe
CPU ID FUNCTION:NAME
3 1 :BEGIN 1
So two DTrace sessions are running simultaneously. But with TLS:
# dtrace -n 'BEGIN { self->x = 1; trace(self->x); }' &
[1] 8129
dtrace: description 'BEGIN ' matched 1 probe
CPU ID FUNCTION:NAME
3 1 :BEGIN 1
# dtrace -n 'BEGIN { self->x = 1; trace(self->x); }' &
[2] 8132
dtrace: description 'BEGIN ' matched 1 probe
dtrace: could not enable tracing: failed to create BPF map 'dvars':
Operation not permitted
In dt_cg_load_var(), the comment
/* otherwise, handle thread-local and built-in variables */
is a little funny. With this patch, tvar and bvar no longer really
share any more code. So change the comment to talk only about tvars.
Then, bvars are what's left. That is, instead of
/* handle tvars and bvars */
if (tvars) {
tvar stuff
return
} else {
bvar part 1
}
bvar part 2
just make it
/* tvars */
if (tvars) {
tvar stuff
return
}
/* bvars */
bvar stuff
That should make the code clearer.
In dt_cg_store_var(), we should generate lbl_done only for the tvar
case. We generate it now even for gvar/lvar (and then do not use it).
E.g., dt_cg_load_var() properly has the narrower scope for lbl_done.
In dt_dis_bpf_args(), why change the comment from %r1 to %r0? I think
it should still be %r1.
There needs to be a more graceful failure (and a test) for the case that
dt_options[DTRACEOPT_DYNVARSIZE] < dt_maxtlslen . Currently,
# cat z.d
#pragma D option dynvarsize=15
struct foo {
int x1, x2, x3, x4;
};
self struct foo a;
BEGIN {
trace(self->a.x1);
printf("good\n");
exit(0);
}
ERROR {
printf("ugh\n");
exit(1);
}
# dtrace -s z.d
dtrace: script 'z.d' matched 2 probes
BPF: fd -1 is not pointing to valid bpf_map
BPF: verification time 18 usec
BPF: stack depth
BPF: processed 0 insns (limit 1000000) max_states_per_insn 0
total_states 0 peak_states 0 mark_read 0
dtrace: could not enable tracing: BPF program load for
'dtrace:::BEGIN' failed: Bad file descriptor
I cannot seem to get test/unittest/variables/tvar/tst.struct.d to run on
either Solaris or legacy DTrace.
In
test/unittest/variables/tvar/err.D_OP_INCOMPAT.default_int.d
test/unittest/variables/tvar/tst.default_int.d
change s/thresd/thread/.
In
test/unittest/variables/tvar/err.limited_space.d
test/unittest/variables/tvar/tst.store_zero_deletes.d:
two changes:
- Remove the strsize line. The strsize has nothing to do with this test.
If anything, default strsize=256 is a more interesting test.
- Change dynvarsize from 31 to 15.
In
test/unittest/variables/tvar/tst.post_inc_lvar.d
test/unittest/variables/tvar/tst.pre_inc_lvar.d
how about
- * ASSERTION: Test that post-increment on an local variable evaluates
to the
+ * ASSERTION: Test that post-increment on a thread-local variable
evaluates to the
- * old value of the local variable.
+ * old value of the variable.
In test/unittest/variables/tvar/tst.str-size.d, use strsize=4. The test
is not interesting if the strsize is bigger.
Finally (not sure about this one), maybe a test that stores only zeroes
to very many variables (and yet does not run out of room). This would be
a complement to tst.store_zero_deletes.d, which stores a zero to delete
an existing element.
More information about the DTrace-devel
mailing list