[DTrace-devel] [oracle/dtrace-utils] 94d952: Tweak testsuite to account for libctf / libdtrace-...

Nick Alcock noreply at github.com
Wed Dec 1 15:02:09 UTC 2021


  Branch: refs/heads/dev
  Home:   https://github.com/oracle/dtrace-utils
  Commit: 94d95280489a69fef9c975a8a9df9fb670697d64
      https://github.com/oracle/dtrace-utils/commit/94d95280489a69fef9c975a8a9df9fb670697d64
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    A test/unittest/enum/err.D_UNKNOWN.RepeatIdentifiers.r.p
    M test/unittest/funcs/err.inet_ntoabadaddr.d
    M test/unittest/funcs/err.inet_ntopbadaddr.d
    M test/unittest/funcs/err.inet_ntopbadarg.d
    A test/unittest/offsetof/err.D_UNKNOWN.OffsetofNULL.r.p
    A test/unittest/types/err.D_UNKNOWN.dupenum.r.p
    A test/unittest/types/err.D_UNKNOWN.dupstruct.r.p
    A test/unittest/union/err.D_DECL_INCOMPLETE.circular.r.p
    A test/unittest/union/err.D_DECL_INCOMPLETE.order.r.p
    A test/unittest/union/err.D_DECL_INCOMPLETE.simple.r.p
    M test/utils/Build
    A test/utils/libctf.r.p

  Log Message:
  -----------
  Tweak testsuite to account for libctf / libdtrace-ctf differences

Some error messages changed in libctf while libdrace-ctf is still using
the old messages.  The most common change is the addition of a trailing
period for error messages.

Forward declarations were printed as 'struct FOO' with libdtrace-ctf
even if they were for a union or enum.  With libctf they are printed
correctly - the testsuite tests affected by this now support both cases.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Nick Alcock <nick.alcock at oracle.com>


  Commit: cd179d39f759cff7526da7e53a7f35d8f4ec30a4
      https://github.com/oracle/dtrace-utils/commit/cd179d39f759cff7526da7e53a7f35d8f4ec30a4
  Author: Eugene Loh <eugene.loh at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    M libdtrace/dt_cg.c

  Log Message:
  -----------
  Ensure proper validation checks are done before freeing a tstring

We want to call dt_cg_tstring_free() if a tstring is in use.  Two
styles of checks were being used:
    if (dnp->dn_kind == DT_NODE_FUNC && dnp->dn_tstring)
        dt_cg_tstring_free(pcb, dnp);
and
    if (dnp->dn_tstring)
        dt_cg_tstring_free(pcb, dnp);
The problem with the second style is that if dn_kind is not DT_NODE_FUNC,
then dnp->dn_string might still be non-NULL since it is a union with
other members.  This leads to calling dt_cg_tstring_free() with an
invalid address and an assertion failure.

Add the check for dnp->dn_kind == DT_NODE_FUNC and dnp->dn_tstring into
dt_cg_tstring_free() to ensure the checks are performed at all times and
to simplify the code.

Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 40a9fe53f27e5b5d5914fed190dbcf86ebd2f7c1
      https://github.com/oracle/dtrace-utils/commit/40a9fe53f27e5b5d5914fed190dbcf86ebd2f7c1
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    M libdtrace/dt_cg.c
    M libdtrace/dt_parser.h
    A test/unittest/codegen/tst.tstring_asgn_expr.d
    A test/unittest/codegen/tst.tstring_asgn_expr.r
    A test/unittest/codegen/tst.tstring_ternary.d
    A test/unittest/codegen/tst.tstring_ternary.r

  Log Message:
  -----------
  Make sure assignment expressions work correctly with tstrings

The node kinds that can hold a tstring has been expanded from just
DT_NODE_FUNC to DT_NODE_OP1, DT_NODE_OP2, DT_NODE_OP3, DT_NODE_DEXPR.
This is necessary in order to properly manage the life of tstrings.

The dt_cg_store_var() function no longer frees the (possible) tstring
associated with the assignment itself.

Support has been added to handle tstrings when generating code for a
ternary (?:) operator where the value are strings.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>


  Commit: b4baa3e83f8b338ab5f92f5df419b81be222e099
      https://github.com/oracle/dtrace-utils/commit/b4baa3e83f8b338ab5f92f5df419b81be222e099
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    M bpf/substr.S
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-neg-no-cnt.d
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-neg-no-cnt.r
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-neg-too-far-no-cnt.d
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-neg-too-far-no-cnt.r
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-neg-too-far.d
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-neg-too-far.r
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-pos-no-cnt.d
    A test/unittest/funcs/substr/tst.substr-multi-const-idx-pos-no-cnt.r
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-neg-no-cnt.d
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-neg-no-cnt.r
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-neg-too-far-no-cnt.d
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-neg-too-far-no-cnt.r
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-neg-too-far.d
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-neg-too-far.r
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-pos-no-cnt.d
    A test/unittest/funcs/substr/tst.substr-multi-var-idx-pos-no-cnt.r
    A test/unittest/funcs/substr/tst.substr-stored-len.d
    A test/unittest/funcs/substr/tst.substr-stored-len.r
    A test/unittest/funcs/substr/tst.substr-strsize.d
    A test/unittest/funcs/substr/tst.substr-strsize.r

  Log Message:
  -----------
  Fix length stored by substr() and optimize the implementation

In certain cases, substr() would store an incorrect length for the
result string.  This problem is resolved with this patch.

Better comments have been added as well to make the code easier to
follow.

The code handling the case where (idx < 0) failed to recognize that
when the adjusted idx value was more negative than the value of cnt,
the result would always be the empty string.  This has been corrected
which improved code performance as well.

Another important change can be found in label .Lcheck_idx (formerly
.Ladjust_cnt).  The original four conditionals were overkill, and the
case for (cnt > 0) was jumping to .Lcnt_pos incorrectly (causing the
incorrect string length value).  It now correctly jumps to .Lcopy.
In fact, the .Lcnt_pos code was unnecessary.

Copying the substring now makes use of the probe_read_str() helper
rather than the probe_read() helper.

10 new tests are added to test various special conditions.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>


  Commit: 25d31f840253aaabdeb4b2ce11e8b8099d87bbe2
      https://github.com/oracle/dtrace-utils/commit/25d31f840253aaabdeb4b2ce11e8b8099d87bbe2
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    M libdtrace/dt_bpf.c
    M libdtrace/dt_cg.c

  Log Message:
  -----------
  Fix tstring length

The tstring area was being allocated without accounting for the NUL byte
at the end of strings.

The tstring reset code was calculating the allocation size per string at
every iteration rather than once.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>


  Commit: cf1e9df28b9585a1e8ca149d72d26a7ddee3070f
      https://github.com/oracle/dtrace-utils/commit/cf1e9df28b9585a1e8ca149d72d26a7ddee3070f
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    M libdtrace/dt_cg.c

  Log Message:
  -----------
  Optimize dt_cg_store_val() for string values

The dt_cg_store_val() implementation was doing more than just copying
a value to the output buffer when dealing with strings.  It was
checking the size of the string to ensure that it was not beyond the
maximum string size, and if it was, it would truncate the string.

That turns out to pose issues because it hides the fact that some of
the string handling code was not ensuring that strings were stored
with the correct string length.  It also hid the fact that string
constants can be longer than the maximum string length, and therefore
atring functions were being presented with strings of an unacceptable
length.

This patch causes several tests in the testsuite to fail.  This is
expected behaviour and will require bugfix patches to string handling
code to ensure that all strings used in D code have a length that is
the maximum string length or less.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>


  Commit: ace8cbfd4d742e1bf06712a3f330addb0b5425c5
      https://github.com/oracle/dtrace-utils/commit/ace8cbfd4d742e1bf06712a3f330addb0b5425c5
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-19 (Fri, 19 Nov 2021)

  Changed paths:
    M bpf/strjoin.S
    A test/unittest/funcs/strjoin/tst.strjoin-bordercases.d
    A test/unittest/funcs/strjoin/tst.strjoin-bordercases.r

  Log Message:
  -----------
  Fix length stored by strjoin() and optimise the implementation

The length that was stored for strjoin() was the sum of the lengths of
the two argument strings.  That means that if the sum was larger than
STRSIZE, we would still store that larger length even though the string
itself got truncated.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>


  Commit: 6cf4dedf1301d3721413870bae30874472d83568
      https://github.com/oracle/dtrace-utils/commit/6cf4dedf1301d3721413870bae30874472d83568
  Author: Kris Van Hees <kris.van.hees at oracle.com>
  Date:   2021-11-30 (Tue, 30 Nov 2021)

  Changed paths:
    M libdtrace/dt_bpf.c
    M libdtrace/dt_impl.h
    M libdtrace/dt_open.c
    M libdtrace/dt_subr.c
    A test/unittest/codegen/tst.str_const_length.d
    A test/unittest/codegen/tst.str_const_length.r

  Log Message:
  -----------
  Ensure string constants that are too long are truncated correctly

The string table can contain strings that are longer than the maximum
string size for the program being compiled.  They need to be truncated
before they are used to ensure that string operations work correctly.

Since string constants can be used verbatim or as values of built-in
variables, it is easiest to just pre-process the string table before it
is loaded into the 'strtab' BPF map.  Every string that is longer than
the maximum string size has its length prefix set to the maximum string
size and the string itself is truncated to that length.

Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>


  Commit: 361d3443578901483c233f0bf8c7c478d13da39c
      https://github.com/oracle/dtrace-utils/commit/361d3443578901483c233f0bf8c7c478d13da39c
  Author: Eugene Loh <eugene.loh at oracle.com>
  Date:   2021-11-30 (Tue, 30 Nov 2021)

  Changed paths:
    M libdtrace/dt_cg.c
    M test/unittest/dif/rand.d
    M test/unittest/funcs/tst.rand.d
    A test/unittest/funcs/tst.rand_inter.sh
    A test/unittest/funcs/tst.rand_intra.sh

  Log Message:
  -----------
  Add support for rand() subroutine

Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: c194ba4d332730fb1632b99411cb199448e4bf6a
      https://github.com/oracle/dtrace-utils/commit/c194ba4d332730fb1632b99411cb199448e4bf6a
  Author: Eugene Loh <eugene.loh at oracle.com>
  Date:   2021-11-30 (Tue, 30 Nov 2021)

  Changed paths:
    M test/unittest/actions/symmod/tst.symmod.sh

  Log Message:
  -----------
  Add test for func(), which is an alias for sym()

Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: ab83f64d8e7c0dad643ec131d1c454811eede995
      https://github.com/oracle/dtrace-utils/commit/ab83f64d8e7c0dad643ec131d1c454811eede995
  Author: Eugene Loh <eugene.loh at oracle.com>
  Date:   2021-11-30 (Tue, 30 Nov 2021)

  Changed paths:
    M libdtrace/dt_cg.c
    M libdtrace/dt_consume.c
    M test/unittest/funcs/tst.ftruncate.sh

  Log Message:
  -----------
  Add support for the ftruncate() action

Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 44f72b19509198733b53973e96502c6678fb67fc
      https://github.com/oracle/dtrace-utils/commit/44f72b19509198733b53973e96502c6678fb67fc
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M cmd/Build
    R cmd/ctf_module_dump.c
    M dtrace.spec

  Log Message:
  -----------
  cmd: delete ctf_module_dump

This tool was useful back in the day because it knew how to read CTF
that was dumped into kernel modules, digging out the shared CTF
dict and built-in module info from the fake ctf.ko module and
passing it to ctf_dump.

But it is 2021.  We haven't linked CTF into in-tree kernel modules since
UEK4 in 2017 (4.1.12-113, libdtrace-ctf 0.7).  Every tool needed for
this is obsolete, including libdtrace-ctf's ctf_dump; and objdump does
just as good a job of dumping modules as this tool (or, actually, given
that it can't dump raw .ctfa files yet, just as bad a job; but this will
be fixed soon and is easily workable around with a single objcopy call
even now).  Also ctf_module_dump is getting in the way of
representational improvements to the kernel module tracking.

So drop ctf_module_dump.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Suggested-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 9b8733bbcc873d9092d887af54bcba8a5e0d9396
      https://github.com/oracle/dtrace-utils/commit/9b8733bbcc873d9092d887af54bcba8a5e0d9396
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_impl.h
    M libdtrace/dt_module.c
    M libdtrace/dt_open.c

  Log Message:
  -----------
  modules: purge linked-in shared ctf.ko module support

The original design for CTF in DTrace for Linux had all CTF linked into
kernel modules in the .ctf section. This worked fine except that a
lot of CTF has no corresponding module file on disk: built-in modules,
the core kernel, and shared CTF.

So we linked all of those into a fake module named ctf.ko, with
section names akin to .ctf.$MODULE.

This led to a pile of complexity in dt_module, since one ctf.ko,
uniquely among ELF files loaded for CTF, could have multiple dt_modules
associated with it, with section names that were constructed on the fly
and hence needed dynamic allocation.  No less than two new module flags
were needed to track all of this, and it was a perennial source of
rarely-spotted bugs (since out-of-tree modules, in particular, are
rarely tested).

Now that all in-kernel CTF has been stored in vmlinux.ctfa for almost
five years (the last kernel to support the ctf.ko fake module was
4.1-era, and no such ancient kernel has a hope of having good enough BPF
support to work with DTrace) we can rip all this out.

Support for loading shared CTF from anywhere other than vmlinux.ctfa is
removed in the process: the only other source of shared CTF is .ctf in
userspace, and if we ever support that it will be loaded by ctf_open and
we won't need to do anything special at all.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 4c66cfeeb596795c94fbac7be8606d408c94515e
      https://github.com/oracle/dtrace-utils/commit/4c66cfeeb596795c94fbac7be8606d408c94515e
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_htab.c
    M libdtrace/dt_htab.h

  Log Message:
  -----------
  htab: add dt_htab_entries

This returns the number of entries added to the hashtable.

We also fix a bug in the per-bucket entry count (used only by
dt_htab_stats): it was being incremented on addition but never
decremented on removal, even though dt_htab_ops.del had better be an
inverse of dt_htab_ops.add, and nentries was being incremented on
addition -- and we don't have tombstones in the bucket chain list, so
the number of nentries in a bucket really does fall on removal.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: c5b9b89c9f50410eb46d21fee1ec5e1651b5efa9
      https://github.com/oracle/dtrace-utils/commit/c5b9b89c9f50410eb46d21fee1ec5e1651b5efa9
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_htab.c

  Log Message:
  -----------
  htab: have dt_htab_destroy del all the elements

This means that a dt_htab_destroy of a non-empty htab doesn't leak memory
and leave all its former elements full of wild pointers.

Also add some comments to dt_htab_delete given how many mistakes I made
reading it.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: bead3613b9c5e98bdb8a3cf8bebcfc55b60042f6
      https://github.com/oracle/dtrace-utils/commit/bead3613b9c5e98bdb8a3cf8bebcfc55b60042f6
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_consume.c
    M libdtrace/dt_impl.h

  Log Message:
  -----------
  speculations: use a destructor rather than hand-freeing

Combined with the member-destroying ht_htab_destroy in the previous
commit, this lets us do away with the dt_spec_bufs list and just have
the dt_specs_byid hash own its own members.  (We rename that hash
to dt_spec_bufs because that's just a better name.)

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: fc0e5e778d45d028422f1402ae7ca8640e4a9ce2
      https://github.com/oracle/dtrace-utils/commit/fc0e5e778d45d028422f1402ae7ca8640e4a9ce2
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_kernel_module.c
    M libdtrace/dt_kernel_module.h
    M libdtrace/libdtrace.ver

  Log Message:
  -----------
  kernpath: delete the dt_kernpath 'public API wrappers'

These were used by the old module dumper as an escape hatch to get
inside libdtrace and use its facilities to locate kernel modules.

Now the old module dumper is dead, we don't need this any more.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 1b022e4d61e25a5a9cde4019dd37e85c907a07fd
      https://github.com/oracle/dtrace-utils/commit/1b022e4d61e25a5a9cde4019dd37e85c907a07fd
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_impl.h
    M libdtrace/dt_kernel_module.c
    M libdtrace/dt_kernel_module.h
    M libdtrace/dt_open.c

  Log Message:
  -----------
  htab reduction: kernpath

Replace the special-purpose kernpath hash table with the generic
dt_htab.  This is a simple case which serves as a template for the
more complex ones to follow.

This has very little effect on performance, because even though the htab
is terribly poorly sized, it is sized in the wrong direction: the
kernpath hash only contains out-of-tree modules, and there won't be
hundreds of those.  The main improvement here is in clarity.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 06ee5547fe940dd45ac0f89709f3f06243b94aea
      https://github.com/oracle/dtrace-utils/commit/06ee5547fe940dd45ac0f89709f3f06243b94aea
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_consume.c
    M libdtrace/dt_htab.c
    M libdtrace/dt_htab.h
    M libdtrace/dt_kernel_module.c

  Log Message:
  -----------
  htab: add an iterator

This commit defines an iterator in the libctf _next style because it's
*so much* nicer to use than the function-calling _iter form:

extern void *dt_htab_next(const dt_htab_t *htab, dt_htab_next_t **it);
extern void dt_htab_next_destroy(dt_htab_next_t *i);

Call dt_htab_next with the htab to iterate over and a dt_htab_next_t
initialized to NULL to allocate an iterator and return the first
element.  Subsequent calls with the allocated iterator will return
further elements from the hash until iteration is complete, at which
time the iterator is freed and reset to NULL, ready for another
iteration cycle.

There are no restrictions whatsoever on what you can do inside iteration
(iterate over other things, iterate over the same thing, fork, longjmp
out of it, etc) except that you should not delete hash entries that the
iterator has not yet returned: if you insert new ones, they may or may
not be returned by this iteration cycle.  If you need to exit early or
longjmp out, dt_htab_next_destroy frees the iterator for you.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: d7b29de37c7385c1458293785788b570ef768bfa
      https://github.com/oracle/dtrace-utils/commit/d7b29de37c7385c1458293785788b570ef768bfa
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_dof.c
    M libdtrace/dt_impl.h
    M libdtrace/dt_open.c
    M libdtrace/dt_pcb.c
    M libdtrace/dt_probe.c
    M libdtrace/dt_program.c
    M libdtrace/dt_provider.c
    M libdtrace/dt_provider.h

  Log Message:
  -----------
  htab reduction: providers

Providers were stored in fixed-size hash tables and companion
lists.  The fixed-size tables generally were grossly oversized;
with USDT or pid, they were terribly undersized.

Eliminate the lists and replace the hash tables with dt_htab.

We can move to using destructors (allowing deletion of htab elements
without needing a specialized _destroy function, and straightforward
freeing of the dt_provs htab on dtrace_close), but this is a little
troublesome because the dt_provlist is separate from dt_provs, and since
dt_provs element destructors don't have access to a dtp, the destructors
also can't get at the dt_provlist to unchain dt_provs elements that are
being deleted.

We can fix all this by dropping the dt_provlist and replacing it with
dt_htab_next-based iteration over dt_provs. The iteration order is
different, but that's all that changes.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 7c20540a1d4326227ff92d39a91574aef9c24400
      https://github.com/oracle/dtrace-utils/commit/7c20540a1d4326227ff92d39a91574aef9c24400
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_impl.h
    M libdtrace/dt_module.c
    M libdtrace/dt_module.h
    M libdtrace/dt_open.c

  Log Message:
  -----------
  htab reduction: modules

This one is nice and simple, just the same as all the others.

We can ditch nmods in favour of dt_htab_entries, but cannot remove the
modlist (because the order of modules matters, so htab iteration cannot
replace it).

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


  Commit: 8151fb850b033e30dad98b6db26a6c22ee5df56a
      https://github.com/oracle/dtrace-utils/commit/8151fb850b033e30dad98b6db26a6c22ee5df56a
  Author: Nick Alcock <nick.alcock at oracle.com>
  Date:   2021-12-01 (Wed, 01 Dec 2021)

  Changed paths:
    M libdtrace/dt_impl.h
    M libdtrace/dt_module.c
    M libdtrace/dt_open.c
    M libdtrace/dt_symtab.c
    M libdtrace/dt_symtab.h

  Log Message:
  -----------
  htab reduction: symtab

The symtab is a major performance sink.  The symtabs in a running DTrace
vary from tiny (most modules) to huge (20,000+ entries in the vmlinux
symtab), but all of them are stuffed into fixed-size htabs of size 211,
leading to hash chains hundreds long.  Combine that with the exponential
explosion of symtab lookups at startup time (something to be fixed in a
later commit), and huge amounts of time get spent traversing bucket
chains on every DTrace startup (and a lot of time doubtless gets wasted
at runtime too).  Symbol lookup is also slow: it does one lookup per
loaded module until it finds the right one.

This is silly: most lookups are done over all modules, and for those
that aren't, we can intern all symbol names into a global
(per-dtrace_hdl) htab and walk matching names until we find one in the
right module (most names are unique across all modules, and for all such
names, as well as for names where we only want the first matching one,
there is no need to walk at all). This requires the storage of a module
back-pointer in the symbol so we can populate the module name in
dtrace_lookup_by_name, but that's easy enough.  We also store the hval
of the name so we don't need to recompute it over and over again, since
it can never change.

We can simplify the dt_symbol a bit, throwing out explicit storage of
the offset pointer and explicit checking of the DT_ST_PACKED flag on
every name lookup, and instead just replace the name pointer in each
symbol with a pointer into the strtab when packing happens.  We still
need to check the packed flag at destroy time to see whether we need to
free the string, but this simplifies all lookups and in particular makes
it easier than it would otherwise be to look up the symbol name when
doing a htab comparison.  Since the dt_htab naturally supports multiple
entries with the same name (returning the first inserted by default,
unless the caller explicitly walks through them, as we do when a symbol
from a specific module is requested) this lets us throw out the entirety
of the dt_symtab_purge complexity: should it turn out to be actually
important (if there are vast numbers of symbols with the same name
in a lot of modules), it's much easier to implement by just walking
the htab links now, just as is done in dt_module_symbol_by_name.

The only extra bit of fragility this adds is that destruction of the
centralized dt_kernsym_t makes all symbol name lookups fail, since we no
longer have any per-symbol hashtabs.  So we add a new dt_module_fini
that ensures that modules are destroyed right after the kernsym hashtab,
so there is no room for any name lookups to sneak in.

Speeds up startup from ~3.8s/invocation to ~0.9s/invocation.  Speeds up
a full test run by about 20 minutes, which is less than I would expect
given that we have a thousand-odd tests: I suspect cache effects or
something. Still, it's faster than it was.

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>


Compare: https://github.com/oracle/dtrace-utils/compare/ce6bb1ed4631...8151fb850b03



More information about the DTrace-devel mailing list