[DTrace-devel] [PATCH v2 11/11] htab reduction: symtab
Eugene Loh
eugene.loh at oracle.com
Mon Oct 25 11:19:03 PDT 2021
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
And...
On 10/20/21 2:54 PM, Nick Alcock wrote:
> The symtab is a major performance sink. The symtabs in a running DTrace
> vary from tiny (most modules) to huge (20,000+ entries in the vmlinux
> symtab), but all of them are stuffed into fixed-size htabs of size 211,
> leading to hash chains hundreds long. Combine that with the exponential
I already asked about this. Why exponential? You said it's O(nm). That
is not exponential. Exponential would be O(2^n)... or so.
> explosion of symtab lookups at startup time (something to be fixed in a
> later commit), and huge amounts of time get spent traversing bucket
> chains on every DTrace startup (and a lot of time doubtless gets wasted
> at runtime too). Symbol lookup is also slow: it does one lookup per
> loaded module until it finds the right one.
>
> The only extra bit of fragility this adds is that destruction of the
> centralized dt_kernsym_t makes all symbol name lookups fail, since we no
> longer have any per-symbol hashtabs. So we add a new dt_module_fini
> that ensures that modules are destroyed right after the kernsym hashtab,
> so there is no room for any name lookups to sneak in.
I do not understand this. Nor do I see a dt_module_fini().
> diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c
> @@ -29,14 +29,13 @@
>
> struct dt_symbol {
> dt_list_t dts_list; /* list forward/back pointers */
Not a part of this patch, but so far as I can tell dts_list has been
orphaned since its introduction a decade ago in 9f987758bc9a .
> - union {
> - char *str; /* symbol name */
> - size_t off; /* symbol offset in strtab */
> - } dts_name;
> + char *dts_name; /* symbol name */
> GElf_Addr dts_addr; /* symbol address */
> GElf_Xword dts_size; /* symbol size */
> unsigned char dts_info; /* ELF symbol information */
> - struct dt_symbol *dts_next; /* hash chain pointer */
> + uint32_t dts_hval; /* cached hash value (for speed) */
> + dt_module_t *dts_dmp; /* module this symbol is contained within */
> + struct dt_hentry dts_he; /* htab links */
> };
>
> /*
> @@ -57,8 +56,6 @@ typedef struct dt_symrange {
>
> struct dt_symtab {
> dt_list_t dtst_symlist; /* symbol list */
I do not understand why dtst_symlist is still included. Can't it be
replaced by the iterator?
> - dt_symbol_t **dtst_syms_by_name;/* symbol name->addr hash buckets */
> - uint_t dtst_symbuckets; /* number of buckets */
> char *dtst_strtab; /* string table of symbol names */
> dt_symrange_t *dtst_ranges; /* range->symbol mapping */
> uint_t dtst_num_range; /* - number of ranges */
More information about the DTrace-devel
mailing list