[DTrace-devel] [PATCH 6/7] htab reduction: symtab

Eugene Loh eugene.loh at oracle.com
Mon Oct 11 13:02:12 PDT 2021


I read through this and I confess I kept losing sight of the forest for 
the trees, but
     Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
And picky comments as usual:

On 10/5/21 9:38 AM, Nick Alcock wrote:
> The symtab is a major performance sink.  The symtabs in a running DTrace
> vary from tiny (most modules) to huge (20,000+ entries in the vmlinux
> symtab), but all of them are stuffed into fixed-size htabs of size 211,
> leading to hash chains hundreds long.  Combine that with the exponential
I'm not familiar with this.  In what way is it exponential?
> explosion of symtab lookups at startup time (something to be fixed in a
> later commit), and huge amounts of time get spent traversing bucket
> chains on every DTrace startup (and a lot of time doubtless gets wasted
> at runtime too).  Symbol lookup is also slow: it does one lookup per
> loaded module until it finds the right one.
>
> diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
> @@ -1453,9 +1462,6 @@ dt_module_from_object(dtrace_hdl_t *dtp, const char *object)
>    * Only the st_info, st_value, and st_size fields of the GElf_Sym are guaranteed
>    * to be populated: the st_shndx is populated but its only meaningful value is
>    * SHN_UNDEF versus !SHN_UNDEF.
> - *
> - * XXX profile this: possibly introduce a symbol->name mapping for
> - * first-matching-symbol across all modules.
>    */
>   int
>   dtrace_lookup_by_name(dtrace_hdl_t *dtp, const char *object, const char *name,
s/XXX/FIXME/
> diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c
> @@ -145,8 +142,24 @@ static int dt_symbol_search_cmp(const void *lp, const void *rp)
> +static uint32_t
> +dt_symtab_hval(const dt_symbol_t *sym, void *arg)
> +{
> +	return sym->dts_hval ? sym->dts_hval : str2hval(sym->dts_name, 0);
> +}
How about caching the computed value?  E.g.,
         if (sym->dts_hval == 0)
             sym->dts_hval = str2hval(sym->dts_name, 0);
         return sym->dts_hval;
Perhaps such caching is unnecessary, but we're checking sym->dts_hval 
anyhow.

> @@ -155,12 +168,10 @@ dt_symtab_create(void)
>   
>   	memset(symtab, 0, sizeof(struct dt_symtab));
>   
> -	symtab->dtst_symbuckets = _dtrace_strbuckets;
> -	symtab->dtst_syms_by_name = calloc(symtab->dtst_symbuckets,
> -	    sizeof(struct dt_symbol *));
> +	if (!dtp->dt_kernsyms)
> +		dtp->dt_kernsyms = dt_htab_create(dtp, &dt_symtab_htab_ops);
>   
> -	if (symtab->dtst_syms_by_name == NULL) {
> -		free(symtab->dtst_syms_by_name);
> +	if (dtp->dt_kernsyms == NULL) {
>   		free(symtab);
>   		return NULL;
>   	}
We check twice if dtp->dt_kernsyms is NULL:  once to see if 
dt_htab_create() should be called and again to see if the call was 
successful.  How about using syntactically similar expressions to check 
semantically identical checks?

> @@ -207,36 +229,28 @@ dt_symbol_insert(dt_symtab_t *symtab, const char *name,
>   
>   	if ((dtsp = malloc(sizeof(dt_symbol_t))) == NULL)
> -		return NULL;
> +		goto oom;
>   
>   	if (symtab->dtst_num_range >= symtab->dtst_num_range_alloc)
> -		if (dt_symtab_grow_ranges(symtab) == NULL) {
> -			free(dtsp);
> -			return NULL;
> -		}
> +		if (dt_symtab_grow_ranges(symtab) == NULL)
> +			goto oom;
oom?  So if the dtsp=malloc() is successful but the 
dt_symtab_grow_ranges() is not, we go to oom, where we free the 
uninitialized pointer dtsp->dts_name.  Not a good idea.  I don't think 
there's much value in channeling all those code paths through a common 
oom exit path.  The various "return NULL" statements seem fine to me.
> -	if (size > 0) {
> -		symtab->dtst_ranges[symtab->dtst_num_range].dtsr_sym = dtsp;
> -		symtab->dtst_num_range++;
> -	}
> @@ -244,36 +258,48 @@ dt_symbol_insert(dt_symtab_t *symtab, const char *name,
> +	if (size > 0) {
> +		symtab->dtst_ranges[symtab->dtst_num_range].dtsr_sym = dtsp;
> +		symtab->dtst_num_range++;
> +	}
Okay, so that code is moving, but while we're at it might as well 
perhaps change to a more common idiom:
     if (size > 0)
         symtab->dtst_range[symtab->dtst_num_range++].dtsr_sym = dtsp;



More information about the DTrace-devel mailing list