[DTrace-devel] [PATCH v2 11/11] htab reduction: symtab

Eugene Loh eugene.loh at oracle.com
Mon Oct 25 11:19:03 PDT 2021


Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
And...

On 10/20/21 2:54 PM, Nick Alcock wrote:
> The symtab is a major performance sink.  The symtabs in a running DTrace
> vary from tiny (most modules) to huge (20,000+ entries in the vmlinux
> symtab), but all of them are stuffed into fixed-size htabs of size 211,
> leading to hash chains hundreds long.  Combine that with the exponential

I already asked about this.  Why exponential?  You said it's O(nm). That 
is not exponential.  Exponential would be O(2^n)... or so.

> explosion of symtab lookups at startup time (something to be fixed in a
> later commit), and huge amounts of time get spent traversing bucket
> chains on every DTrace startup (and a lot of time doubtless gets wasted
> at runtime too).  Symbol lookup is also slow: it does one lookup per
> loaded module until it finds the right one.
>
> The only extra bit of fragility this adds is that destruction of the
> centralized dt_kernsym_t makes all symbol name lookups fail, since we no
> longer have any per-symbol hashtabs.  So we add a new dt_module_fini
> that ensures that modules are destroyed right after the kernsym hashtab,
> so there is no room for any name lookups to sneak in.

I do not understand this.  Nor do I see a dt_module_fini().

> diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c
> @@ -29,14 +29,13 @@
>   
>   struct dt_symbol {
>   	dt_list_t dts_list;		/* list forward/back pointers */

Not a part of this patch, but so far as I can tell dts_list has been 
orphaned since its introduction a decade ago in 9f987758bc9a .

> -	union {
> -		char *str;		/* symbol name */
> -		size_t off;		/* symbol offset in strtab */
> -	} dts_name;
> +	char *dts_name;			/* symbol name */
>   	GElf_Addr dts_addr;		/* symbol address */
>   	GElf_Xword dts_size;		/* symbol size */
>   	unsigned char dts_info;		/* ELF symbol information */
> -	struct dt_symbol *dts_next;	/* hash chain pointer */
> +	uint32_t dts_hval;		/* cached hash value (for speed) */
> +	dt_module_t *dts_dmp;		/* module this symbol is contained within */
> +	struct dt_hentry dts_he;	/* htab links */
>   };
>   
>   /*
> @@ -57,8 +56,6 @@ typedef struct dt_symrange {
>   
>   struct dt_symtab {
>   	dt_list_t dtst_symlist;		/* symbol list */

I do not understand why dtst_symlist is still included.  Can't it be 
replaced by the iterator?

> -	dt_symbol_t **dtst_syms_by_name;/* symbol name->addr hash buckets */
> -	uint_t dtst_symbuckets;		/* number of buckets */
>   	char *dtst_strtab;		/* string table of symbol names */
>   	dt_symrange_t *dtst_ranges;	/* range->symbol mapping */
>   	uint_t dtst_num_range;		/*   - number of ranges */



More information about the DTrace-devel mailing list