[DTrace-devel] [PATCH 3/7] htab: add dt_htab_insert_unique
Eugene Loh
eugene.loh at oracle.com
Thu Oct 7 20:16:30 PDT 2021
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>
Incidentally, an alternative to introducing new_bucket is, instead of
setting new_bucket=1, just set no_dups=0. That is, the "no dups" flag
becomes irrelevant if you're introducing a new bucket anyhow. Or,
branch to "add:" only if no_dups==0. It just seems weird to be using
both new_bucket and no_dups. But not a big deal. So ignore these
suggestions if they strike you as worse than the problem.
Also, though this is "unrelated" to the patch, I was wondering if you
could explain the pre-existing code. We enter the insert function. We
look for a match. If none, we'll add a bucket and therefore check the
htab size. If needed, we resize the htab. At that point, we "retry."
Why? If we failed to find a bucket the first time, why should we find
one now? Or, are we rerunning the search merely for the side effect of
recomputing idx?
On 10/5/21 9:38 AM, Nick Alcock wrote:
> Some uses of htabs want to insert items only if they don't already exist
> (duplicates being rather hard to fish out of the htab again). This is
> doable by doing a lookup first, but when inserting large numbers of
> items, this is a bit of a waste of time, given that insertion already
> knows whether an item matching the key already exists (because it has
> to create a new bucket otherwise).
>
> So split dt_htab_insert into an internal function that can insert dups
> or not, then introduce a new dt_htab_insert_unique that calls it.
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
> ---
> libdtrace/dt_htab.c | 26 ++++++++++++++++++++++++--
> libdtrace/dt_htab.h | 1 +
> 2 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/libdtrace/dt_htab.c b/libdtrace/dt_htab.c
> index 8e99987d06f1..ff80738f17e6 100644
> --- a/libdtrace/dt_htab.c
> +++ b/libdtrace/dt_htab.c
> @@ -119,12 +119,14 @@ static int resize(dt_htab_t *htab)
>
> /*
> * Add an entry to the hashtable. Resize if necessary, and allocate a new
> - * bucket if necessary.
> + * bucket if necessary. Optionally skip insertion if this bucket (-> key)
> + * is already present.
> */
> -int dt_htab_insert(dt_htab_t *htab, void *entry)
> +static int dt_htab_insert_internal(dt_htab_t *htab, void *entry, int no_dups)
> {
> uint32_t hval = htab->ops->hval(entry);
> int idx;
> + int new_bucket = 0;
> dt_hbucket_t *bucket;
>
> retry:
> @@ -144,6 +146,7 @@ retry:
> goto retry;
> }
>
> + new_bucket = 1;
> bucket = malloc(sizeof(dt_hbucket_t));
> if (!bucket)
> return -ENOMEM;
> @@ -156,12 +159,31 @@ retry:
> htab->nbuckets++;
>
> add:
> + if (!new_bucket && no_dups)
> + return 0;
> +
> bucket->head = htab->ops->add(bucket->head, entry);
> bucket->nentries++;
>
> return 0;
> }
>
> +/*
> + * Add an entry to the hashtable.
> + */
> +int dt_htab_insert(dt_htab_t *htab, void *entry)
> +{
> + return dt_htab_insert_internal(htab, entry, 0);
> +}
> +
> +/*
> + * Add an entry to the hashtable, adding only if not already present.
> + */
> +int dt_htab_insert_unique(dt_htab_t *htab, void *entry)
> +{
> + return dt_htab_insert_internal(htab, entry, 1);
> +}
> +
> /*
> * Find an entry in the hashtable.
> */
> diff --git a/libdtrace/dt_htab.h b/libdtrace/dt_htab.h
> index 3a716357362f..9da72163a9dd 100644
> --- a/libdtrace/dt_htab.h
> +++ b/libdtrace/dt_htab.h
> @@ -87,6 +87,7 @@ typedef struct dt_htab dt_htab_t;
> extern dt_htab_t *dt_htab_create(struct dtrace_hdl *dtp, dt_htab_ops_t *ops);
> extern void dt_htab_destroy(struct dtrace_hdl *dtp, dt_htab_t *htab);
> extern int dt_htab_insert(dt_htab_t *htab, void *entry);
> +extern int dt_htab_insert_unique(dt_htab_t *htab, void *entry);
> extern void *dt_htab_lookup(const dt_htab_t *htab, const void *entry);
> extern int dt_htab_delete(dt_htab_t *htab, void *entry);
> extern void dt_htab_stats(const char *name, const dt_htab_t *htab);
More information about the DTrace-devel
mailing list