[DTrace-devel] [PATCH v3 08/11] htab: add an iterator

Thu Nov 4 13:28:59 UTC 2021

On 2 Nov 2021, Eugene Loh told this:

> On 11/1/21 10:29 AM, Nick Alcock wrote:
>> This commit defines an iterator in the libctf _next style because it's
>> *so much* nicer to use than the function-calling _iter form:
>>
>> extern void *dt_htab_next(const dt_htab_t *htab, dt_htab_next_t **it);
>> extern void dt_htab_next_destroy(dt_htab_next_t *i);
>>
>> Call dt_htab_next with the htab to iterate over and a dt_htab_next_t
>> initialized to NULL. It allocates an iterator, returns new elements from
>> the hash until iteration is complete, then frees the iterator and
>> re-annuls it, ready for another iteration cycle.
>
> I suppose all of that is true, but it's a little deceptive since the 
> call does different subsets of those things in different calls.  How about:
>
>      Call dt_htab_next with the htab to iterate over and a dt_htab_next_t
>      initialized to NULL to allocate an iterator and return the first 
> element.
>      Subsequent calls with the allocated iterator will return further 
> elements
>      from the hash until iteration is complete, at which time the 
> iterator is
>      freed and reset to NULL, ready for another iteration cycle.

Oh that's nice! (Except for the staircasing, which your mailer seems to
be doing a lot :) either it should be inserting linefeeds or you should
be doing that by hand, but not both.)

>> There are no restrictions whatsoever on what you can do inside iteration
>> (iterate over other things, fork, longjmp out of it, you name it) except
>> that you should not delete hash entries that the iterator has not yet
>> returned: if you insert new ones, they may or may not be returned by
>> this iteration cycle.
>
> Phrases like "whatsoever" and "you name it" suggest one message, while 
> "except" etc. suggest another.  How about:

It's saying "you can do anything, with these limitations".

>      You can delete any entry already returned this iteration cycle.
>      If you insert entries, they may or may not be returned in this
>      iteration cycle.

I dropped the "you name it" but didn't adopt ths, because it drops the
fairly important point that iteration loops are *just loops*. (I put
that in because users over in binutils were worrying about whether you
can even *call functions* from inside iteration, let alone do other
iterations or fork or the like. Hell, you can even start another
iteration over a hash you're already in the middle of iterating over.)

I rephrased it a bit and moved some of the previous para down into it.

>> diff --git a/libdtrace/dt_consume.c b/libdtrace/dt_consume.c
>> @@ -409,6 +409,7 @@ static dt_htab_ops_t dt_spec_buf_htab_ops = {
>>   	.hval = (htab_hval_fn)dt_spec_buf_hval,
>>   	.cmp = (htab_cmp_fn)dt_spec_buf_cmp,
>>   	.add = (htab_add_fn)dt_spec_buf_add,
>> +	.next = (htab_next_fn)dt_spec_buf_next,
>>   	.del = (htab_del_fn)dt_spec_buf_del_buf
>>   };
>
> Sorry if I muddled my earlier comment on this.  In dt_htab.h, next 
> consistently follows del.  (In dt_htab.c, there is similar ordering, 
> even if admittedly the details are different.)  Anyhow, stick next 
> *after* del in dt_consume.c just for consistency.

Ah! Hm, which is most consistently followed... dt_htab.h is the odd man
out, but it's also the nicest ordering. Adjusted all the others.

I also added .next to the top-of-dt_htab.c comment.

>> +	/*
>> +	 * Capture the value we will return after advancing the iterator, and
>> +	 * handle end-of-iteration.
>> +	 */
>> +
>> +	ret = i->ent;
>> +	if (i->exhausted) {
>> +		free(i);
>> +		*it = NULL;
>> +		return NULL;
>> +	}
>
> This is okay but how about doing the i->exhausted check first. If it 
> fails, then it makes sense to set ret=i->ent.

I vacillated on this repeatedly, and then picked the wrong one :) fixed.

-- 
NULL && (void)