DPDK patches and discussions
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: Michel Machado <michel@digirati.com.br>,
	"Fu, Qiaobin" <qiaobinf@bu.edu>,
	 "Richardson, Bruce" <bruce.richardson@intel.com>,
	"De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"Doucette, Cody, Joseph" <doucette@bu.edu>,
	 "Wang, Yipeng1" <yipeng1.wang@intel.com>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	"Gobriel, Sameh" <sameh.gobriel@intel.com>,
	"Tai, Charlie" <charlie.tai@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>, nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH v2] hash table: add an iterator over conflicting entries
Date: Fri, 17 Aug 2018 19:41:17 +0000	[thread overview]
Message-ID: <AM6PR08MB3672A461DF5F8D29F607846C983D0@AM6PR08MB3672.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <5e809298-ee0e-f03f-e83a-59b764e3a9b8@digirati.com.br>



-----Original Message-----
From: Michel Machado <michel@digirati.com.br> 
Sent: Friday, August 17, 2018 8:35 AM
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Fu, Qiaobin <qiaobinf@bu.edu>; Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
Cc: dev@dpdk.org; Doucette, Cody, Joseph <doucette@bu.edu>; Wang, Yipeng1 <yipeng1.wang@intel.com>; Wiles, Keith <keith.wiles@intel.com>; Gobriel, Sameh <sameh.gobriel@intel.com>; Tai, Charlie <charlie.tai@intel.com>; Stephen Hemminger <stephen@networkplumber.org>; nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH v2] hash table: add an iterator over conflicting entries

On 08/16/2018 10:33 PM, Honnappa Nagarahalli wrote:
> +/* Get the primary bucket index given the precomputed hash value. */ 
> +static inline uint32_t rte_hash_get_primary_bucket(const struct 
> +rte_hash *h, hash_sig_t sig) {
> +	return sig & h->bucket_bitmask;
> +}
> +
> +/* Get the secondary bucket index given the precomputed hash value. 
> +*/ static inline uint32_t rte_hash_get_secondary_bucket(const struct 
> +rte_hash *h, hash_sig_t sig) {
> +	return rte_hash_secondary_hash(sig) & h->bucket_bitmask; }
> +
> IMO, to keep the code consistent, we do not need to have the above 2 functions.

    Ok.

> +int32_t __rte_experimental
> +rte_hash_iterate_conflict_entries(struct rte_conflict_iterator_state *state,
> +	const void **key, const void **data) {
> +	struct rte_hash_iterator_conflict_entries_state *__state;
> +
> +	RETURN_IF_TRUE(((state == NULL) || (key == NULL) ||
> +		(data == NULL)), -EINVAL);
> +
> +	__state = (struct rte_hash_iterator_conflict_entries_state *)state;
> +
> +	while (__state->vnext < RTE_HASH_BUCKET_ENTRIES * 2) {
> +		uint32_t bidx = (__state->vnext < RTE_HASH_BUCKET_ENTRIES) ?
> +			__state->primary_bidx : __state->secondary_bidx;
> +		uint32_t next = __state->vnext & (RTE_HASH_BUCKET_ENTRIES - 1);
> +		uint32_t position = __state->h->buckets[bidx].key_idx[next];
> +		struct rte_hash_key *next_key;
> +		/*
> +		 * The test below is unlikely because this iterator is meant
> +		 * to be used after a failed insert.
> +		 * */
> +		if (unlikely(position == EMPTY_SLOT))
> +			goto next;
> +
> +		/* Get the entry in key table. */
> +		next_key = (struct rte_hash_key *) (
> +			(char *)__state->h->key_store +
> +			position * __state->h->key_entry_size);
> +		/* Return key and data. */
> +		*key = next_key->key;
> +		*data = next_key->pdata;
> +
> +next:
> +		/* Increment iterator. */
> +		__state->vnext++;
> +
> +		if (likely(position != EMPTY_SLOT))
> +			return position - 1;
> +	}
> +
> +	return -ENOENT;
> +}
> 
> 
> I think, we can make this API similar to 'rte_hash_iterate'. I suggest the following API signature:
> 
> int32_t
> rte_hash_iterate_conflict_entries (const struct rte_hash *h, const 
> void **key, void **data, hash_sig_t sig, uint32_t *next)

    The goal of our interface is to support changing the underlying hash table algorithm without requiring changes in applications. As Yipeng1 Wang exemplified in the discussion of the first version of this patch, "in future, rte_hash may use three hash functions, or as I mentioned each bucket may have an additional linked list or even a second level hash table, or if the hopscotch hash replaces cuckoo hash as the new algorithm." These new algorithms may require more state than sig and next can efficiently provide in order to browse the conflicting entries.

Thank you for your explanation. I think, 64B for the size of the state is good. This should apply for 'rte_hash_iterate' API as well. It currently has 4B of state (if the 'sig' is kept out) and is dependent on current hash algorithm.

Can you elaborate more on using ' struct rte_conflict_iterator_state' as the argument for the API?

If the API signature is changed to: rte_hash_iterate_conflict_entries (const struct rte_hash *h, void **key, void **data, const hash_sig_t sig, struct rte_conflict_iterator_state *state) - it will be inline with the existing APIs. Contents of 'state' must be initialized to 0 for the first call. This will also avoid creating 'rte_hash_iterator_conflict_entries_init' API. 


> I also suggest to change the API name to ' rte_hash_iterate_bucket_entries' - 'bucket' is a well understood term in the context of hash algorithms.

    It's a matter of semantics here. rte_hash_iterate_conflict_entries()
may cross more than one bucket. In fact, the first version of this patch tried to do exactly that, but it exposes the underlying algorithm. In addition, future algorithms may stretch what is being browsed even further.

I agree it is a matter of semantics. From the user/application point of view, the algorithm implemented should not matter. 'conflict_entries' definitely conveys the meaning, I think this is nothing but 'entries in a bucket' in the context of hash. May be, Yipeng can reconsider his comment?

> Do we also need to have 'rte_hash_iterate_conflict_entries_with_hash' API?

    I may have not understood the question. We are already working with the hash (i.e. sig). Did you mean something else?

Let me elaborate. For the API 'rte_hash_lookup', there are multiple variations such as 'rte_hash_lookup_with_hash', 'rte_hash_lookup_data', 'rte_hash_lookup_with_hash_data' etc. We do not need to create similar variations for 'rte_hash_iterate_conflict_entries' API right now. But the naming of the API should be such that these variations can be created in the future.

> diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h 
> index f71ca9fbf..7ecb6a7eb 100644
> --- a/lib/librte_hash/rte_hash.h
> +++ b/lib/librte_hash/rte_hash.h
> @@ -61,6 +61,11 @@ struct rte_hash_parameters {
>   /** @internal A hash table structure. */  struct rte_hash;
>   
> +/** @internal A hash table conflict iterator state structure. */ 
> +struct rte_conflict_iterator_state {
> +	uint8_t space[64];
> +};
> +
Needs aligning to cache line.

> 
> The size depends on the current size of the state, which is subject to change with the algorithm used.

    We chose a size that should be robust for any future underlying algorithm. Do you have a suggestion on how to go about it? We chose to have a simple struct to enable applications to allocate a state as a local variable and avoid a memory allocation.

This looks fine after your explanation. The structure name can be changed to 'rte_iterator_state' so that it can be used in other iterator APIs too.

[ ]'s
Michel Machado

  reply	other threads:[~2018-08-17 19:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-16  7:30 Fu, Qiaobin
2018-08-17  2:33 ` Honnappa Nagarahalli
2018-08-17 13:34   ` Michel Machado
2018-08-17 19:41     ` Honnappa Nagarahalli [this message]
2018-08-18 22:45       ` Michel Machado
2018-08-18 23:08       ` Michel Machado
2018-08-21  5:10         ` Honnappa Nagarahalli
2018-08-21 12:41           ` Michel Machado
2018-08-21 23:42             ` Honnappa Nagarahalli
2018-08-24  0:33               ` Wang, Yipeng1
2018-08-24 12:34                 ` Michel Machado
2018-08-27  3:12                   ` Honnappa Nagarahalli
2018-08-27 18:27                     ` Michel Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM6PR08MB3672A461DF5F8D29F607846C983D0@AM6PR08MB3672.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=bruce.richardson@intel.com \
    --cc=charlie.tai@intel.com \
    --cc=dev@dpdk.org \
    --cc=doucette@bu.edu \
    --cc=keith.wiles@intel.com \
    --cc=michel@digirati.com.br \
    --cc=nd@arm.com \
    --cc=pablo.de.lara.guarch@intel.com \
    --cc=qiaobinf@bu.edu \
    --cc=sameh.gobriel@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=yipeng1.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).