DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Wang, Yipeng1" <yipeng1.wang@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	"Lilijun (Jerry)" <jerry.lilijun@huawei.com>
Cc: "'dev@dpdk.org'" <dev@dpdk.org>,
	"stable@dpdk.org" <stable@dpdk.org>,
	wangyunjian <wangyunjian@huawei.com>,
	xudingke <xudingke@huawei.com>
Subject: Re: [dpdk-dev] [dpdk-stable] 答复:  [PATCH] lib/librte_hash: add rte_hash_del_key_fixed without compact
Date: Wed, 24 Mar 2021 23:25:31 +0000	[thread overview]
Message-ID: <BYAPR11MB3494C7002F911BC94E7806D5C3639@BYAPR11MB3494.namprd11.prod.outlook.com> (raw)
In-Reply-To: <5797361.OYrOJzcSuM@thomas>

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, March 24, 2021 2:33 PM
> To: Wang, Yipeng1 <yipeng1.wang@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>
> Cc: 'dev@dpdk.org' <dev@dpdk.org>; stable@dpdk.org; wangyunjian
> <wangyunjian@huawei.com>; xudingke <xudingke@huawei.com>
> Subject: Re: [dpdk-stable] 答复: [dpdk-dev] [PATCH] lib/librte_hash: add
> rte_hash_del_key_fixed without compact
> 
> Sorry for not following the full thread.
> What is the conclusion please?
> 
[Wang, Yipeng] 
Hi, I reviewed my previous comment, I recall that this patch is an RFC for a new API.

The proposed API is not directly appliable for now since it discloses the internals of the implementation to user.

I am waiting for Jerry to post a new RFC, and more details of the use case to motivate the change.
Thomas, you could change the status accordingly.


> 13/05/2020 03:28, Lilijun (Jerry):
> >
> > > -----邮件原件-----
> > > 发件人: Wang, Yipeng1 [mailto:yipeng1.wang@intel.com]
> > > 发送时间: 2020年5月13日 7:41
> > > 收件人: Lilijun (Jerry) <jerry.lilijun@huawei.com>; Honnappa
> > > Nagarahalli <Honnappa.Nagarahalli@arm.com>; 'dev@dpdk.org'
> > > <dev@dpdk.org>
> > > 抄送: wangyunjian <wangyunjian@huawei.com>; xudingke
> > > <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>; nd
> > > <nd@arm.com>; nd <nd@arm.com>
> > > 主题: RE: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > rte_hash_del_key_fixed without compact
> > >
> > > > -----Original Message-----
> > > > From: Lilijun (Jerry) <jerry.lilijun@huawei.com>
> > > > Sent: Tuesday, May 5, 2020 6:10 PM
> > > > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> > > > 'dev@dpdk.org' <dev@dpdk.org>
> > > > Cc: wangyunjian <wangyunjian@huawei.com>; xudingke
> > > > <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>; nd
> > > > <nd@arm.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>; nd
> > > <nd@arm.com>
> > > > Subject: 答复: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > > rte_hash_del_key_fixed without compact
> > > >
> > > > > -----邮件原件-----
> > > > > 发件人: Honnappa Nagarahalli
> [mailto:Honnappa.Nagarahalli@arm.com]
> > > > > 发送时间: 2020年5月6日 7:18
> > > > > 收件人: Lilijun (Jerry) <jerry.lilijun@huawei.com>; 'dev@dpdk.org'
> > > > > <dev@dpdk.org>
> > > > > 抄送: wangyunjian <wangyunjian@huawei.com>; xudingke
> > > > > <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>; nd
> > > > > <nd@arm.com>; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>;
> > > > > yipeng1.wang@intel.com; nd <nd@arm.com>
> > > > > 主题: RE: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > > > rte_hash_del_key_fixed without compact
> > > > >
> > > > > <snip>
> > > > >
> > > > > Adding Yipeng, maintainer for hash library
> > > > >
> > > > > >
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > Using rte_hash iterate and delete keys is to free the related
> > > > > > data's
> > > > memory.
> > > > > > There are two reasons why rte_hash_reset() is not properly:
> > > > > > 1)  the reset function just clear all keys, the key's related
> > > > > > data are
> > > leaked.
> > > > > That is a good point. I think this should be documented in the API.
> > >
> > > [Yipeng]
> > > By leaking, do you mean that you keep data in separate places and
> > > the pointers to them are missed from table after reset? Can you keep
> > > data in an array and iterate that array instead?
> > [Lilijun (Jerry)]
> > Yes, the data pointers in rte hash table are missed after reset.
> > The solution using an external array to keep all data can avoid this problem,
> but it may introduce some extra memories cost.
> > It's better rte_hash can support iterate when inserting or deleting.
> > > > >
> > > > > > 2)  In some cases, I don't need delete all keys. Just some
> > > > > > selected keys and data are deleted and released.
> > > [Yipeng]
> > > Could you keep a candidate list of keys you want to delete in
> > > another data structure so you don’t need to iterate the whole hash table?
> > >
> > [Lilijun (Jerry)]
> > I need iterate the hash table at first and decide which key are candidate,
> then delete the key/data instantly.
> > Of course, here I can keep those candidate keys and datas in a temporary
> list and delete them after the iterate is finished.
> > This may be a second choice for me although it became more complicated.
> > > > > I understand the problem you have pointed out and understand how
> > > > > to reproduce it. But, the use case is not clear to me. Can you
> > > > > please explain the use case?
> > > > [Lilijun (Jerry)]
> > > >
> > > > As you know, the dpdk rte_hash use a fixed size table to store all
> > > keys/datas.
> > > > The memory used by hash table is related with this fixed size.
> > > > In my case, normally the count of keys is about 100,000 but
> > > > sometimes the count may burst up to 30,000,000.
> > > > In order to save memory usage, I create a small hash table with
> > > > 100,000 size and replace to a bigger one with 30,000,000 size when
> > > > there are more keys to be stored. Also when the key's count
> > > > reduced to less than 100,000, I replace the hash table with a
> > > > small one to save the
> > > memory.
> > > [Yipeng]
> > > Could you tell me more on the use case? Since insertion would also
> > > invalidate the Iterator, do you insert keys only to new table during
> resizing?
> > >
> > [Lilijun (Jerry)]
> > Yes, Insert only to new table. Because the resize process need take a write
> lock and the old table's key insertion are prevented by the lock now.
> > Do you mean the key insertion may change other key's position by cuckoo
> hash algorithm and invalidate the iterator?
> > That maybe a new question I haven't met yet.
> > > >
> > > > >
> > > > > >
> > > > > > Jerry.
> > > > > >
> > > > > > -----邮件原件-----
> > > > > > 发件人: Honnappa Nagarahalli
> > > > > > [mailto:Honnappa.Nagarahalli@arm.com]
> > > > > > 发送时间: 2020年4月29日 4:46
> > > > > > 收件人: Lilijun (Jerry) <jerry.lilijun@huawei.com>; 'dev@dpdk.org'
> > > > > > <dev@dpdk.org>
> > > > > > 抄送: wangyunjian <wangyunjian@huawei.com>; xudingke
> > > > > > <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>;
> nd
> > > > > > <nd@arm.com>; Honnappa Nagarahalli
> > > > <Honnappa.Nagarahalli@arm.com>;
> > > > > nd
> > > > > > <nd@arm.com>
> > > > > > 主题: RE: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > > > > rte_hash_del_key_fixed without compact
> > > > > >
> > > > > > <snip>
> > > > > >
> > > > > > Hi Jerry,
> > > > > > 	Few questions inline.
> > > > > >
> > > > > > > Subject: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > > > > > rte_hash_del_key_fixed without compact
> > > > > > >
> > > > > > > The keys idx are stored in rte_hash main bucket key slots
> > > > > > > and extend bucket key stots.
> > > > > > > We iterate every no empty Keys in h->buckets and
> > > > > > > h->buckets_ext from start to last.
> > > > > > > When deleting keys the function __rte_hash_compact_ll() may
> > > > > > > move last_bkt's key to previous bucket in order to compact
> > > > > > > extend bucket
> > > list.
> > > > > > > If the previous bucket has been iterated, the moved key may
> > > > > > > be missed for users.
> > > > > > > Then those missed keys are leaked and rte_hash table can't
> > > > > > > be
> > > cleanup.
> > > > > > > So we add a new API rte_hash_del_key_fixed() used in iterate
> > > > > > > loop to avoid this bugs.
> > > > > > >
> > > > > > > ---
> > > > > > >  lib/librte_hash/rte_cuckoo_hash.c    | 19 ++++++++++++++-----
> > > > > > >  lib/librte_hash/rte_hash.h           |  5 +++++
> > > > > > >  lib/librte_hash/rte_hash_version.map |  1 +
> > > > > > >  3 files changed, 20 insertions(+), 5 deletions(-)
> > > > > > >
> > > > > > > diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> > > > > > > b/lib/librte_hash/rte_cuckoo_hash.c
> > > > > > > index b52630b..2da3c1d 100644
> > > > > > > --- a/lib/librte_hash/rte_cuckoo_hash.c
> > > > > > > +++ b/lib/librte_hash/rte_cuckoo_hash.c
> > > > > > > @@ -1523,7 +1523,7 @@ search_and_remove(const struct
> > > > > > > rte_hash *h, const void *key,
> > > > > > >
> > > > > > >  static inline int32_t
> > > > > > >  __rte_hash_del_key_with_hash(const struct rte_hash *h,
> > > > > > > const void
> > > > > *key,
> > > > > > > -						hash_sig_t sig)
> > > > > > > +						hash_sig_t sig,
> > > uint8_t
> > > > > > > compact)
> > > > > > >  {
> > > > > > >  	uint32_t prim_bucket_idx, sec_bucket_idx;
> > > > > > >  	struct rte_hash_bucket *prim_bkt, *sec_bkt, *prev_bkt,
> > > > > > > *last_bkt;
> > > > > > @@
> > > > > > > -1541,7 +1541,8 @@ __rte_hash_del_key_with_hash(const struct
> > > > > > > rte_hash *h, const void *key,
> > > > > > >  	/* look for key in primary bucket */
> > > > > > >  	ret = search_and_remove(h, key, prim_bkt, short_sig, &pos);
> > > > > > >  	if (ret != -1) {
> > > > > > > -		__rte_hash_compact_ll(h, prim_bkt, pos);
> > > > > > > +		if (compact)
> > > > > > > +			__rte_hash_compact_ll(h, prim_bkt, pos);
> > > > > > >  		last_bkt = prim_bkt->next;
> > > > > > >  		prev_bkt = prim_bkt;
> > > > > > >  		goto return_bkt;
> > > > > > > @@ -1553,7 +1554,8 @@ __rte_hash_del_key_with_hash(const
> > > struct
> > > > > > > rte_hash *h, const void *key,
> > > > > > >  	FOR_EACH_BUCKET(cur_bkt, sec_bkt) {
> > > > > > >  		ret = search_and_remove(h, key, cur_bkt, short_sig,
> > > &pos);
> > > > > > >  		if (ret != -1) {
> > > > > > > -			__rte_hash_compact_ll(h, cur_bkt, pos);
> > > > > > > +			if (compact)
> > > > > > > +				__rte_hash_compact_ll(h, cur_bkt,
> > > pos);
> > > > > > >  			last_bkt = sec_bkt->next;
> > > > > > >  			prev_bkt = sec_bkt;
> > > > > > >  			goto return_bkt;
> > > > > > > @@ -1607,14 +1609,21 @@ rte_hash_del_key_with_hash(const
> > > struct
> > > > > > > rte_hash *h,
> > > > > > >  			const void *key, hash_sig_t sig)  {
> > > > > > >  	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
> > > > > > > -	return __rte_hash_del_key_with_hash(h, key, sig);
> > > > > > > +	return __rte_hash_del_key_with_hash(h, key, sig, 1);
> > > > > > >  }
> > > > > > >
> > > > > > >  int32_t
> > > > > > >  rte_hash_del_key(const struct rte_hash *h, const void *key)  {
> > > > > > >  	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
> > > > > > > -	return __rte_hash_del_key_with_hash(h, key,
> > > rte_hash_hash(h,
> > > > > key));
> > > > > > > +	return __rte_hash_del_key_with_hash(h, key,
> > > rte_hash_hash(h,
> > > > > key),
> > > > > > > 1);
> > > > > > > +}
> > > > > > > +
> > > > > > > +int32_t
> > > > > > > +rte_hash_del_key_fixed(const struct rte_hash *h, const void
> > > > > > > +*key)
> > > {
> > > > > > > +	RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
> > > > > > > +	return __rte_hash_del_key_with_hash(h, key,
> > > rte_hash_hash(h,
> > > > > key),
> > > > > > > 0);
> > > > > > >  }
> > > > > > >
> > > > > > >  int
> > > > > > > diff --git a/lib/librte_hash/rte_hash.h
> > > > > > > b/lib/librte_hash/rte_hash.h index eceb365..9b71d8a 100644
> > > > > > > --- a/lib/librte_hash/rte_hash.h
> > > > > > > +++ b/lib/librte_hash/rte_hash.h
> > > > > > > @@ -297,6 +297,11 @@ rte_hash_add_key_with_hash(const
> struct
> > > > > > rte_hash
> > > > > > > *h, const void *key, hash_sig_t  int32_t
> > > > > > > rte_hash_del_key(const struct rte_hash *h, const void *key);
> > > > > > >
> > > > > > > +
> > > > > > > +/* for without compact */
> > > > > > > +int32_t
> > > > > > > +rte_hash_del_key_fixed(const struct rte_hash *h, const void
> > > > > > > +*key);
> > > > > > > +
> > > > > > >  /**
> > > > > > >   * Remove a key from an existing hash table.
> > > > > > >   * This operation is not multi-thread safe diff --git
> > > > > > > a/lib/librte_hash/rte_hash_version.map
> > > > > > > b/lib/librte_hash/rte_hash_version.map
> > > > > > > index 30cc086..1941d17 100644
> > > > > > > --- a/lib/librte_hash/rte_hash_version.map
> > > > > > > +++ b/lib/librte_hash/rte_hash_version.map
> > > > > > > @@ -11,6 +11,7 @@ DPDK_20.0 {
> > > > > > >  	rte_hash_count;
> > > > > > >  	rte_hash_create;
> > > > > > >  	rte_hash_del_key;
> > > > > > > +	rte_hash_del_key_fixed;
> > > > > > >  	rte_hash_del_key_with_hash;
> > > > > > >  	rte_hash_find_existing;
> > > > > > >  	rte_hash_free;
> > > > > > > --
> > > > > > > 2.19.1
> > > > > > >
> > > > > > > -----邮件原件-----
> > > > > > > 发件人: Lilijun (Jerry)
> > > > > > > 发送时间: 2020年4月18日 18:00
> > > > > > > 收件人: 'dev@dpdk.org' <dev@dpdk.org>; 'stable@dpdk.org'
> > > > > > > <stable@dpdk.org>
> > > > > > > 主题: rte_hash bug: can't iterate all entries when deleting
> > > > > > > keys in rte_hash iterate loop.
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > >     In my test, entries can't be cleanup in rte_hash table
> > > > > > > when deleting keys in rte_hash iterate loop. The test steps:
> > > > > > >     1.  create a hash table table1 with limit 30000, ext
> > > > > > > bucket enabled,  and insert 30000 entries into this hash table.
> > > > > > >     2.  create a larger hash table table2 with limit 60000,
> > > > > > > , ext bucket
> > > > > > enabled.
> > > > > > >     3.  iterate all entries of table1 and insert them to the table2.
> > > > > > > Insert new
> > > > > > > 10000 entries to this table2.
> > > > > > >     4.  Then flush all entries from table2 by deleting keys
> > > > > > > in rte_hash iterate loop. But there are still some keys leaked in
> table2.
> > > > > > Is there any reason for flushing table2 in this manner?
> > > > > > Is it possible to use 'rte_hash_reset' instead?
> > > > > >
> > > > > > >
> > > > > > >     From my analysis, the keys idx are stored in rte_hash
> > > > > > > main bucket key slots and extend bucket key stots.
> > > > > > >     We iterate every no empty Keys in h->buckets and
> > > > > > > h->buckets_ext from start to last.
> > > > > > >     When deleting keys the function __rte_hash_compact_ll()
> > > > > > > may move last_bkt's key to previous bucket in order to
> > > > > > > compact extend
> > > > bucket list.
> > > > > > >     If the previous bucket has been iterated, the moved key
> > > > > > > may be missed for users.
> > > > > > >     Then those missed keys are leaked and rte_hash table
> > > > > > > can't be
> > > > cleanup.
> > > > > > >
> > > > > > >     Now I retry the iterate and delete keys, that can avoid this bug.
> > > > > > >
> > > > > > >     Is there any ideas or solutions on this bug?   Thanks.
> > > > > > >
> > > > > > > Jerry.
> > > [Wang, Yipeng]
> > > Thanks Honnappa for adding me to the recipient list, I am sorry that
> > > I missed this patch previously.
> > > In future you could add me to the recipient list for hash related
> > > patches/questions! Thanks!
> > >
> > > Although it is not a bug to me, I think this is a valid issue and
> > > thanks for bringing it up.
> > > I see the iterate function as like the iterator from C++ stl. Any
> > > insertion/delete should invalidate the current iterator (insertion
> > > too, since insertion could move things around in cuckoo path).
> > > From this point of view, by design there is no guarantee that the "next"
> > > pointer is still valid after any insertion/deletion during the
> > > iteration. We at least should clarify this in the documentation.
> > >
> > > That said, in c++ stl there is indeed a way to iterate and delete
> > > since it provides an erase function returning the next valid iterator.
> > > I think we don’t have such capability in rte_hash now as you pointed
> > > out, and this is a gap you want to fill.
> > >
> > > However, the proposed API " rte_hash_del_key_fixed" exposes the
> > > internals of the implementation.
> > > The internal of hash lib is supposed to be a black box to the user.
> > > Although the issue is caused by "compaction", the API should not
> > > expose that. I am thinking a function like the "erase" from stl
> > > which returns the next valid iterator should be a better way. What do you
> think?
> > >
> > [Lilijun (Jerry)]
> > Yes, I am agree with your views. Some "erase" function may be better for
> this cases.
> >
> > > Resizing is an important feature that we always think to add into rte_hash.
> > > Feel free to propose resizing Features too into rte_hash as you are
> > > already doing similar thing right now.
> > >
> > [Lilijun (Jerry)]
> > Yes, the memory cost by rte hash may be waste when the keys usage is not
> high in common.
> > Resize features can figure out that problem in rte_hash. Thanks for your
> advice.
> >
> > > BTW, you can add a [RFC] prefix to the subject line for future RFC patches.
> > >
> > > Other questions/comments are inlined.
> 
> 


  reply	other threads:[~2021-03-24 23:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-27  2:28 [dpdk-dev] " Lilijun (Jerry)
2020-04-28 20:46 ` Honnappa Nagarahalli
2020-04-29  1:07   ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-05 23:17     ` [dpdk-dev] " Honnappa Nagarahalli
2020-05-06  1:09       ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-12 23:41         ` [dpdk-dev] " Wang, Yipeng1
2020-05-13  1:28           ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-14 17:44             ` [dpdk-dev] " Wang, Yipeng1
2021-03-24 21:33             ` [dpdk-dev] [dpdk-stable] 答复: " Thomas Monjalon
2021-03-24 23:25               ` Wang, Yipeng1 [this message]
2021-03-25  8:04                 ` Thomas Monjalon
2020-05-13 19:27         ` [dpdk-dev] " Honnappa Nagarahalli
2020-05-14  0:55           ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-14  1:21             ` [dpdk-dev] " Honnappa Nagarahalli
2020-05-14  1:51               ` [dpdk-dev] 答复: " Lilijun (Jerry)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR11MB3494C7002F911BC94E7806D5C3639@BYAPR11MB3494.namprd11.prod.outlook.com \
    --to=yipeng1.wang@intel.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=jerry.lilijun@huawei.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=wangyunjian@huawei.com \
    --cc=xudingke@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).