From: "Wang, Yipeng1" <yipeng1.wang@intel.com>
To: "Lilijun (Jerry)" <jerry.lilijun@huawei.com>,
Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
"'dev@dpdk.org'" <dev@dpdk.org>
Cc: wangyunjian <wangyunjian@huawei.com>,
xudingke <xudingke@huawei.com>,
"'stable@dpdk.org'" <stable@dpdk.org>, nd <nd@arm.com>,
nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH] lib/librte_hash: add rte_hash_del_key_fixed without compact
Date: Tue, 12 May 2020 23:41:20 +0000 [thread overview]
Message-ID: <MW3PR11MB46846D04E1223D7708388898C3BE0@MW3PR11MB4684.namprd11.prod.outlook.com> (raw)
In-Reply-To: <40280F65B1B0B44E8089ED31C01616EBA49A19CE@dggeml529-mbx.china.huawei.com>
> -----Original Message-----
> From: Lilijun (Jerry) <jerry.lilijun@huawei.com>
> Sent: Tuesday, May 5, 2020 6:10 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> 'dev@dpdk.org' <dev@dpdk.org>
> Cc: wangyunjian <wangyunjian@huawei.com>; xudingke
> <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>; nd
> <nd@arm.com>; Wang, Yipeng1 <yipeng1.wang@intel.com>; nd
> <nd@arm.com>
> Subject: 答复: [dpdk-dev] [PATCH] lib/librte_hash: add
> rte_hash_del_key_fixed without compact
>
> > -----邮件原件-----
> > 发件人: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > 发送时间: 2020年5月6日 7:18
> > 收件人: Lilijun (Jerry) <jerry.lilijun@huawei.com>; 'dev@dpdk.org'
> > <dev@dpdk.org>
> > 抄送: wangyunjian <wangyunjian@huawei.com>; xudingke
> > <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>; nd
> > <nd@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> > yipeng1.wang@intel.com; nd <nd@arm.com>
> > 主题: RE: [dpdk-dev] [PATCH] lib/librte_hash: add rte_hash_del_key_fixed
> > without compact
> >
> > <snip>
> >
> > Adding Yipeng, maintainer for hash library
> >
> > >
> > > Thanks for your reply.
> > >
> > > Using rte_hash iterate and delete keys is to free the related data's
> memory.
> > > There are two reasons why rte_hash_reset() is not properly:
> > > 1) the reset function just clear all keys, the key's related data are leaked.
> > That is a good point. I think this should be documented in the API.
[Yipeng]
By leaking, do you mean that you keep data in separate places and the pointers
to them are missed from table after reset? Can you keep data in an array and iterate that array instead?
> >
> > > 2) In some cases, I don't need delete all keys. Just some selected
> > > keys and data are deleted and released.
[Yipeng]
Could you keep a candidate list of keys you want to delete in another data structure
so you don’t need to iterate the whole hash table?
> > I understand the problem you have pointed out and understand how to
> > reproduce it. But, the use case is not clear to me. Can you please
> > explain the use case?
> [Lilijun (Jerry)]
>
> As you know, the dpdk rte_hash use a fixed size table to store all keys/datas.
> The memory used by hash table is related with this fixed size.
> In my case, normally the count of keys is about 100,000 but sometimes the
> count may burst up to 30,000,000.
> In order to save memory usage, I create a small hash table with 100,000 size
> and replace to a bigger one with 30,000,000 size when there are more keys to
> be stored. Also when the key's count reduced to less than 100,000, I replace
> the hash table with a small one to save the memory.
[Yipeng]
Could you tell me more on the use case? Since insertion would also invalidate the
Iterator, do you insert keys only to new table during resizing?
>
> >
> > >
> > > Jerry.
> > >
> > > -----邮件原件-----
> > > 发件人: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > > 发送时间: 2020年4月29日 4:46
> > > 收件人: Lilijun (Jerry) <jerry.lilijun@huawei.com>; 'dev@dpdk.org'
> > > <dev@dpdk.org>
> > > 抄送: wangyunjian <wangyunjian@huawei.com>; xudingke
> > > <xudingke@huawei.com>; 'stable@dpdk.org' <stable@dpdk.org>; nd
> > > <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>;
> > nd
> > > <nd@arm.com>
> > > 主题: RE: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > rte_hash_del_key_fixed without compact
> > >
> > > <snip>
> > >
> > > Hi Jerry,
> > > Few questions inline.
> > >
> > > > Subject: [dpdk-dev] [PATCH] lib/librte_hash: add
> > > > rte_hash_del_key_fixed without compact
> > > >
> > > > The keys idx are stored in rte_hash main bucket key slots and
> > > > extend bucket key stots.
> > > > We iterate every no empty Keys in h->buckets and h->buckets_ext
> > > > from start to last.
> > > > When deleting keys the function __rte_hash_compact_ll() may move
> > > > last_bkt's key to previous bucket in order to compact extend bucket list.
> > > > If the previous bucket has been iterated, the moved key may be
> > > > missed for users.
> > > > Then those missed keys are leaked and rte_hash table can't be cleanup.
> > > > So we add a new API rte_hash_del_key_fixed() used in iterate loop
> > > > to avoid this bugs.
> > > >
> > > > ---
> > > > lib/librte_hash/rte_cuckoo_hash.c | 19 ++++++++++++++-----
> > > > lib/librte_hash/rte_hash.h | 5 +++++
> > > > lib/librte_hash/rte_hash_version.map | 1 +
> > > > 3 files changed, 20 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> > > > b/lib/librte_hash/rte_cuckoo_hash.c
> > > > index b52630b..2da3c1d 100644
> > > > --- a/lib/librte_hash/rte_cuckoo_hash.c
> > > > +++ b/lib/librte_hash/rte_cuckoo_hash.c
> > > > @@ -1523,7 +1523,7 @@ search_and_remove(const struct rte_hash *h,
> > > > const void *key,
> > > >
> > > > static inline int32_t
> > > > __rte_hash_del_key_with_hash(const struct rte_hash *h, const void
> > *key,
> > > > - hash_sig_t sig)
> > > > + hash_sig_t sig, uint8_t
> > > > compact)
> > > > {
> > > > uint32_t prim_bucket_idx, sec_bucket_idx;
> > > > struct rte_hash_bucket *prim_bkt, *sec_bkt, *prev_bkt,
> > > > *last_bkt;
> > > @@
> > > > -1541,7 +1541,8 @@ __rte_hash_del_key_with_hash(const struct
> > > > rte_hash *h, const void *key,
> > > > /* look for key in primary bucket */
> > > > ret = search_and_remove(h, key, prim_bkt, short_sig, &pos);
> > > > if (ret != -1) {
> > > > - __rte_hash_compact_ll(h, prim_bkt, pos);
> > > > + if (compact)
> > > > + __rte_hash_compact_ll(h, prim_bkt, pos);
> > > > last_bkt = prim_bkt->next;
> > > > prev_bkt = prim_bkt;
> > > > goto return_bkt;
> > > > @@ -1553,7 +1554,8 @@ __rte_hash_del_key_with_hash(const struct
> > > > rte_hash *h, const void *key,
> > > > FOR_EACH_BUCKET(cur_bkt, sec_bkt) {
> > > > ret = search_and_remove(h, key, cur_bkt, short_sig, &pos);
> > > > if (ret != -1) {
> > > > - __rte_hash_compact_ll(h, cur_bkt, pos);
> > > > + if (compact)
> > > > + __rte_hash_compact_ll(h, cur_bkt, pos);
> > > > last_bkt = sec_bkt->next;
> > > > prev_bkt = sec_bkt;
> > > > goto return_bkt;
> > > > @@ -1607,14 +1609,21 @@ rte_hash_del_key_with_hash(const struct
> > > > rte_hash *h,
> > > > const void *key, hash_sig_t sig) {
> > > > RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
> > > > - return __rte_hash_del_key_with_hash(h, key, sig);
> > > > + return __rte_hash_del_key_with_hash(h, key, sig, 1);
> > > > }
> > > >
> > > > int32_t
> > > > rte_hash_del_key(const struct rte_hash *h, const void *key) {
> > > > RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
> > > > - return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h,
> > key));
> > > > + return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h,
> > key),
> > > > 1);
> > > > +}
> > > > +
> > > > +int32_t
> > > > +rte_hash_del_key_fixed(const struct rte_hash *h, const void *key) {
> > > > + RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
> > > > + return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h,
> > key),
> > > > 0);
> > > > }
> > > >
> > > > int
> > > > diff --git a/lib/librte_hash/rte_hash.h
> > > > b/lib/librte_hash/rte_hash.h index eceb365..9b71d8a 100644
> > > > --- a/lib/librte_hash/rte_hash.h
> > > > +++ b/lib/librte_hash/rte_hash.h
> > > > @@ -297,6 +297,11 @@ rte_hash_add_key_with_hash(const struct
> > > rte_hash
> > > > *h, const void *key, hash_sig_t int32_t rte_hash_del_key(const
> > > > struct rte_hash *h, const void *key);
> > > >
> > > > +
> > > > +/* for without compact */
> > > > +int32_t
> > > > +rte_hash_del_key_fixed(const struct rte_hash *h, const void
> > > > +*key);
> > > > +
> > > > /**
> > > > * Remove a key from an existing hash table.
> > > > * This operation is not multi-thread safe diff --git
> > > > a/lib/librte_hash/rte_hash_version.map
> > > > b/lib/librte_hash/rte_hash_version.map
> > > > index 30cc086..1941d17 100644
> > > > --- a/lib/librte_hash/rte_hash_version.map
> > > > +++ b/lib/librte_hash/rte_hash_version.map
> > > > @@ -11,6 +11,7 @@ DPDK_20.0 {
> > > > rte_hash_count;
> > > > rte_hash_create;
> > > > rte_hash_del_key;
> > > > + rte_hash_del_key_fixed;
> > > > rte_hash_del_key_with_hash;
> > > > rte_hash_find_existing;
> > > > rte_hash_free;
> > > > --
> > > > 2.19.1
> > > >
> > > > -----邮件原件-----
> > > > 发件人: Lilijun (Jerry)
> > > > 发送时间: 2020年4月18日 18:00
> > > > 收件人: 'dev@dpdk.org' <dev@dpdk.org>; 'stable@dpdk.org'
> > > > <stable@dpdk.org>
> > > > 主题: rte_hash bug: can't iterate all entries when deleting keys in
> > > > rte_hash iterate loop.
> > > >
> > > > Hi all,
> > > >
> > > > In my test, entries can't be cleanup in rte_hash table when
> > > > deleting keys in rte_hash iterate loop. The test steps:
> > > > 1. create a hash table table1 with limit 30000, ext bucket
> > > > enabled, and insert 30000 entries into this hash table.
> > > > 2. create a larger hash table table2 with limit 60000, , ext
> > > > bucket
> > > enabled.
> > > > 3. iterate all entries of table1 and insert them to the table2.
> > > > Insert new
> > > > 10000 entries to this table2.
> > > > 4. Then flush all entries from table2 by deleting keys in
> > > > rte_hash iterate loop. But there are still some keys leaked in table2.
> > > Is there any reason for flushing table2 in this manner?
> > > Is it possible to use 'rte_hash_reset' instead?
> > >
> > > >
> > > > From my analysis, the keys idx are stored in rte_hash main
> > > > bucket key slots and extend bucket key stots.
> > > > We iterate every no empty Keys in h->buckets and
> > > > h->buckets_ext from start to last.
> > > > When deleting keys the function __rte_hash_compact_ll() may
> > > > move last_bkt's key to previous bucket in order to compact extend
> bucket list.
> > > > If the previous bucket has been iterated, the moved key may be
> > > > missed for users.
> > > > Then those missed keys are leaked and rte_hash table can't be
> cleanup.
> > > >
> > > > Now I retry the iterate and delete keys, that can avoid this bug.
> > > >
> > > > Is there any ideas or solutions on this bug? Thanks.
> > > >
> > > > Jerry.
[Wang, Yipeng]
Thanks Honnappa for adding me to the recipient list, I am sorry that I missed this patch previously.
In future you could add me to the recipient list for hash related patches/questions! Thanks!
Although it is not a bug to me, I think this is a valid issue and thanks for bringing it up.
I see the iterate function as like the iterator from C++ stl. Any insertion/delete should invalidate
the current iterator (insertion too, since insertion could move things around in cuckoo path).
From this point of view, by design there is no guarantee that the "next" pointer is still valid
after any insertion/deletion during the iteration. We at least should clarify this in the documentation.
That said, in c++ stl there is indeed a way to iterate and delete since it provides an erase function returning the next valid iterator.
I think we don’t have such capability in rte_hash now as you pointed out, and this is a gap you want to fill.
However, the proposed API " rte_hash_del_key_fixed" exposes the internals of the implementation.
The internal of hash lib is supposed to be a black box to the user. Although the issue is caused by "compaction",
the API should not expose that. I am thinking a function like the "erase" from stl which returns the next valid
iterator should be a better way. What do you think?
Resizing is an important feature that we always think to add into rte_hash. Feel free to propose resizing
Features too into rte_hash as you are already doing similar thing right now.
BTW, you can add a [RFC] prefix to the subject line for future RFC patches.
Other questions/comments are inlined.
next prev parent reply other threads:[~2020-05-12 23:41 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-27 2:28 Lilijun (Jerry)
2020-04-28 20:46 ` Honnappa Nagarahalli
2020-04-29 1:07 ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-05 23:17 ` [dpdk-dev] " Honnappa Nagarahalli
2020-05-06 1:09 ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-12 23:41 ` Wang, Yipeng1 [this message]
2020-05-13 1:28 ` Lilijun (Jerry)
2020-05-14 17:44 ` [dpdk-dev] " Wang, Yipeng1
2021-03-24 21:33 ` [dpdk-dev] [dpdk-stable] 答复: " Thomas Monjalon
2021-03-24 23:25 ` Wang, Yipeng1
2021-03-25 8:04 ` Thomas Monjalon
2020-05-13 19:27 ` [dpdk-dev] " Honnappa Nagarahalli
2020-05-14 0:55 ` [dpdk-dev] 答复: " Lilijun (Jerry)
2020-05-14 1:21 ` [dpdk-dev] " Honnappa Nagarahalli
2020-05-14 1:51 ` [dpdk-dev] 答复: " Lilijun (Jerry)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MW3PR11MB46846D04E1223D7708388898C3BE0@MW3PR11MB4684.namprd11.prod.outlook.com \
--to=yipeng1.wang@intel.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=dev@dpdk.org \
--cc=jerry.lilijun@huawei.com \
--cc=nd@arm.com \
--cc=stable@dpdk.org \
--cc=wangyunjian@huawei.com \
--cc=xudingke@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).