* [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown
@ 2018-05-24 15:20 Xueming Li
2018-05-24 18:29 ` Yongseok Koh
0 siblings, 1 reply; 5+ messages in thread
From: Xueming Li @ 2018-05-24 15:20 UTC (permalink / raw)
To: Shahaf Shuler, Yongseok Koh; +Cc: Xueming Li, dev
This patch add check to avoid init device MR data twcie which caused
priv data linked to itself in list.
Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: yskoh@mellanox.com
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
drivers/net/mlx5/mlx5_ethdev.c | 2 ++
drivers/net/mlx5/mlx5_mr.c | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index c52ec6fb7..7ffd17ff3 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -398,6 +398,8 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
* deadlock. If it overflows, lookup should be done by searching MR list
* linearly, which is slow.
*/
+ if (priv->mr.cache.table)
+ return 0;
if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
dev->device->numa_node)) {
/* rte_errno is already set. */
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index abb1f5179..08105a443 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
rte_errno = EINVAL;
return -rte_errno;
}
+ assert(!bt->table && !bt->size);
memset(bt, 0, sizeof(*bt));
bt->table = rte_calloc_socket("B-tree table",
n, sizeof(struct mlx5_mr_cache),
--
2.13.3
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown
2018-05-24 15:20 [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown Xueming Li
@ 2018-05-24 18:29 ` Yongseok Koh
2018-05-24 18:33 ` Yongseok Koh
2018-05-25 6:04 ` Xueming(Steven) Li
0 siblings, 2 replies; 5+ messages in thread
From: Yongseok Koh @ 2018-05-24 18:29 UTC (permalink / raw)
To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev, Adrien Mazarguil
> On May 24, 2018, at 8:20 AM, Xueming Li <xuemingl@mellanox.com> wrote:
>
> This patch add check to avoid init device MR data twcie which caused
> priv data linked to itself in list.
A typo.
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: yskoh@mellanox.com
>
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
> drivers/net/mlx5/mlx5_ethdev.c | 2 ++
> drivers/net/mlx5/mlx5_mr.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index c52ec6fb7..7ffd17ff3 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -398,6 +398,8 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
> * deadlock. If it overflows, lookup should be done by searching MR list
> * linearly, which is slow.
> */
> + if (priv->mr.cache.table)
> + return 0;
Thanks, nice catch!
Adrien submitted a fix [1] for the similar issue. Adding device to the memory event
callback list is moved to mlx5_pci_probe() as it should be done once. So, how about
moving the rest (mlx5_mr_btree_init()) to mlx5_pci_probe() as well? I mean, instead
of adding this check, you can move the btree_init() above the line where Adrien moved
LIST_INSERT_HEAD() to. You'll need to rebase on Adrien's patch and add a comment to
Shahaf there's dependency.
[1]
net/mlx5: fix crash when configure is not called
http://dpdk.org/dev/patchwork/patch/40404/
Yongseok
> if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> dev->device->numa_node)) {
> /* rte_errno is already set. */
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> index abb1f5179..08105a443 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
> rte_errno = EINVAL;
> return -rte_errno;
> }
> + assert(!bt->table && !bt->size);
> memset(bt, 0, sizeof(*bt));
> bt->table = rte_calloc_socket("B-tree table",
> n, sizeof(struct mlx5_mr_cache),
> --
> 2.13.3
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown
2018-05-24 18:29 ` Yongseok Koh
@ 2018-05-24 18:33 ` Yongseok Koh
2018-05-25 6:04 ` Xueming(Steven) Li
1 sibling, 0 replies; 5+ messages in thread
From: Yongseok Koh @ 2018-05-24 18:33 UTC (permalink / raw)
To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev, Adrien Mazarguil
> On May 24, 2018, at 11:29 AM, Yongseok Koh <yskoh@mellanox.com> wrote:
>
>>
>> On May 24, 2018, at 8:20 AM, Xueming Li <xuemingl@mellanox.com> wrote:
>>
>> This patch add check to avoid init device MR data twcie which caused
>> priv data linked to itself in list.
>
> A typo.
>
>> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
>> Cc: yskoh@mellanox.com
>>
>> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
>> ---
>> drivers/net/mlx5/mlx5_ethdev.c | 2 ++
>> drivers/net/mlx5/mlx5_mr.c | 1 +
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
>> index c52ec6fb7..7ffd17ff3 100644
>> --- a/drivers/net/mlx5/mlx5_ethdev.c
>> +++ b/drivers/net/mlx5/mlx5_ethdev.c
>> @@ -398,6 +398,8 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
>> * deadlock. If it overflows, lookup should be done by searching MR list
>> * linearly, which is slow.
>> */
>> + if (priv->mr.cache.table)
>> + return 0;
>
> Thanks, nice catch!
>
> Adrien submitted a fix [1] for the similar issue. Adding device to the memory event
> callback list is moved to mlx5_pci_probe() as it should be done once. So, how about
> moving the rest (mlx5_mr_btree_init()) to mlx5_pci_probe() as well? I mean, instead
> of adding this check, you can move the btree_init() above the line where Adrien moved
> LIST_INSERT_HEAD() to. You'll need to rebase on Adrien's patch and add a comment to
> Shahaf there's dependency.
>
>
> [1]
> net/mlx5: fix crash when configure is not called
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpdk.org%2Fdev%2Fpatchwork%2Fpatch%2F40404%2F&data=02%7C01%7Cyskoh%40mellanox.com%7Ca731aadfe6834a7a435f08d5c1a45fcf%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636627834121490391&sdata=D8HjhDMzMfRucGCTSWCieRkNmptGXLJHo8it0Py%2F3gY%3D&reserved=0
And you might want to write the same patch for mlx4.
Thanks,
Yongseok
>> if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
>> dev->device->numa_node)) {
>> /* rte_errno is already set. */
>> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
>> index abb1f5179..08105a443 100644
>> --- a/drivers/net/mlx5/mlx5_mr.c
>> +++ b/drivers/net/mlx5/mlx5_mr.c
>> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
>> rte_errno = EINVAL;
>> return -rte_errno;
>> }
>> + assert(!bt->table && !bt->size);
>> memset(bt, 0, sizeof(*bt));
>> bt->table = rte_calloc_socket("B-tree table",
>> n, sizeof(struct mlx5_mr_cache),
>> --
>> 2.13.3
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown
2018-05-24 18:29 ` Yongseok Koh
2018-05-24 18:33 ` Yongseok Koh
@ 2018-05-25 6:04 ` Xueming(Steven) Li
2018-05-25 6:40 ` Xueming(Steven) Li
1 sibling, 1 reply; 5+ messages in thread
From: Xueming(Steven) Li @ 2018-05-25 6:04 UTC (permalink / raw)
To: Yongseok Koh; +Cc: Shahaf Shuler, dev, Adrien Mazarguil
> -----Original Message-----
> From: Yongseok Koh
> Sent: Friday, May 25, 2018 2:30 AM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Subject: Re: [PATCH] net/mlx5: fix port dead lock when port shutdown
>
>
> > On May 24, 2018, at 8:20 AM, Xueming Li <xuemingl@mellanox.com> wrote:
> >
> > This patch add check to avoid init device MR data twcie which caused
> > priv data linked to itself in list.
>
> A typo.
>
> > Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> > Cc: yskoh@mellanox.com
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> > drivers/net/mlx5/mlx5_ethdev.c | 2 ++
> > drivers/net/mlx5/mlx5_mr.c | 1 +
> > 2 files changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > b/drivers/net/mlx5/mlx5_ethdev.c index c52ec6fb7..7ffd17ff3 100644
> > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > @@ -398,6 +398,8 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
> > * deadlock. If it overflows, lookup should be done by searching MR list
> > * linearly, which is slow.
> > */
> > + if (priv->mr.cache.table)
> > + return 0;
>
> Thanks, nice catch!
>
> Adrien submitted a fix [1] for the similar issue. Adding device to the memory event callback list is
> moved to mlx5_pci_probe() as it should be done once. So, how about moving the rest
> (mlx5_mr_btree_init()) to mlx5_pci_probe() as well? I mean, instead of adding this check, you can move
> the btree_init() above the line where Adrien moved
> LIST_INSERT_HEAD() to. You'll need to rebase on Adrien's patch and add a comment to Shahaf there's
> dependency.
>
Adrien's patch seems resolved the dead lock issue as well :)
Moving mlx5_mr_btree_init() not related to it, I'll start a new patch.
>
> [1]
> net/mlx5: fix crash when configure is not called http://dpdk.org/dev/patchwork/patch/40404/
>
>
> Yongseok
>
> > if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> > dev->device->numa_node)) {
> > /* rte_errno is already set. */
> > diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> > index abb1f5179..08105a443 100644
> > --- a/drivers/net/mlx5/mlx5_mr.c
> > +++ b/drivers/net/mlx5/mlx5_mr.c
> > @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
> > rte_errno = EINVAL;
> > return -rte_errno;
> > }
> > + assert(!bt->table && !bt->size);
> > memset(bt, 0, sizeof(*bt));
> > bt->table = rte_calloc_socket("B-tree table",
> > n, sizeof(struct mlx5_mr_cache),
> > --
> > 2.13.3
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown
2018-05-25 6:04 ` Xueming(Steven) Li
@ 2018-05-25 6:40 ` Xueming(Steven) Li
0 siblings, 0 replies; 5+ messages in thread
From: Xueming(Steven) Li @ 2018-05-25 6:40 UTC (permalink / raw)
To: Yongseok Koh; +Cc: Shahaf Shuler, dev, Adrien Mazarguil
Please bypass this patch, new patch submitted at:
http://www.dpdk.org/dev/patchwork/patch/40412/
> -----Original Message-----
> From: Xueming(Steven) Li
> Sent: Friday, May 25, 2018 2:04 PM
> To: Yongseok Koh <yskoh@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Subject: RE: [PATCH] net/mlx5: fix port dead lock when port shutdown
>
>
>
> > -----Original Message-----
> > From: Yongseok Koh
> > Sent: Friday, May 25, 2018 2:30 AM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Adrien
> > Mazarguil <adrien.mazarguil@6wind.com>
> > Subject: Re: [PATCH] net/mlx5: fix port dead lock when port shutdown
> >
> >
> > > On May 24, 2018, at 8:20 AM, Xueming Li <xuemingl@mellanox.com> wrote:
> > >
> > > This patch add check to avoid init device MR data twcie which caused
> > > priv data linked to itself in list.
> >
> > A typo.
> >
> > > Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> > > Cc: yskoh@mellanox.com
> > >
> > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > ---
> > > drivers/net/mlx5/mlx5_ethdev.c | 2 ++
> > > drivers/net/mlx5/mlx5_mr.c | 1 +
> > > 2 files changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > > b/drivers/net/mlx5/mlx5_ethdev.c index c52ec6fb7..7ffd17ff3 100644
> > > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > > @@ -398,6 +398,8 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
> > > * deadlock. If it overflows, lookup should be done by searching MR list
> > > * linearly, which is slow.
> > > */
> > > + if (priv->mr.cache.table)
> > > + return 0;
> >
> > Thanks, nice catch!
> >
> > Adrien submitted a fix [1] for the similar issue. Adding device to the
> > memory event callback list is moved to mlx5_pci_probe() as it should
> > be done once. So, how about moving the rest
> > (mlx5_mr_btree_init()) to mlx5_pci_probe() as well? I mean, instead of
> > adding this check, you can move the btree_init() above the line where
> > Adrien moved
> > LIST_INSERT_HEAD() to. You'll need to rebase on Adrien's patch and add
> > a comment to Shahaf there's dependency.
> >
>
> Adrien's patch seems resolved the dead lock issue as well :) Moving mlx5_mr_btree_init() not related
> to it, I'll start a new patch.
>
> >
> > [1]
> > net/mlx5: fix crash when configure is not called
> > http://dpdk.org/dev/patchwork/patch/40404/
> >
> >
> > Yongseok
> >
> > > if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> > > dev->device->numa_node)) {
> > > /* rte_errno is already set. */
> > > diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> > > index abb1f5179..08105a443 100644
> > > --- a/drivers/net/mlx5/mlx5_mr.c
> > > +++ b/drivers/net/mlx5/mlx5_mr.c
> > > @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
> > > rte_errno = EINVAL;
> > > return -rte_errno;
> > > }
> > > + assert(!bt->table && !bt->size);
> > > memset(bt, 0, sizeof(*bt));
> > > bt->table = rte_calloc_socket("B-tree table",
> > > n, sizeof(struct mlx5_mr_cache),
> > > --
> > > 2.13.3
> > >
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-05-25 6:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-24 15:20 [dpdk-dev] [PATCH] net/mlx5: fix port dead lock when port shutdown Xueming Li
2018-05-24 18:29 ` Yongseok Koh
2018-05-24 18:33 ` Yongseok Koh
2018-05-25 6:04 ` Xueming(Steven) Li
2018-05-25 6:40 ` Xueming(Steven) Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).