* [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
@ 2018-05-25  6:35 Xueming Li
  2018-05-25  6:38 ` Xueming(Steven) Li
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Xueming Li @ 2018-05-25  6:35 UTC (permalink / raw)
  To: Shahaf Shuler, Yongseok Koh; +Cc: Xueming Li, dev
This patch moved MR cache init from device configuration function to
probe function to make sure init only once.
Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: yskoh@mellanox.com
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        | 11 +++++++++++
 drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
 drivers/net/mlx5/mlx5_mr.c     |  1 +
 3 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dae847493..77ed8e01f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			goto port_error;
 		}
 		priv->config.max_verbs_prio = verb_priorities;
+		/*
+		 * Once the device is added to the list of memory event
+		 * callback, its global MR cache table cannot be expanded
+		 * on the fly because of deadlock. If it overflows, lookup
+		 * should be done by searching MR list linearly, which is slow.
+		 */
+		err = -mlx5_mr_btree_init(&priv->mr.cache,
+					  MLX5_MR_BTREE_CACHE_N * 2,
+					  eth_dev->device->numa_node);
+		if (err)
+			goto port_error;
 		/* Add device to memory callback list. */
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
 		LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f6cebae41..90488af33 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 		if (++j == rxqs_n)
 			j = 0;
 	}
-	/*
-	 * Once the device is added to the list of memory event callback, its
-	 * global MR cache table cannot be expanded on the fly because of
-	 * deadlock. If it overflows, lookup should be done by searching MR list
-	 * linearly, which is slow.
-	 */
-	if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
-			       dev->device->numa_node)) {
-		/* rte_errno is already set. */
-		return -rte_errno;
-	}
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index abb1f5179..08105a443 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	assert(!bt->table && !bt->size);
 	memset(bt, 0, sizeof(*bt));
 	bt->table = rte_calloc_socket("B-tree table",
 				      n, sizeof(struct mlx5_mr_cache),
-- 
2.13.3
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
  2018-05-25  6:35 [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init Xueming Li
@ 2018-05-25  6:38 ` Xueming(Steven) Li
  2018-05-25 10:19 ` Yongseok Koh
  2018-05-26 13:27 ` [dpdk-dev] [PATCH v2] " Xueming Li
  2 siblings, 0 replies; 11+ messages in thread
From: Xueming(Steven) Li @ 2018-05-25  6:38 UTC (permalink / raw)
  To: Xueming(Steven) Li, Shahaf Shuler, Yongseok Koh; +Cc: dev, Adrien Mazarguil
Hi Shahaf,
Please note that this patch depends on Adrien's patch:
	http://dpdk.org/dev/patchwork/patch/40404/
> -----Original Message-----
> From: Xueming Li <xuemingl@mellanox.com>
> Sent: Friday, May 25, 2018 2:36 PM
> To: Shahaf Shuler <shahafs@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Cc: Xueming(Steven) Li <xuemingl@mellanox.com>; dev@dpdk.org
> Subject: [PATCH] net/mlx5: fix memory region cache init
> 
> This patch moved MR cache init from device configuration function to probe function to make sure init
> only once.
> 
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: yskoh@mellanox.com
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c        | 11 +++++++++++
>  drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
>  drivers/net/mlx5/mlx5_mr.c     |  1 +
>  3 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index dae847493..77ed8e01f 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  			goto port_error;
>  		}
>  		priv->config.max_verbs_prio = verb_priorities;
> +		/*
> +		 * Once the device is added to the list of memory event
> +		 * callback, its global MR cache table cannot be expanded
> +		 * on the fly because of deadlock. If it overflows, lookup
> +		 * should be done by searching MR list linearly, which is slow.
> +		 */
> +		err = -mlx5_mr_btree_init(&priv->mr.cache,
> +					  MLX5_MR_BTREE_CACHE_N * 2,
> +					  eth_dev->device->numa_node);
> +		if (err)
> +			goto port_error;
>  		/* Add device to memory callback list. */
>  		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
>  		LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c index
> f6cebae41..90488af33 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
>  		if (++j == rxqs_n)
>  			j = 0;
>  	}
> -	/*
> -	 * Once the device is added to the list of memory event callback, its
> -	 * global MR cache table cannot be expanded on the fly because of
> -	 * deadlock. If it overflows, lookup should be done by searching MR list
> -	 * linearly, which is slow.
> -	 */
> -	if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> -			       dev->device->numa_node)) {
> -		/* rte_errno is already set. */
> -		return -rte_errno;
> -	}
>  	return 0;
>  }
> 
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index abb1f5179..08105a443 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
>  		rte_errno = EINVAL;
>  		return -rte_errno;
>  	}
> +	assert(!bt->table && !bt->size);
>  	memset(bt, 0, sizeof(*bt));
>  	bt->table = rte_calloc_socket("B-tree table",
>  				      n, sizeof(struct mlx5_mr_cache),
> --
> 2.13.3
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
  2018-05-25  6:35 [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init Xueming Li
  2018-05-25  6:38 ` Xueming(Steven) Li
@ 2018-05-25 10:19 ` Yongseok Koh
  2018-05-25 13:18   ` Xueming(Steven) Li
  2018-05-26 13:27 ` [dpdk-dev] [PATCH v2] " Xueming Li
  2 siblings, 1 reply; 11+ messages in thread
From: Yongseok Koh @ 2018-05-25 10:19 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev
> On May 24, 2018, at 11:35 PM, Xueming Li <xuemingl@mellanox.com> wrote:
> 
> This patch moved MR cache init from device configuration function to
> probe function to make sure init only once.
> 
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: yskoh@mellanox.com
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
> drivers/net/mlx5/mlx5.c        | 11 +++++++++++
> drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
> drivers/net/mlx5/mlx5_mr.c     |  1 +
> 3 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index dae847493..77ed8e01f 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> 			goto port_error;
> 		}
> 		priv->config.max_verbs_prio = verb_priorities;
> +		/*
> +		 * Once the device is added to the list of memory event
> +		 * callback, its global MR cache table cannot be expanded
> +		 * on the fly because of deadlock. If it overflows, lookup
> +		 * should be done by searching MR list linearly, which is slow.
> +		 */
> +		err = -mlx5_mr_btree_init(&priv->mr.cache,
> +					  MLX5_MR_BTREE_CACHE_N * 2,
> +					  eth_dev->device->numa_node);
> +		if (err)
> +			goto port_error;
A nit.
Like mlx5_flow_create_drop_queue(), please store rte_errno to err (err = rte_errno;)
instead of putting a minus sign to the function.
With that being fixed, you can put my acked-by tag when you submit v2.
 
Thanks,
Yongseok
> 		/* Add device to memory callback list. */
> 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
> 		LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index f6cebae41..90488af33 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
> 		if (++j == rxqs_n)
> 			j = 0;
> 	}
> -	/*
> -	 * Once the device is added to the list of memory event callback, its
> -	 * global MR cache table cannot be expanded on the fly because of
> -	 * deadlock. If it overflows, lookup should be done by searching MR list
> -	 * linearly, which is slow.
> -	 */
> -	if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> -			       dev->device->numa_node)) {
> -		/* rte_errno is already set. */
> -		return -rte_errno;
> -	}
> 	return 0;
> }
> 
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> index abb1f5179..08105a443 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
> 		rte_errno = EINVAL;
> 		return -rte_errno;
> 	}
> +	assert(!bt->table && !bt->size);
> 	memset(bt, 0, sizeof(*bt));
> 	bt->table = rte_calloc_socket("B-tree table",
> 				      n, sizeof(struct mlx5_mr_cache),
> -- 
> 2.13.3
> 
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
  2018-05-25 10:19 ` Yongseok Koh
@ 2018-05-25 13:18   ` Xueming(Steven) Li
  2018-05-25 15:41     ` Yongseok Koh
  0 siblings, 1 reply; 11+ messages in thread
From: Xueming(Steven) Li @ 2018-05-25 13:18 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: Shahaf Shuler, dev
> -----Original Message-----
> From: Yongseok Koh
> Sent: Friday, May 25, 2018 6:20 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
> 
> > On May 24, 2018, at 11:35 PM, Xueming Li <xuemingl@mellanox.com> wrote:
> >
> > This patch moved MR cache init from device configuration function to
> > probe function to make sure init only once.
> >
> > Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> > Cc: yskoh@mellanox.com
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> > drivers/net/mlx5/mlx5.c        | 11 +++++++++++
> > drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
> > drivers/net/mlx5/mlx5_mr.c     |  1 +
> > 3 files changed, 12 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > dae847493..77ed8e01f 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> > 			goto port_error;
> > 		}
> > 		priv->config.max_verbs_prio = verb_priorities;
> > +		/*
> > +		 * Once the device is added to the list of memory event
> > +		 * callback, its global MR cache table cannot be expanded
> > +		 * on the fly because of deadlock. If it overflows, lookup
> > +		 * should be done by searching MR list linearly, which is slow.
> > +		 */
> > +		err = -mlx5_mr_btree_init(&priv->mr.cache,
> > +					  MLX5_MR_BTREE_CACHE_N * 2,
> > +					  eth_dev->device->numa_node);
> > +		if (err)
> > +			goto port_error;
> 
> A nit.
> Like mlx5_flow_create_drop_queue(), please store rte_errno to err (err = rte_errno;) instead of
> putting a minus sign to the function.
Rte_errno is set inside mlx5_mr_btree_init() if any error, that's why I simply the code.
> 
> With that being fixed, you can put my acked-by tag when you submit v2.
> 
> Thanks,
> Yongseok
> 
> > 		/* Add device to memory callback list. */
> > 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
> > 		LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
> > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > b/drivers/net/mlx5/mlx5_ethdev.c index f6cebae41..90488af33 100644
> > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
> > 		if (++j == rxqs_n)
> > 			j = 0;
> > 	}
> > -	/*
> > -	 * Once the device is added to the list of memory event callback, its
> > -	 * global MR cache table cannot be expanded on the fly because of
> > -	 * deadlock. If it overflows, lookup should be done by searching MR list
> > -	 * linearly, which is slow.
> > -	 */
> > -	if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> > -			       dev->device->numa_node)) {
> > -		/* rte_errno is already set. */
> > -		return -rte_errno;
> > -	}
> > 	return 0;
> > }
> >
> > diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> > index abb1f5179..08105a443 100644
> > --- a/drivers/net/mlx5/mlx5_mr.c
> > +++ b/drivers/net/mlx5/mlx5_mr.c
> > @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
> > 		rte_errno = EINVAL;
> > 		return -rte_errno;
> > 	}
> > +	assert(!bt->table && !bt->size);
> > 	memset(bt, 0, sizeof(*bt));
> > 	bt->table = rte_calloc_socket("B-tree table",
> > 				      n, sizeof(struct mlx5_mr_cache),
> > --
> > 2.13.3
> >
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
  2018-05-25 13:18   ` Xueming(Steven) Li
@ 2018-05-25 15:41     ` Yongseok Koh
  2018-05-26  2:31       ` Xueming(Steven) Li
  0 siblings, 1 reply; 11+ messages in thread
From: Yongseok Koh @ 2018-05-25 15:41 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Shahaf Shuler, dev, Nélio Laranjeiro, Adrien Mazarguil
> On May 25, 2018, at 6:18 AM, Xueming(Steven) Li <xuemingl@mellanox.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Yongseok Koh
>> Sent: Friday, May 25, 2018 6:20 PM
>> To: Xueming(Steven) Li <xuemingl@mellanox.com>
>> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
>> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
>> 
>>> On May 24, 2018, at 11:35 PM, Xueming Li <xuemingl@mellanox.com> wrote:
>>> 
>>> This patch moved MR cache init from device configuration function to
>>> probe function to make sure init only once.
>>> 
>>> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
>>> Cc: yskoh@mellanox.com
>>> 
>>> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
>>> ---
>>> drivers/net/mlx5/mlx5.c        | 11 +++++++++++
>>> drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
>>> drivers/net/mlx5/mlx5_mr.c     |  1 +
>>> 3 files changed, 12 insertions(+), 11 deletions(-)
>>> 
>>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
>>> dae847493..77ed8e01f 100644
>>> --- a/drivers/net/mlx5/mlx5.c
>>> +++ b/drivers/net/mlx5/mlx5.c
>>> @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>>>            goto port_error;
>>>        }
>>>        priv->config.max_verbs_prio = verb_priorities;
>>> +        /*
>>> +         * Once the device is added to the list of memory event
>>> +         * callback, its global MR cache table cannot be expanded
>>> +         * on the fly because of deadlock. If it overflows, lookup
>>> +         * should be done by searching MR list linearly, which is slow.
>>> +         */
>>> +        err = -mlx5_mr_btree_init(&priv->mr.cache,
>>> +                      MLX5_MR_BTREE_CACHE_N * 2,
>>> +                      eth_dev->device->numa_node);
>>> +        if (err)
>>> +            goto port_error;
>> 
>> A nit.
>> Like mlx5_flow_create_drop_queue(), please store rte_errno to err (err = rte_errno;) instead of
>> putting a minus sign to the function.
> 
> Rte_errno is set inside mlx5_mr_btree_init() if any error, that's why I simply the code.
I understood your intention and I know that’s not faulty either. 
However, here consistency is more important than simplicity. In the probe func, we have been fixing quite a few bugs due to code inconsistencies. I don’t want to put a minus sign but follow the way as we did for mlx5_flow_create_drop_queue() in the same function. Putting a minus sign unlike other code around it could be another buggy point if someone else makes changes there in the future. Please make it aligned because both approaches are same anyway. 
Thanks 
Yongseok 
>> 
>> With that being fixed, you can put my acked-by tag when you submit v2.
>> 
>> Thanks,
>> Yongseok
>> 
>>>        /* Add device to memory callback list. */
>>>        rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
>>>        LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
>>> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
>>> b/drivers/net/mlx5/mlx5_ethdev.c index f6cebae41..90488af33 100644
>>> --- a/drivers/net/mlx5/mlx5_ethdev.c
>>> +++ b/drivers/net/mlx5/mlx5_ethdev.c
>>> @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
>>>        if (++j == rxqs_n)
>>>            j = 0;
>>>    }
>>> -    /*
>>> -     * Once the device is added to the list of memory event callback, its
>>> -     * global MR cache table cannot be expanded on the fly because of
>>> -     * deadlock. If it overflows, lookup should be done by searching MR list
>>> -     * linearly, which is slow.
>>> -     */
>>> -    if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
>>> -                   dev->device->numa_node)) {
>>> -        /* rte_errno is already set. */
>>> -        return -rte_errno;
>>> -    }
>>>    return 0;
>>> }
>>> 
>>> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
>>> index abb1f5179..08105a443 100644
>>> --- a/drivers/net/mlx5/mlx5_mr.c
>>> +++ b/drivers/net/mlx5/mlx5_mr.c
>>> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
>>>        rte_errno = EINVAL;
>>>        return -rte_errno;
>>>    }
>>> +    assert(!bt->table && !bt->size);
>>>    memset(bt, 0, sizeof(*bt));
>>>    bt->table = rte_calloc_socket("B-tree table",
>>>                      n, sizeof(struct mlx5_mr_cache),
>>> --
>>> 2.13.3
>>> 
> 
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
  2018-05-25 15:41     ` Yongseok Koh
@ 2018-05-26  2:31       ` Xueming(Steven) Li
  2018-05-26  5:01         ` Yongseok Koh
  0 siblings, 1 reply; 11+ messages in thread
From: Xueming(Steven) Li @ 2018-05-26  2:31 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: Shahaf Shuler, dev, Nélio Laranjeiro, Adrien Mazarguil
> -----Original Message-----
> From: Yongseok Koh
> Sent: Friday, May 25, 2018 11:41 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Nélio Laranjeiro <nelio.laranjeiro@6wind.com>;
> Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
> 
> 
> > On May 25, 2018, at 6:18 AM, Xueming(Steven) Li <xuemingl@mellanox.com> wrote:
> >
> >
> >
> >> -----Original Message-----
> >> From: Yongseok Koh
> >> Sent: Friday, May 25, 2018 6:20 PM
> >> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> >> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> >> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
> >>
> >>> On May 24, 2018, at 11:35 PM, Xueming Li <xuemingl@mellanox.com> wrote:
> >>>
> >>> This patch moved MR cache init from device configuration function to
> >>> probe function to make sure init only once.
> >>>
> >>> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> >>> Cc: yskoh@mellanox.com
> >>>
> >>> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> >>> ---
> >>> drivers/net/mlx5/mlx5.c        | 11 +++++++++++
> >>> drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
> >>> drivers/net/mlx5/mlx5_mr.c     |  1 +
> >>> 3 files changed, 12 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> >>> dae847493..77ed8e01f 100644
> >>> --- a/drivers/net/mlx5/mlx5.c
> >>> +++ b/drivers/net/mlx5/mlx5.c
> >>> @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> >>>            goto port_error;
> >>>        }
> >>>        priv->config.max_verbs_prio = verb_priorities;
> >>> +        /*
> >>> +         * Once the device is added to the list of memory event
> >>> +         * callback, its global MR cache table cannot be expanded
> >>> +         * on the fly because of deadlock. If it overflows, lookup
> >>> +         * should be done by searching MR list linearly, which is slow.
> >>> +         */
> >>> +        err = -mlx5_mr_btree_init(&priv->mr.cache,
> >>> +                      MLX5_MR_BTREE_CACHE_N * 2,
> >>> +                      eth_dev->device->numa_node);
> >>> +        if (err)
> >>> +            goto port_error;
> >>
> >> A nit.
> >> Like mlx5_flow_create_drop_queue(), please store rte_errno to err
> >> (err = rte_errno;) instead of putting a minus sign to the function.
> >
> > Rte_errno is set inside mlx5_mr_btree_init() if any error, that's why I simply the code.
> 
> I understood your intention and I know that’s not faulty either.
> 
> However, here consistency is more important than simplicity. In the probe func, we have been fixing
> quite a few bugs due to code inconsistencies. I don’t want to put a minus sign but follow the way as
> we did for mlx5_flow_create_drop_queue() in the same function. Putting a minus sign unlike other code
> around it could be another buggy point if someone else makes changes there in the future. Please make
> it aligned because both approaches are same anyway.
Such design is confusing, error code just used to test error happens or not, in functions we spend many 
code to save errno to both rte_errno and return value, and then also caller side. If people intend to 
have error info saved in rte_errno, then maybe changing functions returning void could be better - this 
something also anti-design, like errno, rte_errno is used to indicate detail root cause when error happens,
for people who want to know detail, that can’t necessarily mean that we should pass it precisely in each
caller. Just personal concern.
> 
> Thanks
> Yongseok
> >>
> >> With that being fixed, you can put my acked-by tag when you submit v2.
> >>
> >> Thanks,
> >> Yongseok
> >>
> >>>        /* Add device to memory callback list. */
> >>>        rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
> >>>        LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
> >>> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> >>> b/drivers/net/mlx5/mlx5_ethdev.c index f6cebae41..90488af33 100644
> >>> --- a/drivers/net/mlx5/mlx5_ethdev.c
> >>> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> >>> @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
> >>>        if (++j == rxqs_n)
> >>>            j = 0;
> >>>    }
> >>> -    /*
> >>> -     * Once the device is added to the list of memory event callback, its
> >>> -     * global MR cache table cannot be expanded on the fly because of
> >>> -     * deadlock. If it overflows, lookup should be done by searching MR list
> >>> -     * linearly, which is slow.
> >>> -     */
> >>> -    if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
> >>> -                   dev->device->numa_node)) {
> >>> -        /* rte_errno is already set. */
> >>> -        return -rte_errno;
> >>> -    }
> >>>    return 0;
> >>> }
> >>>
> >>> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> >>> index abb1f5179..08105a443 100644
> >>> --- a/drivers/net/mlx5/mlx5_mr.c
> >>> +++ b/drivers/net/mlx5/mlx5_mr.c
> >>> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
> >>>        rte_errno = EINVAL;
> >>>        return -rte_errno;
> >>>    }
> >>> +    assert(!bt->table && !bt->size);
> >>>    memset(bt, 0, sizeof(*bt));
> >>>    bt->table = rte_calloc_socket("B-tree table",
> >>>                      n, sizeof(struct mlx5_mr_cache),
> >>> --
> >>> 2.13.3
> >>>
> >
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
  2018-05-26  2:31       ` Xueming(Steven) Li
@ 2018-05-26  5:01         ` Yongseok Koh
  0 siblings, 0 replies; 11+ messages in thread
From: Yongseok Koh @ 2018-05-26  5:01 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Shahaf Shuler, dev, Nélio Laranjeiro, Adrien Mazarguil
> On May 25, 2018, at 7:32 PM, Xueming(Steven) Li <xuemingl@mellanox.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Yongseok Koh
>> Sent: Friday, May 25, 2018 11:41 PM
>> To: Xueming(Steven) Li <xuemingl@mellanox.com>
>> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Nélio Laranjeiro <nelio.laranjeiro@6wind.com>;
>> Adrien Mazarguil <adrien.mazarguil@6wind.com>
>> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
>> 
>> 
>>> On May 25, 2018, at 6:18 AM, Xueming(Steven) Li <xuemingl@mellanox.com> wrote:
>>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Yongseok Koh
>>>> Sent: Friday, May 25, 2018 6:20 PM
>>>> To: Xueming(Steven) Li <xuemingl@mellanox.com>
>>>> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
>>>> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
>>>> 
>>>>> On May 24, 2018, at 11:35 PM, Xueming Li <xuemingl@mellanox.com> wrote:
>>>>> 
>>>>> This patch moved MR cache init from device configuration function to
>>>>> probe function to make sure init only once.
>>>>> 
>>>>> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
>>>>> Cc: yskoh@mellanox.com
>>>>> 
>>>>> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
>>>>> ---
>>>>> drivers/net/mlx5/mlx5.c        | 11 +++++++++++
>>>>> drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
>>>>> drivers/net/mlx5/mlx5_mr.c     |  1 +
>>>>> 3 files changed, 12 insertions(+), 11 deletions(-)
>>>>> 
>>>>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
>>>>> dae847493..77ed8e01f 100644
>>>>> --- a/drivers/net/mlx5/mlx5.c
>>>>> +++ b/drivers/net/mlx5/mlx5.c
>>>>> @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>>>>>           goto port_error;
>>>>>       }
>>>>>       priv->config.max_verbs_prio = verb_priorities;
>>>>> +        /*
>>>>> +         * Once the device is added to the list of memory event
>>>>> +         * callback, its global MR cache table cannot be expanded
>>>>> +         * on the fly because of deadlock. If it overflows, lookup
>>>>> +         * should be done by searching MR list linearly, which is slow.
>>>>> +         */
>>>>> +        err = -mlx5_mr_btree_init(&priv->mr.cache,
>>>>> +                      MLX5_MR_BTREE_CACHE_N * 2,
>>>>> +                      eth_dev->device->numa_node);
>>>>> +        if (err)
>>>>> +            goto port_error;
>>>> 
>>>> A nit.
>>>> Like mlx5_flow_create_drop_queue(), please store rte_errno to err
>>>> (err = rte_errno;) instead of putting a minus sign to the function.
>>> 
>>> Rte_errno is set inside mlx5_mr_btree_init() if any error, that's why I simply the code.
>> 
>> I understood your intention and I know that’s not faulty either.
>> 
>> However, here consistency is more important than simplicity. In the probe func, we have been fixing
>> quite a few bugs due to code inconsistencies. I don’t want to put a minus sign but follow the way as
>> we did for mlx5_flow_create_drop_queue() in the same function. Putting a minus sign unlike other code
>> around it could be another buggy point if someone else makes changes there in the future. Please make
>> it aligned because both approaches are same anyway.
> 
> Such design is confusing, error code just used to test error happens or not, in functions we spend many 
> code to save errno to both rte_errno and return value, and then also caller side. If people intend to 
> have error info saved in rte_errno, then maybe changing functions returning void could be better - this 
> something also anti-design, like errno, rte_errno is used to indicate detail root cause when error happens,
> for people who want to know detail, that can’t necessarily mean that we should pass it precisely in each
> caller. Just personal concern.
Of course, you can submit the idea to make such an enhancement. Please come up with a separate patch for 18.08. 
For now, let’s focus on fixing the bug for the current release. Schedule is really tight for 18.05. Please send out a new version. We don’t have much time.
Thanks
Yongseok 
>>>> 
>>>> With that being fixed, you can put my acked-by tag when you submit v2.
>>>> 
>>>> Thanks,
>>>> Yongseok
>>>> 
>>>>>       /* Add device to memory callback list. */
>>>>>       rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
>>>>>       LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
>>>>> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
>>>>> b/drivers/net/mlx5/mlx5_ethdev.c index f6cebae41..90488af33 100644
>>>>> --- a/drivers/net/mlx5/mlx5_ethdev.c
>>>>> +++ b/drivers/net/mlx5/mlx5_ethdev.c
>>>>> @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
>>>>>       if (++j == rxqs_n)
>>>>>           j = 0;
>>>>>   }
>>>>> -    /*
>>>>> -     * Once the device is added to the list of memory event callback, its
>>>>> -     * global MR cache table cannot be expanded on the fly because of
>>>>> -     * deadlock. If it overflows, lookup should be done by searching MR list
>>>>> -     * linearly, which is slow.
>>>>> -     */
>>>>> -    if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
>>>>> -                   dev->device->numa_node)) {
>>>>> -        /* rte_errno is already set. */
>>>>> -        return -rte_errno;
>>>>> -    }
>>>>>   return 0;
>>>>> }
>>>>> 
>>>>> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
>>>>> index abb1f5179..08105a443 100644
>>>>> --- a/drivers/net/mlx5/mlx5_mr.c
>>>>> +++ b/drivers/net/mlx5/mlx5_mr.c
>>>>> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
>>>>>       rte_errno = EINVAL;
>>>>>       return -rte_errno;
>>>>>   }
>>>>> +    assert(!bt->table && !bt->size);
>>>>>   memset(bt, 0, sizeof(*bt));
>>>>>   bt->table = rte_calloc_socket("B-tree table",
>>>>>                     n, sizeof(struct mlx5_mr_cache),
>>>>> --
>>>>> 2.13.3
>>>>> 
>>> 
^ permalink raw reply	[flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v2] net/mlx5: fix memory region cache init
  2018-05-25  6:35 [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init Xueming Li
  2018-05-25  6:38 ` Xueming(Steven) Li
  2018-05-25 10:19 ` Yongseok Koh
@ 2018-05-26 13:27 ` Xueming Li
  2018-05-26 17:08   ` Yongseok Koh
  2 siblings, 1 reply; 11+ messages in thread
From: Xueming Li @ 2018-05-26 13:27 UTC (permalink / raw)
  To: Shahaf Shuler, Yongseok Koh; +Cc: Xueming Li, dev
This patch moved MR cache init from device configuration function to
probe function to make sure init only once.
Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: yskoh@mellanox.com
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        | 13 +++++++++++++
 drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
 drivers/net/mlx5/mlx5_mr.c     |  1 +
 3 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dae847493..3ef02e2d2 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1193,6 +1193,19 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			goto port_error;
 		}
 		priv->config.max_verbs_prio = verb_priorities;
+		/*
+		 * Once the device is added to the list of memory event
+		 * callback, its global MR cache table cannot be expanded
+		 * on the fly because of deadlock. If it overflows, lookup
+		 * should be done by searching MR list linearly, which is slow.
+		 */
+		err = mlx5_mr_btree_init(&priv->mr.cache,
+					 MLX5_MR_BTREE_CACHE_N * 2,
+					 eth_dev->device->numa_node);
+		if (err) {
+			err = rte_errno;
+			goto port_error;
+		}
 		/* Add device to memory callback list. */
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
 		LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f6cebae41..90488af33 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 		if (++j == rxqs_n)
 			j = 0;
 	}
-	/*
-	 * Once the device is added to the list of memory event callback, its
-	 * global MR cache table cannot be expanded on the fly because of
-	 * deadlock. If it overflows, lookup should be done by searching MR list
-	 * linearly, which is slow.
-	 */
-	if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
-			       dev->device->numa_node)) {
-		/* rte_errno is already set. */
-		return -rte_errno;
-	}
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index abb1f5179..08105a443 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	assert(!bt->table && !bt->size);
 	memset(bt, 0, sizeof(*bt));
 	bt->table = rte_calloc_socket("B-tree table",
 				      n, sizeof(struct mlx5_mr_cache),
-- 
2.13.3
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2] net/mlx5: fix memory region cache init
  2018-05-26 13:27 ` [dpdk-dev] [PATCH v2] " Xueming Li
@ 2018-05-26 17:08   ` Yongseok Koh
  2018-05-27  5:05     ` Shahaf Shuler
  0 siblings, 1 reply; 11+ messages in thread
From: Yongseok Koh @ 2018-05-26 17:08 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev
> On May 26, 2018, at 6:28 AM, Xueming Li <xuemingl@mellanox.com> wrote:
> 
> This patch moved MR cache init from device configuration function to
> probe function to make sure init only once.
> 
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: yskoh@mellanox.com
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Thanks
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2] net/mlx5: fix memory region cache init
  2018-05-26 17:08   ` Yongseok Koh
@ 2018-05-27  5:05     ` Shahaf Shuler
  2018-05-27  5:08       ` Shahaf Shuler
  0 siblings, 1 reply; 11+ messages in thread
From: Shahaf Shuler @ 2018-05-27  5:05 UTC (permalink / raw)
  To: Yongseok Koh, Xueming(Steven) Li; +Cc: dev
Saturday, May 26, 2018 8:08 PM, Yongseok Koh:
> Subject: Re: [PATCH v2] net/mlx5: fix memory region cache init
> 
> 
> > On May 26, 2018, at 6:28 AM, Xueming Li <xuemingl@mellanox.com>
> wrote:
> >
> > This patch moved MR cache init from device configuration function to
> > probe function to make sure init only once.
> >
> > Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> > Cc: yskoh@mellanox.com
> >
> > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > ---
> Acked-by: Yongseok Koh <yskoh@mellanox.com>
Applied to next-net-mlx, thanks. 
> 
> Thanks
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2] net/mlx5: fix memory region cache init
  2018-05-27  5:05     ` Shahaf Shuler
@ 2018-05-27  5:08       ` Shahaf Shuler
  0 siblings, 0 replies; 11+ messages in thread
From: Shahaf Shuler @ 2018-05-27  5:08 UTC (permalink / raw)
  To: Shahaf Shuler, Yongseok Koh, Xueming(Steven) Li, Ferruh Yigit; +Cc: dev
Ferruh,
Sunday, May 27, 2018 8:05 AM, Shahaf Shuler:
> Subject: Re: [dpdk-dev] [PATCH v2] net/mlx5: fix memory region cache init
> 
> Saturday, May 26, 2018 8:08 PM, Yongseok Koh:
> > Subject: Re: [PATCH v2] net/mlx5: fix memory region cache init
> >
> >
> > > On May 26, 2018, at 6:28 AM, Xueming Li <xuemingl@mellanox.com>
> > wrote:
> > >
> > > This patch moved MR cache init from device configuration function to
> > > probe function to make sure init only once.
> > >
> > > Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> > > Cc: yskoh@mellanox.com
> > >
> > > Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > > ---
> > Acked-by: Yongseok Koh <yskoh@mellanox.com>
> 
> Applied to next-net-mlx, thanks.
This is yet another critical fix for a bug caught recently. 
This patch is to prevent deadlock when restarting the port.  
Hope you will be able to pull this one in. 
> 
> >
> > Thanks
^ permalink raw reply	[flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-05-27  5:08 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-25  6:35 [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init Xueming Li
2018-05-25  6:38 ` Xueming(Steven) Li
2018-05-25 10:19 ` Yongseok Koh
2018-05-25 13:18   ` Xueming(Steven) Li
2018-05-25 15:41     ` Yongseok Koh
2018-05-26  2:31       ` Xueming(Steven) Li
2018-05-26  5:01         ` Yongseok Koh
2018-05-26 13:27 ` [dpdk-dev] [PATCH v2] " Xueming Li
2018-05-26 17:08   ` Yongseok Koh
2018-05-27  5:05     ` Shahaf Shuler
2018-05-27  5:08       ` Shahaf Shuler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).