Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Yongseok Koh <yskoh@mellanox.com>
To: "Xueming(Steven) Li" <xuemingl@mellanox.com>
Cc: "Shahaf Shuler" <shahafs@mellanox.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Nélio Laranjeiro" <nelio.laranjeiro@6wind.com>,
	"Adrien Mazarguil" <adrien.mazarguil@6wind.com>
Subject: Re: [dpdk-dev] [PATCH] net/mlx5: fix memory region cache init
Date: Sat, 26 May 2018 05:01:28 +0000	[thread overview]
Message-ID: <BEC85FF7-CB39-4A61-BE30-DEF6C761B6D6@mellanox.com> (raw)
In-Reply-To: <VI1PR05MB16783ED3FD9B087962423975AC680@VI1PR05MB1678.eurprd05.prod.outlook.com>


> On May 25, 2018, at 7:32 PM, Xueming(Steven) Li <xuemingl@mellanox.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Yongseok Koh
>> Sent: Friday, May 25, 2018 11:41 PM
>> To: Xueming(Steven) Li <xuemingl@mellanox.com>
>> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org; Nélio Laranjeiro <nelio.laranjeiro@6wind.com>;
>> Adrien Mazarguil <adrien.mazarguil@6wind.com>
>> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
>> 
>> 
>>> On May 25, 2018, at 6:18 AM, Xueming(Steven) Li <xuemingl@mellanox.com> wrote:
>>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Yongseok Koh
>>>> Sent: Friday, May 25, 2018 6:20 PM
>>>> To: Xueming(Steven) Li <xuemingl@mellanox.com>
>>>> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
>>>> Subject: Re: [PATCH] net/mlx5: fix memory region cache init
>>>> 
>>>>> On May 24, 2018, at 11:35 PM, Xueming Li <xuemingl@mellanox.com> wrote:
>>>>> 
>>>>> This patch moved MR cache init from device configuration function to
>>>>> probe function to make sure init only once.
>>>>> 
>>>>> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
>>>>> Cc: yskoh@mellanox.com
>>>>> 
>>>>> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
>>>>> ---
>>>>> drivers/net/mlx5/mlx5.c        | 11 +++++++++++
>>>>> drivers/net/mlx5/mlx5_ethdev.c | 11 -----------
>>>>> drivers/net/mlx5/mlx5_mr.c     |  1 +
>>>>> 3 files changed, 12 insertions(+), 11 deletions(-)
>>>>> 
>>>>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
>>>>> dae847493..77ed8e01f 100644
>>>>> --- a/drivers/net/mlx5/mlx5.c
>>>>> +++ b/drivers/net/mlx5/mlx5.c
>>>>> @@ -1193,6 +1193,17 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>>>>>           goto port_error;
>>>>>       }
>>>>>       priv->config.max_verbs_prio = verb_priorities;
>>>>> +        /*
>>>>> +         * Once the device is added to the list of memory event
>>>>> +         * callback, its global MR cache table cannot be expanded
>>>>> +         * on the fly because of deadlock. If it overflows, lookup
>>>>> +         * should be done by searching MR list linearly, which is slow.
>>>>> +         */
>>>>> +        err = -mlx5_mr_btree_init(&priv->mr.cache,
>>>>> +                      MLX5_MR_BTREE_CACHE_N * 2,
>>>>> +                      eth_dev->device->numa_node);
>>>>> +        if (err)
>>>>> +            goto port_error;
>>>> 
>>>> A nit.
>>>> Like mlx5_flow_create_drop_queue(), please store rte_errno to err
>>>> (err = rte_errno;) instead of putting a minus sign to the function.
>>> 
>>> Rte_errno is set inside mlx5_mr_btree_init() if any error, that's why I simply the code.
>> 
>> I understood your intention and I know that’s not faulty either.
>> 
>> However, here consistency is more important than simplicity. In the probe func, we have been fixing
>> quite a few bugs due to code inconsistencies. I don’t want to put a minus sign but follow the way as
>> we did for mlx5_flow_create_drop_queue() in the same function. Putting a minus sign unlike other code
>> around it could be another buggy point if someone else makes changes there in the future. Please make
>> it aligned because both approaches are same anyway.
> 
> Such design is confusing, error code just used to test error happens or not, in functions we spend many 
> code to save errno to both rte_errno and return value, and then also caller side. If people intend to 
> have error info saved in rte_errno, then maybe changing functions returning void could be better - this 
> something also anti-design, like errno, rte_errno is used to indicate detail root cause when error happens,
> for people who want to know detail, that can’t necessarily mean that we should pass it precisely in each
> caller. Just personal concern.

Of course, you can submit the idea to make such an enhancement. Please come up with a separate patch for 18.08. 

For now, let’s focus on fixing the bug for the current release. Schedule is really tight for 18.05. Please send out a new version. We don’t have much time.

Thanks
Yongseok 


>>>> 
>>>> With that being fixed, you can put my acked-by tag when you submit v2.
>>>> 
>>>> Thanks,
>>>> Yongseok
>>>> 
>>>>>       /* Add device to memory callback list. */
>>>>>       rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
>>>>>       LIST_INSERT_HEAD(&mlx5_shared_data->mem_event_cb_list,
>>>>> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
>>>>> b/drivers/net/mlx5/mlx5_ethdev.c index f6cebae41..90488af33 100644
>>>>> --- a/drivers/net/mlx5/mlx5_ethdev.c
>>>>> +++ b/drivers/net/mlx5/mlx5_ethdev.c
>>>>> @@ -392,17 +392,6 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
>>>>>       if (++j == rxqs_n)
>>>>>           j = 0;
>>>>>   }
>>>>> -    /*
>>>>> -     * Once the device is added to the list of memory event callback, its
>>>>> -     * global MR cache table cannot be expanded on the fly because of
>>>>> -     * deadlock. If it overflows, lookup should be done by searching MR list
>>>>> -     * linearly, which is slow.
>>>>> -     */
>>>>> -    if (mlx5_mr_btree_init(&priv->mr.cache, MLX5_MR_BTREE_CACHE_N * 2,
>>>>> -                   dev->device->numa_node)) {
>>>>> -        /* rte_errno is already set. */
>>>>> -        return -rte_errno;
>>>>> -    }
>>>>>   return 0;
>>>>> }
>>>>> 
>>>>> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
>>>>> index abb1f5179..08105a443 100644
>>>>> --- a/drivers/net/mlx5/mlx5_mr.c
>>>>> +++ b/drivers/net/mlx5/mlx5_mr.c
>>>>> @@ -191,6 +191,7 @@ mlx5_mr_btree_init(struct mlx5_mr_btree *bt, int n, int socket)
>>>>>       rte_errno = EINVAL;
>>>>>       return -rte_errno;
>>>>>   }
>>>>> +    assert(!bt->table && !bt->size);
>>>>>   memset(bt, 0, sizeof(*bt));
>>>>>   bt->table = rte_calloc_socket("B-tree table",
>>>>>                     n, sizeof(struct mlx5_mr_cache),
>>>>> --
>>>>> 2.13.3
>>>>> 
>>>

next prev parent reply	other threads:[~2018-05-26  5:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25  6:35 Xueming Li
2018-05-25  6:38 ` Xueming(Steven) Li
2018-05-25 10:19 ` Yongseok Koh
2018-05-25 13:18   ` Xueming(Steven) Li
2018-05-25 15:41     ` Yongseok Koh
2018-05-26  2:31       ` Xueming(Steven) Li
2018-05-26  5:01         ` Yongseok Koh [this message]
2018-05-26 13:27 ` [dpdk-dev] [PATCH v2] " Xueming Li
2018-05-26 17:08   ` Yongseok Koh
2018-05-27  5:05     ` Shahaf Shuler
2018-05-27  5:08       ` Shahaf Shuler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BEC85FF7-CB39-4A61-BE30-DEF6C761B6D6@mellanox.com \
    --to=yskoh@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=nelio.laranjeiro@6wind.com \
    --cc=shahafs@mellanox.com \
    --cc=xuemingl@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).