patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
@ 2019-01-09  8:54 Yongseok Koh
  2019-01-09  9:38 ` [dpdk-stable] [dpdk-dev] " David Marchand
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Yongseok Koh @ 2019-01-09  8:54 UTC (permalink / raw)
  To: shahafs; +Cc: dev, stable

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fda7004e2d..ced5547307 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
-					      RTE_PKTMBUF_HEADROOM);
+		uintptr_t buf_addr =
+			(uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
+			rte_pktmbuf_priv_size(rxq->mp) + RTE_PKTMBUF_HEADROOM;
+
+		assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64(buf_addr);
 		/* If there's only one MR, no need to replace LKey in WQE. */
 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
 			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-09  8:54 [dpdk-stable] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer Yongseok Koh
@ 2019-01-09  9:38 ` David Marchand
  2019-01-09  9:52   ` Olivier Matz
       [not found] ` <20190109131908.4949-1-yskoh@mellanox.com>
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: David Marchand @ 2019-01-09  9:38 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: shahafs, dev, stable

On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote:

> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
> to be accessed as it is static and easily calculated from the mbuf address.
> Accessing the mbuf content causes unnecessary load stall and it is worsened
> on ARM.
>
> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> Cc: stable@dpdk.org
>
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index fda7004e2d..ced5547307 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data
> *rxq, uint16_t n)
>                 return;
>         }
>         for (i = 0; i < n; ++i) {
> -               wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr
> +
> -                                             RTE_PKTMBUF_HEADROOM);
> +               uintptr_t buf_addr =
> +                       (uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
> +                       rte_pktmbuf_priv_size(rxq->mp) +
> RTE_PKTMBUF_HEADROOM;
> +
> +               assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
> +               wq[i].addr = rte_cpu_to_be_64(buf_addr);
>                 /* If there's only one MR, no need to replace LKey in WQE.
> */
>                 if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) >
> 1))
>                         wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
> --
> 2.11.0
>
>
How about having a macro / inline in the mbuf api to get this information
in a consistent/unique way ?
I can see we have this calculation at least in rte_pktmbuf_init() and
rte_pktmbuf_detach().


-- 
David Marchand

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-09  9:38 ` [dpdk-stable] [dpdk-dev] " David Marchand
@ 2019-01-09  9:52   ` Olivier Matz
  2019-01-09  9:56     ` Yongseok Koh
  0 siblings, 1 reply; 14+ messages in thread
From: Olivier Matz @ 2019-01-09  9:52 UTC (permalink / raw)
  To: David Marchand; +Cc: Yongseok Koh, shahafs, dev, stable

On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote:
> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> > On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
> > to be accessed as it is static and easily calculated from the mbuf address.
> > Accessing the mbuf content causes unnecessary load stall and it is worsened
> > on ARM.
> >
> > Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec.h
> > index fda7004e2d..ced5547307 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> > @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data
> > *rxq, uint16_t n)
> >                 return;
> >         }
> >         for (i = 0; i < n; ++i) {
> > -               wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr
> > +
> > -                                             RTE_PKTMBUF_HEADROOM);
> > +               uintptr_t buf_addr =
> > +                       (uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
> > +                       rte_pktmbuf_priv_size(rxq->mp) +
> > RTE_PKTMBUF_HEADROOM;
> > +
> > +               assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
> > +               wq[i].addr = rte_cpu_to_be_64(buf_addr);
> >                 /* If there's only one MR, no need to replace LKey in WQE.
> > */
> >                 if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) >
> > 1))
> >                         wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
> > --
> > 2.11.0
> >
> >
> How about having a macro / inline in the mbuf api to get this information
> in a consistent/unique way ?
> I can see we have this calculation at least in rte_pktmbuf_init() and
> rte_pktmbuf_detach().

Agree. Maybe rte_mbuf_default_buf_addr(m) ?

Side note, is the assert() correct in the patch? I'd say there's a
difference of RTE_PKTMBUF_HEADROOM between the 2 values.

Olivier

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-09  9:52   ` Olivier Matz
@ 2019-01-09  9:56     ` Yongseok Koh
  2019-01-09 10:05       ` David Marchand
  0 siblings, 1 reply; 14+ messages in thread
From: Yongseok Koh @ 2019-01-09  9:56 UTC (permalink / raw)
  To: Olivier Matz; +Cc: David Marchand, Shahaf Shuler, dev, stable


> On Jan 9, 2019, at 1:52 AM, Olivier Matz <olivier.matz@6wind.com> wrote:
> 
> On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote:
>> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote:
>> 
>>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
>>> to be accessed as it is static and easily calculated from the mbuf address.
>>> Accessing the mbuf content causes unnecessary load stall and it is worsened
>>> on ARM.
>>> 
>>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
>>> Cc: stable@dpdk.org
>>> 
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>> ---
>>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
>>> b/drivers/net/mlx5/mlx5_rxtx_vec.h
>>> index fda7004e2d..ced5547307 100644
>>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
>>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
>>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data
>>> *rxq, uint16_t n)
>>>                return;
>>>        }
>>>        for (i = 0; i < n; ++i) {
>>> -               wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr
>>> +
>>> -                                             RTE_PKTMBUF_HEADROOM);
>>> +               uintptr_t buf_addr =
>>> +                       (uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
>>> +                       rte_pktmbuf_priv_size(rxq->mp) +
>>> RTE_PKTMBUF_HEADROOM;
>>> +
>>> +               assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
>>> +               wq[i].addr = rte_cpu_to_be_64(buf_addr);
>>>                /* If there's only one MR, no need to replace LKey in WQE.
>>> */
>>>                if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) >
>>> 1))
>>>                        wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
>>> --
>>> 2.11.0
>>> 
>>> 
>> How about having a macro / inline in the mbuf api to get this information
>> in a consistent/unique way ?
>> I can see we have this calculation at least in rte_pktmbuf_init() and
>> rte_pktmbuf_detach().
> 
> Agree. Maybe rte_mbuf_default_buf_addr(m) ?

I'm also okay to add. Will come up with a new patch.

> Side note, is the assert() correct in the patch? I'd say there's a
> difference of RTE_PKTMBUF_HEADROOM between the 2 values.

Oops, my fault. Thanks for the catch, you saved a crash. :-)

Thanks,
Yongseok

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-09  9:56     ` Yongseok Koh
@ 2019-01-09 10:05       ` David Marchand
  2019-01-09 10:11         ` Yongseok Koh
  0 siblings, 1 reply; 14+ messages in thread
From: David Marchand @ 2019-01-09 10:05 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: Olivier Matz, Shahaf Shuler, dev, stable

On Wed, Jan 9, 2019 at 10:56 AM Yongseok Koh <yskoh@mellanox.com> wrote:

>
> > On Jan 9, 2019, at 1:52 AM, Olivier Matz <olivier.matz@6wind.com> wrote:
> >
> > On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote:
> >> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote:
> >>
> >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't
> needed
> >>> to be accessed as it is static and easily calculated from the mbuf
> address.
> >>> Accessing the mbuf content causes unnecessary load stall and it is
> worsened
> >>> on ARM.
> >>>
> >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>> ---
> >>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
> >>> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> b/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> index fda7004e2d..ced5547307 100644
> >>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data
> >>> *rxq, uint16_t n)
> >>>                return;
> >>>        }
> >>>        for (i = 0; i < n; ++i) {
> >>> -               wq[i].addr =
> rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr
> >>> +
> >>> -                                             RTE_PKTMBUF_HEADROOM);
> >>> +               uintptr_t buf_addr =
> >>> +                       (uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
> >>> +                       rte_pktmbuf_priv_size(rxq->mp) +
> >>> RTE_PKTMBUF_HEADROOM;
> >>> +
> >>> +               assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
> >>> +               wq[i].addr = rte_cpu_to_be_64(buf_addr);
> >>>                /* If there's only one MR, no need to replace LKey in
> WQE.
> >>> */
> >>>                if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) >
> >>> 1))
> >>>                        wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
> >>> --
> >>> 2.11.0
> >>>
> >>>
> >> How about having a macro / inline in the mbuf api to get this
> information
> >> in a consistent/unique way ?
> >> I can see we have this calculation at least in rte_pktmbuf_init() and
> >> rte_pktmbuf_detach().
> >
> > Agree. Maybe rte_mbuf_default_buf_addr(m) ?
>
> I'm also okay to add. Will come up with a new patch.
>
> > Side note, is the assert() correct in the patch? I'd say there's a
> > difference of RTE_PKTMBUF_HEADROOM between the 2 values.
>
> Oops, my fault. Thanks for the catch, you saved a crash. :-)
>

Is this assert really necessary if we have a common macro ?
I was under the impression that this assert is there to catch misalignement
between the mbuf api and the driver.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-09 10:05       ` David Marchand
@ 2019-01-09 10:11         ` Yongseok Koh
  0 siblings, 0 replies; 14+ messages in thread
From: Yongseok Koh @ 2019-01-09 10:11 UTC (permalink / raw)
  To: David Marchand; +Cc: Olivier Matz, Shahaf Shuler, dev, stable



> On Jan 9, 2019, at 2:05 AM, David Marchand <david.marchand@redhat.com> wrote:
> 
> 
> 
> On Wed, Jan 9, 2019 at 10:56 AM Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> > On Jan 9, 2019, at 1:52 AM, Olivier Matz <olivier.matz@6wind.com> wrote:
> > 
> > On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote:
> >> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote:
> >> 
> >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
> >>> to be accessed as it is static and easily calculated from the mbuf address.
> >>> Accessing the mbuf content causes unnecessary load stall and it is worsened
> >>> on ARM.
> >>> 
> >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> >>> Cc: stable@dpdk.org
> >>> 
> >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>> ---
> >>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
> >>> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>> 
> >>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> b/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> index fda7004e2d..ced5547307 100644
> >>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> >>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data
> >>> *rxq, uint16_t n)
> >>>                return;
> >>>        }
> >>>        for (i = 0; i < n; ++i) {
> >>> -               wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr
> >>> +
> >>> -                                             RTE_PKTMBUF_HEADROOM);
> >>> +               uintptr_t buf_addr =
> >>> +                       (uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
> >>> +                       rte_pktmbuf_priv_size(rxq->mp) +
> >>> RTE_PKTMBUF_HEADROOM;
> >>> +
> >>> +               assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
> >>> +               wq[i].addr = rte_cpu_to_be_64(buf_addr);
> >>>                /* If there's only one MR, no need to replace LKey in WQE.
> >>> */
> >>>                if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) >
> >>> 1))
> >>>                        wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
> >>> --
> >>> 2.11.0
> >>> 
> >>> 
> >> How about having a macro / inline in the mbuf api to get this information
> >> in a consistent/unique way ?
> >> I can see we have this calculation at least in rte_pktmbuf_init() and
> >> rte_pktmbuf_detach().
> > 
> > Agree. Maybe rte_mbuf_default_buf_addr(m) ?
> 
> I'm also okay to add. Will come up with a new patch.
> 
> > Side note, is the assert() correct in the patch? I'd say there's a
> > difference of RTE_PKTMBUF_HEADROOM between the 2 values.
> 
> Oops, my fault. Thanks for the catch, you saved a crash. :-)
> 
> Is this assert really necessary if we have a common macro ?
> I was under the impression that this assert is there to catch misalignement between the mbuf api and the driver.

It is still good to have. This can catch corruption of mbuf content which sometimes
happens due to wrong mbuf handling in PMD or potential HW memory corruption.


Thanks,
Yongseok

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-stable] [PATCH v2 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
       [not found] ` <20190109131908.4949-1-yskoh@mellanox.com>
@ 2019-01-09 13:19   ` Yongseok Koh
  0 siblings, 0 replies; 14+ messages in thread
From: Yongseok Koh @ 2019-01-09 13:19 UTC (permalink / raw)
  To: olivier.matz, shahafs; +Cc: dev, david.marchand, stable

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---

v2:
* use the newly introduced API - rte_mbuf_buf_addr_default()
* fix error in assert

 drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fda7004e2d..154934d92d 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
-					      RTE_PKTMBUF_HEADROOM);
+		void *buf_addr = rte_mbuf_buf_addr_default(elts[i], rxq->mp);
+
+		assert(buf_addr == elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
 			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-stable] [PATCH v3 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
       [not found] ` <20190110183528.42503-1-yskoh@mellanox.com>
@ 2019-01-10 18:35   ` Yongseok Koh
  2019-01-10 19:10     ` [dpdk-stable] [dpdk-dev] " Shahaf Shuler
  0 siblings, 1 reply; 14+ messages in thread
From: Yongseok Koh @ 2019-01-10 18:35 UTC (permalink / raw)
  To: olivier.matz, shahafs; +Cc: dev, roszenrami, david.marchand, stable

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---

v3:
* rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr()

v2:
* use the newly introduced API - rte_mbuf_buf_addr_default()
* fix error in assert

 drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fda7004e2d..989a1fdce5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
-					      RTE_PKTMBUF_HEADROOM);
+		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
+
+		assert(buf_addr == elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
 			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-10 18:35   ` [dpdk-stable] [PATCH v3 " Yongseok Koh
@ 2019-01-10 19:10     ` Shahaf Shuler
  0 siblings, 0 replies; 14+ messages in thread
From: Shahaf Shuler @ 2019-01-10 19:10 UTC (permalink / raw)
  To: Yongseok Koh, olivier.matz; +Cc: dev, roszenrami, david.marchand, stable

Thursday, January 10, 2019 8:35 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v3 2/2] net/mlx5: fix instruction hotspot on
> replenishing Rx buffer
> 
> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to
> be accessed as it is static and easily calculated from the mbuf address.
> Accessing the mbuf content causes unnecessary load stall and it is worsened on
> ARM.
> 
> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

> ---
> 
> v3:
> * rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr()
> 
> v2:
> * use the newly introduced API - rte_mbuf_buf_addr_default()
> * fix error in assert
> 
>  drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h
> b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index fda7004e2d..989a1fdce5 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data
> *rxq, uint16_t n)
>  		return;
>  	}
>  	for (i = 0; i < n; ++i) {
> -		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
> -					      RTE_PKTMBUF_HEADROOM);
> +		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
> +
> +		assert(buf_addr == elts[i]->buf_addr);
> +		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
> +RTE_PKTMBUF_HEADROOM);
>  		/* If there's only one MR, no need to replace LKey in WQE. */
>  		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
>  			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
> --
> 2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-stable] [PATCH v4 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
       [not found] ` <20190110224030.2671-1-yskoh@mellanox.com>
@ 2019-01-10 22:40   ` Yongseok Koh
  0 siblings, 0 replies; 14+ messages in thread
From: Yongseok Koh @ 2019-01-10 22:40 UTC (permalink / raw)
  To: olivier.matz, shahafs; +Cc: dev, roszenrami, david.marchand, stable

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---

v4:
* no change

v3:
* rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr()

v2:
* use the newly introduced API - rte_mbuf_buf_addr_default()
* fix error in assert

 drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fda7004e2d..989a1fdce5 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
-					      RTE_PKTMBUF_HEADROOM);
+		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
+
+		assert(buf_addr == elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
 			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-stable] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
       [not found] ` <20190114211622.6900-1-yskoh@mellanox.com>
@ 2019-01-14 21:16   ` Yongseok Koh
  2019-02-06 15:54     ` [dpdk-stable] [dpdk-dev] " Kevin Traynor
  0 siblings, 1 reply; 14+ messages in thread
From: Yongseok Koh @ 2019-01-14 21:16 UTC (permalink / raw)
  To: olivier.matz, shahafs; +Cc: dev, arybchenko, roszenrami, david.marchand, stable

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---

v5:
* no change

v4:
* no change

v3:
* rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr()

v2:
* use the newly introduced API - rte_mbuf_buf_addr_default()
* fix error in assert

 drivers/net/mlx5/mlx5_rxtx_vec.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fda7004e2d..5df8e291e6 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -102,7 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
+		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
+
+		assert(buf_addr == elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-01-14 21:16   ` [dpdk-stable] [PATCH v5 " Yongseok Koh
@ 2019-02-06 15:54     ` Kevin Traynor
  2019-02-21 19:10       ` Kevin Traynor
  0 siblings, 1 reply; 14+ messages in thread
From: Kevin Traynor @ 2019-02-06 15:54 UTC (permalink / raw)
  To: Yongseok Koh, olivier.matz, shahafs
  Cc: dev, arybchenko, roszenrami, david.marchand, stable

On 01/14/2019 09:16 PM, Yongseok Koh wrote:
> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
> to be accessed as it is static and easily calculated from the mbuf address.
> Accessing the mbuf content causes unnecessary load stall and it is worsened
> on ARM.
> 
> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> Cc: stable@dpdk.org
> 

This is using the API introduced in 1/2, so it's not really suitable for
backport. Maybe you want to send an alternative for stable?

> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
> ---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-02-06 15:54     ` [dpdk-stable] [dpdk-dev] " Kevin Traynor
@ 2019-02-21 19:10       ` Kevin Traynor
  2019-03-08  2:05         ` Yongseok Koh
  0 siblings, 1 reply; 14+ messages in thread
From: Kevin Traynor @ 2019-02-21 19:10 UTC (permalink / raw)
  To: Yongseok Koh
  Cc: olivier.matz, shahafs, dev, arybchenko, roszenrami,
	david.marchand, stable

Hi Yongseok,

Can you let me know how you want to proceed with the below. I think we
could just drop as it's a performance optimization, or maybe you have a
different idea.

thanks,
Kevin.

On 06/02/2019 15:54, Kevin Traynor wrote:
> On 01/14/2019 09:16 PM, Yongseok Koh wrote:
>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
>> to be accessed as it is static and easily calculated from the mbuf address.
>> Accessing the mbuf content causes unnecessary load stall and it is worsened
>> on ARM.
>>
>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
>> Cc: stable@dpdk.org
>>
> 
> This is using the API introduced in 1/2, so it's not really suitable for
> backport. Maybe you want to send an alternative for stable?
> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
>> ---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-stable] [dpdk-dev] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer
  2019-02-21 19:10       ` Kevin Traynor
@ 2019-03-08  2:05         ` Yongseok Koh
  0 siblings, 0 replies; 14+ messages in thread
From: Yongseok Koh @ 2019-03-08  2:05 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Olivier Matz, Shahaf Shuler, dev, Andrew Rybchenko, Rami Rosen,
	David Marchand, stable

Oops, I missed this email somehow,
probably out of mind during my long vacation. :-)

I've also encountered this issue with 17.11.6.
http://git.dpdk.org/dpdk-stable/commit/?h=17.11&id=63f06f3fccc87c55adb33248e5c68a0175d213f1

I'll send a backport to you.


Sorry for late reply.
Yongseok

> On Feb 21, 2019, at 11:10 AM, Kevin Traynor <ktraynor@redhat.com> wrote:
> 
> Hi Yongseok,
> 
> Can you let me know how you want to proceed with the below. I think we
> could just drop as it's a performance optimization, or maybe you have a
> different idea.
> 
> thanks,
> Kevin.
> 
> On 06/02/2019 15:54, Kevin Traynor wrote:
>> On 01/14/2019 09:16 PM, Yongseok Koh wrote:
>>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
>>> to be accessed as it is static and easily calculated from the mbuf address.
>>> Accessing the mbuf content causes unnecessary load stall and it is worsened
>>> on ARM.
>>> 
>>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
>>> Cc: stable@dpdk.org
>>> 
>> 
>> This is using the API introduced in 1/2, so it's not really suitable for
>> backport. Maybe you want to send an alternative for stable?
>> 
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
>>> ---
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-03-08  2:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09  8:54 [dpdk-stable] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer Yongseok Koh
2019-01-09  9:38 ` [dpdk-stable] [dpdk-dev] " David Marchand
2019-01-09  9:52   ` Olivier Matz
2019-01-09  9:56     ` Yongseok Koh
2019-01-09 10:05       ` David Marchand
2019-01-09 10:11         ` Yongseok Koh
     [not found] ` <20190109131908.4949-1-yskoh@mellanox.com>
2019-01-09 13:19   ` [dpdk-stable] [PATCH v2 2/2] " Yongseok Koh
     [not found] ` <20190110183528.42503-1-yskoh@mellanox.com>
2019-01-10 18:35   ` [dpdk-stable] [PATCH v3 " Yongseok Koh
2019-01-10 19:10     ` [dpdk-stable] [dpdk-dev] " Shahaf Shuler
     [not found] ` <20190110224030.2671-1-yskoh@mellanox.com>
2019-01-10 22:40   ` [dpdk-stable] [PATCH v4 " Yongseok Koh
     [not found] ` <20190114211622.6900-1-yskoh@mellanox.com>
2019-01-14 21:16   ` [dpdk-stable] [PATCH v5 " Yongseok Koh
2019-02-06 15:54     ` [dpdk-stable] [dpdk-dev] " Kevin Traynor
2019-02-21 19:10       ` Kevin Traynor
2019-03-08  2:05         ` Yongseok Koh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).