* [dpdk-stable] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer @ 2019-01-09 8:54 Yongseok Koh 2019-01-09 9:38 ` [dpdk-stable] [dpdk-dev] " David Marchand ` (4 more replies) 0 siblings, 5 replies; 14+ messages in thread From: Yongseok Koh @ 2019-01-09 8:54 UTC (permalink / raw) To: shahafs; +Cc: dev, stable On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM. Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> --- drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index fda7004e2d..ced5547307 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) return; } for (i = 0; i < n; ++i) { - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + - RTE_PKTMBUF_HEADROOM); + uintptr_t buf_addr = + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + + rte_pktmbuf_priv_size(rxq->mp) + RTE_PKTMBUF_HEADROOM; + + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); + wq[i].addr = rte_cpu_to_be_64(buf_addr); /* If there's only one MR, no need to replace LKey in WQE. */ if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); -- 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-09 8:54 [dpdk-stable] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer Yongseok Koh @ 2019-01-09 9:38 ` David Marchand 2019-01-09 9:52 ` Olivier Matz [not found] ` <20190109131908.4949-1-yskoh@mellanox.com> ` (3 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: David Marchand @ 2019-01-09 9:38 UTC (permalink / raw) To: Yongseok Koh; +Cc: shahafs, dev, stable On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote: > On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed > to be accessed as it is static and easily calculated from the mbuf address. > Accessing the mbuf content causes unnecessary load stall and it is worsened > on ARM. > > Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") > Cc: stable@dpdk.org > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com> > --- > drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h > b/drivers/net/mlx5/mlx5_rxtx_vec.h > index fda7004e2d..ced5547307 100644 > --- a/drivers/net/mlx5/mlx5_rxtx_vec.h > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h > @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data > *rxq, uint16_t n) > return; > } > for (i = 0; i < n; ++i) { > - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr > + > - RTE_PKTMBUF_HEADROOM); > + uintptr_t buf_addr = > + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + > + rte_pktmbuf_priv_size(rxq->mp) + > RTE_PKTMBUF_HEADROOM; > + > + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); > + wq[i].addr = rte_cpu_to_be_64(buf_addr); > /* If there's only one MR, no need to replace LKey in WQE. > */ > if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > > 1)) > wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); > -- > 2.11.0 > > How about having a macro / inline in the mbuf api to get this information in a consistent/unique way ? I can see we have this calculation at least in rte_pktmbuf_init() and rte_pktmbuf_detach(). -- David Marchand ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-09 9:38 ` [dpdk-stable] [dpdk-dev] " David Marchand @ 2019-01-09 9:52 ` Olivier Matz 2019-01-09 9:56 ` Yongseok Koh 0 siblings, 1 reply; 14+ messages in thread From: Olivier Matz @ 2019-01-09 9:52 UTC (permalink / raw) To: David Marchand; +Cc: Yongseok Koh, shahafs, dev, stable On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote: > On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote: > > > On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed > > to be accessed as it is static and easily calculated from the mbuf address. > > Accessing the mbuf content causes unnecessary load stall and it is worsened > > on ARM. > > > > Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") > > Cc: stable@dpdk.org > > > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com> > > --- > > drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h > > b/drivers/net/mlx5/mlx5_rxtx_vec.h > > index fda7004e2d..ced5547307 100644 > > --- a/drivers/net/mlx5/mlx5_rxtx_vec.h > > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h > > @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data > > *rxq, uint16_t n) > > return; > > } > > for (i = 0; i < n; ++i) { > > - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr > > + > > - RTE_PKTMBUF_HEADROOM); > > + uintptr_t buf_addr = > > + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + > > + rte_pktmbuf_priv_size(rxq->mp) + > > RTE_PKTMBUF_HEADROOM; > > + > > + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); > > + wq[i].addr = rte_cpu_to_be_64(buf_addr); > > /* If there's only one MR, no need to replace LKey in WQE. > > */ > > if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > > > 1)) > > wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); > > -- > > 2.11.0 > > > > > How about having a macro / inline in the mbuf api to get this information > in a consistent/unique way ? > I can see we have this calculation at least in rte_pktmbuf_init() and > rte_pktmbuf_detach(). Agree. Maybe rte_mbuf_default_buf_addr(m) ? Side note, is the assert() correct in the patch? I'd say there's a difference of RTE_PKTMBUF_HEADROOM between the 2 values. Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-09 9:52 ` Olivier Matz @ 2019-01-09 9:56 ` Yongseok Koh 2019-01-09 10:05 ` David Marchand 0 siblings, 1 reply; 14+ messages in thread From: Yongseok Koh @ 2019-01-09 9:56 UTC (permalink / raw) To: Olivier Matz; +Cc: David Marchand, Shahaf Shuler, dev, stable > On Jan 9, 2019, at 1:52 AM, Olivier Matz <olivier.matz@6wind.com> wrote: > > On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote: >> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote: >> >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed >>> to be accessed as it is static and easily calculated from the mbuf address. >>> Accessing the mbuf content causes unnecessary load stall and it is worsened >>> on ARM. >>> >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") >>> Cc: stable@dpdk.org >>> >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> >>> --- >>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- >>> 1 file changed, 6 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h >>> b/drivers/net/mlx5/mlx5_rxtx_vec.h >>> index fda7004e2d..ced5547307 100644 >>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h >>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h >>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data >>> *rxq, uint16_t n) >>> return; >>> } >>> for (i = 0; i < n; ++i) { >>> - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr >>> + >>> - RTE_PKTMBUF_HEADROOM); >>> + uintptr_t buf_addr = >>> + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + >>> + rte_pktmbuf_priv_size(rxq->mp) + >>> RTE_PKTMBUF_HEADROOM; >>> + >>> + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); >>> + wq[i].addr = rte_cpu_to_be_64(buf_addr); >>> /* If there's only one MR, no need to replace LKey in WQE. >>> */ >>> if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > >>> 1)) >>> wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); >>> -- >>> 2.11.0 >>> >>> >> How about having a macro / inline in the mbuf api to get this information >> in a consistent/unique way ? >> I can see we have this calculation at least in rte_pktmbuf_init() and >> rte_pktmbuf_detach(). > > Agree. Maybe rte_mbuf_default_buf_addr(m) ? I'm also okay to add. Will come up with a new patch. > Side note, is the assert() correct in the patch? I'd say there's a > difference of RTE_PKTMBUF_HEADROOM between the 2 values. Oops, my fault. Thanks for the catch, you saved a crash. :-) Thanks, Yongseok ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-09 9:56 ` Yongseok Koh @ 2019-01-09 10:05 ` David Marchand 2019-01-09 10:11 ` Yongseok Koh 0 siblings, 1 reply; 14+ messages in thread From: David Marchand @ 2019-01-09 10:05 UTC (permalink / raw) To: Yongseok Koh; +Cc: Olivier Matz, Shahaf Shuler, dev, stable On Wed, Jan 9, 2019 at 10:56 AM Yongseok Koh <yskoh@mellanox.com> wrote: > > > On Jan 9, 2019, at 1:52 AM, Olivier Matz <olivier.matz@6wind.com> wrote: > > > > On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote: > >> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote: > >> > >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't > needed > >>> to be accessed as it is static and easily calculated from the mbuf > address. > >>> Accessing the mbuf content causes unnecessary load stall and it is > worsened > >>> on ARM. > >>> > >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") > >>> Cc: stable@dpdk.org > >>> > >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> > >>> --- > >>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- > >>> 1 file changed, 6 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> b/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> index fda7004e2d..ced5547307 100644 > >>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data > >>> *rxq, uint16_t n) > >>> return; > >>> } > >>> for (i = 0; i < n; ++i) { > >>> - wq[i].addr = > rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr > >>> + > >>> - RTE_PKTMBUF_HEADROOM); > >>> + uintptr_t buf_addr = > >>> + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + > >>> + rte_pktmbuf_priv_size(rxq->mp) + > >>> RTE_PKTMBUF_HEADROOM; > >>> + > >>> + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); > >>> + wq[i].addr = rte_cpu_to_be_64(buf_addr); > >>> /* If there's only one MR, no need to replace LKey in > WQE. > >>> */ > >>> if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > > >>> 1)) > >>> wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); > >>> -- > >>> 2.11.0 > >>> > >>> > >> How about having a macro / inline in the mbuf api to get this > information > >> in a consistent/unique way ? > >> I can see we have this calculation at least in rte_pktmbuf_init() and > >> rte_pktmbuf_detach(). > > > > Agree. Maybe rte_mbuf_default_buf_addr(m) ? > > I'm also okay to add. Will come up with a new patch. > > > Side note, is the assert() correct in the patch? I'd say there's a > > difference of RTE_PKTMBUF_HEADROOM between the 2 values. > > Oops, my fault. Thanks for the catch, you saved a crash. :-) > Is this assert really necessary if we have a common macro ? I was under the impression that this assert is there to catch misalignement between the mbuf api and the driver. -- David Marchand ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-09 10:05 ` David Marchand @ 2019-01-09 10:11 ` Yongseok Koh 0 siblings, 0 replies; 14+ messages in thread From: Yongseok Koh @ 2019-01-09 10:11 UTC (permalink / raw) To: David Marchand; +Cc: Olivier Matz, Shahaf Shuler, dev, stable > On Jan 9, 2019, at 2:05 AM, David Marchand <david.marchand@redhat.com> wrote: > > > > On Wed, Jan 9, 2019 at 10:56 AM Yongseok Koh <yskoh@mellanox.com> wrote: > > > On Jan 9, 2019, at 1:52 AM, Olivier Matz <olivier.matz@6wind.com> wrote: > > > > On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote: > >> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh <yskoh@mellanox.com> wrote: > >> > >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed > >>> to be accessed as it is static and easily calculated from the mbuf address. > >>> Accessing the mbuf content causes unnecessary load stall and it is worsened > >>> on ARM. > >>> > >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") > >>> Cc: stable@dpdk.org > >>> > >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> > >>> --- > >>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- > >>> 1 file changed, 6 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> b/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> index fda7004e2d..ced5547307 100644 > >>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h > >>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data > >>> *rxq, uint16_t n) > >>> return; > >>> } > >>> for (i = 0; i < n; ++i) { > >>> - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr > >>> + > >>> - RTE_PKTMBUF_HEADROOM); > >>> + uintptr_t buf_addr = > >>> + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + > >>> + rte_pktmbuf_priv_size(rxq->mp) + > >>> RTE_PKTMBUF_HEADROOM; > >>> + > >>> + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); > >>> + wq[i].addr = rte_cpu_to_be_64(buf_addr); > >>> /* If there's only one MR, no need to replace LKey in WQE. > >>> */ > >>> if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > > >>> 1)) > >>> wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); > >>> -- > >>> 2.11.0 > >>> > >>> > >> How about having a macro / inline in the mbuf api to get this information > >> in a consistent/unique way ? > >> I can see we have this calculation at least in rte_pktmbuf_init() and > >> rte_pktmbuf_detach(). > > > > Agree. Maybe rte_mbuf_default_buf_addr(m) ? > > I'm also okay to add. Will come up with a new patch. > > > Side note, is the assert() correct in the patch? I'd say there's a > > difference of RTE_PKTMBUF_HEADROOM between the 2 values. > > Oops, my fault. Thanks for the catch, you saved a crash. :-) > > Is this assert really necessary if we have a common macro ? > I was under the impression that this assert is there to catch misalignement between the mbuf api and the driver. It is still good to have. This can catch corruption of mbuf content which sometimes happens due to wrong mbuf handling in PMD or potential HW memory corruption. Thanks, Yongseok ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20190109131908.4949-1-yskoh@mellanox.com>]
* [dpdk-stable] [PATCH v2 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer [not found] ` <20190109131908.4949-1-yskoh@mellanox.com> @ 2019-01-09 13:19 ` Yongseok Koh 0 siblings, 0 replies; 14+ messages in thread From: Yongseok Koh @ 2019-01-09 13:19 UTC (permalink / raw) To: olivier.matz, shahafs; +Cc: dev, david.marchand, stable On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM. Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> --- v2: * use the newly introduced API - rte_mbuf_buf_addr_default() * fix error in assert drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index fda7004e2d..154934d92d 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) return; } for (i = 0; i < n; ++i) { - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + - RTE_PKTMBUF_HEADROOM); + void *buf_addr = rte_mbuf_buf_addr_default(elts[i], rxq->mp); + + assert(buf_addr == elts[i]->buf_addr); + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM); /* If there's only one MR, no need to replace LKey in WQE. */ if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); -- 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20190110183528.42503-1-yskoh@mellanox.com>]
* [dpdk-stable] [PATCH v3 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer [not found] ` <20190110183528.42503-1-yskoh@mellanox.com> @ 2019-01-10 18:35 ` Yongseok Koh 2019-01-10 19:10 ` [dpdk-stable] [dpdk-dev] " Shahaf Shuler 0 siblings, 1 reply; 14+ messages in thread From: Yongseok Koh @ 2019-01-10 18:35 UTC (permalink / raw) To: olivier.matz, shahafs; +Cc: dev, roszenrami, david.marchand, stable On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM. Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> --- v3: * rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr() v2: * use the newly introduced API - rte_mbuf_buf_addr_default() * fix error in assert drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index fda7004e2d..989a1fdce5 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) return; } for (i = 0; i < n; ++i) { - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + - RTE_PKTMBUF_HEADROOM); + void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp); + + assert(buf_addr == elts[i]->buf_addr); + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM); /* If there's only one MR, no need to replace LKey in WQE. */ if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); -- 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-10 18:35 ` [dpdk-stable] [PATCH v3 " Yongseok Koh @ 2019-01-10 19:10 ` Shahaf Shuler 0 siblings, 0 replies; 14+ messages in thread From: Shahaf Shuler @ 2019-01-10 19:10 UTC (permalink / raw) To: Yongseok Koh, olivier.matz; +Cc: dev, roszenrami, david.marchand, stable Thursday, January 10, 2019 8:35 PM, Yongseok Koh: > Subject: [dpdk-dev] [PATCH v3 2/2] net/mlx5: fix instruction hotspot on > replenishing Rx buffer > > On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to > be accessed as it is static and easily calculated from the mbuf address. > Accessing the mbuf content causes unnecessary load stall and it is worsened on > ARM. > > Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") > Cc: stable@dpdk.org > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com> > --- > > v3: > * rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr() > > v2: > * use the newly introduced API - rte_mbuf_buf_addr_default() > * fix error in assert > > drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h > b/drivers/net/mlx5/mlx5_rxtx_vec.h > index fda7004e2d..989a1fdce5 100644 > --- a/drivers/net/mlx5/mlx5_rxtx_vec.h > +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h > @@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data > *rxq, uint16_t n) > return; > } > for (i = 0; i < n; ++i) { > - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + > - RTE_PKTMBUF_HEADROOM); > + void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp); > + > + assert(buf_addr == elts[i]->buf_addr); > + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + > +RTE_PKTMBUF_HEADROOM); > /* If there's only one MR, no need to replace LKey in WQE. */ > if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) > wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); > -- > 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20190110224030.2671-1-yskoh@mellanox.com>]
* [dpdk-stable] [PATCH v4 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer [not found] ` <20190110224030.2671-1-yskoh@mellanox.com> @ 2019-01-10 22:40 ` Yongseok Koh 0 siblings, 0 replies; 14+ messages in thread From: Yongseok Koh @ 2019-01-10 22:40 UTC (permalink / raw) To: olivier.matz, shahafs; +Cc: dev, roszenrami, david.marchand, stable On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM. Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> --- v4: * no change v3: * rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr() v2: * use the newly introduced API - rte_mbuf_buf_addr_default() * fix error in assert drivers/net/mlx5/mlx5_rxtx_vec.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index fda7004e2d..989a1fdce5 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -102,8 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) return; } for (i = 0; i < n; ++i) { - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + - RTE_PKTMBUF_HEADROOM); + void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp); + + assert(buf_addr == elts[i]->buf_addr); + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM); /* If there's only one MR, no need to replace LKey in WQE. */ if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); -- 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20190114211622.6900-1-yskoh@mellanox.com>]
* [dpdk-stable] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer [not found] ` <20190114211622.6900-1-yskoh@mellanox.com> @ 2019-01-14 21:16 ` Yongseok Koh 2019-02-06 15:54 ` [dpdk-stable] [dpdk-dev] " Kevin Traynor 0 siblings, 1 reply; 14+ messages in thread From: Yongseok Koh @ 2019-01-14 21:16 UTC (permalink / raw) To: olivier.matz, shahafs; +Cc: dev, arybchenko, roszenrami, david.marchand, stable On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM. Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com> --- v5: * no change v4: * no change v3: * rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr() v2: * use the newly introduced API - rte_mbuf_buf_addr_default() * fix error in assert drivers/net/mlx5/mlx5_rxtx_vec.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index fda7004e2d..5df8e291e6 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -102,7 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) return; } for (i = 0; i < n; ++i) { - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + + void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp); + + assert(buf_addr == elts[i]->buf_addr); + wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr + RTE_PKTMBUF_HEADROOM); /* If there's only one MR, no need to replace LKey in WQE. */ if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) -- 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-01-14 21:16 ` [dpdk-stable] [PATCH v5 " Yongseok Koh @ 2019-02-06 15:54 ` Kevin Traynor 2019-02-21 19:10 ` Kevin Traynor 0 siblings, 1 reply; 14+ messages in thread From: Kevin Traynor @ 2019-02-06 15:54 UTC (permalink / raw) To: Yongseok Koh, olivier.matz, shahafs Cc: dev, arybchenko, roszenrami, david.marchand, stable On 01/14/2019 09:16 PM, Yongseok Koh wrote: > On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed > to be accessed as it is static and easily calculated from the mbuf address. > Accessing the mbuf content causes unnecessary load stall and it is worsened > on ARM. > > Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") > Cc: stable@dpdk.org > This is using the API introduced in 1/2, so it's not really suitable for backport. Maybe you want to send an alternative for stable? > Signed-off-by: Yongseok Koh <yskoh@mellanox.com> > Acked-by: Shahaf Shuler <shahafs@mellanox.com> > --- ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-02-06 15:54 ` [dpdk-stable] [dpdk-dev] " Kevin Traynor @ 2019-02-21 19:10 ` Kevin Traynor 2019-03-08 2:05 ` Yongseok Koh 0 siblings, 1 reply; 14+ messages in thread From: Kevin Traynor @ 2019-02-21 19:10 UTC (permalink / raw) To: Yongseok Koh Cc: olivier.matz, shahafs, dev, arybchenko, roszenrami, david.marchand, stable Hi Yongseok, Can you let me know how you want to proceed with the below. I think we could just drop as it's a performance optimization, or maybe you have a different idea. thanks, Kevin. On 06/02/2019 15:54, Kevin Traynor wrote: > On 01/14/2019 09:16 PM, Yongseok Koh wrote: >> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed >> to be accessed as it is static and easily calculated from the mbuf address. >> Accessing the mbuf content causes unnecessary load stall and it is worsened >> on ARM. >> >> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") >> Cc: stable@dpdk.org >> > > This is using the API introduced in 1/2, so it's not really suitable for > backport. Maybe you want to send an alternative for stable? > >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> >> Acked-by: Shahaf Shuler <shahafs@mellanox.com> >> --- ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v5 2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer 2019-02-21 19:10 ` Kevin Traynor @ 2019-03-08 2:05 ` Yongseok Koh 0 siblings, 0 replies; 14+ messages in thread From: Yongseok Koh @ 2019-03-08 2:05 UTC (permalink / raw) To: Kevin Traynor Cc: Olivier Matz, Shahaf Shuler, dev, Andrew Rybchenko, Rami Rosen, David Marchand, stable Oops, I missed this email somehow, probably out of mind during my long vacation. :-) I've also encountered this issue with 17.11.6. http://git.dpdk.org/dpdk-stable/commit/?h=17.11&id=63f06f3fccc87c55adb33248e5c68a0175d213f1 I'll send a backport to you. Sorry for late reply. Yongseok > On Feb 21, 2019, at 11:10 AM, Kevin Traynor <ktraynor@redhat.com> wrote: > > Hi Yongseok, > > Can you let me know how you want to proceed with the below. I think we > could just drop as it's a performance optimization, or maybe you have a > different idea. > > thanks, > Kevin. > > On 06/02/2019 15:54, Kevin Traynor wrote: >> On 01/14/2019 09:16 PM, Yongseok Koh wrote: >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed >>> to be accessed as it is static and easily calculated from the mbuf address. >>> Accessing the mbuf content causes unnecessary load stall and it is worsened >>> on ARM. >>> >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") >>> Cc: stable@dpdk.org >>> >> >> This is using the API introduced in 1/2, so it's not really suitable for >> backport. Maybe you want to send an alternative for stable? >> >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> >>> Acked-by: Shahaf Shuler <shahafs@mellanox.com> >>> --- > ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-03-08 2:05 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-09 8:54 [dpdk-stable] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer Yongseok Koh 2019-01-09 9:38 ` [dpdk-stable] [dpdk-dev] " David Marchand 2019-01-09 9:52 ` Olivier Matz 2019-01-09 9:56 ` Yongseok Koh 2019-01-09 10:05 ` David Marchand 2019-01-09 10:11 ` Yongseok Koh [not found] ` <20190109131908.4949-1-yskoh@mellanox.com> 2019-01-09 13:19 ` [dpdk-stable] [PATCH v2 2/2] " Yongseok Koh [not found] ` <20190110183528.42503-1-yskoh@mellanox.com> 2019-01-10 18:35 ` [dpdk-stable] [PATCH v3 " Yongseok Koh 2019-01-10 19:10 ` [dpdk-stable] [dpdk-dev] " Shahaf Shuler [not found] ` <20190110224030.2671-1-yskoh@mellanox.com> 2019-01-10 22:40 ` [dpdk-stable] [PATCH v4 " Yongseok Koh [not found] ` <20190114211622.6900-1-yskoh@mellanox.com> 2019-01-14 21:16 ` [dpdk-stable] [PATCH v5 " Yongseok Koh 2019-02-06 15:54 ` [dpdk-stable] [dpdk-dev] " Kevin Traynor 2019-02-21 19:10 ` Kevin Traynor 2019-03-08 2:05 ` Yongseok Koh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).