* [dpdk-stable] [PATCH 1/2] net/mlx5: optimize inline mbuf freeing [not found] <1608311697-31529-1-git-send-email-viacheslavo@nvidia.com> @ 2020-12-18 17:14 ` Viacheslav Ovsiienko [not found] ` <1609922063-13716-1-git-send-email-viacheslavo@nvidia.com> [not found] ` <1611335529-26503-1-git-send-email-viacheslavo@nvidia.com> 2 siblings, 0 replies; 7+ messages in thread From: Viacheslav Ovsiienko @ 2020-12-18 17:14 UTC (permalink / raw) To: dev; +Cc: rasland, matan, orika, thomas, stable The mlx5 PMD supports packet data inlining by pushing data to the transmit descriptor. If packet is short enough and all data are inline, the mbuf is not needed for data send anymore and can be freed. The mbuf free was performed in the most inner loop building the transmit descriptors. This patch postpones the mbuf free transaction to the tx_burst routine exit, optimizing the loop and allowing the bulk freeing for the multiple mbufs in single pool API call. Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/net/mlx5/mlx5_rxtx.c | 38 ++++++++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5_rxtx.h | 1 + 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index d12d746..e8c8783 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1940,6 +1940,17 @@ enum mlx5_txcmp_code { } } } +/* + * No inline version to free buffers for optimal call + * on the tx_burst completion. + */ +static __rte_noinline void +__mlx5_tx_free_mbuf(struct rte_mbuf **__rte_restrict pkts, + unsigned int pkts_n, + unsigned int olx __rte_unused) +{ + mlx5_tx_free_mbuf(pkts, pkts_n, olx); +} /** * Free the mbuf from the elts ring buffer till new tail. @@ -4392,10 +4403,25 @@ enum mlx5_txcmp_code { MLX5_ASSERT(room >= tlen); room -= tlen; /* - * Packet data are completely inlined, - * free the packet immediately. + * Packet data are completely inline, + * we can try to free the packet. + */ + if (likely(loc->pkts_sent == loc->mbuf_free)) { + /* + * All the packets from the burst beginning + * are inline, we can free mbufs directly + * from the origin array on tx_burst exit(). + */ + loc->mbuf_free++; + goto next_mbuf; + } + /* + * In order no to call rte_pktmbuf_free_seg() here, + * in the most inner loop (that might be very + * expensive) we just save the mbuf in elts. */ - rte_pktmbuf_free_seg(loc->mbuf); + txq->elts[txq->elts_head++ & txq->elts_m] = loc->mbuf; + loc->elts_free--; goto next_mbuf; pointer_empw: /* @@ -4417,6 +4443,7 @@ enum mlx5_txcmp_code { mlx5_tx_dseg_ptr(txq, loc, dseg, dptr, dlen, olx); /* We have to store mbuf in elts.*/ txq->elts[txq->elts_head++ & txq->elts_m] = loc->mbuf; + loc->elts_free--; room -= MLX5_WQE_DSEG_SIZE; /* Ring buffer wraparound is checked at the loop end.*/ ++dseg; @@ -4426,7 +4453,6 @@ enum mlx5_txcmp_code { slen += dlen; #endif loc->pkts_sent++; - loc->elts_free--; pkts_n--; if (unlikely(!pkts_n || !loc->elts_free)) { /* @@ -4880,6 +4906,8 @@ enum mlx5_txcmp_code { MLX5_ASSERT(txq->wqe_s >= (uint16_t)(txq->wqe_ci - txq->wqe_pi)); if (unlikely(!pkts_n)) return 0; + if (MLX5_TXOFF_CONFIG(INLINE)) + loc.mbuf_free = 0; loc.pkts_sent = 0; loc.pkts_copy = 0; loc.wqe_last = NULL; @@ -5143,6 +5171,8 @@ enum mlx5_txcmp_code { /* Increment sent packets counter. */ txq->stats.opackets += loc.pkts_sent; #endif + if (MLX5_TXOFF_CONFIG(INLINE) && loc.mbuf_free) + __mlx5_tx_free_mbuf(pkts, loc.mbuf_free, olx); return loc.pkts_sent; } diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 7989a50..fc5cc2e 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -217,6 +217,7 @@ struct mlx5_txq_local { uint16_t wqe_free; /* available wqe remain. */ uint16_t mbuf_off; /* data offset in current mbuf. */ uint16_t mbuf_nseg; /* number of remaining mbuf. */ + uint16_t mbuf_free; /* number of inline mbufs to free. */ }; /* TX queue descriptor. */ -- 1.8.3.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <1609922063-13716-1-git-send-email-viacheslavo@nvidia.com>]
* [dpdk-stable] [PATCH v2 1/2] net/mlx5: optimize inline mbuf freeing [not found] ` <1609922063-13716-1-git-send-email-viacheslavo@nvidia.com> @ 2021-01-06 8:34 ` Viacheslav Ovsiienko 0 siblings, 0 replies; 7+ messages in thread From: Viacheslav Ovsiienko @ 2021-01-06 8:34 UTC (permalink / raw) To: dev; +Cc: rasland, matan, orika, thomas, akozyrev, stable The mlx5 PMD supports packet data inlining by pushing data to the transmit descriptor. If packet is short enough and all data are inline, the mbuf is not needed for data send anymore and can be freed. The mbuf free was performed in the most inner loop building the transmit descriptors. This patch postpones the mbuf free transaction to the tx_burst routine exit, optimizing the loop and allowing the bulk freeing for the multiple mbufs in single pool API call. Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/net/mlx5/mlx5_rxtx.c | 38 ++++++++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5_rxtx.h | 1 + 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 65a1f99..ee56a72 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1990,6 +1990,17 @@ enum mlx5_txcmp_code { } } } +/* + * No inline version to free buffers for optimal call + * on the tx_burst completion. + */ +static __rte_noinline void +__mlx5_tx_free_mbuf(struct rte_mbuf **__rte_restrict pkts, + unsigned int pkts_n, + unsigned int olx __rte_unused) +{ + mlx5_tx_free_mbuf(pkts, pkts_n, olx); +} /** * Free the mbuf from the elts ring buffer till new tail. @@ -4408,10 +4419,25 @@ enum mlx5_txcmp_code { MLX5_ASSERT(room >= tlen); room -= tlen; /* - * Packet data are completely inlined, - * free the packet immediately. + * Packet data are completely inline, + * we can try to free the packet. + */ + if (likely(loc->pkts_sent == loc->mbuf_free)) { + /* + * All the packets from the burst beginning + * are inline, we can free mbufs directly + * from the origin array on tx_burst exit(). + */ + loc->mbuf_free++; + goto next_mbuf; + } + /* + * In order no to call rte_pktmbuf_free_seg() here, + * in the most inner loop (that might be very + * expensive) we just save the mbuf in elts. */ - rte_pktmbuf_free_seg(loc->mbuf); + txq->elts[txq->elts_head++ & txq->elts_m] = loc->mbuf; + loc->elts_free--; goto next_mbuf; pointer_empw: /* @@ -4433,6 +4459,7 @@ enum mlx5_txcmp_code { mlx5_tx_dseg_ptr(txq, loc, dseg, dptr, dlen, olx); /* We have to store mbuf in elts.*/ txq->elts[txq->elts_head++ & txq->elts_m] = loc->mbuf; + loc->elts_free--; room -= MLX5_WQE_DSEG_SIZE; /* Ring buffer wraparound is checked at the loop end.*/ ++dseg; @@ -4442,7 +4469,6 @@ enum mlx5_txcmp_code { slen += dlen; #endif loc->pkts_sent++; - loc->elts_free--; pkts_n--; if (unlikely(!pkts_n || !loc->elts_free)) { /* @@ -4892,6 +4918,8 @@ enum mlx5_txcmp_code { MLX5_ASSERT(txq->wqe_s >= (uint16_t)(txq->wqe_ci - txq->wqe_pi)); if (unlikely(!pkts_n)) return 0; + if (MLX5_TXOFF_CONFIG(INLINE)) + loc.mbuf_free = 0; loc.pkts_sent = 0; loc.pkts_copy = 0; loc.wqe_last = NULL; @@ -5155,6 +5183,8 @@ enum mlx5_txcmp_code { /* Increment sent packets counter. */ txq->stats.opackets += loc.pkts_sent; #endif + if (MLX5_TXOFF_CONFIG(INLINE) && loc.mbuf_free) + __mlx5_tx_free_mbuf(pkts, loc.mbuf_free, olx); return loc.pkts_sent; } diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 1e9345a..af47839 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -217,6 +217,7 @@ struct mlx5_txq_local { uint16_t wqe_free; /* available wqe remain. */ uint16_t mbuf_off; /* data offset in current mbuf. */ uint16_t mbuf_nseg; /* number of remaining mbuf. */ + uint16_t mbuf_free; /* number of inline mbufs to free. */ }; /* TX queue descriptor. */ -- 1.8.3.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <1611335529-26503-1-git-send-email-viacheslavo@nvidia.com>]
* [dpdk-stable] [PATCH v3 1/2] net/mlx5: optimize inline mbuf freeing [not found] ` <1611335529-26503-1-git-send-email-viacheslavo@nvidia.com> @ 2021-01-22 17:12 ` Viacheslav Ovsiienko 2021-01-27 12:44 ` Ferruh Yigit 0 siblings, 1 reply; 7+ messages in thread From: Viacheslav Ovsiienko @ 2021-01-22 17:12 UTC (permalink / raw) To: dev; +Cc: rasland, matan, orika, thomas, akozyrev, stable The mlx5 PMD supports packet data inlining by pushing data to the transmit descriptor. If packet is short enough and all data are inline, the mbuf is not needed for data send anymore and can be freed. The mbuf free was performed in the most inner loop building the transmit descriptors. This patch postpones the mbuf free transaction to the tx_burst routine exit, optimizing the loop and allowing the bulk freeing for the multiple mbufs in single pool API call. Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/net/mlx5/mlx5_rxtx.c | 38 ++++++++++++++++++++++++++++++++++---- drivers/net/mlx5/mlx5_rxtx.h | 1 + 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 3497765..97912dd 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1990,6 +1990,17 @@ enum mlx5_txcmp_code { } } } +/* + * No inline version to free buffers for optimal call + * on the tx_burst completion. + */ +static __rte_noinline void +__mlx5_tx_free_mbuf(struct rte_mbuf **__rte_restrict pkts, + unsigned int pkts_n, + unsigned int olx __rte_unused) +{ + mlx5_tx_free_mbuf(pkts, pkts_n, olx); +} /** * Free the mbuf from the elts ring buffer till new tail. @@ -4408,10 +4419,25 @@ enum mlx5_txcmp_code { MLX5_ASSERT(room >= tlen); room -= tlen; /* - * Packet data are completely inlined, - * free the packet immediately. + * Packet data are completely inline, + * we can try to free the packet. + */ + if (likely(loc->pkts_sent == loc->mbuf_free)) { + /* + * All the packets from the burst beginning + * are inline, we can free mbufs directly + * from the origin array on tx_burst exit(). + */ + loc->mbuf_free++; + goto next_mbuf; + } + /* + * In order no to call rte_pktmbuf_free_seg() here, + * in the most inner loop (that might be very + * expensive) we just save the mbuf in elts. */ - rte_pktmbuf_free_seg(loc->mbuf); + txq->elts[txq->elts_head++ & txq->elts_m] = loc->mbuf; + loc->elts_free--; goto next_mbuf; pointer_empw: /* @@ -4433,6 +4459,7 @@ enum mlx5_txcmp_code { mlx5_tx_dseg_ptr(txq, loc, dseg, dptr, dlen, olx); /* We have to store mbuf in elts.*/ txq->elts[txq->elts_head++ & txq->elts_m] = loc->mbuf; + loc->elts_free--; room -= MLX5_WQE_DSEG_SIZE; /* Ring buffer wraparound is checked at the loop end.*/ ++dseg; @@ -4442,7 +4469,6 @@ enum mlx5_txcmp_code { slen += dlen; #endif loc->pkts_sent++; - loc->elts_free--; pkts_n--; if (unlikely(!pkts_n || !loc->elts_free)) { /* @@ -4892,6 +4918,8 @@ enum mlx5_txcmp_code { MLX5_ASSERT(txq->wqe_s >= (uint16_t)(txq->wqe_ci - txq->wqe_pi)); if (unlikely(!pkts_n)) return 0; + if (MLX5_TXOFF_CONFIG(INLINE)) + loc.mbuf_free = 0; loc.pkts_sent = 0; loc.pkts_copy = 0; loc.wqe_last = NULL; @@ -5155,6 +5183,8 @@ enum mlx5_txcmp_code { /* Increment sent packets counter. */ txq->stats.opackets += loc.pkts_sent; #endif + if (MLX5_TXOFF_CONFIG(INLINE) && loc.mbuf_free) + __mlx5_tx_free_mbuf(pkts, loc.mbuf_free, olx); return loc.pkts_sent; } diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 7756ed3..9dac408 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -209,6 +209,7 @@ struct mlx5_txq_local { uint16_t wqe_free; /* available wqe remain. */ uint16_t mbuf_off; /* data offset in current mbuf. */ uint16_t mbuf_nseg; /* number of remaining mbuf. */ + uint16_t mbuf_free; /* number of inline mbufs to free. */ }; /* TX queue descriptor. */ -- 1.8.3.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-stable] [PATCH v3 1/2] net/mlx5: optimize inline mbuf freeing 2021-01-22 17:12 ` [dpdk-stable] [PATCH v3 " Viacheslav Ovsiienko @ 2021-01-27 12:44 ` Ferruh Yigit 2021-01-27 12:48 ` [dpdk-stable] [dpdk-dev] " Ferruh Yigit 2021-01-28 9:14 ` [dpdk-stable] " Slava Ovsiienko 0 siblings, 2 replies; 7+ messages in thread From: Ferruh Yigit @ 2021-01-27 12:44 UTC (permalink / raw) To: Viacheslav Ovsiienko, dev; +Cc: rasland, matan, orika, thomas, akozyrev, stable On 1/22/2021 5:12 PM, Viacheslav Ovsiienko wrote: > The mlx5 PMD supports packet data inlining by pushing data > to the transmit descriptor. If packet is short enough and all > data are inline, the mbuf is not needed for data send anymore > and can be freed. > > The mbuf free was performed in the most inner loop building > the transmit descriptors. This patch postpones the mbuf free > transaction to the tx_burst routine exit, optimizing the loop > and allowing the bulk freeing for the multiple mbufs in single > pool API call. > > Cc: stable@dpdk.org > Hi Slava, This patch is optimization for inline mbufs, right, it is not a fix, should it be backported? cc'ed LTS maintainers. I am dropping the stable to for now in the next-net, can add it later based on discussion result. > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 1/2] net/mlx5: optimize inline mbuf freeing 2021-01-27 12:44 ` Ferruh Yigit @ 2021-01-27 12:48 ` Ferruh Yigit 2021-01-28 9:14 ` [dpdk-stable] " Slava Ovsiienko 1 sibling, 0 replies; 7+ messages in thread From: Ferruh Yigit @ 2021-01-27 12:48 UTC (permalink / raw) To: Viacheslav Ovsiienko, dev Cc: rasland, matan, orika, thomas, akozyrev, stable, Kevin Traynor, Luca Boccassi On 1/27/2021 12:44 PM, Ferruh Yigit wrote: > On 1/22/2021 5:12 PM, Viacheslav Ovsiienko wrote: >> The mlx5 PMD supports packet data inlining by pushing data >> to the transmit descriptor. If packet is short enough and all >> data are inline, the mbuf is not needed for data send anymore >> and can be freed. >> >> The mbuf free was performed in the most inner loop building >> the transmit descriptors. This patch postpones the mbuf free >> transaction to the tx_burst routine exit, optimizing the loop >> and allowing the bulk freeing for the multiple mbufs in single >> pool API call. >> >> Cc: stable@dpdk.org >> > > Hi Slava, > > This patch is optimization for inline mbufs, right, it is not a fix, should it > be backported? > > cc'ed LTS maintainers. > cc'ed now. > I am dropping the stable to for now in the next-net, can add it later based on > discussion result. > >> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-stable] [PATCH v3 1/2] net/mlx5: optimize inline mbuf freeing 2021-01-27 12:44 ` Ferruh Yigit 2021-01-27 12:48 ` [dpdk-stable] [dpdk-dev] " Ferruh Yigit @ 2021-01-28 9:14 ` Slava Ovsiienko 2021-01-28 9:34 ` Thomas Monjalon 1 sibling, 1 reply; 7+ messages in thread From: Slava Ovsiienko @ 2021-01-28 9:14 UTC (permalink / raw) To: Ferruh Yigit, dev Cc: Raslan Darawsheh, Matan Azrad, Ori Kam, NBU-Contact-Thomas Monjalon, Alexander Kozyrev, stable Hi, Ferruh > -----Original Message----- > From: Ferruh Yigit <ferruh.yigit@intel.com> > Sent: Wednesday, January 27, 2021 14:45 > To: Slava Ovsiienko <viacheslavo@nvidia.com>; dev@dpdk.org > Cc: Raslan Darawsheh <rasland@nvidia.com>; Matan Azrad > <matan@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-Thomas > Monjalon <thomas@monjalon.net>; Alexander Kozyrev > <akozyrev@nvidia.com>; stable@dpdk.org > Subject: Re: [dpdk-stable] [PATCH v3 1/2] net/mlx5: optimize inline mbuf > freeing > > On 1/22/2021 5:12 PM, Viacheslav Ovsiienko wrote: > > The mlx5 PMD supports packet data inlining by pushing data to the > > transmit descriptor. If packet is short enough and all data are > > inline, the mbuf is not needed for data send anymore and can be freed. > > > > The mbuf free was performed in the most inner loop building the > > transmit descriptors. This patch postpones the mbuf free transaction > > to the tx_burst routine exit, optimizing the loop and allowing the > > bulk freeing for the multiple mbufs in single pool API call. > > > > Cc: stable@dpdk.org > > > > Hi Slava, > > This patch is optimization for inline mbufs, right, it is not a fix, should it be > backported? Not critical, but nice to have this small optimization in LTS. > > cc'ed LTS maintainers. > > I am dropping the stable to for now in the next-net, can add it later based on > discussion result. OK, let's consider this backporting in dedicated way, thank you. With best regards, Slava ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-stable] [PATCH v3 1/2] net/mlx5: optimize inline mbuf freeing 2021-01-28 9:14 ` [dpdk-stable] " Slava Ovsiienko @ 2021-01-28 9:34 ` Thomas Monjalon 0 siblings, 0 replies; 7+ messages in thread From: Thomas Monjalon @ 2021-01-28 9:34 UTC (permalink / raw) To: Slava Ovsiienko Cc: Ferruh Yigit, dev, Raslan Darawsheh, Matan Azrad, Ori Kam, Alexander Kozyrev, stable, bluca, kevin.traynor, Christian Ehrhardt 28/01/2021 10:14, Slava Ovsiienko: > From: Ferruh Yigit <ferruh.yigit@intel.com> > > On 1/22/2021 5:12 PM, Viacheslav Ovsiienko wrote: > > > The mlx5 PMD supports packet data inlining by pushing data to the > > > transmit descriptor. If packet is short enough and all data are > > > inline, the mbuf is not needed for data send anymore and can be freed. > > > > > > The mbuf free was performed in the most inner loop building the > > > transmit descriptors. This patch postpones the mbuf free transaction > > > to the tx_burst routine exit, optimizing the loop and allowing the > > > bulk freeing for the multiple mbufs in single pool API call. > > > > > > Cc: stable@dpdk.org > > > > > > > Hi Slava, > > > > This patch is optimization for inline mbufs, right, it is not a fix, should it be > > backported? > Not critical, but nice to have this small optimization in LTS. > > > > > cc'ed LTS maintainers. > > > > I am dropping the stable to for now in the next-net, can add it later based on > > discussion result. > > OK, let's consider this backporting in dedicated way, thank you. Consensus from techboard is to reject optimizations in LTS for now. Some acceptance guidelines will be written soon. Not sure this one will be considered. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-01-28 9:34 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1608311697-31529-1-git-send-email-viacheslavo@nvidia.com> 2020-12-18 17:14 ` [dpdk-stable] [PATCH 1/2] net/mlx5: optimize inline mbuf freeing Viacheslav Ovsiienko [not found] ` <1609922063-13716-1-git-send-email-viacheslavo@nvidia.com> 2021-01-06 8:34 ` [dpdk-stable] [PATCH v2 " Viacheslav Ovsiienko [not found] ` <1611335529-26503-1-git-send-email-viacheslavo@nvidia.com> 2021-01-22 17:12 ` [dpdk-stable] [PATCH v3 " Viacheslav Ovsiienko 2021-01-27 12:44 ` Ferruh Yigit 2021-01-27 12:48 ` [dpdk-stable] [dpdk-dev] " Ferruh Yigit 2021-01-28 9:14 ` [dpdk-stable] " Slava Ovsiienko 2021-01-28 9:34 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).