From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
To: Feifei Wang <Feifei.Wang2@arm.com>,
Konstantin Ananyev <konstantin.ananyev@huawei.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
Ruifeng Wang <Ruifeng.Wang@arm.com>,
Yuying Zhang <Yuying.Zhang@intel.com>,
Beilei Xing <beilei.xing@intel.com>
Subject: Re: [PATCH v11 2/4] net/i40e: implement mbufs recycle mode
Date: Sat, 23 Sep 2023 21:40:51 +0100 [thread overview]
Message-ID: <a1abe048-29b3-b61c-61e1-6d272764662a@yandex.ru> (raw)
In-Reply-To: <AS8PR08MB7718602AA2ABB7141CA3FC8DC8FEA@AS8PR08MB7718.eurprd08.prod.outlook.com>
>>>>>>>>>>>>>> Define specific function implementation for i40e driver.
>>>>>>>>>>>>>> Currently, mbufs recycle mode can support 128bit
>>>>>>>>>>>>>> vector path and
>>>>>>>>>>>>>> avx2
>>>>>>>>>>>> path.
>>>>>>>>>>>>>> And can be enabled both in fast free and no fast free
>> mode.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Suggested-by: Honnappa Nagarahalli
>>>>>>>>>>>>>> <honnappa.nagarahalli@arm.com>
>>>>>>>>>>>>>> Signed-off-by: Feifei Wang
>>>>>>>>>>>>>> <feifei.wang2@arm.com>
>>>>>>>>>>>>>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>>>>>>>>>>>>> Reviewed-by: Honnappa Nagarahalli
>>>>>>>>>> <honnappa.nagarahalli@arm.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>> drivers/net/i40e/i40e_ethdev.c | 1 +
>>>>>>>>>>>>>> drivers/net/i40e/i40e_ethdev.h | 2 +
>>>>>>>>>>>>>> .../net/i40e/i40e_recycle_mbufs_vec_common.c |
>>>>>>>>>>>>>> 147
>>>>>>>>>>>>>> ++++++++++++++++++
>>>>>>>>>>>>>> drivers/net/i40e/i40e_rxtx.c | 32 ++++
>>>>>>>>>>>>>> drivers/net/i40e/i40e_rxtx.h | 4 +
>>>>>>>>>>>>>> drivers/net/i40e/meson.build | 1 +
>>>>>>>>>>>>>> 6 files changed, 187 insertions(+) create mode
>>>>>>>>>>>>>> 100644
>>>>>>>>>>>>>> drivers/net/i40e/i40e_recycle_mbufs_vec_common.c
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/net/i40e/i40e_ethdev.c
>>>>>>>>>>>>>> b/drivers/net/i40e/i40e_ethdev.c index
>>>>>>>>>>>>>> 8271bbb394..50ba9aac94
>>>>>>>>>>>>>> 100644
>>>>>>>>>>>>>> --- a/drivers/net/i40e/i40e_ethdev.c
>>>>>>>>>>>>>> +++ b/drivers/net/i40e/i40e_ethdev.c
>>>>>>>>>>>>>> @@ -496,6 +496,7 @@ static const struct
>>>>>>>>>>>>>> eth_dev_ops i40e_eth_dev_ops
>>>>>>>>>>>> = {
>>>>>>>>>>>>>> .flow_ops_get = i40e_dev_flow_ops_get,
>>>>>>>>>>>>>> .rxq_info_get = i40e_rxq_info_get,
>>>>>>>>>>>>>> .txq_info_get = i40e_txq_info_get,
>>>>>>>>>>>>>> + .recycle_rxq_info_get =
>>>>> i40e_recycle_rxq_info_get,
>>>>>>>>>>>>>> .rx_burst_mode_get =
>>>>> i40e_rx_burst_mode_get,
>>>>>>>>>>>>>> .tx_burst_mode_get =
>>>>> i40e_tx_burst_mode_get,
>>>>>>>>>>>>>> .timesync_enable = i40e_timesync_enable,
>>>>>>>>>>>>>> diff --git a/drivers/net/i40e/i40e_ethdev.h
>>>>>>>>>>>>>> b/drivers/net/i40e/i40e_ethdev.h index
>>>>>>>>>>>>>> 6f65d5e0ac..af758798e1
>>>>>>>>>>>>>> 100644
>>>>>>>>>>>>>> --- a/drivers/net/i40e/i40e_ethdev.h
>>>>>>>>>>>>>> +++ b/drivers/net/i40e/i40e_ethdev.h
>>>>>>>>>>>>>> @@ -1355,6 +1355,8 @@ void
>>>>>>>>>>>>>> i40e_rxq_info_get(struct rte_eth_dev *dev, uint16_t
>> queue_id,
>>>>>>>>>>>>>> struct rte_eth_rxq_info *qinfo); void
>>>>>>>>>>>>>> i40e_txq_info_get(struct rte_eth_dev *dev,
>>>>>>>>>>>>>> uint16_t
>>>> queue_id,
>>>>>>>>>>>>>> struct rte_eth_txq_info *qinfo);
>>>>>>>>>>>>>> +void i40e_recycle_rxq_info_get(struct
>>>>>>>>>>>>>> +rte_eth_dev *dev, uint16_t
>>>>>>>>>>>> queue_id,
>>>>>>>>>>>>>> + struct rte_eth_recycle_rxq_info
>>>>>>>>>>>>>> +*recycle_rxq_info);
>>>>>>>>>>>>>> int i40e_rx_burst_mode_get(struct rte_eth_dev
>>>>>>>>>>>>>> *dev, uint16_t
>>>>>>>>>> queue_id,
>>>>>>>>>>>>>> struct rte_eth_burst_mode *mode);
>>>>> int
>>>>>>>>>>>>>> i40e_tx_burst_mode_get(struct rte_eth_dev *dev,
>>>>>>>>>>>>>> uint16_t queue_id, diff -- git
>>>>>>>>>>>>>> a/drivers/net/i40e/i40e_recycle_mbufs_vec_common
>>>>>>>>>>>>>> .c
>>>>>>>>>>>>>> b/drivers/net/i40e/i40e_recycle_mbufs_vec_common
>>>>>>>>>>>>>> .c new file mode 100644 index
>>>>>>>>>>>>>> 0000000000..5663ecccde
>>>>>>>>>>>>>> --- /dev/null
>>>>>>>>>>>>>> +++ b/drivers/net/i40e/i40e_recycle_mbufs_vec_co
>>>>>>>>>>>>>> +++ mmon
>>>>>>>>>>>>>> +++ .c
>>>>>>>>>>>>>> @@ -0,0 +1,147 @@
>>>>>>>>>>>>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>>>>>>>>>>>>> + * Copyright (c) 2023 Arm Limited.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +#include <stdint.h> #include <ethdev_driver.h>
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +#include "base/i40e_prototype.h"
>>>>>>>>>>>>>> +#include "base/i40e_type.h"
>>>>>>>>>>>>>> +#include "i40e_ethdev.h"
>>>>>>>>>>>>>> +#include "i40e_rxtx.h"
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +#pragma GCC diagnostic ignored "-Wcast-qual"
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>> +i40e_recycle_rx_descriptors_refill_vec(void
>>>>>>>>>>>>>> +*rx_queue, uint16_t
>>>>>>>>>>>>>> +nb_mbufs) {
>>>>>>>>>>>>>> + struct i40e_rx_queue *rxq = rx_queue;
>>>>>>>>>>>>>> + struct i40e_rx_entry *rxep;
>>>>>>>>>>>>>> + volatile union i40e_rx_desc *rxdp;
>>>>>>>>>>>>>> + uint16_t rx_id;
>>>>>>>>>>>>>> + uint64_t paddr;
>>>>>>>>>>>>>> + uint64_t dma_addr;
>>>>>>>>>>>>>> + uint16_t i;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>>>>>>>>>>> + rxep = &rxq->sw_ring[rxq->rxrearm_start];
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + for (i = 0; i < nb_mbufs; i++) {
>>>>>>>>>>>>>> + /* Initialize rxdp descs. */
>>>>>>>>>>>>>> + paddr = (rxep[i].mbuf)->buf_iova +
>>>>>>>>>>>>>> RTE_PKTMBUF_HEADROOM;
>>>>>>>>>>>>>> + dma_addr = rte_cpu_to_le_64(paddr);
>>>>>>>>>>>>>> + /* flush desc with pa dma_addr */
>>>>>>>>>>>>>> + rxdp[i].read.hdr_addr = 0;
>>>>>>>>>>>>>> + rxdp[i].read.pkt_addr = dma_addr;
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* Update the descriptor initializer index */
>>>>>>>>>>>>>> + rxq->rxrearm_start += nb_mbufs;
>>>>>>>>>>>>>> + rx_id = rxq->rxrearm_start - 1;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + if (unlikely(rxq->rxrearm_start >= rxq->nb_rx_desc)) {
>>>>>>>>>>>>>> + rxq->rxrearm_start = 0;
>>>>>>>>>>>>>> + rx_id = rxq->nb_rx_desc - 1;
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + rxq->rxrearm_nb -= nb_mbufs;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + rte_io_wmb();
>>>>>>>>>>>>>> + /* Update the tail pointer on the NIC */
>>>>>>>>>>>>>> + I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail,
>>>>>>>>>>>>>> +rx_id); }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +uint16_t
>>>>>>>>>>>>>> +i40e_recycle_tx_mbufs_reuse_vec(void *tx_queue,
>>>>>>>>>>>>>> + struct rte_eth_recycle_rxq_info *recycle_rxq_info) {
>>>>>>>>>>>>>> + struct i40e_tx_queue *txq = tx_queue;
>>>>>>>>>>>>>> + struct i40e_tx_entry *txep;
>>>>>>>>>>>>>> + struct rte_mbuf **rxep;
>>>>>>>>>>>>>> + int i, n;
>>>>>>>>>>>>>> + uint16_t nb_recycle_mbufs;
>>>>>>>>>>>>>> + uint16_t avail = 0;
>>>>>>>>>>>>>> + uint16_t mbuf_ring_size = recycle_rxq_info-
>>>>>> mbuf_ring_size;
>>>>>>>>>>>>>> + uint16_t mask = recycle_rxq_info->mbuf_ring_size - 1;
>>>>>>>>>>>>>> + uint16_t refill_requirement =
>>>>>>>>>>>>>> +recycle_rxq_info-
>>>>>>>>>>> refill_requirement;
>>>>>>>>>>>>>> + uint16_t refill_head = *recycle_rxq_info->refill_head;
>>>>>>>>>>>>>> + uint16_t receive_tail =
>>>>>>>>>>>>>> +*recycle_rxq_info->receive_tail;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* Get available recycling Rx buffers. */
>>>>>>>>>>>>>> + avail = (mbuf_ring_size - (refill_head -
>>>>>>>>>>>>>> +receive_tail)) & mask;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* Check Tx free thresh and Rx available space. */
>>>>>>>>>>>>>> + if (txq->nb_tx_free > txq->tx_free_thresh ||
>>>>>>>>>>>>>> +avail <=
>>>>>>>>>>>>>> +txq-
>>>>>>>>>>> tx_rs_thresh)
>>>>>>>>>>>>>> + return 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* check DD bits on threshold descriptor */
>>>>>>>>>>>>>> + if
>>>>>>>>>>>>>> +((txq->tx_ring[txq->tx_next_dd].cmd_type_offset
>>>>>>>>>>>>>> +_bsz
>>>>>>>>>>>>>> +&
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>>>>>>>>>>>>>> + return 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + n = txq->tx_rs_thresh;
>>>>>>>>>>>>>> + nb_recycle_mbufs = n;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* Mbufs recycle mode can only support no ring
>>>>>>>>>>>>>> +buffer
>>>>>>>>>> wrapping
>>>>>>>>>>>>>> around.
>>>>>>>>>>>>>> + * Two case for this:
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * case 1: The refill head of Rx buffer ring
>>>>>>>>>>>>>> +needs to be aligned
>>>>>>>>>> with
>>>>>>>>>>>>>> + * mbuf ring size. In this case, the number of
>>>>>>>>>>>>>> +Tx
>>>>> freeing buffers
>>>>>>>>>>>>>> + * should be equal to refill_requirement.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * case 2: The refill head of Rx ring buffer
>>>>>>>>>>>>>> +does not need to be
>>>>>>>>>> aligned
>>>>>>>>>>>>>> + * with mbuf ring size. In this case, the
>>>>>>>>>>>>>> +update of refill head
>>>>>>>>>> can not
>>>>>>>>>>>>>> + * exceed the Rx mbuf ring size.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> + if (refill_requirement != n ||
>>>>>>>>>>>>>> + (!refill_requirement && (refill_head + n >
>>>>>>>>>> mbuf_ring_size)))
>>>>>>>>>>>>>> + return 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* First buffer to free from S/W ring is at index
>>>>>>>>>>>>>> + * tx_next_dd - (tx_rs_thresh-1).
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> + txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
>>>>>>>>>>>>>> + rxep = recycle_rxq_info->mbuf_ring;
>>>>>>>>>>>>>> + rxep += refill_head;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + if (txq->offloads &
>>>>> RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
>>>>>>>>>>>>>> + /* Avoid txq contains buffers from unexpected
>>>>>>>>>> mempool. */
>>>>>>>>>>>>>> + if (unlikely(recycle_rxq_info->mp
>>>>>>>>>>>>>> + != txep[0].mbuf-
>>>>>> pool))
>>>>>>>>>>>>>> + return 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* Directly put mbufs from Tx to Rx. */
>>>>>>>>>>>>>> + for (i = 0; i < n; i++)
>>>>>>>>>>>>>> + rxep[i] = txep[i].mbuf;
>>>>>>>>>>>>>> + } else {
>>>>>>>>>>>>>> + for (i = 0; i < n; i++) {
>>>>>>>>>>>>>> + rxep[i] =
>>>>>>>>>> rte_pktmbuf_prefree_seg(txep[i].mbuf);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> + /* If Tx buffers are not the last
>>>>> reference or
>>>>>>>>>> from
>>>>>>>>>>>>>> + * unexpected mempool, previous
>>>>> copied
>>>>>>>>>> buffers are
>>>>>>>>>>>>>> + * considered as invalid.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> + if (unlikely((rxep[i] == NULL &&
>>>>>>>>>> refill_requirement) ||
>>>>>>>>>>>>> [Konstantin]
>>>>>>>>>>>>> Could you pls remind me why it is ok to have
>>>>>>>>>>>>> rxep[i]==NULL when refill_requirement is not set?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If reill_requirement is not zero, it means each tx
>>>>>>>>>>>>> freed buffer must be valid and can be put into Rx sw_ring.
>>>>>>>>>>>>> Then the refill head of Rx buffer ring can be
>>>>>>>>>>>>> aligned with mbuf ring size. Briefly speaking the
>>>>>>>>>>>>> number
>>>>>>>>>>>> of Tx valid freed buffer must be equal to Rx
>> refill_requirement.
>>>>>>>>>>>> For example, i40e driver.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If reill_requirement is zero, it means that the
>>>>>>>>>>>>> refill head of Rx buffer ring does not need to be
>>>>>>>>>>>>> aligned with mbuf ring size, thus if Tx have n
>>>>>>>>>>>>> valid freed buffers, we just need to put these n
>>>>>>>>>>>>> buffers into Rx
>>>>>>>>>>>>> sw-
>>>>>>>>>>>> ring, and not to be equal to the Rx setting rearm number.
>>>>>>>>>>>> For example, mlx5 driver.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In conclusion, above difference is due to pmd
>>>>>>>>>>>>> drivers have different
>>>>>>>>>>>> strategies to update their Rx rearm(refill) head.
>>>>>>>>>>>>> For i40e driver, if rearm_head exceed 1024, it
>>>>>>>>>>>>> will be set as
>>>>>>>>>>>>> 0 due to the
>>>>>>>>>>>> number of each rearm is a fixed value by default.
>>>>>>>>>>>>> For mlx5 driver. Its rearm_head can exceed 1024,
>>>>>>>>>>>>> and use mask to achieve
>>>>>>>>>>>> real index. Thus its rearm number can be a different value.
>>>>>>>>>>>>
>>>>>>>>>>>> Ok, but if rte_pktmbuf_prefree_seg(txep[i].mbuf), it
>>>>>>>>>>>> means that this mbuf is not free yet and can't be reused.
>>>>>>>>>>>> Shouldn't we then set nb_recycle_mbufs = 0 in that case
>> too?
>>>>>>>>>>>> Or probably would be enough to skip that mbuf?
>>>>>>>>>>>> Might be something like that:
>>>>>>>>>>>>
>>>>>>>>>>>> for (i = 0, j = 0; i < n; i++) {
>>>>>>>>>>>>
>>>>>>>>>>>> rxep[j] = rte_pktmbuf_prefree_seg(txep[i].mbuf);
>>>>>>>>>>>> if (rxep[j] == NULL || recycle_rxq_info->mp !=
>>>>>>>>>>>> rxep[j].mbuf-
>>>>>>> pool)) {
>>>>>>>>>>>> if (refill_requirement) {
>>>>>>>>>>>> nb_recycle_mbufs = 0;
>>>>>>>>>>>> break;
>>>>>>>>>>>> }
>>>>>>>>>>>> } else
>>>>>>>>>>>> j++;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> /* now j contains actual number of recycled mbufs */
>>>>>>>>>>>>
>>>>>>>>>>>> ?
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> + recycle_rxq_info-
>>>>>> mp !=
>>>>>>>>>> txep[i].mbuf-
>>>>>>>>>>>>>>> pool))
>>>>>>>>>>>>>> + nb_recycle_mbufs = 0;
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> + /* If Tx buffers are not the last reference or
>>>>>>>>>>>>>> + * from unexpected mempool, all recycled
>>>>> buffers
>>>>>>>>>>>>>> + * are put into mempool.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> + if (nb_recycle_mbufs == 0)
>>>>>>>>>>>>>> + for (i = 0; i < n; i++) {
>>>>>>>>>>>>>> + if (rxep[i] != NULL)
>>>>>>>>>>>>>> +
>>>>> rte_mempool_put(rxep[i]-
>>>>>>>>>>> pool,
>>>>>>>>>>>>>> rxep[i]);
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> +
>>>>>>>>>>> [Konstantin] After another thought, it might be easier
>>>>>>>>>>> and cleaner
>>>>>> just to:
>>>>>>>>>>> if (rxep[j] == NULL || recycle_rxq_info->mp != rxep[j].mbuf-
>>> pool)
>>>>>>>>>>> nb_recycle_mbufs = 0;
>>>>>>>>>>>
>>>>>>>>>>> Anyway, from my understanding - if
>>>>>>>>>>> rte_pktmbuf_prefree_seg(mbuf) returns NULL, then we
>>>>>>>>>>> can't recycle
>>>>>> that mbuf.
>>>>>>>>>>>
>>>>>>>>>>> [Feifei] Agree with that
>>>>>>>>>>> 'rte_pktmbuf_prefree_seg(mbuf) returns NULL, then
>>>>>>>>>> we can't recycle that mbuf'.
>>>>>>>>>>>
>>>>>>>>>>> Firstly, we should know for i40e driver, the number
>>>>>>>>>>> of free mbufs is fixed, it
>>>>>>>>>> must equal to 'tx_rs_thresh'
>>>>>>>>>>> This means if we start to free Tx mbufs, it cannot be
>>>>>>>>>>> interrupted until the
>>>>>>>>>> given amount of mbufs are freed.
>>>>>>>>>>> In the meanwhile, doing prefree operation for a Tx
>>>>>>>>>>> mbuf can be looked at this mbuf is freed from this TX
>>>>>>>>>>> sw-ring if the API returns NULL. This is due
>>>>>>>>>> to that call 'prefree_seg' function means update the mbuf
>> refcnt.
>>>>>>>>>>>
>>>>>>>>>>> So let's come back our recycle mbuf case.
>>>>>>>>>>> Here we consider that the case if we have 32 mbufs
>>>>>>>>>>> need to be freed, and
>>>>>>>>>> we firstly do the pre-free.
>>>>>>>>>>> And then first 8 mbufs is good and return value is not none.
>>>>>>>>>>> But the 9th
>>>>>>>>>> mbuf is bad, its refcnt is more than 1.
>>>>>>>>>>> So we just update its refcnt and cannot put it into Rx sw-ring.
>>>>>>>>>>> For Tx sw-ring,
>>>>>>>>>> this mbuf has been freed.
>>>>>>>>>>> Then we should continue to do pre-free operation for
>>>>>>>>>>> the next Tx mbufs to ensure the given amount of mbufs are
>> freed.
>>>>>>>>>>>
>>>>>>>>>>> Do a conclusion for this, this is because if we start
>>>>>>>>>>> to do pre-free operation, the Tx mbuf refcnt value
>>>>>>>>>>> maybe changed, so we cannot stop or
>>>>>>>>>> break until finish all the pre-free operation.
>>>>>>>>>>>
>>>>>>>>>>> Finally, in the case that Rx refill_request is not
>>>>>>>>>>> zero, but the valid mbuf amount is less than this
>>>>>>>>>>> value, we must put back this Tx mbufs into
>>>>>>>>>> mempool.
>>>>>>>>>>>
>>>>>>>>>>> Above is the reason why I do not directly jump out of
>>>>>>>>>>> the loop if some mbuf
>>>>>>>>>> return value is NULL.
>>>>>>>>>>
>>>>>>>>>> Yep, I already realized that it is a bit more
>>>>>>>>>> complicated and we need to continue with prefree() for
>>>>>>>>>> all packets even when we get NULL in
>>>>>>>> the middle.
>>>>>>>>>> Anyway the code has to be changed, right?
>>>>>>>>>>
>>>>>>>>> Sorry, I think for the code, it is unnecessary to be changed.
>>>>>>>>> For no fast free path, currently:
>>>>>>>>> 1. We check whether each mbuf is Ok and call pre_free function
>>>>>>>>> ----------------------------------------------------------
>>>>>>>>> ----
>>>>>>>>> --
>>>>>>>>> --
>>>>>>>>> ----
>>>>>>>>> ----------------------------------------------------------
>>>>>>>>> ----
>>>>>>>>> 2.1 For the mbuf return value is not NULL, it is put into Rx sw-
>> ring.
>>>>>>>>> 2.2 For the mbuf return value is zero and refill-request,
>>>>>>>>> it will also firstly put into Rx sw-ring, and we set
>>>>>>>>> nb_recycle =
>>>>>>>>> 0
>>>>>>>>> ----------------------------------------------------------
>>>>>>>>> ----
>>>>>>>>> --
>>>>>>>>> --
>>>>>>>>> ----
>>>>>>>>> ----------------------------------------------------------
>>>>>>>>> ----
>>>>>>>>> 3.1 We check nb_recycle, if it is not 0, we will continue
>>>>>>>>> to rearm Rx descs
>>>>>>>> and update its index by call descs_refill function.
>>>>>>>>> 3.2 if nb_recycle is 0, we will put valid recycle mbufs
>>>>>>>>> back into mempool as general path. This is necessary,
>>>>>>>>> because we need to ensure the freed Tx number is
>>>>>>>>> fixed.(Some buffers return is null can be seen as freed,
>>>>>>>>> others need to be put into mempool)
>>>>>>>>>
>>>>>>>>> Or maybe I ignore something?
>>>>>>>>
>>>>>>>>
>>>>>>>> I am talking about the case when both refill_requirement and
>>>>>>>> mbuf return values iare zero:
>>>>>>>> if (unlikely((rxep[i] == NULL && refill_requirement) || // ???
>>>> rxep[i]
>>>>> ==
>>>>>> 0
>>>>>>>> AND refill_requirement == 0 ???
>>>>>>>> recycle_rxq_info->mp != txep[i].mbuf->pool))
>>>>>>>> nb_recycle_mbufs = 0;
>>>>>>>>
>>>>>>>> As I can read the code you treat such situation as valid,
>>>>>>>> while I think we should reset nb_recycle_mbufs to zero when
>>>>>>>> rxep[i] == NULL, no matter what value refill_requirement is.
>>>>>>>
>>>>>>> So this means for maybe MLX5 driver, its refill_request = 0.
>>>>>>> And if some mbufs return value is zero, the other mbufs can
>>>>>>> not be recycled into Rx sw-ring? Because nb_recycle=0, and
>>>>>>> they need to be put into
>>>>>> mempool.
>>>>>>>
>>>>>>> I think for such as MLX5 driver, we can allow recycle some
>>>>>>> valid mbufs into
>>>>>> Rx ring.
>>>>>>> Because no constraint for its refill number. Is my suggestion
>> reasonable?
>>>>>>
>>>>>> I suppose yes: if refill_request is zero we potentially can skip 'bad'
>>>>>> mbufs and continue with recycling for remaining ones.
>>>>>> It would probably require more changes in current code, but
>>>>>> sounds ok to me in general.
>>>>> That's Ok. Thanks for your careful reviewing.
>>>>
>>>> I'm very sorry not to receive your e-mail and until now I realize we
>>>> need to do some code change for i40e driver. Also thanks ferruh to kindly
>> remind this.
>>
>> No worries at all and thanks for you and Ferruh fro fast reaction.
>>
>>>>
>>>> Agree with you we need some operation for the case that
>>>> (refill_requirement == 0 && rxep[i] == 0).
>>>> Thus maybe we can do a change as follows:
>>>>
>>>> for (i = 0; i < n; i++) {
>>>> rxep[0] = rte_pktmbuf_prefree_seg(txep[i].mbuf);
>>>> if (unlikely((rxep[0] == NULL && refill_requirement) ||
>>>> recycle_rxq_info->mp != txep[i].mbuf->pool))
>>>> nb_recycle_mbufs = 0;
>>>> if (likely(rxep[0]))
>>>> rxep++;
>>>> }
>>>>
>>>> Is above change is Ok?
>>>>
>>>
>>> Just do a change for the above code:
>>> ----------------------------------------------------------------------
>>> -------------------------------
>>> int j = 0;
>>> ......
>>>
>>> for (i = 0; i < n; i++) {
>>> rxep[j] = rte_pktmbuf_prefree_seg(txep[i].mbuf);
>>> if (unlikely(rxep[j] == NULL && !refill_requirement))
>>> continue;
>>> if (unlikely((rxep[j] == NULL && refill_requirement) ||
>>> recycle_rxq_info->mp != txep[i].mbuf->pool))
>>> nb_recycle_mbufs = 0;
>>> j++;
>>> }
>>>
>>> if (nb_recycle_mbufs == 0)
>>> for (i = 0; i < j; i++) {
>>> if (rxep[i] != NULL)
>>> rte_mempool_put(rxep[i]->pool, rxep[i]);
>>> }
>>
>> I think it should work...
>> One more thing, shouldn't we reset
>> nb_recycle_mbufs = j;
>>
>> Something like:
>> if (nb_recycle_mbufs == 0) {
>> for (i = 0; i < j; i++) {
>> if (rxep[i] != NULL)
>> rte_mempool_put(rxep[i]->pool, rxep[i]);
>> }
>> } else
>> nb_recycle_mbufs = j;
>>
>> ?
>>
>>
> Yes, we should reset it.
>
> I just do a performance test with the new code and find the performance decrease with 8%, comparing
> with the old version.
>
> Thus I think we should come back your previous idea:
> "we should reset nb_recycle_mbufs to zero when rxep[i] == NULL,
> no matter what value refill_requirement is":
>
> for (i = 0; i < n; i++) {
> rxep[i] = rte_pktmbuf_prefree_seg(txep[i].mbuf);
> if (unlikely((rxep[i] == NULL) ||
> recycle_rxq_info->mp != txep[i].mbuf->pool))
> nb_recycle_mbufs = 0;
> }
>
> Thus we drop the case that putting the remaining valid mbuf into Rx sw-ring
> when no refill_request, but maintain good performance.
I am ok with that way too.
Again that way, it looks simpler and probaly less error-prone.
Thanks
Konstantin
next prev parent reply other threads:[~2023-09-23 20:40 UTC|newest]
Thread overview: 145+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-20 8:16 [PATCH v1 0/5] Direct re-arming of buffers on receive side Feifei Wang
2022-04-20 8:16 ` [PATCH v1 1/5] net/i40e: remove redundant Dtype initialization Feifei Wang
2022-04-20 8:16 ` [PATCH v1 2/5] net/i40e: enable direct rearm mode Feifei Wang
2022-05-11 22:28 ` Konstantin Ananyev
2022-04-20 8:16 ` [PATCH v1 3/5] ethdev: add API for " Feifei Wang
2022-04-20 9:59 ` Morten Brørup
2022-04-29 2:42 ` 回复: " Feifei Wang
2022-04-20 10:41 ` Andrew Rybchenko
2022-04-29 6:28 ` 回复: " Feifei Wang
2022-05-10 22:49 ` Honnappa Nagarahalli
2022-06-03 10:19 ` Andrew Rybchenko
2022-04-20 10:50 ` Jerin Jacob
2022-05-02 3:09 ` 回复: " Feifei Wang
2022-04-21 14:57 ` Stephen Hemminger
2022-04-29 6:35 ` 回复: " Feifei Wang
2022-04-20 8:16 ` [PATCH v1 4/5] net/i40e: add direct rearm mode internal API Feifei Wang
2022-05-11 22:31 ` Konstantin Ananyev
2022-04-20 8:16 ` [PATCH v1 5/5] examples/l3fwd: enable direct rearm mode Feifei Wang
2022-04-20 10:10 ` Morten Brørup
2022-04-21 2:35 ` Honnappa Nagarahalli
2022-04-21 6:40 ` Morten Brørup
2022-05-10 22:01 ` Honnappa Nagarahalli
2022-05-11 7:17 ` Morten Brørup
2022-05-11 22:33 ` Konstantin Ananyev
2022-05-27 11:28 ` Konstantin Ananyev
2022-05-31 17:14 ` Honnappa Nagarahalli
2022-06-03 10:32 ` Andrew Rybchenko
2022-06-06 11:27 ` Konstantin Ananyev
2022-06-29 21:25 ` Honnappa Nagarahalli
2022-05-11 23:00 ` [PATCH v1 0/5] Direct re-arming of buffers on receive side Konstantin Ananyev
[not found] ` <20220516061012.618787-1-feifei.wang2@arm.com>
2022-05-24 1:25 ` Konstantin Ananyev
2022-05-24 12:40 ` Morten Brørup
2022-05-24 20:14 ` Honnappa Nagarahalli
2022-05-28 12:22 ` Konstantin Ananyev
2022-06-01 1:00 ` Honnappa Nagarahalli
2022-06-03 23:32 ` Konstantin Ananyev
2022-06-04 8:07 ` Morten Brørup
2022-06-29 21:58 ` Honnappa Nagarahalli
2022-06-30 15:21 ` Morten Brørup
2022-07-01 19:30 ` Honnappa Nagarahalli
2022-07-01 20:28 ` Morten Brørup
2022-06-13 5:55 ` 回复: " Feifei Wang
2023-01-04 7:30 ` [PATCH v3 0/3] " Feifei Wang
2023-01-04 7:30 ` [PATCH v3 1/3] ethdev: enable direct rearm with separate API Feifei Wang
2023-01-04 8:21 ` Morten Brørup
2023-01-04 8:51 ` 回复: " Feifei Wang
2023-01-04 10:11 ` Morten Brørup
2023-02-24 8:55 ` 回复: " Feifei Wang
2023-03-06 12:49 ` Ferruh Yigit
2023-03-06 13:26 ` Morten Brørup
2023-03-06 14:53 ` 回复: " Feifei Wang
2023-03-06 15:02 ` Ferruh Yigit
2023-03-07 6:12 ` Honnappa Nagarahalli
2023-03-07 10:52 ` Konstantin Ananyev
2023-03-07 20:41 ` Ferruh Yigit
2023-03-22 14:43 ` Honnappa Nagarahalli
2023-02-02 14:33 ` Konstantin Ananyev
2023-02-24 9:45 ` 回复: " Feifei Wang
2023-02-27 19:31 ` Konstantin Ananyev
2023-02-28 2:16 ` 回复: " Feifei Wang
2023-02-28 8:09 ` Morten Brørup
2023-03-01 7:34 ` 回复: " Feifei Wang
2023-01-04 7:30 ` [PATCH v3 2/3] net/i40e: " Feifei Wang
2023-02-02 14:37 ` Konstantin Ananyev
2023-02-24 9:50 ` 回复: " Feifei Wang
2023-02-27 19:35 ` Konstantin Ananyev
2023-02-28 2:15 ` 回复: " Feifei Wang
2023-03-07 11:01 ` Konstantin Ananyev
2023-03-14 6:07 ` 回复: " Feifei Wang
2023-03-19 16:11 ` Konstantin Ananyev
2023-03-23 10:49 ` Feifei Wang
2023-01-04 7:30 ` [PATCH v3 3/3] net/ixgbe: " Feifei Wang
2023-01-31 6:13 ` 回复: [PATCH v3 0/3] Direct re-arming of buffers on receive side Feifei Wang
2023-02-01 1:10 ` Konstantin Ananyev
2023-02-01 2:24 ` 回复: " Feifei Wang
2023-03-22 12:56 ` Morten Brørup
2023-03-22 13:41 ` Honnappa Nagarahalli
2023-03-22 14:04 ` Morten Brørup
2023-08-02 7:38 ` [PATCH v8 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-02 7:38 ` [PATCH v8 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-02 7:38 ` [PATCH v8 2/4] net/i40e: implement " Feifei Wang
2023-08-02 7:38 ` [PATCH v8 3/4] net/ixgbe: " Feifei Wang
2023-08-02 7:38 ` [PATCH v8 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-02 8:08 ` [PATCH v9 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-02 8:08 ` [PATCH v9 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-02 8:08 ` [PATCH v9 2/4] net/i40e: implement " Feifei Wang
2023-08-02 8:08 ` [PATCH v9 3/4] net/ixgbe: " Feifei Wang
2023-08-02 8:08 ` [PATCH v9 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-04 9:24 ` [PATCH v10 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-04 9:24 ` [PATCH v10 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-04 9:24 ` [PATCH v10 2/4] net/i40e: implement " Feifei Wang
2023-08-04 9:24 ` [PATCH v10 3/4] net/ixgbe: " Feifei Wang
2023-08-04 9:24 ` [PATCH v10 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-22 7:27 ` [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-22 7:27 ` [PATCH v11 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-22 14:02 ` Stephen Hemminger
2023-08-24 3:16 ` Feifei Wang
2023-08-22 23:33 ` Konstantin Ananyev
2023-08-24 3:38 ` Feifei Wang
2023-08-22 7:27 ` [PATCH v11 2/4] net/i40e: implement " Feifei Wang
2023-08-22 23:43 ` Konstantin Ananyev
2023-08-24 6:10 ` Feifei Wang
2023-08-31 17:24 ` Konstantin Ananyev
2023-08-31 23:49 ` Konstantin Ananyev
2023-09-01 12:22 ` Feifei Wang
2023-09-01 14:22 ` Konstantin Ananyev
2023-09-04 6:59 ` Feifei Wang
2023-09-04 7:49 ` Konstantin Ananyev
2023-09-04 9:24 ` Feifei Wang
2023-09-04 10:21 ` Konstantin Ananyev
2023-09-05 3:11 ` Feifei Wang
2023-09-22 14:58 ` Feifei Wang
2023-09-22 15:46 ` Feifei Wang
2023-09-22 16:40 ` Konstantin Ananyev
2023-09-23 5:52 ` Feifei Wang
2023-09-23 20:40 ` Konstantin Ananyev [this message]
2023-09-25 3:26 ` Feifei Wang
2023-08-22 7:27 ` [PATCH v11 3/4] net/ixgbe: " Feifei Wang
2023-08-22 7:27 ` [PATCH v11 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-22 7:33 ` [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-22 13:59 ` Stephen Hemminger
2023-08-24 3:11 ` Feifei Wang
2023-08-24 7:36 ` [PATCH v12 " Feifei Wang
2023-08-24 7:36 ` [PATCH v12 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-31 9:16 ` Feifei Wang
2023-09-20 13:10 ` Ferruh Yigit
2023-08-24 7:36 ` [PATCH v12 2/4] net/i40e: implement " Feifei Wang
2023-08-24 7:36 ` [PATCH v12 3/4] net/ixgbe: " Feifei Wang
2023-08-24 7:36 ` [PATCH v12 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-09-20 13:11 ` Ferruh Yigit
2023-09-20 13:12 ` [PATCH v12 0/4] Recycle mbufs from Tx queue into Rx queue Ferruh Yigit
2023-09-22 15:30 ` Ferruh Yigit
2023-09-25 3:19 ` [PATCH v13 " Feifei Wang
2023-09-25 3:19 ` [PATCH v13 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-09-25 4:40 ` Ajit Khaparde
2023-09-25 3:19 ` [PATCH v13 2/4] net/i40e: implement " Feifei Wang
2023-09-26 8:26 ` Ferruh Yigit
2023-09-26 8:56 ` Konstantin Ananyev
2023-09-26 13:34 ` Konstantin Ananyev
2023-09-25 3:19 ` [PATCH v13 3/4] net/ixgbe: " Feifei Wang
2023-09-26 13:30 ` Konstantin Ananyev
2023-09-25 3:19 ` [PATCH v13 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-09-26 13:30 ` Konstantin Ananyev
2023-09-26 16:38 ` Ajit Khaparde
2023-09-27 17:24 ` [PATCH v13 0/4] Recycle mbufs from Tx queue into Rx queue Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a1abe048-29b3-b61c-61e1-6d272764662a@yandex.ru \
--to=konstantin.v.ananyev@yandex.ru \
--cc=Feifei.Wang2@arm.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=Ruifeng.Wang@arm.com \
--cc=Yuying.Zhang@intel.com \
--cc=beilei.xing@intel.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@huawei.com \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).