From: "Min Hu (Connor)" <humin29@huawei.com>
To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
Chengchang Tang <tangchengchang@huawei.com>, <dev@dpdk.org>
Cc: <linuxarm@huawei.com>, <chas3@att.com>, <ferruh.yigit@intel.com>,
<konstantin.ananyev@intel.com>
Subject: Re: [dpdk-dev] [PATCH 1/2] net/bonding: support Tx prepare for bonding
Date: Tue, 24 May 2022 20:11:06 +0800 [thread overview]
Message-ID: <8fa66325-0f70-af0a-5126-7cae0fc8d0a4@huawei.com> (raw)
In-Reply-To: <4ce1752a-b605-8f35-ab06-13a671ef0e6c@oktetlabs.ru>
Hi, Andrew,
在 2021/6/8 17:49, Andrew Rybchenko 写道:
> "for bonding" is redundant in the summary since it is already "net/bonding".
>
> On 4/23/21 12:46 PM, Chengchang Tang wrote:
>> To use the HW offloads capability (e.g. checksum and TSO) in the Tx
>> direction, the upper-layer users need to call rte_eth_dev_prepare to do
>> some adjustment to the packets before sending them (e.g. processing
>> pseudo headers when Tx checksum offoad enabled). But, the tx_prepare
>> callback of the bond driver is not implemented. Therefore, related
>> offloads can not be used unless the upper layer users process the packet
>> properly in their own application. But it is bad for the
>> transplantability.
>>
>> However, it is difficult to design the tx_prepare callback for bonding
>> driver. Because when a bonded device sends packets, the bonded device
>> allocates the packets to different slave devices based on the real-time
>> link status and bonding mode. That is, it is very difficult for the
>> bonding device to determine which slave device's prepare function should
>> be invoked. In addition, if the link status changes after the packets are
>> prepared, the packets may fail to be sent because packets allocation may
>> change.
>>
>> So, in this patch, the tx_prepare callback of bonding driver is not
>> implemented. Instead, the rte_eth_dev_tx_prepare() will be called for
>> all the fast path packet in mode 0, 1, 2, 4, 5, 6. In this way, all
>> tx_offloads can be processed correctly for all NIC devices in these modes.
>> If tx_prepare is not required in some cases, then slave PMDs tx_prepare
>> pointer should be NULL and rte_eth_tx_prepare() will be just a NOOP.
>> In these cases, the impact on performance will be very limited. It is
>> the responsibility of the slave PMDs to decide when the real tx_prepare
>> needs to be used. The information from dev_config/queue_setup is
>> sufficient for them to make these decisions.
>>
>> Note:
>> The rte_eth_tx_prepare is not added to bond mode 3(Broadcast). This is
>> because in broadcast mode, a packet needs to be sent by all slave ports.
>> Different PMDs process the packets differently in tx_prepare. As a result,
>> the sent packet may be incorrect.
>>
>> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
>> ---
>> drivers/net/bonding/rte_eth_bond.h | 1 -
>> drivers/net/bonding/rte_eth_bond_pmd.c | 28 ++++++++++++++++++++++++----
>> 2 files changed, 24 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/bonding/rte_eth_bond.h b/drivers/net/bonding/rte_eth_bond.h
>> index 874aa91..1e6cc6d 100644
>> --- a/drivers/net/bonding/rte_eth_bond.h
>> +++ b/drivers/net/bonding/rte_eth_bond.h
>> @@ -343,7 +343,6 @@ rte_eth_bond_link_up_prop_delay_set(uint16_t bonded_port_id,
>> int
>> rte_eth_bond_link_up_prop_delay_get(uint16_t bonded_port_id);
>>
>> -
>> #ifdef __cplusplus
>> }
>> #endif
>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
>> index 2e9cea5..84af348 100644
>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>> @@ -606,8 +606,14 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
>> /* Send packet burst on each slave device */
>> for (i = 0; i < num_of_slaves; i++) {
>> if (slave_nb_pkts[i] > 0) {
>> + int nb_prep_pkts;
>> +
>> + nb_prep_pkts = rte_eth_tx_prepare(slaves[i],
>> + bd_tx_q->queue_id, slave_bufs[i],
>> + slave_nb_pkts[i]);
>> +
>
> Shouldn't it be called iff queue Tx offloads are not zero?
> It will allow to decrease performance degradation if no
> Tx offloads are enabled. Same in all cases below.
>
>> num_tx_slave = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
>> - slave_bufs[i], slave_nb_pkts[i]);
>> + slave_bufs[i], nb_prep_pkts);
>
> In fact it is a problem here and really big problems.
> Tx prepare may fail and return less packets. Tx prepare
> of some packet may always fail. If application tries to
> send packets in a loop until success, it will be a
> forever loop here. Since application calls Tx burst,
> it is 100% legal behaviour of the function to return 0
> if Tx ring is full. It is not an error indication.
> However, in the case of Tx prepare it is an error
> indication.
Just regardless of this patch, I think the problem already exit in
'rte_eth_tx_burst'.
For example, there exits one 'bad' rte_mbuf in tx_pkts[].
set net driver 'fm10k' as an example, in 'fm10k_xmit_pkts',
when one rte_mbuf is 'bad'(mb->nb_segs == 0), it will break
and return the pkt num( < nb_pkts) which successfully xmited.
The code is like :
"
uint16_t
fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts)
{
....
for (count = 0; count < nb_pkts; ++count) {
/* sanity check to make sure the mbuf is valid */
if ((mb->nb_segs == 0) ||
((mb->nb_segs > 1) && (mb->next == NULL)))
break;
}
return count;
}
"
So, if APP send packets in a loop until success, it will be
also forever loop here.
And one solution to fix it is depending on APP itself, like what testpmd
has done: it adds delay time, like that:
"
nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
/*
* Retry if necessary
*/
if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
retry = 0;
while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
rte_delay_us(burst_tx_delay_time);
nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
&pkts_burst[nb_tx], nb_rx - nb_tx);
}
}
"
what I mean, this patch does not introduce new 'bugs' to
'rte_eth_tx_burst'. And also, the known bug in retry situation can be
fixed in APP.
>
> Should we change Tx burst description and enforce callers
> to check for rte_errno? It sounds like a major change...
>
> [snip]
> .
>
next prev parent reply other threads:[~2022-05-24 12:11 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-16 11:04 [dpdk-dev] [RFC 0/2] add Tx prepare support for bonding device Chengchang Tang
2021-04-16 11:04 ` [dpdk-dev] [RFC 1/2] net/bonding: add Tx prepare for bonding Chengchang Tang
2021-04-16 11:04 ` [dpdk-dev] [RFC 2/2] app/testpmd: add cmd for bonding Tx prepare Chengchang Tang
2021-04-16 11:12 ` [dpdk-dev] [RFC 0/2] add Tx prepare support for bonding device Min Hu (Connor)
2021-04-20 1:26 ` Ferruh Yigit
2021-04-20 2:44 ` Chengchang Tang
2021-04-20 8:33 ` Ananyev, Konstantin
2021-04-20 12:44 ` Chengchang Tang
2021-04-20 13:18 ` Ananyev, Konstantin
2021-04-20 14:06 ` Chengchang Tang
2021-04-23 9:46 ` [dpdk-dev] [PATCH " Chengchang Tang
2021-04-23 9:46 ` [dpdk-dev] [PATCH 1/2] net/bonding: support Tx prepare for bonding Chengchang Tang
2021-06-08 9:49 ` Andrew Rybchenko
2021-06-09 6:42 ` Chengchang Tang
2021-06-09 9:35 ` Andrew Rybchenko
2021-06-10 7:32 ` Chengchang Tang
2021-06-14 14:16 ` Andrew Rybchenko
2021-06-09 10:25 ` Ananyev, Konstantin
2021-06-10 6:46 ` Chengchang Tang
2021-06-14 11:36 ` Ananyev, Konstantin
2022-05-24 12:11 ` Min Hu (Connor) [this message]
2022-07-25 4:08 ` [PATCH v2 0/3] add Tx prepare support for bonding driver Chengwen Feng
2022-07-25 4:08 ` [PATCH v2 1/3] net/bonding: support Tx prepare Chengwen Feng
2022-09-13 10:22 ` Ferruh Yigit
2022-09-13 15:08 ` Chas Williams
2022-09-14 0:46 ` fengchengwen
2022-09-14 16:59 ` Chas Williams
2022-09-17 2:35 ` fengchengwen
2022-09-17 13:38 ` Chas Williams
2022-09-19 14:07 ` Konstantin Ananyev
2022-09-19 23:02 ` Chas Williams
2022-09-22 2:12 ` fengchengwen
2022-09-25 10:32 ` Chas Williams
2022-09-26 10:18 ` Konstantin Ananyev
2022-09-26 16:36 ` Chas Williams
2022-07-25 4:08 ` [PATCH v2 2/3] net/bonding: support Tx prepare fail stats Chengwen Feng
2022-07-25 4:08 ` [PATCH v2 3/3] net/bonding: add testpmd cmd for Tx prepare Chengwen Feng
2022-07-25 7:04 ` [PATCH v2 0/3] add Tx prepare support for bonding driver humin (Q)
2022-09-13 1:41 ` fengchengwen
2022-09-17 4:15 ` [PATCH v3 " Chengwen Feng
2022-09-17 4:15 ` [PATCH v3 1/3] net/bonding: support Tx prepare Chengwen Feng
2022-09-17 4:15 ` [PATCH v3 2/3] net/bonding: support Tx prepare fail stats Chengwen Feng
2022-09-17 4:15 ` [PATCH v3 3/3] net/bonding: add testpmd cmd for Tx prepare Chengwen Feng
2022-10-09 3:36 ` [PATCH v4] net/bonding: call Tx prepare before Tx burst Chengwen Feng
2022-10-10 19:42 ` Chas Williams
2022-10-11 13:28 ` fengchengwen
2022-10-11 13:20 ` [PATCH v5] " Chengwen Feng
2022-10-15 15:26 ` Chas Williams
2022-10-18 14:25 ` fengchengwen
2022-10-20 7:07 ` Andrew Rybchenko
2021-04-23 9:46 ` [dpdk-dev] [PATCH 2/2] net/bonding: support configuring Tx offloading for bonding Chengchang Tang
2021-06-08 9:49 ` Andrew Rybchenko
2021-06-09 6:57 ` Chengchang Tang
2021-06-09 9:11 ` Ananyev, Konstantin
2021-06-09 9:37 ` Andrew Rybchenko
2021-06-10 6:29 ` Chengchang Tang
2021-06-14 11:05 ` Ananyev, Konstantin
2021-06-14 14:13 ` Andrew Rybchenko
2021-04-30 6:26 ` [dpdk-dev] [PATCH 0/2] add Tx prepare support for bonding device Chengchang Tang
2021-04-30 6:47 ` Min Hu (Connor)
2021-06-03 1:44 ` Chengchang Tang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8fa66325-0f70-af0a-5126-7cae0fc8d0a4@huawei.com \
--to=humin29@huawei.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=chas3@att.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=linuxarm@huawei.com \
--cc=tangchengchang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).