From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 11C45A00C4; Mon, 25 Jul 2022 06:16:00 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 29C5142847; Mon, 25 Jul 2022 06:15:46 +0200 (CEST) Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by mails.dpdk.org (Postfix) with ESMTP id 765EA40684 for ; Mon, 25 Jul 2022 06:15:41 +0200 (CEST) Received: from dggpeml500024.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Lrmq50FgTz1M8DT; Mon, 25 Jul 2022 12:12:49 +0800 (CST) Received: from localhost.localdomain (10.67.165.24) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 25 Jul 2022 12:15:39 +0800 From: Chengwen Feng To: , CC: , , , , Subject: [PATCH v2 1/3] net/bonding: support Tx prepare Date: Mon, 25 Jul 2022 12:08:40 +0800 Message-ID: <20220725040842.35027-2-fengchengwen@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20220725040842.35027-1-fengchengwen@huawei.com> References: <1619171202-28486-2-git-send-email-tangchengchang@huawei.com> <20220725040842.35027-1-fengchengwen@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500024.china.huawei.com (7.185.36.10) X-CFilter-Loop: Reflected X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Normally, to use the HW offloads capability (e.g. checksum and TSO) in the Tx direction, the application needs to call rte_eth_dev_prepare to do some adjustment with the packets before sending them (e.g. processing pseudo headers when Tx checksum offload enabled). But, the tx_prepare callback of the bonding driver is not implemented. Therefore, the sent packets may have errors (e.g. checksum errors). However, it is difficult to design the tx_prepare callback for bonding driver. Because when a bonded device sends packets, the bonded device allocates the packets to different slave devices based on the real-time link status and bonding mode. That is, it is very difficult for the bonded device to determine which slave device's prepare function should be invoked. So, in this patch, the tx_prepare callback of bonding driver is not implemented. Instead, the rte_eth_dev_tx_prepare() will be called for all the fast path packets in mode 0, 1, 2, 4, 5, 6 (mode 3 is not included, see[1]). In this way, all tx_offloads can be processed correctly for all NIC devices in these modes. As previously discussed (see V1), if the tx_prepare fails, the bonding driver will free the cossesponding packets internally, and only the packets of the tx_prepare OK are xmit. To minimize performance impact, this patch adds one new 'tx_prepare_enabled' field, and corresponding control and get API: rte_eth_bond_tx_prepare_set() and rte_eth_bond_tx_prepare_get(). [1]: In bond mode 3 (broadcast), a packet needs to be sent by all slave ports. Different slave PMDs process the packets differently in tx_prepare. If call tx_prepare before each slave port sending, the sent packet may be incorrect. Signed-off-by: Chengchang Tang Signed-off-by: Chengwen Feng --- .../link_bonding_poll_mode_drv_lib.rst | 5 ++ doc/guides/rel_notes/release_22_11.rst | 6 ++ drivers/net/bonding/eth_bond_private.h | 1 + drivers/net/bonding/rte_eth_bond.h | 24 +++++++ drivers/net/bonding/rte_eth_bond_api.c | 32 +++++++++ drivers/net/bonding/rte_eth_bond_pmd.c | 65 ++++++++++++++++--- drivers/net/bonding/version.map | 5 ++ 7 files changed, 130 insertions(+), 8 deletions(-) diff --git a/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.rst b/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.rst index 9510368103..a3d91b2091 100644 --- a/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.rst +++ b/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.rst @@ -359,6 +359,11 @@ The link status of a bonded device is dictated by that of its slaves, if all slave device link status are down or if all slaves are removed from the link bonding device then the link status of the bonding device will go down. +Unlike normal PMD drivers, the Tx prepare for the bonding driver is controlled +by ``rte_eth_bond_tx_prepare_set`` (all bond modes except mode 3 (broadcast) +are supported). The ``rte_eth_bond_tx_prepare_get`` for querying the enabling +status is provided. + It is also possible to configure / query the configuration of the control parameters of a bonded device using the provided APIs ``rte_eth_bond_mode_set/ get``, ``rte_eth_bond_primary_set/get``, diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst index 8c021cf050..6e28a6c0af 100644 --- a/doc/guides/rel_notes/release_22_11.rst +++ b/doc/guides/rel_notes/release_22_11.rst @@ -55,6 +55,12 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Added Tx prepare for bonding driver.** + + * Added ``rte_eth_bond_tx_prepare_set`` to set whether enable Tx prepare for bonded port. + All bond modes except mode 3 (broadcast) are supported. + * Added ``rte_eth_bond_tx_prepare_get`` to get whether Tx prepare enabled for bonded port. + Removed Items ------------- diff --git a/drivers/net/bonding/eth_bond_private.h b/drivers/net/bonding/eth_bond_private.h index 8222e3cd38..9996f6673c 100644 --- a/drivers/net/bonding/eth_bond_private.h +++ b/drivers/net/bonding/eth_bond_private.h @@ -117,6 +117,7 @@ struct bond_dev_private { uint16_t user_defined_primary_port; /**< Flag for whether primary port is user defined or not */ + uint8_t tx_prepare_enabled; uint8_t balance_xmit_policy; /**< Transmit policy - l2 / l23 / l34 for operation in balance mode */ burst_xmit_hash_t burst_xmit_hash; diff --git a/drivers/net/bonding/rte_eth_bond.h b/drivers/net/bonding/rte_eth_bond.h index 874aa91a5f..deae2dd9ad 100644 --- a/drivers/net/bonding/rte_eth_bond.h +++ b/drivers/net/bonding/rte_eth_bond.h @@ -343,6 +343,30 @@ rte_eth_bond_link_up_prop_delay_set(uint16_t bonded_port_id, int rte_eth_bond_link_up_prop_delay_get(uint16_t bonded_port_id); +/** + * Set whether enable Tx-prepare (rte_eth_tx_prepare) for bonded port + * + * @param bonded_port_id Bonded device id + * @param en Enable flag + * + * @return + * 0 on success, negative value otherwise. + */ +__rte_experimental +int +rte_eth_bond_tx_prepare_set(uint16_t bonded_port_id, bool en); + +/** + * Get whether Tx-prepare (rte_eth_tx_prepare) is enabled for bonded port + * + * @param bonded_port_id Bonded device id + * + * @return + * 0-disabled, 1-enabled, negative value otherwise. + */ +__rte_experimental +int +rte_eth_bond_tx_prepare_get(uint16_t bonded_port_id); #ifdef __cplusplus } diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c index 4ac191c468..47841289f4 100644 --- a/drivers/net/bonding/rte_eth_bond_api.c +++ b/drivers/net/bonding/rte_eth_bond_api.c @@ -1070,3 +1070,35 @@ rte_eth_bond_link_up_prop_delay_get(uint16_t bonded_port_id) return internals->link_up_delay_ms; } + +int +rte_eth_bond_tx_prepare_set(uint16_t bonded_port_id, bool en) +{ + struct bond_dev_private *internals; + + if (valid_bonded_port_id(bonded_port_id) != 0) + return -1; + + internals = rte_eth_devices[bonded_port_id].data->dev_private; + if (internals->mode == BONDING_MODE_BROADCAST) { + RTE_BOND_LOG(ERR, "Mode broadcast don't support to configure Tx-prepare"); + return -ENOTSUP; + } + + internals->tx_prepare_enabled = en ? 1 : 0; + + return 0; +} + +int +rte_eth_bond_tx_prepare_get(uint16_t bonded_port_id) +{ + struct bond_dev_private *internals; + + if (valid_bonded_port_id(bonded_port_id) != 0) + return -1; + + internals = rte_eth_devices[bonded_port_id].data->dev_private; + + return internals->tx_prepare_enabled; +} diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c index 73e6972035..c32c7e6c6c 100644 --- a/drivers/net/bonding/rte_eth_bond_pmd.c +++ b/drivers/net/bonding/rte_eth_bond_pmd.c @@ -559,6 +559,56 @@ bond_ethdev_rx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) return nb_recv_pkts; } +static inline uint16_t +bond_ethdev_tx_wrap(struct bond_tx_queue *bd_tx_q, uint16_t slave_port_id, + struct rte_mbuf **tx_pkts, uint16_t nb_pkts) +{ + struct bond_dev_private *internals = bd_tx_q->dev_private; + uint16_t queue_id = bd_tx_q->queue_id; + struct rte_mbuf *fail_pkts[nb_pkts]; + uint8_t fail_mark[nb_pkts]; + uint16_t nb_pre, index; + uint16_t fail_cnt = 0; + int i; + + if (!internals->tx_prepare_enabled) + goto tx_burst; + + nb_pre = rte_eth_tx_prepare(slave_port_id, queue_id, tx_pkts, nb_pkts); + if (nb_pre == nb_pkts) + goto tx_burst; + + fail_pkts[fail_cnt++] = tx_pkts[nb_pre]; + memset(fail_mark, 0, sizeof(fail_mark)); + fail_mark[nb_pre] = 1; + for (i = nb_pre + 1; i < nb_pkts; /* update in inner loop */) { + nb_pre = rte_eth_tx_prepare(slave_port_id, queue_id, + tx_pkts + i, nb_pkts - i); + if (nb_pre == nb_pkts - i) + break; + fail_pkts[fail_cnt++] = tx_pkts[i + nb_pre]; + fail_mark[i + nb_pre] = 1; + i += nb_pre + 1; + } + + /* move tx-prepare OK mbufs to the end */ + for (i = index = nb_pkts - 1; i >= 0; i--) { + if (!fail_mark[i]) + tx_pkts[index--] = tx_pkts[i]; + } + /* move tx-prepare fail mbufs to the begin, and free them */ + for (i = 0; i < fail_cnt; i++) { + tx_pkts[i] = fail_pkts[i]; + rte_pktmbuf_free(fail_pkts[i]); + } + + if (fail_cnt == nb_pkts) + return nb_pkts; +tx_burst: + return fail_cnt + rte_eth_tx_burst(slave_port_id, queue_id, + tx_pkts + fail_cnt, nb_pkts - fail_cnt); +} + static uint16_t bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) @@ -602,7 +652,7 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs, /* Send packet burst on each slave device */ for (i = 0; i < num_of_slaves; i++) { if (slave_nb_pkts[i] > 0) { - num_tx_slave = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id, + num_tx_slave = bond_ethdev_tx_wrap(bd_tx_q, slaves[i], slave_bufs[i], slave_nb_pkts[i]); /* if tx burst fails move packets to end of bufs */ @@ -635,8 +685,8 @@ bond_ethdev_tx_burst_active_backup(void *queue, if (internals->active_slave_count < 1) return 0; - return rte_eth_tx_burst(internals->current_primary_port, bd_tx_q->queue_id, - bufs, nb_pkts); + return bond_ethdev_tx_wrap(bd_tx_q, internals->current_primary_port, + bufs, nb_pkts); } static inline uint16_t @@ -951,8 +1001,8 @@ bond_ethdev_tx_burst_tlb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) #endif } - num_tx_total += rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id, - bufs + num_tx_total, nb_pkts - num_tx_total); + num_tx_total += bond_ethdev_tx_wrap(bd_tx_q, slaves[i], + bufs + num_tx_total, nb_pkts - num_tx_total); if (num_tx_total == nb_pkts) break; @@ -1158,9 +1208,8 @@ tx_burst_balance(void *queue, struct rte_mbuf **bufs, uint16_t nb_bufs, if (slave_nb_bufs[i] == 0) continue; - slave_tx_count = rte_eth_tx_burst(slave_port_ids[i], - bd_tx_q->queue_id, slave_bufs[i], - slave_nb_bufs[i]); + slave_tx_count = bond_ethdev_tx_wrap(bd_tx_q, + slave_port_ids[i], slave_bufs[i], slave_nb_bufs[i]); total_tx_count += slave_tx_count; diff --git a/drivers/net/bonding/version.map b/drivers/net/bonding/version.map index 9333923b4e..2c121f2559 100644 --- a/drivers/net/bonding/version.map +++ b/drivers/net/bonding/version.map @@ -31,3 +31,8 @@ DPDK_23 { local: *; }; + +EXPERIMENTAL { + rte_eth_bond_tx_prepare_get; + rte_eth_bond_tx_prepare_set; +}; \ No newline at end of file -- 2.33.0