From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D15F3A0032; Sat, 17 Sep 2022 15:38:33 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 77FD74021F; Sat, 17 Sep 2022 15:38:33 +0200 (CEST) Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by mails.dpdk.org (Postfix) with ESMTP id A520D4021D for ; Sat, 17 Sep 2022 15:38:31 +0200 (CEST) Received: by mail-qt1-f177.google.com with SMTP id j10so15023823qtv.4 for ; Sat, 17 Sep 2022 06:38:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date; bh=U2SQ9b61RIYTZS8bBlHCz+w41YJWGJ4/gp6mksgtHBo=; b=CVs8qqeXoFG7yJAeAgJrgApNFo+Dqrzcu7OpvTPjvjnZ2PrcCXV4tA6eN0mQbSbaJ8 5RB2IbizuGzoSAbm31nQUWpkea67ps6qY98YzDcBD4SKWzdFz/BLrSfmpIW1B0U8EWFH MM3wXQCR+ZXq15ijOfwd3f5sJPyc4Oz/SKZMsIPop50dTaQhgJQWa7RjHvWjX74gP0MR fmpr5eyK6d/3HdP9LPSq4Rmjawog/p4sGfAgCruL1Kqj+Mo//iDputqfFkny3Au3wPl9 IbqouDhVYB7x8Joq3vqc1DFomwe39pKY2HUX0F6enWepAMrKLAEq729W7eotZHfmtgnx uRcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date; bh=U2SQ9b61RIYTZS8bBlHCz+w41YJWGJ4/gp6mksgtHBo=; b=qP3mmsDoJ9j2RkVjoLWOl5hOoKNGrLMMy7gFb473sXpLsPzZMBxoXJy7T4Cu53GxLz IVBHa4+7+oa31cYQlZqBPk8vFZ0oDclezARPr0RS1/b432ttq0dHOZAes2iGYETcgDWf ZSW0/BUOFAtE07FjddFu0sxBCBOGssMyujxLuCMP0n8gBHcqCx5PrLbLsSXXHNfBq2o+ VJve8u3vZkyD+JaenFqYjnNfWEPyz1o2ALpEtJMBl6rAmW6a0k7n7QxGQmHIpXO+rpED /tGzU4ySq1jK2FEg+UpK0P93q2Ef/Zr/KGwAr+AjkG0n8NJml/WjTEx99ywnTXxRVbkP Nnow== X-Gm-Message-State: ACrzQf3uEHTtcJ1/bL6ix4u9uOvtB5E9sDtqIH+YE0ZhWEQh+oxa1Plz IJ8liKNS3yAr4JqVXhSnGE21Dg2nqSunTg== X-Google-Smtp-Source: AMsMyM6JKfW2sDXOw+EBI2CQkEG0z8aEwGbdWAqMnKcbVyYjqNdGJV1CDc4Ef4+pORB5frKHaCjZ8g== X-Received: by 2002:a05:622a:406:b0:35c:b76f:909f with SMTP id n6-20020a05622a040600b0035cb76f909fmr8218872qtx.432.1663421910902; Sat, 17 Sep 2022 06:38:30 -0700 (PDT) Received: from ?IPV6:2600:4040:225b:ea00:6063:8c9b:774a:6cf4? ([2600:4040:225b:ea00:6063:8c9b:774a:6cf4]) by smtp.googlemail.com with ESMTPSA id u7-20020a05622a17c700b00342b7e4241fsm6957460qtk.77.2022.09.17.06.38.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 17 Sep 2022 06:38:30 -0700 (PDT) Message-ID: <863016bd-a20b-8a9f-8edc-cfddc0593546@gmail.com> Date: Sat, 17 Sep 2022 09:38:29 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Subject: Re: [PATCH v2 1/3] net/bonding: support Tx prepare Content-Language: en-US To: fengchengwen , Ferruh Yigit , thomas@monjalon.net, andrew.rybchenko@oktetlabs.ru, konstantin.ananyev@intel.com Cc: dev@dpdk.org, chas3@att.com, humin29@huawei.com References: <1619171202-28486-2-git-send-email-tangchengchang@huawei.com> <20220725040842.35027-1-fengchengwen@huawei.com> <20220725040842.35027-2-fengchengwen@huawei.com> <495fb2f0-60c2-f1c9-2985-0d08bb463ad0@xilinx.com> <4b4af3e8-710a-ae75-8171-331ebfe4e4f7@huawei.com> <6c91f993-b11d-987c-6d20-38ee11f9f9db@gmail.com> <509a1984-841a-e42c-05c1-707b024ef7a8@huawei.com> From: Chas Williams <3chas3@gmail.com> In-Reply-To: <509a1984-841a-e42c-05c1-707b024ef7a8@huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 9/16/22 22:35, fengchengwen wrote: > Hi Chas, > > On 2022/9/15 0:59, Chas Williams wrote: >> On 9/13/22 20:46, fengchengwen wrote: >>> >>> The main problem is hard to design a tx_prepare for bonding device: >>> 1. as Chas Williams said, there maybe twice hash calc to get target slave >>>     devices. >>> 2. also more important, if the slave devices have changes(e.g. slave device >>>     link down or remove), and if the changes happens between bond-tx-prepare and >>>     bond-tx-burst, the output slave will changes, and this may lead to checksum >>>     failed. (Note: a bond device with slave devices may from different vendors, >>>     and slave devices may have different requirements, e.g. slave-A support calc >>>     IPv4 pseudo-head automatic (no need driver pre-calc), but slave-B need driver >>>     pre-calc). >>> >>> Current design cover the above two scenarios by using in-place tx-prepare. and >>> in addition, bond devices are not transparent to applications, I think it's a >>> practical method to provide tx-prepare support in this way. >>> >> >> >> I don't think you need to export an enable/disable routine for the use of >> rte_eth_tx_prepare. It's safe to just call that routine, even if it isn't >> implemented. You are just trading one branch in DPDK librte_eth_dev for a >> branch in drivers/net/bonding. > > Our first patch was just like yours (just add tx-prepare default), but community > is concerned about impacting performance. > > As a trade-off, I think we can add the enable/disable API. IMHO, that's a bad idea. If the rte_eth_dev_tx_prepare API affects performance adversly, that is not a bonding problem. All applications should be calling rte_eth_dev_tx_prepare. There's no defined API to determine if rte_eth_dev_tx_prepare should be called. Therefore, applications should always call rte_eth_dev_tx_prepare. Regardless, as I previously mentioned, you are just trading the location of the branch, especially in the bonding case. If rte_eth_dev_tx_prepare is causing a performance drop, then that API should be improved or rewritten. There are PMD that require you to use that API. Locally, we had maintained a patch to eliminate the use of rte_eth_dev_tx_prepare. However, that has been getting harder and harder to maintain. The performance lost by just calling rte_eth_dev_tx_prepare was marginal. > >> >> I think you missed fixing tx_machine in 802.3ad support. We have been using >> the following patch locally which I never got around to submitting. > > You are right, I will send V3 fix it. > >> >> >> From a458654d68ff5144266807ef136ac3dd2adfcd98 Mon Sep 17 00:00:00 2001 >> From: "Charles (Chas) Williams" >> Date: Tue, 3 May 2022 16:52:37 -0400 >> Subject: [PATCH] net/bonding: call rte_eth_tx_prepare before rte_eth_tx_burst >> >> Some PMDs might require a call to rte_eth_tx_prepare before sending the >> packets for transmission. Typically, the prepare step handles the VLAN >> headers, but it may need to do other things. >> >> Signed-off-by: Chas Williams > > ... > >>               * ring if transmission fails so the packet isn't lost. >> @@ -1322,8 +1350,12 @@ bond_ethdev_tx_burst_broadcast(void *queue, struct rte_mbuf **bufs, >> >>      /* Transmit burst on each active slave */ >>      for (i = 0; i < num_of_slaves; i++) { >> -        slave_tx_total[i] = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id, >> +        uint16_t nb_prep; >> + >> +        nb_prep = rte_eth_tx_prepare(slaves[i], bd_tx_q->queue_id, >>                      bufs, nb_pkts); >> +        slave_tx_total[i] = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id, >> +                    bufs, nb_prep); > > The tx-prepare may edit packet data, and the broadcast mode will send a packet to all slaves, > the packet data is sent and edited at the same time. Is this likely to cause problems ? This routine is already broken. You can't just increment the refcount and send the packet into a PMD's transmit routine. Nothing guarantees that a transmit routine will not modify the packet. Many PMDs perform an rte_vlan_insert. You should at least perform a clone of the packet so that the mbuf headers aren't mangled by each PMD. Just to be safe you should perform a partial deep copy of the packet headers in case some PMD does an rte_vlan_insert and the other PMDs in the bonding group do not need an rte_vlan_insert. So doing a blind rte_eth_dev_tx_preprare isn't making anything much worse. > >> >>          if (unlikely(slave_tx_total[i] < nb_pkts)) >>              tx_failed_flag = 1;