From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2CCB2A2EDB for ; Sun, 29 Sep 2019 14:11:08 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E4A281F28; Sun, 29 Sep 2019 14:11:07 +0200 (CEST) Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com [67.231.154.164]) by dpdk.org (Postfix) with ESMTP id D27C41F28 for ; Sun, 29 Sep 2019 14:11:05 +0200 (CEST) X-Virus-Scanned: Proofpoint Essentials engine Received: from webmail.solarflare.com (uk.solarflare.com [193.34.186.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mx1-us5.ppe-hosted.com (PPE Hosted ESMTP Server) with ESMTPS id 781134005C; Sun, 29 Sep 2019 12:11:03 +0000 (UTC) Received: from [192.168.38.17] (91.220.146.112) by ukex01.SolarFlarecom.com (10.17.10.4) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Sun, 29 Sep 2019 13:10:57 +0100 To: Ori Kam , Thomas Monjalon , Ferruh Yigit CC: "dev@dpdk.org" , "jingjing.wu@intel.com" , "stephen@networkplumber.org" References: <1569479349-36962-1-git-send-email-orika@mellanox.com> <1569479349-36962-2-git-send-email-orika@mellanox.com> <4c7da322-ea96-9649-b6cc-aad3a21df57a@solarflare.com> From: Andrew Rybchenko Message-ID: <87d7bb44-0537-3927-3276-e15743654268@solarflare.com> Date: Sun, 29 Sep 2019 15:10:53 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-Originating-IP: [91.220.146.112] X-ClientProxiedBy: ocex03.SolarFlarecom.com (10.20.40.36) To ukex01.SolarFlarecom.com (10.17.10.4) X-TM-AS-Product-Ver: SMEX-12.5.0.1300-8.5.1010-24942.002 X-TM-AS-Result: No-27.587800-8.000000-10 X-TMASE-MatchedRID: yebcs53SkkDmLzc6AOD8DfHkpkyUphL9Ud7Bjfo+5jTfoEW8Nyvnbz6P hj6DfZCEl8V6ZKEJA0+TH1CW/TkdqoyQUjhAi5+qMiMrbc70Pfd6QyBM5BtUW5m3TxN83Lo4jDg gcexqipUY9x8LgAfoDlng9TDo+S0W3EFLWHIZiz+KR0fcRBoRNU+4kuSJytt6PaYQnKWC98HKpq 3pvilz3WJAKcSt6PFE9VC6ARvCx67g6HB8qkvR8Ob3p4cnIXGNDyGPIvVxZnAWedCrEHXal1VF0 I750g9gG7dg7lzTm97GJTjSUOcmhuVozigZV/U0yf21YeIsPYZBSY6kx+M18Zc8I4HeDcpL3c00 lAOLPF58rxJaCCB5LtCGuesANG2NhT5cErpKjumqDSBu0tUhr8GcvcBoab2jI0YrtQLsSUxHNKA ISz2I1PHzNepWgkJSA/TcMOE6PSYUwVRCL6lqYWYGUnP3BHTl7MCp9lIIoYpSMUnCl2ZytNUcWY Cb/v9ouHiKpWY04ODU4r8KOeu7iGCOzIT/HGOFC4s/hE51YdWIniKKY5qrob142xO28EPeUgADW WOEbmN2wcINfceTjnDlPghqPnfyYlldA0POS1ImFvfzyF2ij7zETYfYS4xZ8DnqkZbEZZjTZ9Wt 1IEO+ExCVyzpTlsq7G8/PqbxJIvU2l0oNakT0cVbb3pjW5MnMfFOYpCI/uL5BgEebZ/Id+IXZjA cTWmvtJYETlbQGRHu5OT7GhgW7m6DRXsV2c8pB8FxO/BQHsJx4UrJDBdbJu1VpmGiDxtcAyul3K MVU/qBA+mUA2qFzWc5FCFZQeubTmVOsUDKD5eeAiCmPx4NwFkMvWAuahr8+gD2vYtOFhgqtq5d3 cxkNQP90fJP9eHt X-TM-AS-User-Approved-Sender: Yes X-TM-AS-User-Blocked-Sender: No X-TMASE-Result: 10--27.587800-8.000000 X-TMASE-Version: SMEX-12.5.0.1300-8.5.1010-24942.002 X-MDID: 1569759064-AeLBOgbBFu1k Subject: Re: [dpdk-dev] [PATCH 01/13] ethdev: support setup function for hairpin queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Ori, On 9/28/19 6:19 PM, Ori Kam wrote: > Hi Andrew. > PSB > >> -----Original Message----- >> From: Andrew Rybchenko >> Sent: Thursday, September 26, 2019 8:24 PM >> To: Ori Kam ; Thomas Monjalon >> ; Ferruh Yigit >> Cc: dev@dpdk.org; jingjing.wu@intel.com; stephen@networkplumber.org >> Subject: Re: [dpdk-dev] [PATCH 01/13] ethdev: support setup function for >> hairpin queue >> >> On 9/26/19 6:58 PM, Ori Kam wrote: >>> Hi Andrew, >>> Thanks for your comments PSB. >>> >>>> -----Original Message----- >>>> From: Andrew Rybchenko >>>> On 9/26/19 9:28 AM, Ori Kam wrote: >>>>> This commit introduce the RX/TX hairpin setup function. >>>>> RX/TX should be Rx/Tx here and everywhere below. >>>>> >>>>> Hairpin is RX/TX queue that is used by the nic in order to offload >>>>> wire to wire traffic. >>>>> >>>>> Each hairpin queue is binded to one or more queues from other type. >>>>> For example TX hairpin queue should be binded to at least 1 RX hairpin >>>>> queue and vice versa. >>>> How should application find out that hairpin queues are supported? >>> It should be stated in the release note of the DPDK, when manufacture adds >> support for this. >>> In addition if the application try to set hairpin queue and it fails it can mean >> depending on the >>> error that the hairpin is not supported. >> I'm talking about dev_info-like information. Documentation is nice, but >> it is not >> very useful to implement application which works with NICs from >> different vendors. >> > What if we add get hairpin capabilities function. > We could have, the max number of queues, if the nic support 1:n connection, > which offloads are supported and so on. So basically create a new set of capabilities > for hairpin this I think will also remove other concern that you have. > What do you think? Yes, I think an API to report capabilities would be useful. It should be also used in setup functions in order to do checks on generic level that setup request is OK vs caps. >>>> How many? >>> There is no limit to the number of hairpin queues from application all queues >> can be hairpin queues. >> >> I'm pretty sure that it could be vendor specific. >> > Please see my answer above. > >>>> How should application find out which ports/queues could be used for >>>> pining? >>> All ports and queues can be supported, if the application request invalid >> combination, for example >>> in current Mellanox implementation binding between two ports then the >> setup function will fail. >>> If you would like I can add capability for this, but there are too many options. >> For example number >>> of queues, binding limitations all of those will be very hard to declare. >>> >>> >>>> Is hair-pinning domain on device level sufficient to expose limitations? >>>> >>> I'm sorry but I don’t understand your question. >> I was just trying to imagine how we could  say that we can hairpin >> one port Rx queues to another port Tx queues. >> > Like I suggested above if I will add a capability function we could have > a field that says port_binidng supported, or something else, along this line. Not sure that I understand, but I'll take a look when submitted. >>>>> Signed-off-by: Ori Kam >>>>> --- >>>>> lib/librte_ethdev/rte_ethdev.c | 213 >>>>> +++++++++++++++++++++++++++++++ >>>>> lib/librte_ethdev/rte_ethdev.h | 145 +++++++++++++++++++++ >>>>> lib/librte_ethdev/rte_ethdev_core.h | 18 +++ >>>>> lib/librte_ethdev/rte_ethdev_version.map | 4 + >>>>> 4 files changed, 380 insertions(+) >>>>> >>>>> diff --git a/lib/librte_ethdev/rte_ethdev.c >>>>> b/lib/librte_ethdev/rte_ethdev.c index 30b0c78..4021f38 100644 >>>>> --- a/lib/librte_ethdev/rte_ethdev.c >>>>> +++ b/lib/librte_ethdev/rte_ethdev.c >>>>> @@ -1701,6 +1701,115 @@ struct rte_eth_dev * >>>>> } >>>>> >>>>> int >>>>> +rte_eth_rx_hairpin_queue_setup(uint16_t port_id, uint16_t >>>>> rx_queue_id, >>>>> + uint16_t nb_rx_desc, unsigned int socket_id, >>>>> + const struct rte_eth_rxconf *rx_conf, >>>>> + const struct rte_eth_hairpin_conf *hairpin_conf) >>>>> Below code duplicates rte_eth_rx_queue_setup() a lot and it is very bad >>>>> from maintenance point of view. Similar problem with Tx hairpin queue >>>>> setup. >>>>> >>> I'm aware of that. The reasons I choose it are: (same goes to Tx) >>> 1. use the same function approach, meaning to use the current setup >> function >>> the issues with this are: >>> * API break. >>> * It will have extra parameters, for example mempool will not be used >>> for hairpin and hairpin configuration will not be used for normal queue. >>> It is possible to use a struct but again API break and some fields are not >> used. >>> * we are just starting with the hairpin, most likely there will be >> modification so >>> it is better to have a different function. >>> * From application he undertand that this is a different kind of queue, >> which shouldn't be >>> used by the application. >> It does not excuse to duplicate so much code below. If we have separate >> dev_info-like limitations for hairpin, it would make sense, but I hope that >> it would be still possible to avoid code duplication. >> > We can start with the most basic implementation, which will mean that the function > will almost be empty, when other vendors or Mellanox will require some additional > test or code they will be able to decide if to add new code to he function, or > extract the shared code from the standard function to a specific function, and > use this function in both setup functions. > What do you think? Let's try and take a look at the code. [snip] >>>>> @@ -1769,6 +1793,60 @@ int rte_eth_rx_queue_setup(uint16_t port_id, >>>> uint16_t rx_queue_id, >>>>> struct rte_mempool *mb_pool); >>>>> >>>>> /** >>>>> + * @warning >>>>> + * @b EXPERIMENTAL: this API may change, or be removed, without prior >>>>> + notice >>>>> + * >>>>> + * Allocate and set up a hairpin receive queue for an Ethernet device. >>>>> + * >>>>> + * The function set up the selected queue to be used in hairpin. >>>>> + * >>>>> + * @param port_id >>>>> + * The port identifier of the Ethernet device. >>>>> + * @param rx_queue_id >>>>> + * The index of the receive queue to set up. >>>>> + * The value must be in the range [0, nb_rx_queue - 1] previously >>>> supplied >>>>> + * to rte_eth_dev_configure(). >>>> Is any Rx queue may be setup as hairpin queue? >>>> Can it be still used for regular traffic? >>>> >>> No if a queue is used as hairpin it can't be used for normal traffic. >>> This is also why I like the idea of two different functions, in order to create >>> This distinction. >> If so, do we need at least debug-level checks in Tx/Rx burst functions? >> Is it required to patch rte flow RSS action to ensure that Rx queues of >> only one kind are specified? >> What about attempt to add Rx/Tx callbacks for hairpin queues? >> > I think the checks should be done in PMD level. Since from high level they are the > same. Sorry, I don't understand why. If something could be checked on generic level, it should be done to avoid duplication in all drivers. > Call backs for Rx/Tx doesn't make sense, since the idea is to bypass the CPU. If so, I think rte_eth_add_tx_callback() should be patched to return an error if specified queue is hairpin. Same for Rx. Any other cases? >>>>> + * @param nb_rx_desc >>>>> + * The number of receive descriptors to allocate for the receive ring. >>>> Does it still make sense for hairpin queue? >>>> >>> Yes, since it can affect memory size used by the device, and can affect >> performance. >>>>> + * @param socket_id >>>>> + * The *socket_id* argument is the socket identifier in case of NUMA. >>>>> + * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint >>>> for >>>>> + * the DMA memory allocated for the receive descriptors of the ring. >>>> Is it still required to be provided for hairpin Rx queue? >>>> >>> Yes, for internal PMD structures to be allocated, but we can if pressed >> remove it. >>>>> + * @param rx_conf >>>>> + * The pointer to the configuration data to be used for the receive >>>> queue. >>>>> + * NULL value is allowed, in which case default RX configuration >>>>> + * will be used. >>>>> + * The *rx_conf* structure contains an *rx_thresh* structure with the >>>> values >>>>> + * of the Prefetch, Host, and Write-Back threshold registers of the >>>> receive >>>>> + * ring. >>>>> + * In addition it contains the hardware offloads features to activate using >>>>> + * the DEV_RX_OFFLOAD_* flags. >>>>> + * If an offloading set in rx_conf->offloads >>>>> + * hasn't been set in the input argument eth_conf->rxmode.offloads >>>>> + * to rte_eth_dev_configure(), it is a new added offloading, it must be >>>>> + * per-queue type and it is enabled for the queue. >>>>> + * No need to repeat any bit in rx_conf->offloads which has already been >>>>> + * enabled in rte_eth_dev_configure() at port level. An offloading >>>> enabled >>>>> + * at port level can't be disabled at queue level. >>>> Which offloads still make sense in the case of hairpin Rx queue? >>>> What about threshhods, drop enable? >>>> >>> Drop and thresholds make sense, for example the application can state that, >>> in case of back pressure to start dropping packets in order not to affect the >>> entire nic. >>> regarding offloads mainly vlan strip or vlan insert but those can also >>> be used in rte_flow. >>> But future offloads like QoS or other maybe shared. >> I'm not a fan of dead parameters which are added just to use >> the same structure. It raises too many questions on maintenance. >> Also I don't like idea to share hairpin and regular offloads. >> May be it is OK to share namespace (still unsure), but capabilities >> are definitely different and some regular offloads are simply not >> applicable to hairpin case. >> > I agree with you I think that my suggestion above (new caps for hairpin) > solve this issue. Do you agree? > I will remove the rte_eth_txconf and only hae the hairpin_conf with some new > fields, same for the Rx, is that O.K.? I think it would be better to keep only used parameters. Anyway, it is experimental API and we can add missing parameters when required. [snip] Thanks, Andrew.