From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C9E83A04C9; Sun, 13 Sep 2020 17:49:11 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 946821DB9; Sun, 13 Sep 2020 17:49:10 +0200 (CEST) Received: from git-send-mailer.rdmz.labs.mlnx (unknown [37.142.13.130]) by dpdk.org (Postfix) with ESMTP id 97E0FE07 for ; Sun, 13 Sep 2020 17:49:09 +0200 (CEST) From: Bing Zhao To: thomas@monjalon.net, orika@nvidia.com, ferruh.yigit@intel.com, arybchenko@solarflare.com Cc: dev@dpdk.org Date: Sun, 13 Sep 2020 23:48:56 +0800 Message-Id: <1600012140-70151-1-git-send-email-bingz@nvidia.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [RFC PATCH v2 0/4] introduce support for hairpin between two ports X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hairpin functionality only supports one single port mode (e.g testpmd application) in the current implementation. It means that the traffic will be sent out from the same port it comes. There is no such restriction for some NICs, and strong demand to support two ports hairpin mode in real-life cases. Two ports hairpin mode does not really mean hairpin will only support two ports in a single application. Indeed, it also needs to support the single port hairpin today for compatibility. In the meanwhile, 'two ports' means the ingress and egress ports of the traffic could Be different. And also, there is no restriction that 1. traffic from the same ingress port must go to the same egress port 2. traffic from the port that as 'egress' for other traffic flows must go to their 'ingress' port The configuration should be flexible and the behavior of traffic will be decided by the rte flows. Usually, during the startup phase, all the hairpin configurations except flows should be done. It means that hairpin TXQ and peer RXQ should be bound together. It is feasible in single port mode and transparent to the application. In two ports mode, there may be some problems for the queues configuring and binding. 1. Once TXQ & RXQ belong to different ports, it would be hard to configure the first port when the initialization of the second port is not done. Also, it is not proper to configure the first port during the second one starting. 2. The port could be attached and detached dynamically. Hairpin between these ports should support dynamic configuration. In two ports hairpin mode, since the TXQ and RXQ belong to different ports. If some actions need to be done in the TX part, the egress flow could be inserted explicitly and managed separately from the RX part. What's more, one egress flow could be shared for different ingress flows from the same or different ports. In order to satisfy these, some changes on the current rte ethdev and flow APIs are needed and some new APIs will be introduced. 1. Data structures in 'rte_ethdev.h' Two new members are added. struct rte_eth_hairpin_conf { uint16_t peer_count; /**< The number of peers. */ struct rte_eth_hairpin_peer peers[RTE_ETH_MAX_HAIRPIN_PEERS]; uint16_t tx_explicit; uint16_t manual_bind; }; 'tx_explicit': If 0, PMD will help to insert the egress flow in a implicit way. If 1, the application will insert it by itself. 'manual_bind': If 0, PMD will try to bind hairpin TXQ and RXQ peer automatically, like in today's single port hairpin mode and this is for backward compatibility. If 1, then manual bind API will be called. The application should ensure there is no conflict for the hairpin peer configurations between TX & RX as today and PMD could check them inside. For new member 'tx_explicit', all queue pairs from one ingress port to the same egress are suggested to have the same value in order not to create chaos, like in RSS cases. For new member 'manual_bind', the same suggestion is applicable. The support for the new members will be decided by the NICs' capacity and real-life usage from the application. 2. New macros in 'rte_ethdev.h' RTE_ETH_HAIRPIN_BIND_AUTO (0) RTE_ETH_HAIRPIN_BIND_MANUAL (1) RTE_ETH_HAIRPIN_TXRULE_IMPLICIT (0) RTE_ETH_HAIRPIN_TXRULE_EXPLICIT (1) These are used for the new members in 'struct rte_eth_hairpin_conf'. 3. New function APIs in 'rte_ethdev.h' * int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port) * typedef int (*eth_hairpin_bind)(struct rte_eth_dev *dev, uint16_t rx_port); This function will be used to bind one port egress to the peer port ingress. If 'rx_port' is equal to RTE_MAX_ETHPORTS, then all the ports will be traversed to bind hairpin egress queues to all of their ingress queues configured. The application needs to call it repeatedly to bind all egress ports. This should be called after the hairpin queues are set up and devices are started. If 'manual_bind' is not specified, no need to call this API. A function pointer with 'eth_hairpin_bind' type should be provided by the PMD to execute the hardware setting in the driver. 0 return value means success and a negative value will be returned to indicate the actual failure. * int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port) * typedef int (*eth_hairpin_unbind)(struct rte_eth_dev *dev, uint16_t rx_port); This function will unbind one port egress to the peer port ingress, only one direction hairpin will be unbound. Unbinding of the opposite direction needs another call of this API. If 'rx_port' is equal to RTE_MAX_ETHPORTS, all the ports will be traversed to do the queues unbind (if any). The application needs to call it repeatedly to unbind all egress ports. The API could be called without stopping or closing the eth device, but the application should ensure the flows inserted for the hairpin port pairs be handled properly. The traffic behavior should be divinable after unbound. It is suggested to remove all the flows for the same direction of a port pairs to be unbound, on both ports. A function pointer with 'eth_hairpin_unbind' type should be provided by the PMD to execute the hardware setting in the driver. 0 return value means success and a negative value will be returned to indicate the actual failure. After unbinding, the bind API could be called again to enable it. No peer reconfiguring is supported now without closing the devices. 4. New rte_flow item * RTE_FLOW_ITEM_TYPE_TX_QUEUE struct rte_flow_item_tx_queue { uint32_t queue; }; This provides a new item to match for an egress packet. In two ports hairpin mode, since the TX rules could be inserted explicitly on the egress port, it is hard to distinguish the hairpin packets from the software packets. Even if with metadata, it may require complex management. The support new rte_flow item is optional, depending on the NIC's capacity. With this item, a few wildcard rules could be inserted for hairpin to support some common actions. When switching to two ports hairpin mode with explicit TX rules, the metadata could be used to provide the 'connection' for a packet between ingress & egress. 1. The packet header might be changed due to the NAT of DECAP in the ingress, and the inner header or other parts may be different. 2. Different ingress flow rules could share the same egress rule to simplify rules management. The rte_flow examples are like below (port 0 RX X -> port 1 TX Y): flow create 0 ingress group M pattern eth / … / end actions queue index is X / set_meta data is V / end X is the ingress hairpin queue index. flow create 1 egress group N pattern eth / meta data is V / end actions vxlan_encap / end flow create 1 egress group 0 pattern eth / tx_queue index is Y / end actions jump group N / end Y is the egress hairpin queue index. This wildcard flow will help to redirect all the ethernet packets from hairpin TX queue Y to some specific group for further handling. In the meanwhile, other traffic sent from software will not be impacted by this wildcard rule. To verify this in testpmd, some changes are also required. 1. During startup phase, hairpin binding will use the chaining mode. E.g. if 3 ports are probed, hairpin traffic will be like this port A -> port B, Port B -> port C, port C -> port A In only a single port is probed port A -> port A 2. flow command line will add support to parse tx queue index pattern format: tx_queue index is UNSIGNED / ... Thanks Signed-off-by: Bing Zhao Bing Zhao (4): ethdev: add support for flow item transmit queue testpmd: add item transmit queue in flow CLI ethdev: add hairpin bind APIs ethdev: add new attributes to hairpin queues config app/test-pmd/cmdline_flow.c | 18 ++++++ lib/librte_ethdev/rte_ethdev.c | 100 ++++++++++++++++++++++++++++++++++ lib/librte_ethdev/rte_ethdev.h | 68 +++++++++++++++++++++++ lib/librte_ethdev/rte_ethdev_driver.h | 52 ++++++++++++++++++ lib/librte_ethdev/rte_flow.c | 1 + lib/librte_ethdev/rte_flow.h | 30 ++++++++++ 6 files changed, 269 insertions(+) -- 2.5.5