From: Bing Zhao <bingz@nvidia.com> To: Ori Kam <orika@nvidia.com>, NBU-Contact-Thomas Monjalon <thomas@monjalon.net>, "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>, "arybchenko@solarflare.com" <arybchenko@solarflare.com>, "dev@dpdk.org" <dev@dpdk.org> Subject: [dpdk-dev] [RFC] introduce support for hairpin between two ports Date: Fri, 11 Sep 2020 04:51:44 +0000 Message-ID: <CY4PR1201MB0072A4383E611EB8B65D89A6D0240@CY4PR1201MB0072.namprd12.prod.outlook.com> (raw) Hairpin functionality only supports one single port mode (e.g testpmd application) in the current implementation. It means that the traffic will be sent out from the same port it comes. There is no such restriction for some NICs, and strong demand to support two ports hairpin mode in real-life cases. Two ports hairpin mode does not really mean hairpin will only support two ports in a single application. Indeed, it also needs to support the single port hairpin today for compatibility. In the meanwhile, 'two ports' means the ingress and egress ports of the traffic could Be different. And also, there is no restriction that 1. traffic from the same ingress port must go to the same egress port 2. traffic from the port that as 'egress' for other traffic flows must go to their 'ingress' port The configuration should be flexible and the behavior of traffic will be decided by the rte flows. Usually, during the startup phase, all the hairpin configurations except flows should be done. It means that hairpin TXQ and peer RXQ should be bound together. It is feasible in single port mode and transparent to the application. In two ports mode, there may be some problems for the queues configuring and binding. 1. Once TXQ & RXQ belong to different ports, it would be hard to configure the first port when the initialization of the second port is not done. Also, it is not proper to configure the first port during the second one starting. 2. The port could be attached and detached dynamically. Hairpin between these ports should support dynamic configuration. In two ports hairpin mode, since the TXQ and RXQ belong to different ports. If some actions need to be done in the TX part, the egress flow could be inserted explicitly and managed separately from the RX part. What's more, one egress flow could be shared for different ingress flows from the same or different ports. In order to satisfy these, some changes on the current rte ethdev and flow APIs are needed and some new APIs will be introduced. 1. Data structures in 'rte_ethdev.h' Two new members are added. struct rte_eth_hairpin_conf { uint16_t peer_count; /**< The number of peers. */ struct rte_eth_hairpin_peer peers[RTE_ETH_MAX_HAIRPIN_PEERS]; uint16_t tx_explicit; uint16_t manual_bind; }; 'tx_explicit': If 0, PMD will help to insert the egress flow in a implicit way. If 1, the application will insert it by itself. 'manual_bind': If 0, PMD will try to bind hairpin TXQ and RXQ peer automatically, like in today's single port hairpin mode and this is for backward compatibility. If 1, then manual bind API will be called. The application should ensure there is no conflict for the hairpin peer configurations between TX & RX as today and PMD could check them inside. For new member 'tx_explicit', all queue pairs from one ingress port to the same egress are suggested to have the same value in order not to create chaos, like in RSS cases. For new member 'manual_bind', the same suggestion is applicable. The support for the new members will be decided by the NICs' capacity and real-life usage from the application. 2. New macros in 'rte_ethdev.h' RTE_ETH_HAIRPIN_BIND_AUTO (0) RTE_ETH_HAIRPIN_BIND_MANUAL (1) RTE_ETH_HAIRPIN_TXRULE_IMPLICIT (0) RTE_ETH_HAIRPIN_TXRULE_EXPLICIT (1) These are used for the new members in 'struct rte_eth_hairpin_conf'. 3. New function APIs in 'rte_ethdev.h' * int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port) * typedef int (*eth_hairpin_bind)(struct rte_eth_dev *dev, uint16_t rx_port); This function will be used to bind one port egress to the peer port ingress. If 'rx_port' is equal to RTE_MAX_ETHPORTS, then all the ports will be traversed to bind hairpin egress queues to all of their ingress queues configured. The application needs to call it repeatedly to bind all egress ports. This should be called after the hairpin queues are set up and devices are started. If 'manual_bind' is not specified, no need to call this API. A function pointer with 'eth_hairpin_bind' type should be provided by the PMD to execute the hardware setting in the driver. 0 return value means success and a negative value will be returned to indicate the actual failure. * int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port) * typedef int (*eth_hairpin_unbind)(struct rte_eth_dev *dev, uint16_t rx_port); This function will unbind one port egress to the peer port ingress, only one direction hairpin will be unbound. Unbinding of the opposite direction needs another call of this API. If 'rx_port' is equal to RTE_MAX_ETHPORTS, all the ports will be traversed to do the queues unbind (if any). The application needs to call it repeatedly to unbind all egress ports. The API could be called without stopping or closing the eth device, but the application should ensure the flows inserted for the hairpin port pairs be handled properly. The traffic behavior should be divinable after unbound. It is suggested to remove all the flows for the same direction of a port pairs to be unbound, on both ports. A function pointer with 'eth_hairpin_unbind' type should be provided by the PMD to execute the hardware setting in the driver. 0 return value means success and a negative value will be returned to indicate the actual failure. After unbinding, the bind API could be called again to enable it. No peer reconfiguring is supported now without closing the devices. 4. New rte_flow item * RTE_FLOW_ITEM_TYPE_TX_QUEUE struct rte_flow_item_tx_queue { uint32_t queue; }; This provides a new item to match for an egress packet. In two ports hairpin mode, since the TX rules could be inserted explicitly on the egress port, it is hard to distinguish the hairpin packets from the software packets. Even if with metadata, it may require complex management. The support new rte_flow item is optional, depending on the NIC's capacity. With this item, a few wildcard rules could be inserted for hairpin to support some common actions. When switching to two ports hairpin mode with explicit TX rules, the metadata could be used to provide the 'connection' for a packet between ingress & egress. 1. The packet header might be changed due to the NAT of DECAP in the ingress, and the inner header or other parts may be different. 2. Different ingress flow rules could share the same egress rule to simplify rules management. The rte_flow examples are like below (port 0 RX X -> port 1 TX Y): flow create 0 ingress group M pattern eth / ... / end actions queue index is X / set_meta data is V / end X is the ingress hairpin queue index. flow create 1 egress group N pattern eth / meta data is V / end actions vxlan_encap / end flow create 1 egress group 0 pattern eth / tx_queue index is Y / end actions jump group N / end Y is the egress hairpin queue index. This wildcard flow will help to redirect all the ethernet packets from hairpin TX queue Y to some specific group for further handling. In the meanwhile, other traffic sent from software will not be impacted by this wildcard rule. To verify this in testpmd, some changes are also required. 1. During startup phase, hairpin binding will use the chaining mode. E.g. if 3 ports are probed, hairpin traffic will be like this port A -> port B, Port B -> port C, port C -> port A In only a single port is probed port A -> port A 2. flow command line will add support to parse tx queue index pattern format: tx_queue index is UNSIGNED / ... Thanks Signed-off-by: Bing Zhao <bingz@nvidia.com>
next reply other threads:[~2020-09-11 4:51 UTC|newest] Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-11 4:51 Bing Zhao [this message] 2020-09-13 15:48 ` [dpdk-dev] [RFC PATCH v2 0/4] " Bing Zhao 2020-09-13 15:48 ` [dpdk-dev] [RFC PATCH v2 1/4] ethdev: add support for flow item transmit queue Bing Zhao 2020-09-13 15:48 ` [dpdk-dev] [RFC PATCH v2 2/4] testpmd: add item transmit queue in flow CLI Bing Zhao 2020-09-13 15:48 ` [dpdk-dev] [RFC PATCH v2 3/4] ethdev: add hairpin bind APIs Bing Zhao 2020-09-13 15:49 ` [dpdk-dev] [RFC PATCH v2 4/4] ethdev: add new attributes to hairpin queues config Bing Zhao 2020-10-01 0:25 ` [dpdk-dev] [PATCH 0/4] introduce support for hairpin between two ports Bing Zhao 2020-10-01 0:25 ` [dpdk-dev] [PATCH 1/4] ethdev: add hairpin bind and unbind APIs Bing Zhao 2020-10-04 9:20 ` Ori Kam 2020-10-07 11:21 ` Bing Zhao 2020-10-07 11:42 ` Ori Kam 2020-10-01 0:26 ` [dpdk-dev] [PATCH 2/4] ethdev: add new attributes to hairpin config Bing Zhao 2020-10-04 9:22 ` Ori Kam 2020-10-07 11:32 ` Bing Zhao 2020-10-01 0:26 ` [dpdk-dev] [PATCH 3/4] ethdev: add APIs for hairpin queue operation Bing Zhao 2020-10-04 9:34 ` Ori Kam 2020-10-07 11:34 ` Bing Zhao 2020-10-01 0:26 ` [dpdk-dev] [PATCH 4/4] app/testpmd: change hairpin queues setup Bing Zhao 2020-10-04 9:39 ` Ori Kam 2020-10-07 11:36 ` Bing Zhao 2020-10-04 9:45 ` [dpdk-dev] [PATCH 0/4] introduce support for hairpin between two ports Ori Kam 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 0/6] " Bing Zhao 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 1/6] ethdev: add hairpin bind and unbind APIs Bing Zhao 2020-10-08 9:07 ` Ori Kam 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 2/6] ethdev: add new attributes to hairpin config Bing Zhao 2020-10-08 9:23 ` Ori Kam 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 3/6] ethdev: add API to get hairpin peer ports list Bing Zhao 2020-10-08 9:40 ` Ori Kam 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 4/6] ethdev: add APIs for hairpin queue operation Bing Zhao 2020-10-08 9:44 ` Ori Kam 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 5/6] app/testpmd: change hairpin queues setup Bing Zhao 2020-10-08 9:45 ` Ori Kam 2020-10-08 8:51 ` [dpdk-dev] [PATCH v2 6/6] doc: update for two ports hairpin mode Bing Zhao 2020-10-08 9:47 ` Ori Kam 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 0/6] introduce support for hairpin between two ports Bing Zhao 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 1/6] ethdev: add hairpin bind and unbind APIs Bing Zhao 2020-10-14 14:35 ` Thomas Monjalon 2020-10-15 2:56 ` Bing Zhao 2020-10-15 7:31 ` Thomas Monjalon 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 2/6] ethdev: add new attributes to hairpin config Bing Zhao 2020-10-12 21:37 ` Thomas Monjalon 2020-10-13 12:29 ` Bing Zhao 2020-10-13 12:41 ` Thomas Monjalon 2020-10-13 13:21 ` Bing Zhao 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 3/6] ethdev: add API to get hairpin peer ports list Bing Zhao 2020-10-08 12:31 ` Ori Kam 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 4/6] ethdev: add APIs for hairpin queue operation Bing Zhao 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 5/6] app/testpmd: change hairpin queues setup Bing Zhao 2020-10-08 12:05 ` [dpdk-dev] [PATCH v3 6/6] doc: update for two ports hairpin mode Bing Zhao 2020-10-12 21:30 ` Thomas Monjalon 2020-10-13 1:13 ` Bing Zhao 2020-10-13 6:37 ` Thomas Monjalon 2020-10-13 6:40 ` Bing Zhao 2020-10-13 16:19 ` [dpdk-dev] [PATCH v4 0/5] introduce support for hairpin between two ports Bing Zhao 2020-10-13 16:19 ` [dpdk-dev] [PATCH v4 1/5] ethdev: add hairpin bind and unbind APIs Bing Zhao 2020-10-14 14:43 ` Thomas Monjalon 2020-10-15 2:59 ` Bing Zhao 2020-10-13 16:19 ` [dpdk-dev] [PATCH v4 2/5] ethdev: add new attributes to hairpin config Bing Zhao 2020-10-13 16:19 ` [dpdk-dev] [PATCH v4 3/5] ethdev: add API to get hairpin peer ports list Bing Zhao 2020-10-14 15:02 ` Thomas Monjalon 2020-10-15 4:03 ` Bing Zhao 2020-10-13 16:19 ` [dpdk-dev] [PATCH v4 4/5] ethdev: add APIs for hairpin queue operation Bing Zhao 2020-10-13 16:19 ` [dpdk-dev] [PATCH v4 5/5] app/testpmd: change hairpin queues setup Bing Zhao 2020-10-15 5:35 ` [dpdk-dev] [PATCH v5 0/5] introduce support for hairpin between two ports Bing Zhao 2020-10-15 5:35 ` [dpdk-dev] [PATCH v5 1/5] ethdev: add hairpin bind and unbind APIs Bing Zhao 2020-10-15 10:34 ` Thomas Monjalon 2020-10-15 11:39 ` Bing Zhao 2020-10-15 5:35 ` [dpdk-dev] [PATCH v5 2/5] ethdev: add new attributes to hairpin config Bing Zhao 2020-10-15 10:46 ` Thomas Monjalon 2020-10-15 13:45 ` Bing Zhao 2020-10-15 5:35 ` [dpdk-dev] [PATCH v5 3/5] ethdev: add API to get hairpin peer ports list Bing Zhao 2020-10-15 5:35 ` [dpdk-dev] [PATCH v5 4/5] ethdev: add APIs for hairpin queue operation Bing Zhao 2020-10-15 5:35 ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: change hairpin queues setup Bing Zhao 2020-10-15 13:08 ` [dpdk-dev] [PATCH v6 0/5] introduce support for hairpin between two ports Bing Zhao 2020-10-15 13:08 ` [dpdk-dev] [PATCH v6 1/5] ethdev: add hairpin bind and unbind APIs Bing Zhao 2020-10-15 13:08 ` [dpdk-dev] [PATCH v6 2/5] ethdev: add new attributes to hairpin config Bing Zhao 2020-10-15 13:08 ` [dpdk-dev] [PATCH v6 3/5] ethdev: add API to get hairpin peer ports list Bing Zhao 2020-10-15 13:08 ` [dpdk-dev] [PATCH v6 4/5] ethdev: add APIs for hairpin queue operation Bing Zhao 2020-10-15 13:08 ` [dpdk-dev] [PATCH v6 5/5] app/testpmd: change hairpin queues setup Bing Zhao 2020-10-15 23:03 ` [dpdk-dev] [PATCH v6 0/5] introduce support for hairpin between two ports Ferruh Yigit 2020-10-16 1:34 ` Bing Zhao
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CY4PR1201MB0072A4383E611EB8B65D89A6D0240@CY4PR1201MB0072.namprd12.prod.outlook.com \ --to=bingz@nvidia.com \ --cc=arybchenko@solarflare.com \ --cc=dev@dpdk.org \ --cc=ferruh.yigit@intel.com \ --cc=orika@nvidia.com \ --cc=thomas@monjalon.net \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
DPDK patches and discussions This inbox may be cloned and mirrored by anyone: git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \ dev@dpdk.org public-inbox-index dev Example config snippet for mirrors. Newsgroup available over NNTP: nntp://inbox.dpdk.org/inbox.dpdk.dev AGPL code for this site: git clone https://public-inbox.org/public-inbox.git