From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7211DA0471 for ; Sun, 8 Sep 2019 08:44:52 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 396AC1F018; Sun, 8 Sep 2019 08:44:51 +0200 (CEST) Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40054.outbound.protection.outlook.com [40.107.4.54]) by dpdk.org (Postfix) with ESMTP id B9CAC1BEED for ; Sun, 8 Sep 2019 08:44:49 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hevEt9QmfWlC72U1qCIfKwxtM48UrXBVcE9R/uVavXGDchGU7Jl5m2OPVBu6CQuRbvvX14/aOr3BWZVn6/n6KjklLFJYecg/ldD54aexw92whiLlX7eYexx5ARrJd7tqRjvI/BdKvR2Tsfee6nYCqMyN5qy2Q5hdtHmsn+imDtAflBXmFg27UljmdET+yXX3LD9f49rdPMqHWELegkTI0QZntP+4lctBLWSMnZv9TaM1SY1KvpiYQ6ALjRw438DtbFka4hX/qyNrkXVHzKggA4dY0YnHpr26qBhlw05f/+OuayLWPdv5Df+4SRkm3ykvECnPaGD556FUihQVazRDJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Vtw7ykWu4E58+n8ZWFfWzpnYIoqWkn5Zfp6RQ46fLEk=; b=nmG7IWXh/4jzDQDwRROvTtobbLn8mTDinjWN7tb3AZQVvwRToZFWMUncsHm0KXRXf3ulrMMMmpJCTUUobwNSONCRLXkHl79hKYXaNWNpsbAkW98vMqhH7ZLu40P8op2MdJOmhvbOyK0cpF2hWtbxHdIDRAxSzo9h5uxbKffwKI8dZQvpIlUIupi4FJAiT63Myyq4ICmJQQVaAZhbfvfSawt9pHAuLSHQNiHVlzgtGA+YPGp4WXFbPWBOA62VEOQMFhozmEGNXygSx6QLBb2JxRpnig2j8tjn/XJeI03PKFO0n7wgjOpODg02mtk+JUgeFcMy/t6zPm8cVlI4KIZmow== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Vtw7ykWu4E58+n8ZWFfWzpnYIoqWkn5Zfp6RQ46fLEk=; b=Ai3IwDr7ivkmwW6sMhqRDGwGK439BDRRNKaN8vO7Iwuk1txzHWDER+m+luhZ6wAlTuyYG61EigA6mmhpc1WluNgIVCeUShOzLnLpoCArM401zICZcLkUzcl4DAUl8OVuiUtP7LrpehBuL4L4CbFl8cZ7wo2ly+/UE+oKkdIhlHk= Received: from AM4PR05MB3425.eurprd05.prod.outlook.com (10.171.190.15) by AM4PR05MB3266.eurprd05.prod.outlook.com (10.171.190.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2220.18; Sun, 8 Sep 2019 06:44:47 +0000 Received: from AM4PR05MB3425.eurprd05.prod.outlook.com ([fe80::4c32:a34f:5558:a2c6]) by AM4PR05MB3425.eurprd05.prod.outlook.com ([fe80::4c32:a34f:5558:a2c6%7]) with mapi id 15.20.2241.018; Sun, 8 Sep 2019 06:44:47 +0000 From: Ori Kam To: "Wu, Jingjing" , Thomas Monjalon , "Yigit, Ferruh" , "arybchenko@solarflare.com" , Shahaf Shuler , Slava Ovsiienko , Alex Rosenbaum CC: "dev@dpdk.org" Thread-Topic: [dpdk-dev] [RFC] ethdev: support hairpin queue Thread-Index: AQHVUdxZYjq5dQPUn0m0/551v94NJqccmUsAgAAVIBCAAW5yAIADSmkg Date: Sun, 8 Sep 2019 06:44:47 +0000 Message-ID: References: <1565703468-55617-1-git-send-email-orika@mellanox.com> <9BB6961774997848B5B42BEC655768F81150C0CA@SHSMSX103.ccr.corp.intel.com> <9BB6961774997848B5B42BEC655768F81150CDEF@SHSMSX103.ccr.corp.intel.com> In-Reply-To: <9BB6961774997848B5B42BEC655768F81150CDEF@SHSMSX103.ccr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=orika@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: b7f83ab9-d4a3-4c1e-f534-08d734280a59 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600166)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:AM4PR05MB3266; x-ms-traffictypediagnostic: AM4PR05MB3266:|AM4PR05MB3266: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 0154C61618 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(39850400004)(366004)(346002)(376002)(136003)(396003)(51914003)(13464003)(199004)(189003)(6246003)(81166006)(76116006)(14454004)(66066001)(256004)(66476007)(486006)(74316002)(6436002)(316002)(71200400001)(53546011)(5660300002)(305945005)(2501003)(6506007)(478600001)(186003)(11346002)(76176011)(2906002)(99286004)(66446008)(102836004)(64756008)(66556008)(446003)(8936002)(26005)(53936002)(6636002)(86362001)(25786009)(7696005)(476003)(9686003)(6116002)(110136005)(14444005)(8676002)(52536014)(4326008)(3846002)(81156014)(229853002)(7736002)(71190400001)(66946007)(55016002)(33656002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3266; H:AM4PR05MB3425.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: OmSupLjf+5S/LQsRGwJNQl8pod1XzenrpYWmBo5qi6sW56TYT/znUxW2T9M9wI/GNGpzVjdfsE/h6S9tn4YOl2VbHyKV6TDqdAkQXkzDPOMojKNVbAWGNAce5I8tvtuSwDbSL/owPs+XUq3WqweiWC6O8ZoxqhpRuX18mPCSHPE8s2MWNpnsVOh6r+nOWSKgcmJfD0bWiIsMud/r65RWNPSVdFFQnMOoXFvhRZB+OtVGSull5gaCZmDF8sAJrtPBk3e+A1VteZc7ct+7cKNGhndYFrBZgacNBtwzzpqXtcawj0AzuT7JGw7DP87ENSaCJTqGCLH08pwjPtlCGXOkE+jrvYeZejSZcp3kLdvg/OgfAg549PlMRu1LHTWX5flEogZvSXAHyvJfviF4dkkTSGylpu+rqKcQmHGSqPxRD7c= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: b7f83ab9-d4a3-4c1e-f534-08d734280a59 X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Sep 2019 06:44:47.1716 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 2vjKeg1jsKfLZp5aKbwW9CEGen9KwHCj8b6xdM/SXtbm9d2ujgYut8tmvVEiToX92gM/zNE7VzgmIEjY7obBZA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3266 Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Jingjing, PSB > -----Original Message----- > From: Wu, Jingjing > Sent: Friday, September 6, 2019 6:08 AM > To: Ori Kam ; Thomas Monjalon > ; Yigit, Ferruh ; > arybchenko@solarflare.com; Shahaf Shuler ; Slava > Ovsiienko ; Alex Rosenbaum > > Cc: dev@dpdk.org > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue >=20 > Hi, Ori >=20 > Thanks for the explanation. I have more question below. >=20 > Thanks > Jingjing >=20 > > -----Original Message----- > > From: Ori Kam [mailto:orika@mellanox.com] > > Sent: Thursday, September 5, 2019 1:45 PM > > To: Wu, Jingjing ; Thomas Monjalon > ; > > Yigit, Ferruh ; arybchenko@solarflare.com; Shah= af > Shuler > > ; Slava Ovsiienko ; > Alex > > Rosenbaum > > Cc: dev@dpdk.org > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > Hi Wu, > > Thanks for your comments PSB, > > > > Ori > > > > > -----Original Message----- > > > From: Wu, Jingjing > > > Sent: Thursday, September 5, 2019 7:01 AM > > > To: Ori Kam ; Thomas Monjalon > > > ; Yigit, Ferruh ; > > > arybchenko@solarflare.com; Shahaf Shuler ; > Slava > > > Ovsiienko ; Alex Rosenbaum > > > > > > Cc: dev@dpdk.org > > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ori Kam > > > > Sent: Tuesday, August 13, 2019 9:38 PM > > > > To: thomas@monjalon.net; Yigit, Ferruh ; > > > > arybchenko@solarflare.com; shahafs@mellanox.com; > > > viacheslavo@mellanox.com; > > > > alexr@mellanox.com > > > > Cc: dev@dpdk.org; orika@mellanox.com > > > > Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > > > > > This RFC replaces RFC[1]. > > > > > > > > The hairpin feature (different name can be forward) acts as "bump o= n the > > > wire", > > > > meaning that a packet that is received from the wire can be modifie= d > using > > > > offloaded action and then sent back to the wire without application > > > intervention > > > > which save CPU cycles. > > > > > > > > The hairpin is the inverse function of loopback in which applicatio= n > > > > sends a packet then it is received again by the > > > > application without being sent to the wire. > > > > > > > > The hairpin can be used by a number of different NVF, for example l= oad > > > > balancer, gateway and so on. > > > > > > > > As can be seen from the hairpin description, hairpin is basically R= X queue > > > > connected to TX queue. > > > > > > > > During the design phase I was thinking of two ways to implement thi= s > > > > feature the first one is adding a new rte flow action. and the seco= nd > > > > one is create a special kind of queue. > > > > > > > > The advantages of using the queue approch: > > > > 1. More control for the application. queue depth (the memory size t= hat > > > > should be used). > > > > 2. Enable QoS. QoS is normaly a parametr of queue, so in this appro= ch it > > > > will be easy to integrate with such system. > > > > > > > > > Which kind of QoS? > > > > For example latency , packet rate those kinds of makes sense in the que= ue > level. > > I know we don't have any current support but I think we will have durin= g the > next year. > > > Where would be the QoS API loading? TM API? Or propose other new? I think it will be a new API, The TM is more like limiting the bandwidth o= f a target flow, while in QoS should influence more the priority between the queues. > > > > > > > 3. Native integression with the rte flow API. Just setting the targ= et > > > > queue/rss to hairpin queue, will result that the traffic will be ro= uted > > > > to the hairpin queue. > > > > 4. Enable queue offloading. > > > > > > > Looks like the hairpin queue is just hardware queue, it has no relati= onship > with > > > host memory. It makes the queue concept a little bit confusing. And w= hy do > we > > > need to setup queues, maybe some info in eth_conf is enough? > > > > Like stated above it makes sense to have queue related parameters. > > For example I can think of application that most packets are going thre= w that > hairpin > > queue, but some control packets are > > from the application. So the application can configure the QoS between = those > two > > queues. In addtion this will enable the application > > to use the queue like normal queue from rte_flow (see comment below) an= d > every other > > aspect. > > > Yes, it is typical use case. And rte_flow is used to classify to differen= t queue? > If I understand correct, your hairpin queue is using host memory/or on-ca= rd > memory for buffering, but CPU cannot touch it, all the packet processing = is > done by NIC. > Queue is created, where the queue ID is used? Tx queue ID may be used as > action of rte_flow? I still don't understand where the hairpin Rx queue I= D be > used. > In my opinion, if no rx/tx function, it should not be a true queue from h= ost view. >=20 Yes rte_flow is used to classify the traffic between the queues, in order t= o use the hairpin feature in=20 the basic usage, the application just insert any ingress flow that the targ= et queue/RSS is hairpin queue. For example assuming that queue index 4 is hairpin queue, hairpin will look= something like this: Flow create 0 ingress group 0 pattern eth / ipv4 .... / end actions decap /= encap / queue index 4 / end I understand but don't agree about your point that if there is no rx/tx fun= ction it is not a queue. In hairpin queue we are offloading the data path. Unrelated to this RFC we = are working on VDPA driver. This is not ethdev driver but what it does is offloading the vhost and offl= oads the enqueue and dequeue functions[1]. > > > > > > Not sure how your hardware make the hairpin work? Use rte_flow for > packet > > > modification offload? Then how does HW distribute packets to those > hardware > > > queue, classification? If So, why not just extend rte_flow with the h= airpin > > > action? > > > > > > > You are correct, the application uses rte_flow and just points the traf= fic to the > requested > > hairpin queue/rss. > > We could have added a new rte_flow command. The reasons we didn't: > > 1. Like stated above some of the hairpin makes sense in queue level. > > 2. In the near future, we will also want to support hairpin between di= fferent > ports. This > > makes much more > > sense using queues. > > > > > > Each hairpin Rxq can be connected Txq / number of Txqs which can > belong to > > > a > > > > different ports assuming the PMD supports it. The same goes the oth= er > > > > way each hairpin Txq can be connected to one or more Rxqs. > > > > This is the reason that both the Txq setup and Rxq setup are gettin= g the > > > > hairpin configuration structure. > > > > > > > > From PMD prespctive the number of Rxq/Txq is the total of standard > > > > queues + hairpin queues. > > > > > > > > To configure hairpin queue the user should call > > > > rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup > insteed > > > > of the normal queue setup functions. > > > > > > If the new API introduced to avoid ABI change, would one API > > > rte_eth_rx_hairpin_setup be enough? > > > > I'm not sure I understand your comment. > > The rx_hairpin_setup was created for two main reasons: > > 1. Avoid API change. > > 2. I think it is more correct to use different API since the parameters= are > different. > > > I mean not use queue setup concept, set hairpin feature through one hairp= in > configuration API. >=20 I'm not sure I understand. API that will look something like this will be better? Int hairpin_bind(uint16_t rx_port, uint16_t rx queue, struct hairpin_conf *= rx_hairpin_conf,=20 uint16_t tx_port, uint16_t tx_queue, struct hairpin_conf *tx_hairpin_conf) The problem with such API, is that it will cause issue for nics that suppor= ts one to many connections. For example assuming that some nic can support one rx queue to 4 tx queues. Also we still need to configure the hairpin queue. So if I understand you c= orrectly is that the hairpin queues will not be setup and this API will set them. =20 > > The reason we have both rx and tx setup functions is that we want the u= ser to > have > > control binding the two queues. > > It is most important when we will advance to hairpin between ports. >=20 > Hairpin between ports? It looks like switch but not hairpin, right? Switch from my understanding is between VM meaning traffic sent from one VM= will be routed directly to the target VM. This is not the case of hairpin. In hairpin traf= fic comes from the wire and goes back to the wire. There are no VM in the system. Example application for ha= irpin is load balancers or gateways, were we get for example one port is connected to one system and the second = port connected to a second system. It is the job of the application to check if the packet should pass and if = so modify it, to match the second system. For example moving VXLAN tunnel packet to MPLS tunnel in the other system. =20 > > > > > > > > Thanks > > > Jingjing > > > > Thanks, > > Ori Thanks, Ori