From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 7211DA0471
	for <public@inbox.dpdk.org>; Sun,  8 Sep 2019 08:44:52 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 396AC1F018;
	Sun,  8 Sep 2019 08:44:51 +0200 (CEST)
Received: from EUR03-DB5-obe.outbound.protection.outlook.com
 (mail-eopbgr40054.outbound.protection.outlook.com [40.107.4.54])
 by dpdk.org (Postfix) with ESMTP id B9CAC1BEED
 for <dev@dpdk.org>; Sun,  8 Sep 2019 08:44:49 +0200 (CEST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=hevEt9QmfWlC72U1qCIfKwxtM48UrXBVcE9R/uVavXGDchGU7Jl5m2OPVBu6CQuRbvvX14/aOr3BWZVn6/n6KjklLFJYecg/ldD54aexw92whiLlX7eYexx5ARrJd7tqRjvI/BdKvR2Tsfee6nYCqMyN5qy2Q5hdtHmsn+imDtAflBXmFg27UljmdET+yXX3LD9f49rdPMqHWELegkTI0QZntP+4lctBLWSMnZv9TaM1SY1KvpiYQ6ALjRw438DtbFka4hX/qyNrkXVHzKggA4dY0YnHpr26qBhlw05f/+OuayLWPdv5Df+4SRkm3ykvECnPaGD556FUihQVazRDJQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=Vtw7ykWu4E58+n8ZWFfWzpnYIoqWkn5Zfp6RQ46fLEk=;
 b=nmG7IWXh/4jzDQDwRROvTtobbLn8mTDinjWN7tb3AZQVvwRToZFWMUncsHm0KXRXf3ulrMMMmpJCTUUobwNSONCRLXkHl79hKYXaNWNpsbAkW98vMqhH7ZLu40P8op2MdJOmhvbOyK0cpF2hWtbxHdIDRAxSzo9h5uxbKffwKI8dZQvpIlUIupi4FJAiT63Myyq4ICmJQQVaAZhbfvfSawt9pHAuLSHQNiHVlzgtGA+YPGp4WXFbPWBOA62VEOQMFhozmEGNXygSx6QLBb2JxRpnig2j8tjn/XJeI03PKFO0n7wgjOpODg02mtk+JUgeFcMy/t6zPm8cVlI4KIZmow==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com;
 dkim=pass header.d=mellanox.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector2;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=Vtw7ykWu4E58+n8ZWFfWzpnYIoqWkn5Zfp6RQ46fLEk=;
 b=Ai3IwDr7ivkmwW6sMhqRDGwGK439BDRRNKaN8vO7Iwuk1txzHWDER+m+luhZ6wAlTuyYG61EigA6mmhpc1WluNgIVCeUShOzLnLpoCArM401zICZcLkUzcl4DAUl8OVuiUtP7LrpehBuL4L4CbFl8cZ7wo2ly+/UE+oKkdIhlHk=
Received: from AM4PR05MB3425.eurprd05.prod.outlook.com (10.171.190.15) by
 AM4PR05MB3266.eurprd05.prod.outlook.com (10.171.190.152) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.2220.18; Sun, 8 Sep 2019 06:44:47 +0000
Received: from AM4PR05MB3425.eurprd05.prod.outlook.com
 ([fe80::4c32:a34f:5558:a2c6]) by AM4PR05MB3425.eurprd05.prod.outlook.com
 ([fe80::4c32:a34f:5558:a2c6%7]) with mapi id 15.20.2241.018; Sun, 8 Sep 2019
 06:44:47 +0000
From: Ori Kam <orika@mellanox.com>
To: "Wu, Jingjing" <jingjing.wu@intel.com>, Thomas Monjalon
 <thomas@monjalon.net>, "Yigit, Ferruh" <ferruh.yigit@intel.com>,
 "arybchenko@solarflare.com" <arybchenko@solarflare.com>, Shahaf Shuler
 <shahafs@mellanox.com>, Slava Ovsiienko <viacheslavo@mellanox.com>, Alex
 Rosenbaum <alexr@mellanox.com>
CC: "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [dpdk-dev] [RFC] ethdev: support hairpin queue
Thread-Index: AQHVUdxZYjq5dQPUn0m0/551v94NJqccmUsAgAAVIBCAAW5yAIADSmkg
Date: Sun, 8 Sep 2019 06:44:47 +0000
Message-ID: <AM4PR05MB3425A111812BCDC7DF79B799DBB40@AM4PR05MB3425.eurprd05.prod.outlook.com>
References: <1565703468-55617-1-git-send-email-orika@mellanox.com>
 <9BB6961774997848B5B42BEC655768F81150C0CA@SHSMSX103.ccr.corp.intel.com>
 <AM4PR05MB3425DB586BEF6B4D3FF521F0DBBB0@AM4PR05MB3425.eurprd05.prod.outlook.com>
 <9BB6961774997848B5B42BEC655768F81150CDEF@SHSMSX103.ccr.corp.intel.com>
In-Reply-To: <9BB6961774997848B5B42BEC655768F81150CDEF@SHSMSX103.ccr.corp.intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=orika@mellanox.com; 
x-originating-ip: [193.47.165.251]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: b7f83ab9-d4a3-4c1e-f534-08d734280a59
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: BCL:0; PCL:0;
 RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600166)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020);
 SRVR:AM4PR05MB3266; 
x-ms-traffictypediagnostic: AM4PR05MB3266:|AM4PR05MB3266:
x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <AM4PR05MB32661ED593B99C4C2404176CDBB40@AM4PR05MB3266.eurprd05.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0154C61618
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10009020)(4636009)(39850400004)(366004)(346002)(376002)(136003)(396003)(51914003)(13464003)(199004)(189003)(6246003)(81166006)(76116006)(14454004)(66066001)(256004)(66476007)(486006)(74316002)(6436002)(316002)(71200400001)(53546011)(5660300002)(305945005)(2501003)(6506007)(478600001)(186003)(11346002)(76176011)(2906002)(99286004)(66446008)(102836004)(64756008)(66556008)(446003)(8936002)(26005)(53936002)(6636002)(86362001)(25786009)(7696005)(476003)(9686003)(6116002)(110136005)(14444005)(8676002)(52536014)(4326008)(3846002)(81156014)(229853002)(7736002)(71190400001)(66946007)(55016002)(33656002);
 DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3266;
 H:AM4PR05MB3425.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en;
 PTR:InfoNoRecords; A:1; MX:1; 
received-spf: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: OmSupLjf+5S/LQsRGwJNQl8pod1XzenrpYWmBo5qi6sW56TYT/znUxW2T9M9wI/GNGpzVjdfsE/h6S9tn4YOl2VbHyKV6TDqdAkQXkzDPOMojKNVbAWGNAce5I8tvtuSwDbSL/owPs+XUq3WqweiWC6O8ZoxqhpRuX18mPCSHPE8s2MWNpnsVOh6r+nOWSKgcmJfD0bWiIsMud/r65RWNPSVdFFQnMOoXFvhRZB+OtVGSull5gaCZmDF8sAJrtPBk3e+A1VteZc7ct+7cKNGhndYFrBZgacNBtwzzpqXtcawj0AzuT7JGw7DP87ENSaCJTqGCLH08pwjPtlCGXOkE+jrvYeZejSZcp3kLdvg/OgfAg549PlMRu1LHTWX5flEogZvSXAHyvJfviF4dkkTSGylpu+rqKcQmHGSqPxRD7c=
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-Network-Message-Id: b7f83ab9-d4a3-4c1e-f534-08d734280a59
X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Sep 2019 06:44:47.1716 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 2vjKeg1jsKfLZp5aKbwW9CEGen9KwHCj8b6xdM/SXtbm9d2ujgYut8tmvVEiToX92gM/zNE7VzgmIEjY7obBZA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3266
Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi Jingjing,

PSB

> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Friday, September 6, 2019 6:08 AM
> To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
>=20
> Hi, Ori
>=20
> Thanks for the explanation. I have more question below.
>=20
> Thanks
> Jingjing
>=20
> > -----Original Message-----
> > From: Ori Kam [mailto:orika@mellanox.com]
> > Sent: Thursday, September 5, 2019 1:45 PM
> > To: Wu, Jingjing <jingjing.wu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > Yigit, Ferruh <ferruh.yigit@intel.com>; arybchenko@solarflare.com; Shah=
af
> Shuler
> > <shahafs@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Alex
> > Rosenbaum <alexr@mellanox.com>
> > Cc: dev@dpdk.org
> > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> >
> > Hi Wu,
> > Thanks for your comments PSB,
> >
> > Ori
> >
> > > -----Original Message-----
> > > From: Wu, Jingjing <jingjing.wu@intel.com>
> > > Sent: Thursday, September 5, 2019 7:01 AM
> > > To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>;
> Slava
> > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > <alexr@mellanox.com>
> > > Cc: dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ori Kam
> > > > Sent: Tuesday, August 13, 2019 9:38 PM
> > > > To: thomas@monjalon.net; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > > arybchenko@solarflare.com; shahafs@mellanox.com;
> > > viacheslavo@mellanox.com;
> > > > alexr@mellanox.com
> > > > Cc: dev@dpdk.org; orika@mellanox.com
> > > > Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > > >
> > > > This RFC replaces RFC[1].
> > > >
> > > > The hairpin feature (different name can be forward) acts as "bump o=
n the
> > > wire",
> > > > meaning that a packet that is received from the wire can be modifie=
d
> using
> > > > offloaded action and then sent back to the wire without application
> > > intervention
> > > > which save CPU cycles.
> > > >
> > > > The hairpin is the inverse function of loopback in which applicatio=
n
> > > > sends a packet then it is received again by the
> > > > application without being sent to the wire.
> > > >
> > > > The hairpin can be used by a number of different NVF, for example l=
oad
> > > > balancer, gateway and so on.
> > > >
> > > > As can be seen from the hairpin description, hairpin is basically R=
X queue
> > > > connected to TX queue.
> > > >
> > > > During the design phase I was thinking of two ways to implement thi=
s
> > > > feature the first one is adding a new rte flow action. and the seco=
nd
> > > > one is create a special kind of queue.
> > > >
> > > > The advantages of using the queue approch:
> > > > 1. More control for the application. queue depth (the memory size t=
hat
> > > > should be used).
> > > > 2. Enable QoS. QoS is normaly a parametr of queue, so in this appro=
ch it
> > > > will be easy to integrate with such system.
> > >
> > >
> > > Which kind of QoS?
> >
> > For example latency , packet rate those kinds of makes sense in the que=
ue
> level.
> > I know we don't have any current support but I think we will have durin=
g the
> next year.
> >
> Where would be the QoS API loading? TM API? Or propose other new?

I think it will be a new API,  The TM is more like limiting the bandwidth o=
f a target flow, while in
QoS should influence more the priority between the queues.

> > >
> > > > 3. Native integression with the rte flow API. Just setting the targ=
et
> > > > queue/rss to hairpin queue, will result that the traffic will be ro=
uted
> > > > to the hairpin queue.
> > > > 4. Enable queue offloading.
> > > >
> > > Looks like the hairpin queue is just hardware queue, it has no relati=
onship
> with
> > > host memory. It makes the queue concept a little bit confusing. And w=
hy do
> we
> > > need to setup queues, maybe some info in eth_conf is enough?
> >
> > Like stated above it makes sense to have queue related parameters.
> > For example I can think of application that most packets are going thre=
w that
> hairpin
> > queue, but some control packets are
> > from the application. So the application can configure the QoS between =
those
> two
> > queues. In addtion this will enable the application
> > to use the queue like normal queue from rte_flow (see comment below) an=
d
> every other
> > aspect.
> >
> Yes, it is typical use case. And rte_flow is used to classify to differen=
t queue?
> If I understand correct, your hairpin queue is using host memory/or on-ca=
rd
> memory for buffering, but CPU cannot touch it, all the packet processing =
is
> done by NIC.
> Queue is created, where the queue ID is used? Tx queue ID may be used as
> action of rte_flow? I still don't understand where the hairpin Rx queue I=
D be
> used.
> In my opinion, if no rx/tx function, it should not be a true queue from h=
ost view.
>=20

Yes rte_flow is used to classify the traffic between the queues, in order t=
o use the hairpin feature in=20
the basic usage, the application just insert any ingress flow that the targ=
et queue/RSS is hairpin queue.
For example assuming that queue index 4 is hairpin queue, hairpin will look=
 something like this:
Flow create 0 ingress group 0 pattern eth / ipv4 .... / end actions decap /=
 encap / queue index 4 / end

I understand but don't agree about your point that if there is no rx/tx fun=
ction it is not a queue.
In hairpin queue we are offloading the data path. Unrelated to this RFC we =
are working on VDPA driver.
This is not ethdev driver but what it does is offloading the vhost and offl=
oads the enqueue and dequeue functions[1].

> > >
> > > Not sure how your hardware make the hairpin work? Use rte_flow for
> packet
> > > modification offload? Then how does HW distribute packets to those
> hardware
> > > queue, classification? If So, why not just extend rte_flow with the h=
airpin
> > > action?
> > >
> >
> > You are correct, the application uses rte_flow and just points the traf=
fic to the
> requested
> > hairpin queue/rss.
> > We could have added a new rte_flow command. The reasons we didn't:
> > 1. Like stated above some of the hairpin makes sense in queue level.
> > 2.  In the near future, we will also want to support hairpin between di=
fferent
> ports. This
> > makes much more
> > sense using queues.
> >
> > > > Each hairpin Rxq can be connected Txq / number of Txqs which can
> belong to
> > > a
> > > > different ports assuming the PMD supports it. The same goes the oth=
er
> > > > way each hairpin Txq can be connected to one or more Rxqs.
> > > > This is the reason that both the Txq setup and Rxq setup are gettin=
g the
> > > > hairpin configuration structure.
> > > >
> > > > From PMD prespctive the number of Rxq/Txq is the total of standard
> > > > queues + hairpin queues.
> > > >
> > > > To configure hairpin queue the user should call
> > > > rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup
> insteed
> > > > of the normal queue setup functions.
> > >
> > > If the new API introduced to avoid ABI change, would one API
> > > rte_eth_rx_hairpin_setup be enough?
> >
> > I'm not sure I understand your comment.
> > The rx_hairpin_setup was created for two main reasons:
> > 1. Avoid API change.
> > 2. I think it is more correct to use different API since the parameters=
 are
> different.
> >
> I mean not use queue setup concept, set hairpin feature through one hairp=
in
> configuration API.
>=20

I'm not sure I understand.
API that will look something like this will be better?
Int hairpin_bind(uint16_t rx_port, uint16_t rx queue, struct hairpin_conf *=
rx_hairpin_conf,=20
uint16_t tx_port, uint16_t tx_queue, struct hairpin_conf *tx_hairpin_conf)

The problem with such API, is that it will cause issue for nics that suppor=
ts one to many connections.
For example assuming that some nic can support one rx queue to 4 tx queues.
Also we still need to configure the hairpin queue. So if I understand you c=
orrectly is that the hairpin queues
will not be setup and this API will set them.
=20
> > The reason we have both rx and tx setup functions is that we want the u=
ser to
> have
> > control binding the two queues.
> > It is most important when we will advance to hairpin between ports.
>=20
> Hairpin between ports? It looks like switch but not hairpin, right?

Switch from my understanding is between VM meaning traffic sent from one VM=
 will be routed
directly to the target VM. This is not the case of hairpin. In hairpin traf=
fic comes from the wire and goes
back to the wire. There are no VM in the system. Example application for ha=
irpin is load balancers or gateways,
were we get for example one port is connected to one system and the second =
port connected to a second system.
It is the job of the application to check if the packet should pass and if =
so modify it, to match the second system.
For example moving VXLAN tunnel packet to MPLS tunnel in the other system.
 =20
> >
> > >
> > > Thanks
> > > Jingjing
> >
> > Thanks,
> > Ori

Thanks,
Ori