From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <harry.van.haaren@intel.com>
Received: from mga04.intel.com (mga04.intel.com [192.55.52.120])
 by dpdk.org (Postfix) with ESMTP id 16DF469A5
 for <dev@dpdk.org>; Wed, 29 Mar 2017 10:28:15 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=intel;
 t=1490776096; x=1522312096;
 h=from:to:cc:subject:date:message-id:references:
 in-reply-to:content-transfer-encoding:mime-version;
 bh=gKXBOwvAN9pajkBVRJlBX18HPVcF/VlYtj8P3exBlrI=;
 b=l0qa1vFM5zwEZuYsaBYxeJNCCiayduxufHtBHwrdB5HfrUUjvO0grf0b
 iwJdLm0wAtewM0z2wM89f8LC+w2AzQ==;
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 29 Mar 2017 01:28:14 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.36,240,1486454400"; d="scan'208";a="66345032"
Received: from irsmsx106.ger.corp.intel.com ([163.33.3.31])
 by orsmga002.jf.intel.com with ESMTP; 29 Mar 2017 01:28:13 -0700
Received: from irsmsx102.ger.corp.intel.com ([169.254.2.153]) by
 IRSMSX106.ger.corp.intel.com ([169.254.8.202]) with mapi id 14.03.0319.002;
 Wed, 29 Mar 2017 09:28:12 +0100
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, "Richardson, Bruce"
 <bruce.richardson@intel.com>
Thread-Topic: [PATCH v5 06/20] event/sw: add support for event queues
Thread-Index: AQHSpL8pQiOO24QChUGy3SqZRbqlMqGoQdgAgAAj03CAAaA4AIAAGY8QgABZ4ICAAQGP0A==
Date: Wed, 29 Mar 2017 08:28:12 +0000
Message-ID: <E923DB57A917B54B9182A2E928D00FA612A20E64@IRSMSX102.ger.corp.intel.com>
References: <489175012-101439-1-git-send-email-harry.van.haaren@intel.com>
 <1490374395-149320-1-git-send-email-harry.van.haaren@intel.com>
 <1490374395-149320-7-git-send-email-harry.van.haaren@intel.com>
 <20170327074011.fgodyrhquabj54r2@localhost.localdomain>
 <E923DB57A917B54B9182A2E928D00FA612A1FEB6@IRSMSX102.ger.corp.intel.com>
 <20170328104301.ysxnlgyxvnqfv674@localhost.localdomain>
 <E923DB57A917B54B9182A2E928D00FA612A20807@IRSMSX102.ger.corp.intel.com>
 <20170328173610.3hi6wyqvdpx2lo7e@localhost.localdomain>
In-Reply-To: <20170328173610.3hi6wyqvdpx2lo7e@localhost.localdomain>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNTE0MWQ1M2EtN2Q3Yi00NzBlLWIxNGUtYzM5ZWM0NDQzZTRmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IlNEZm9ZYlwvbUV6WlwvUFl4djJNdUEzVUV6czZYNlFWZEVSNmlWVUVcL2xUZUE9In0=
x-ctpclassification: CTP_IC
x-originating-ip: [163.33.239.180]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v5 06/20] event/sw: add support for event
	queues
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Mar 2017 08:28:16 -0000

> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Tuesday, March 28, 2017 6:36 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH v5 06/20] event/sw: add support for event queues
>=20

<snip IQ priority question>


> > > A few question on everyone benefit:
> > >
> > > 1) Does RTE_EVENT_QUEUE_CFG_SINGLE_LINK has any other meaning other t=
han an
> > > event queue linked only to single port?  Based on the discussions, It=
 was
> > > add in the header file so that SW PMD can know upfront only single po=
rt
> > > will be linked to the given event queue. It is added as an optimizati=
on for SW
> > > PMD. Does it has any functional expectation?
> >
> > In the context of the SW PMD, SINGLE_LINK means that a specific queue a=
nd port have a unique
> relationship in that there is only connection. This allows bypassing of A=
tomic, Ordering and
> Load-Balancing code. The result is a good performance increase, particula=
rly if the worker port
> dequeue depth is large, as then large bursts of packets can be dequeued w=
ith little overhead.
> >
> > As a result, (ATOMIC | SINGLE_LINK) is not a supported combination for =
the sw pmd queue
> types.
> > To be more precise, a SINGLE_LINK is its own queue type, and can not be=
 OR-ed with any other
> type.
> >
> >
> > > 2) Based on following topology given in documentation patch for queue
> > > based event pipelining,
> > >
> > >   rx_port    w1_port
> > > 	 \     /         \
> > > 	  qid0 - w2_port - qid1
> > > 	       \         /     \
> > > 		    w3_port        tx_port
> > >
> > > a) I understand, rx_port is feeding events to qid0
> > > b) But, Do you see any issue with following model? IMO, It scales wel=
l
> > > linearly based on number of cores available to work(Since it is ATOMI=
C to
> > > ATOMIC). Nothing wrong with
> > > qid1 just connects to tx_port, I am just trying understand the ration=
al
> > > behind it?
> > >
> > >   rx_port   w1_port         w1_port
> > > 	 \     /         \     /
> > > 	  qid0 - w2_port - qid1- w2_port
> > > 	       \         /     \
> > > 		   w3_port         w3_port
> >
> >
> > This is also a valid model from the SW eventdev.
>=20
> OK. If understand it correctly, On the above topology,  Even though you
> make qid2 as ATOMIC. SW PMD will not maintain ingress order when comes ou=
t of
> qid1 on different workers.


If qid0 is ORDERED, and qid1 is Atomic, then the following happens;
- after qid 0, the packets are sprayed across cores,
- they are returned out of order by worker cores
- *at the start* of qid1, packets are re-ordered back into ingress order (m=
aintain 100% of ordering)
- on dequeue from qid1, the atomic flow distribution will keep order per fl=
ow


> A SINGLE_LINK queue with one port attached
> scheme is required at end of the pipeline or where ever ordering has to b=
e
> maintained. Is my understanding correct?


Not quite, the SINGLE_LINK is not required at the end - we just see it as u=
seful for common use cases.
If not useful, there is no reason (due to SW PMD) for an application to cre=
ate this SINGLE_LINK to finish the pipeline.
If you have three cores that wish to TX, the above pipeline is 100% valid i=
n the SW PMD case.


> > The value of using a SINGLE_LINK at the end of a pipeline is
> > A) can TX all traffic on a single core (using a single queue)
> > B) re-ordering of traffic from the previous stage is possible
> >
> > To illustrate (B), a very simple pipeline here
> >
> >  RX port -> QID #1 (Ordered) -> workers(eg 4 ports) -> QID # 2 (SINGLE_=
LINK to tx) -> TX port
> >
> > Here, QID #1 is allowed to send the packets out of order to the 4 worke=
r ports - because they
> are later passed back to the eventdev for re-ordering before they get to =
the SINGLE_LINK stage,
> and then TX in the correct order.
> >
> >
> > > 3)
> > > > Does anybody have a need for a queue to be both Atomic *and* Single=
-link?  I understand
> the
> > > current API doesn't prohibit it, but I don't see the actual use-case =
in which that may be
> > > useful. Atomic implies load-balancing is occurring, single link impli=
es there is only one
> > > consuming core. Those seem like opposites to me?
> > >
> > > I can think about the following use case:
> > >
> > > topology:
> > >
> > >   rx_port    w1_port
> > > 	 \     /         \
> > > 	  qid0 - w2_port - qid1
> > > 	       \         /     \
> > > 		    w3_port        tx_port
> > >
> > > Use case:
> > >
> > > Queue based event pipeling:
> > > ORERDED(Stage1) to ATOMIC(Stage2) pipeline:
> > > - For ingress order maintenance
> > > - For executing Stage 1 in parallel for better scaling
> > > i.e A fat flow can spray over N cores while maintaining the ingress
> > > order when it sends out on the wire(after consuming from tx_port)
> > >
> > > I am not sure how SW PMD work in the use case of ingress order mainte=
nance.
> >
> > I think my illustration of (B) above is the same use-case as you have h=
ere. Instead of using
> an ATOMIC stage2, the SW PMD benefits from using the SINGLE_LINK port/que=
ue, and the
> SINGLE_LINK queue ensures ingress order is also egress order to the TX po=
rt.
> >
> >
> > > But the HW and header file expects this form:
> > > Snippet from header file:
> > > --
> > >  * The source flow ordering from an event queue is maintained when ev=
ents are
> > >  * enqueued to their destination queue within the same ordered flow c=
ontext.
> > >  *
> > >  * Events from the source queue appear in their original order when d=
equeued
> > >  * from a destination queue.
> > > --
> > > Here qid0 is source queue with ORDERED sched_type and qid1 is destina=
tion
> > > queue with ATOMIC sched_type. qid1 can be linked to only port(tx_port=
).
> > >
> > > Are we on same page? If not, let me know the differences? We will try=
 to
> > > accommodate the same in header file.
> >
> > Yes I think we are saying the same thing, using slightly different word=
s.
> >
> > To summarize;
> > - SW PMD sees SINGLE_LINK as its own queue type, and does not support l=
oad-balanced (Atomic
> Ordered, Parallel) queue functionality.
> > - SW PMD would use a SINGLE_LINK queue/port for the final stage of a pi=
peline
> >    A) to allow re-ordering to happen if required
> >    B) to merge traffic from multiple ports into a single stream for TX
> >
> > A possible solution;
> > 1) The application creates a SINGLE_LINK for the purpose of ensuring re=
-ordering is taking
> place as expected, and linking only one port for TX.
>=20
> The only issue is in Low-end cores case it wont scale. TX core will becom=
e as
> bottleneck and we need to have different pipelines based on the amount of=
 traffic(40G or 10G)
> a core can handle.


See above - the SINGLE_LINK isn't required to maintain ordering. Using mult=
iple TX cores is also valid in SW PMD.


> > 2) SW PMDs can create a SINGLE_LINK queue type, and benefit from the op=
timization
>=20
> Yes.
>=20
> > 3) HW PMDs can ignore the "SINGLE_LINK" aspect and uses an ATOMIC inste=
ad (as per your
> example in 3) above)
>=20
> But topology will be fixed for both HW and SW. An extra port and
> extra core needs to wasted for ordering business in case HW. Right?


Nope, no wasting cores, see above :) The SINGLE_LINK is just an easy way to=
 "fan in" traffic from lots of cores to one core (in a performant way in SW=
) to allow a single core do TX. A typical use-case might be putting RX and =
TX on the same core - TX is just a dequeue from a port with a SINGLE_LINK q=
ueue, and an enqueue to NIC.


Summary from the SW PMD point-of-view;=20
- SINGLE_LINK is its own queue type
- SINGLE_LINK queue can NOT schedule according to (Atomic, Ordered or Paral=
lel) rules

Is that acceptable from an API and HW point of view?=20

If so, I will send a new patch for the API to specify more clearly what SIN=
GLE_LINK is.
If not, I'm open to using a capability flag to solve the problem but my und=
erstanding right now is that there is no need.



> I think, we can roll out something based on capability.

Yes, if required that would be a good solution.


> > The application doesn't have to change anything, and just configures it=
s pipeline. The PMD is
> able to optimize if it makes sense (SW) or just use another queue type to=
 provide the same
> functionality to the application (HW).
> >
> > Thoughts? -Harry