From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id D75251041 for ; Thu, 11 Jan 2018 16:47:27 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2018 07:47:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,345,1511856000"; d="scan'208";a="10478983" Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2018 07:47:25 -0800 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.180]) by IRSMSX101.ger.corp.intel.com ([169.254.1.46]) with mapi id 14.03.0319.002; Thu, 11 Jan 2018 15:47:24 +0000 From: "Van Haaren, Harry" To: Pavan Nikhilesh , "jerin.jacob@caviumnetworks.com" , "santosh.shukla@caviumnetworks.com" , "Eads, Gage" , "hemant.agrawal@nxp.com" , "nipun.gupta@nxp.com" , "Ma, Liang J" CC: "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3 09/12] app/eventdev: add pipeline queue worker functions Thread-Index: AQHTiiK5mRtKbCy3BUu0FF77R8OT8aNtTtvggAACt5CAADolAIAA/cDQgAAo+gCAAB1C8A== Date: Thu, 11 Jan 2018 15:47:24 +0000 Message-ID: References: <20171130072406.15605-1-pbhagavatula@caviumnetworks.com> <20180110145144.28403-1-pbhagavatula@caviumnetworks.com> <20180110145144.28403-9-pbhagavatula@caviumnetworks.com> <20180110201710.3uolm2hwzwcowoif@Pavan-LT> <20180111135201.ufj3hqh6frncvjpx@Pavan-LT> In-Reply-To: <20180111135201.ufj3hqh6frncvjpx@Pavan-LT> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMDBlYWIxNjQtNjllOS00ZGIxLTk4OGMtNjFhNjU0NWYwM2I3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IjhVb2NmQ2ZGZzF3M3N1Y2dIOFhwRmw3WUUyVXJYcEVKZ1wvN29hT3hqVENjPSJ9 x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 09/12] app/eventdev: add pipeline queue worker functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jan 2018 15:47:28 -0000 > From: Pavan Nikhilesh [mailto:pbhagavatula@caviumnetworks.com] > Sent: Thursday, January 11, 2018 1:52 PM > To: Van Haaren, Harry ; > jerin.jacob@caviumnetworks.com; santosh.shukla@caviumnetworks.com; Eads, > Gage ; hemant.agrawal@nxp.com; nipun.gupta@nxp.com; = Ma, > Liang J > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 09/12] app/eventdev: add pipeline queue > worker functions >=20 > On Thu, Jan 11, 2018 at 12:17:38PM +0000, Van Haaren, Harry wrote: > > > > > > > > Thinking a little more about this, also in light of patch 11/12 of > this > > > series. > > > > > > > > The code here has a "safe" and "unsafe" version of TX. This involve= s > > > adding a spinlock inside the code, which is being locked/unlocked bef= ore > > > doing the actual TX action. > > > > > > > > I don't understand why this is necessary? DPDK's general stance on > locking > > > for data-path is DPDK functions do not provide locks, and that > application > > > level must implement thread-synchronization if it is required. > > > > > > > > In this case, the app/eventdev can be considered an App, but I don'= t > like > > > the idea of providing a sample application and code that duplicates c= ore > > > functionality with safe/unsafe versions.. > > > > > > > > > > Some PMD's (net/octeontx) have capability to do multi-thread safe Tx > where > > > no > > > thread-synchronization is required. This is exposed via the offload f= lag > > > 'DEV_TX_OFFLOAD_MT_LOCKFREE'. > > > > Yes understood. > > > > > > > So, the _safe Tx functions are selected based on the above offload > > > capability > > > and when the capability is absent _unsafe Tx functions are selected i= .e. > > > synchronized Tx via spin locks based on the Egress port id. > > > > > > This part changes the current behavior of the sample app. > > > > Currently there is a (SINGLE_LINK | ATOMIC) stage at the end of the > pipeline, which performs this "many-to-one" action, allowing a single cor= e > to dequeue all TX traffic, and perform the TX operation in a lock-free > manner. > > > > Changing this to a locking mechanism is going to hurt performance on > platforms that do not support TX_OFFLOAD_MT_LOCKFREE. > > > > In my opinion, the correct fix is to alter the overall pipeline, and > always use lockless TX. Examples below; > > > > NO TX_OFFLOAD_MT_LOCKFREE: > > > > Eth RX adapter -> stage 1 -> stage 2...(N-1) -> stage N -> stage TX > (Atomic | SINGLE_LINK) -> eth TX >=20 > Agreed, when we detect that tx is not lockfree the workers would just > forward > the events to (Atomic | SINGLE_LINK) event queue which would be dequeued= by > a > service(mt_unsafe) and Tx them lockfree. >=20 > > > > > > WITH TX_OFFLOAD_MT_LOCKFREE: > > > > Eth RX adapter -> stage 1 -> stage 2...(N-1) -> stage N -> eth TX MT > Capable >=20 > The current lockfree pipeline would remain the same. > > > > > > By configuring the pipeline based on MT_OFFLOAD_LOCKFREE capability fla= g, > and adding the SINGLE_LINK at the end if required, we can support both > models without resorting to locked TX functions. > > > > I think this will lead to a cleaner and more performant solution. > > >=20 > Thoughts? A quick summary of the issue here, and then an overview of my understanding= of the proposed solution. =3D=3D=3D Issue =3D=3D=3D Ethdev hardware has a flag TX_OFFLOAD_MT_LOCKFREE, which when set means tha= t multiple CPU threads can safely TX on a single ethdev-queue concurrently = (aka; without locking). Not all hardware supports this, so applications mus= t be able to gracefully handle hardware where this capability is not provid= ed. =3D=3D=3D Solution =3D=3D=3D In eventdev pipelines with MT_LOCKFREE capability, the CPU running the last= "worker" stage can also perform the ethdev-TX operation. In eventdev pipelines without MT_LOCKFREE caps, we use a (Single Link | Ato= mic) stage to "fan in" the traffic to a single point, and use a TX service = in order to abstract away the difference in CPU core requirements. The above solution avoids placing locks in the datapath by modifying the pi= peline design, and the difference in CPU requirements is abstracted by only= registering the TX service if required. Note that the TX service doesn't need the infrastructure like the RX adapte= r, as it is much simpler (dequeue from eventdev port, tx on ethdev port). @Pavan, I believe this is the same solution as you - just making sure we're= aligned! Cheers, -Harry