From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0055.outbound.protection.outlook.com [104.47.2.55]) by dpdk.org (Postfix) with ESMTP id D9B245681 for ; Wed, 21 Feb 2018 20:35:39 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=YCucxZvX1BtcLJdf2F1bgxA+8VcspMAIXHajVYCydq4=; b=dC2E8HrTJpUjhRN93e7Dv3yipG3yf8Oi1IKwVx9xZC8zUa2+RcnKCSIrKRu61Hjdi4cl95Ig6TGkZWBV4yvHtAbrMjqfAd/RG7/HdxN2AJEzA/WyF1+vfC35APL11bgGL9/NsGRu9U+DHhfxfdguSd+v2xd6WDawnPIo/Gcf9N0= Received: from DB3PR0402MB3852.eurprd04.prod.outlook.com (52.134.71.143) by DB3PR0402MB3866.eurprd04.prod.outlook.com (52.134.71.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.506.18; Wed, 21 Feb 2018 19:35:37 +0000 Received: from DB3PR0402MB3852.eurprd04.prod.outlook.com ([fe80::8554:d533:15e:1376]) by DB3PR0402MB3852.eurprd04.prod.outlook.com ([fe80::8554:d533:15e:1376%13]) with mapi id 15.20.0506.023; Wed, 21 Feb 2018 19:35:36 +0000 From: Ahmed Mansour To: "Trahe, Fiona" , "Verma, Shally" , "dev@dpdk.org" CC: "Athreya, Narayana Prasad" , "Gupta, Ashish" , "Sahu, Sunila" , "De Lara Guarch, Pablo" , "Challa, Mahipal" , "Jain, Deepak K" , Hemant Agrawal , Roy Pledge , Youri Querry Thread-Topic: [RFC v2] doc compression API for DPDK Thread-Index: AdOFUW8Wdt99b3u6RKydGSrxJwvtHg== Date: Wed, 21 Feb 2018 19:35:35 +0000 Message-ID: References: <348A99DA5F5B7549AA880327E580B435892F589D@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B43589315232@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B4358931F82B@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B43589321277@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B43589324E3A@IRSMSX101.ger.corp.intel.com> Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=ahmed.mansour@nxp.com; x-originating-ip: [192.88.168.1] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB3PR0402MB3866; 7:aPvYBit39Hv4QADP8bdqdLEsNN2H0sLklkbMKfvxycybCunQRj/pipN2FuAwGP/wRgJMCul9aYXgS65np1kPQSaCG3jDpjfJSybcKNwkZHXLPUyeu+SAGvxXEVOb8J2BPMXDjYgr8rf5OnHhEOSfP1//YD6KY3Xtxkmc2Y+eNCxH0H0TQMnWprPVuph0VkdCOnoJ8kLekiexXjIOWHjyODKhNuS1uWpqdcLTYnCr+/CHklRaZd90xXR2VGpYdYCb x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(366004)(39380400002)(39860400002)(396003)(346002)(376002)(57704003)(13464003)(189003)(199004)(59450400001)(2900100001)(229853002)(2501003)(3846002)(4326008)(6116002)(9686003)(99286004)(6246003)(316002)(76176011)(53936002)(106356001)(6436002)(55016002)(54906003)(66066001)(33656002)(53946003)(68736007)(105586002)(2906002)(7736002)(5250100002)(305945005)(110136005)(74316002)(5890100001)(3280700002)(7696005)(97736004)(86362001)(81156014)(93886005)(102836004)(8936002)(5660300001)(25786009)(3660700001)(478600001)(14454004)(561944003)(186003)(53546011)(6506007)(8676002)(26005)(81166006); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0402MB3866; H:DB3PR0402MB3852.eurprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; x-ms-office365-filtering-correlation-id: 8583e994-8c1f-402c-100b-08d5796247cf x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(3008032)(48565401081)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603307)(7153060)(7193020); SRVR:DB3PR0402MB3866; x-ms-traffictypediagnostic: DB3PR0402MB3866: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(244540007438412)(278428928389397)(271806183753584)(185117386973197)(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(5005006)(8121501046)(93006095)(93001095)(3231101)(944501161)(10201501046)(3002001)(6055026)(6041288)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123560045)(20161123562045)(20161123558120)(6072148)(201708071742011); SRVR:DB3PR0402MB3866; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0402MB3866; x-forefront-prvs: 0590BBCCBC received-spf: None (protection.outlook.com: nxp.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: 7EBJWTRo0DfN2Bj3pbLsDOcZTdSr9sgZ27jvx98kbxOlmbIeAs7GfYuvuRJ1mhNnackh2kX49jqd1xDcJ1sasEtx+xe0ey0DtV2klMX+9ZYC2E5eZrY/vnFZ20266YtXTrN7h61Nq7PWfdBMya9MVx2IC0WN4q4X4NzkrHgEOBs= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nxp.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8583e994-8c1f-402c-100b-08d5796247cf X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Feb 2018 19:35:35.9895 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 686ea1d3-bc2b-4c6f-a92c-d99c5c301635 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0402MB3866 Subject: Re: [dpdk-dev] [RFC v2] doc compression API for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Feb 2018 19:35:40 -0000 On 2/21/2018 9:35 AM, Trahe, Fiona wrote:=0A= > Hi Ahmed, Shally,=0A= >=0A= >=0A= >> -----Original Message-----=0A= >> From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com]=0A= >> Sent: Tuesday, February 20, 2018 7:56 PM=0A= >> To: Verma, Shally ; Trahe, Fiona ; dev@dpdk.org=0A= >> Cc: Athreya, Narayana Prasad ; Gupta,= Ashish=0A= >> ; Sahu, Sunila ; De Lar= a Guarch, Pablo=0A= >> ; Challa, Mahipal ; Jain, Deepak K=0A= >> ; Hemant Agrawal ; Roy = Pledge=0A= >> ; Youri Querry =0A= >> Subject: Re: [RFC v2] doc compression API for DPDK=0A= >>=0A= >> /// snip ///=0A= >>>>>>>>>>>>>>>>> D.2.1 Stateful operation state maintenance=0A= >>>>>>>>>>>>>>>>> ---------------------------------------------------------= ------=0A= >>>>>>>>>>>>>>>>> It is always an ideal expectation from application that i= t should parse=0A= >>>>>>>>>>>>>>>> through all related chunk of source data making its mbuf-c= hain and=0A= >>>>>>>>>>>>>> enqueue=0A= >>>>>>>>>>>>>>>> it for stateless processing.=0A= >>>>>>>>>>>>>>>>> However, if it need to break it into several enqueue_burs= t() calls, then=0A= >>>>>>>>>>>>>> an=0A= >>>>>>>>>>>>>>>> expected call flow would be something like:=0A= >>>>>>>>>>>>>>>>> enqueue_burst( |op.no_flush |)=0A= >>>>>>>>>>>>>>>> [Ahmed] The work is now in flight to the PMD.The user will= call dequeue=0A= >>>>>>>>>>>>>>>> burst in a loop until all ops are received. Is this correc= t?=0A= >>>>>>>>>>>>>>>>=0A= >>>>>>>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next= =0A= >>>>>>>>>>>>>>> [Shally] Yes. Ideally every submitted op need to be dequeue= d. However=0A= >>>>>>>>>>>>>> this illustration is specifically in=0A= >>>>>>>>>>>>>>> context of stateful op processing to reflect if a stream is= broken into=0A= >>>>>>>>>>>>>> chunks, then each chunk should be=0A= >>>>>>>>>>>>>>> submitted as one op at-a-time with type =3D STATEFUL and ne= ed to be=0A= >>>>>>>>>>>>>> dequeued first before next chunk is=0A= >>>>>>>>>>>>>>> enqueued.=0A= >>>>>>>>>>>>>>>=0A= >>>>>>>>>>>>>>>>> enqueue_burst( |op.no_flush |)=0A= >>>>>>>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next= =0A= >>>>>>>>>>>>>>>>> enqueue_burst( |op.full_flush |)=0A= >>>>>>>>>>>>>>>> [Ahmed] Why now allow multiple work items in flight? I und= erstand that=0A= >>>>>>>>>>>>>>>> occasionaly there will be OUT_OF_SPACE exception. Can we j= ust=0A= >>>>>>>>>>>>>> distinguish=0A= >>>>>>>>>>>>>>>> the response in exception cases?=0A= >>>>>>>>>>>>>>> [Shally] Multiples ops are allowed in flight, however condi= tion is each op in=0A= >>>>>>>>>>>>>> such case is independent of=0A= >>>>>>>>>>>>>>> each other i.e. belong to different streams altogether.=0A= >>>>>>>>>>>>>>> Earlier (as part of RFC v1 doc) we did consider the proposa= l to process all=0A= >>>>>>>>>>>>>> related chunks of data in single=0A= >>>>>>>>>>>>>>> burst by passing them as ops array but later found that as = not-so-useful for=0A= >>>>>>>>>>>>>> PMD handling for various=0A= >>>>>>>>>>>>>>> reasons. You may please refer to RFC v1 doc review comments= for same.=0A= >>>>>>>>>>>>>> [Fiona] Agree with Shally. In summary, as only one op can be= processed at a=0A= >>>>>>>>>>>>>> time, since each needs the=0A= >>>>>>>>>>>>>> state of the previous, to allow more than 1 op to be in-flig= ht at a time would=0A= >>>>>>>>>>>>>> force PMDs to implement internal queueing and exception hand= ling for=0A= >>>>>>>>>>>>>> OUT_OF_SPACE conditions you mention.=0A= >>>>>>>>>>>> [Ahmed] But we are putting the ops on qps which would make the= m=0A= >>>>>>>>>>>> sequential. Handling OUT_OF_SPACE conditions would be a little= bit more=0A= >>>>>>>>>>>> complex but doable.=0A= >>>>>>>>>>> [Fiona] In my opinion this is not doable, could be very ineffic= ient.=0A= >>>>>>>>>>> There may be many streams.=0A= >>>>>>>>>>> The PMD would have to have an internal queue per stream so=0A= >>>>>>>>>>> it could adjust the next src offset and length in the OUT_OF_SP= ACE case.=0A= >>>>>>>>>>> And this may ripple back though all subsequent ops in the strea= m as each=0A= >>>>>>>>>>> source len is increased and its dst buffer is not big enough.= =0A= >>>>>>>>>> [Ahmed] Regarding multi op OUT_OF_SPACE handling.=0A= >>>>>>>>>> The caller would still need to adjust=0A= >>>>>>>>>> the src length/output buffer as you say. The PMD cannot handle= =0A= >>>>>>>>>> OUT_OF_SPACE internally.=0A= >>>>>>>>>> After OUT_OF_SPACE occurs, the PMD should reject all ops in this= stream=0A= >>>>>>>>>> until it gets explicit=0A= >>>>>>>>>> confirmation from the caller to continue working on this stream.= Any ops=0A= >>>>>>>>>> received by=0A= >>>>>>>>>> the PMD should be returned to the caller with status STREAM_PAUS= ED since=0A= >>>>>>>>>> the caller did not=0A= >>>>>>>>>> explicitly acknowledge that it has solved the OUT_OF_SPACE issue= .=0A= >>>>>>>>>> These semantics can be enabled by adding a new function to the A= PI=0A= >>>>>>>>>> perhaps stream_resume().=0A= >>>>>>>>>> This allows the caller to indicate that it acknowledges that it = has seen=0A= >>>>>>>>>> the issue and this op=0A= >>>>>>>>>> should be used to resolve the issue. Implementations that do not= support=0A= >>>>>>>>>> this mode of use=0A= >>>>>>>>>> can push back immediately after one op is in flight. Implementat= ions=0A= >>>>>>>>>> that support this use=0A= >>>>>>>>>> mode can allow many ops from the same session=0A= >>>>>>>>>>=0A= >>>>>>>>> [Shally] Is it still in context of having single burst where all = op belongs to one stream? If yes, I=0A= >> would=0A= >>>>>> still=0A= >>>>>>>>> say it would add an overhead to PMDs especially if it is expected= to work closer to HW (which I=0A= >> think=0A= >>>>>> is=0A= >>>>>>>>> the case with DPDK PMD).=0A= >>>>>>>>> Though your approach is doable but why this all cannot be in a la= yer above PMD? i.e. a layer=0A= >> above=0A= >>>>>> PMD=0A= >>>>>>>>> can either pass one-op at a time with burst size =3D 1 OR can mak= e chained mbuf of input and=0A= >> output=0A= >>>>>> and=0A= >>>>>>>>> pass than as one op.=0A= >>>>>>>>> Is it just to ease applications of chained mbuf burden or do you = see any performance /use-case=0A= >>>>>>>>> impacting aspect also?=0A= >>>>>>>>>=0A= >>>>>>>>> if it is in context where each op belong to different stream in a= burst, then why do we need=0A= >>>>>>>>> stream_pause and resume? It is a expectations from app to pass mo= re output buffer with=0A= >> consumed=0A= >>>>>> + 1=0A= >>>>>>>>> from next call onwards as it has already=0A= >>>>>>>>> seen OUT_OF_SPACE.=0A= >>>>>>> [Ahmed] Yes, this would add extra overhead to the PMD. Our PMD=0A= >>>>>>> implementation rejects all ops that belong to a stream that has ent= ered=0A= >>>>>>> "RECOVERABLE" state for one reason or another. The caller must=0A= >>>>>>> acknowledge explicitly that it has received news of the problem bef= ore=0A= >>>>>>> the PMD allows this stream to exit "RECOVERABLE" state. I agree wit= h you=0A= >>>>>>> that implementing this functionality in the software layer above th= e PMD=0A= >>>>>>> is a bad idea since the latency reductions are lost.=0A= >>>>>> [Shally] Just reiterating, I rather meant other way around i.e. I se= e it easier to put all such complexity=0A= >> in a=0A= >>>>>> layer above PMD.=0A= >>>>>>=0A= >>>>>>> This setup is useful in latency sensitive applications where the la= tency=0A= >>>>>>> of buffering multiple ops into one op is significant. We found late= ncy=0A= >>>>>>> makes a significant difference in search applications where the PMD= =0A= >>>>>>> competes with software decompression.=0A= >>>>> [Fiona] I see, so when all goes well, you get best-case latency, but = when=0A= >>>>> out-of-space occurs latency will probably be worse.=0A= >>>> [Ahmed] This is exactly right. This use mode assumes out-of-space is a= =0A= >>>> rare occurrence. Recovering from it should take similar time to=0A= >>>> synchronous implementations. The caller gets OUT_OF_SPACE_RECOVERABLE = in=0A= >>>> both sync and async use. The caller can fix up the op and send it back= =0A= >>>> to the PMD to continue work just as would be done in sync. Nonetheless= ,=0A= >>>> the added complexity is not justifiable if out-of-space is very common= =0A= >>>> since the recoverable state will be the limiting factor that forces=0A= >>>> synchronicity.=0A= >>>>>>>> [Fiona] I still have concerns with this and would not want to supp= ort in our PMD.=0A= >>>>>>>> TO make sure I understand, you want to send a burst of ops, with s= everal from same stream.=0A= >>>>>>>> If one causes OUT_OF_SPACE_RECOVERABLE, then the PMD should not pr= ocess any=0A= >>>>>>>> subsequent ops in that stream.=0A= >>>>>>>> Should it return them in a dequeue_burst() with status still NOT_P= ROCESSED?=0A= >>>>>>>> Or somehow drop them? How?=0A= >>>>>>>> While still processing ops form other streams.=0A= >>>>>>> [Ahmed] This is exactly correct. It should return them with=0A= >>>>>>> NOT_PROCESSED. Yes, the PMD should continue processing other stream= s.=0A= >>>>>>>> As we want to offload each op to hardware with as little CPU proce= ssing as possible we=0A= >>>>>>>> would not want to open up each op to see which stream it's attache= d to and=0A= >>>>>>>> make decisions to do per-stream storage, or drop it, or bypass hw = and dequeue without=0A= >> processing.=0A= >>>>>>> [Ahmed] I think I might have missed your point here, but I will try= to=0A= >>>>>>> answer. There is no need to "cushion" ops in DPDK. DPDK should send= ops=0A= >>>>>>> to the PMD and the PMD should reject until stream_continue() is cal= led.=0A= >>>>>>> The next op to be sent by the user will have a special marker in it= to=0A= >>>>>>> inform the PMD to continue working on this stream. Alternatively th= e=0A= >>>>>>> DPDK layer can be made "smarter" to fail during the enqueue by chec= king=0A= >>>>>>> the stream and its state, but like you say this adds additional CPU= =0A= >>>>>>> overhead during the enqueue.=0A= >>>>>>> I am curious. In a simple synchronous use case. How do we prevent u= sers=0A= >>>>>> >from putting multiple ops in flight that belong to a single stream?= Do=0A= >>>>>>> we just currently say it is undefined behavior? Otherwise we would = have=0A= >>>>>>> to check the stream and incur the CPU overhead.=0A= >>>>> [Fiona] We don't do anything to prevent it. It's undefined. IMO on da= ta path in=0A= >>>>> DPDK model we expect good behaviour and don't have to error check for= things like this.=0A= >>>> [Ahmed] This makes sense. We also assume good behavior.=0A= >>>>> In our PMD if we got a burst of 20 ops, we allocate 20 spaces on the = hw q, then=0A= >>>>> build and send those messages. If we found an op from a stream which = already=0A= >>>>> had one inflight, we'd have to hold that back, store in a sw stream-s= pecific holding queue,=0A= >>>>> only send 19 to hw. We cannot send multiple ops from same stream to= =0A= >>>>> the hw as it fans them out and does them in parallel.=0A= >>>>> Once the enqueue_burst() returns, there is no processing=0A= >>>>> context which would spot that the first has completed=0A= >>>>> and send the next op to the hw. On a dequeue_burst() we would spot th= is,=0A= >>>>> in that context could process the next op in the stream.=0A= >>>>> On out of space, instead of processing the next op we would have to t= ransfer=0A= >>>>> all unprocessed ops from the stream to the dequeue result.=0A= >>>>> Some parts of this are doable, but seems likely to add a lot more lat= ency,=0A= >>>>> we'd need to add extra threads and timers to move ops from the sw=0A= >>>>> queue to the hw q to get any benefit, and these constructs would add= =0A= >>>>> context switching and CPU cycles. So we prefer to push this responsib= ility=0A= >>>>> to above the API and it can achieve similar.=0A= >>>> [Ahmed] I see what you mean. Our workflow is almost exactly the same= =0A= >>>> with our hardware, but the fanning out is done by the hardware based o= n=0A= >>>> the stream and ops that belong to the same stream are never allowed to= =0A= >>>> go out of order. Otherwise the data would be corrupted. Likewise the= =0A= >>>> hardware is responsible for checking the state of the stream and=0A= >>>> returning frames as NOT_PROCESSED to the software=0A= >>>>>>>> Maybe we could add a capability if this behaviour is important for= you?=0A= >>>>>>>> e.g. ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS ?=0A= >>>>>>>> Our PMD would set this to 0. And expect no more than one op from a= stateful stream=0A= >>>>>>>> to be in flight at any time.=0A= >>>>>>> [Ahmed] That makes sense. This way the different DPDK implementatio= ns do=0A= >>>>>>> not have to add extra checking for unsupported cases.=0A= >>>>>> [Shally] @ahmed, If I summarise your use-case, this is how to want t= o PMD to support?=0A= >>>>>> - a burst *carry only one stream* and all ops then assumed to be bel= ong to that stream? (please=0A= >> note,=0A= >>>>>> here burst is not carrying more than one stream)=0A= >>>> [Ahmed] No. In this use case the caller sets up an op and enqueues a= =0A= >>>> single op. Then before the response comes back from the PMD the caller= =0A= >>>> enqueues a second op on the same stream.=0A= >>>>>> -PMD will submit one op at a time to HW?=0A= >>>> [Ahmed] I misunderstood what PMD means. I used it throughout to mean t= he=0A= >>>> HW. I used DPDK to mean the software implementation that talks to the= =0A= >>>> hardware.=0A= >>>> The software will submit all ops immediately. The hardware has to figu= re=0A= >>>> out what to do with the ops depending on what stream they belong to.= =0A= >>>>>> -if processed successfully, push it back to completion queue with st= atus =3D SUCCESS. If failed or run to=0A= >>>>>> into OUT_OF_SPACE, then push it to completion queue with status =3D = FAILURE/=0A= >>>>>> OUT_OF_SPACE_RECOVERABLE and rest with status =3D NOT_PROCESSED and = return with enqueue=0A= >> count=0A= >>>>>> =3D total # of ops submitted originally with burst?=0A= >>>> [Ahmed] This is exactly what I had in mind. all ops will be submitted = to=0A= >>>> the HW. The HW will put all of them on the completion queue with the= =0A= >>>> correct status exactly as you say.=0A= >>>>>> -app assumes all have been enqueued, so it go and dequeue all ops=0A= >>>>>> -on seeing an op with OUT_OF_SPACE_RECOVERABLE, app resubmit a burst= of ops with call to=0A= >>>>>> stream_continue/resume API starting from op which encountered OUT_OF= _SPACE and others as=0A= >>>>>> NOT_PROCESSED with updated input and output buffer?=0A= >>>> [Ahmed] Correct this is what we do today in our proprietary API.=0A= >>>>>> -repeat until *all* are dequeued with status =3D SUCCESS or *any* wi= th status =3D FAILURE? If anytime=0A= >>>>>> failure is seen, then app start whole processing all over again or j= ust drop this burst?!=0A= >>>> [Ahmed] The app has the choice on how to proceed. If the issue is=0A= >>>> recoverable then the application can continue this stream from where i= t=0A= >>>> stopped. if the failure is unrecoverable then the application should= =0A= >>>> first fix the problem and start from the beginning of the stream.=0A= >>>>>> If all of above is true, then I think we should add another API such= as=0A= >> rte_comp_enque_single_stream()=0A= >>>>>> which will be functional under Feature Flag =3D ALLOW_ENQUEUE_MULTIP= LE_STATEFUL_OPS or better=0A= >>>>>> name is SUPPORT_ENQUEUE_SINGLE_STREAM?!=0A= >>>> [Ahmed] The main advantage in async use is lost if we force all relate= d=0A= >>>> ops to be in the same burst. if we do that, then we might as well merg= e=0A= >>>> all the ops into one op. That would reduce the overhead.=0A= >>>> The use mode I am proposing is only useful in cases where the data=0A= >>>> becomes available after the first enqueue occurred. I want to allow th= e=0A= >>>> caller to enqueue the second set of data as soon as it is available=0A= >>>> regardless of whether or not the HW has already started working on the= =0A= >>>> first op inflight.=0A= >>> [Shally] @ahmed, Ok.. seems I missed a point here. So, confirm me foll= owing:=0A= >>>=0A= >>> As per current description in doc, expected stateful usage is:=0A= >>> enqueue (op1) --> dequeue(op1) --> enqueue(op2)=0A= >>>=0A= >>> but you're suggesting to allow an option to change it to=0A= >>>=0A= >>> enqueue(op1) -->enqueue(op2)=0A= >>>=0A= >>> i.e. multiple ops from same stream can be put in-flight via subsequent= enqueue_burst() calls without=0A= >> waiting to dequeue previous ones as PMD support it . So, no change to cu= rrent definition of a burst. It will=0A= >> still carry multiple streams where each op belonging to different stream= ?!=0A= >> [Ahmed] Correct. I guess a user could put two ops on the same burst that= =0A= >> belong to the same stream. In that case it would be more efficient to=0A= >> merge the ops using scatter gather. Nonetheless, I would not add checks= =0A= >> in my implementation to limit that use. The hardware does not perceive a= =0A= >> difference between ops that came on one burst and ops that came on two= =0A= >> different bursts. to the hardware they are all ops. What matters is=0A= >> which stream each op belongs to.=0A= >>> if yes, then seems your HW can be setup for multiple streams so it is e= fficient for your case to support it=0A= >> in DPDK PMD layer but our hw doesn't by-default and need SW to back it. = Given that, I also suggest to=0A= >> enable it under some feature flag.=0A= >>> However it looks like an add-on and if it doesn't change current defini= tion of a burst and minimum=0A= >> expectation set on stateful processing described in this document, then = IMO, you can propose this feature=0A= >> as an incremental patch on baseline version, in absence of which,=0A= >>> application will exercise stateful processing as described here (enq->d= eq->enq). Thoughts?=0A= >> [Ahmed] Makes sense. I was worried that there might be fundamental=0A= >> limitations to this mode of use in the API design. That is why I wanted= =0A= >> to share this use mode with you guys and see if it can be accommodated= =0A= >> using an incremental patch in the future.=0A= >>>>> [Fiona] Am curious about Ahmed's response to this. I didn't get that = a burst should carry only one=0A= >> stream=0A= >>>>> Or get how this makes a difference? As there can be many enqueue_burs= t() calls done before an=0A= >> dequeue_burst()=0A= >>>>> Maybe you're thinking the enqueue_burst() would be a blocking call th= at would not return until all the=0A= >> ops=0A= >>>>> had been processed? This would turn it into a synchronous call which = isn't the intent.=0A= >>>> [Ahmed] Agreed, a blocking or even a buffering software layer that bab= y=0A= >>>> sits the hardware does not fundamentally change the parameters of the= =0A= >>>> system as a whole. It just moves workflow management complexity down= =0A= >>>> into the DPDK software layer. Rather there are real latency and=0A= >>>> throughput advantages (because of caching) that I want to expose.=0A= >>>>=0A= > [Fiona] ok, so I think we've agreed that this can be an option, as long a= s not required of=0A= > PMDs and enabled under an explicit capability - named something like=0A= > ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS=0A= > @Ahmed, we'll leave it up to you to define details.=0A= > What's necessary is API text to describe the expected behaviour on any er= ror conditions,=0A= > the pause/resume API, whether an API is expected to clean up if resume do= esn't happen=0A= > and if there's any time limit on this, etc=0A= > But I wouldn't expect any changes to existing burst APIs, and all PMDs an= d applications=0A= > must be able to handle the default behaviour, i.e. with this capability d= isabled.=0A= > Specifically even if a PMD has this capability, if an application ignores= it and only sends=0A= > one op at a time, if a PMD returns OUT_OF_SPACE_RECOVERABLE the stream sh= ould=0A= > not be in a paused state and the PMD should not wait for a resume() to ha= ndle the =0A= > next op sent for that stream.=0A= > Does that make sense?=0A= [Ahmed] That make sense. When this mode is enabled then additional=0A= functions must be called to resume the work, even if only one op was in=0A= flight. When this mode is not enabled then the PMD assumes that the=0A= caller will never enqueue a stateful op before receiving a response to=0A= the one that precedes it in a stream=0A= >=0A= >>>> /// snip ///=0A= >>>=0A= >=0A= =0A=