From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM03-CO1-obe.outbound.protection.outlook.com (mail-co1nam03on0047.outbound.protection.outlook.com [104.47.40.47]) by dpdk.org (Postfix) with ESMTP id 6CCC81F1A for ; Tue, 20 Feb 2018 10:58:19 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=/C1M1fu2VDgJ1nO8f9gicdidvUP1rft71ldqHcVRjxg=; b=lrCWQWg1Tt31GDUPnVk1Kwbgosk2VWqc1UsuIPegh8SAAzJ7768flR41tyzqHZdzLCCrvwTfV3QHRmMsA1kFK4HE4ATjVfUEK/ttt68ciBK++Cb6XOOIBcHxYuQf9nbL0fv+brjJByYvImKOCUwwEmNLEtf7xHXJ7dVStMnBURs= Received: from CY4PR0701MB3634.namprd07.prod.outlook.com (52.132.101.164) by CY4PR0701MB3732.namprd07.prod.outlook.com (52.132.102.36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.506.18; Tue, 20 Feb 2018 09:58:17 +0000 Received: from CY4PR0701MB3634.namprd07.prod.outlook.com ([fe80::e842:2eb2:2ff5:5dd6]) by CY4PR0701MB3634.namprd07.prod.outlook.com ([fe80::e842:2eb2:2ff5:5dd6%13]) with mapi id 15.20.0506.023; Tue, 20 Feb 2018 09:58:15 +0000 From: "Verma, Shally" To: Ahmed Mansour , "Trahe, Fiona" , "dev@dpdk.org" CC: "Athreya, Narayana Prasad" , "Gupta, Ashish" , "Sahu, Sunila" , "De Lara Guarch, Pablo" , "Challa, Mahipal" , "Jain, Deepak K" , Hemant Agrawal , Roy Pledge , Youri Querry Thread-Topic: [RFC v2] doc compression API for DPDK Thread-Index: AdOFUW8Wdt99b3u6RKydGSrxJwvtHgk0NLSQ Date: Tue, 20 Feb 2018 09:58:15 +0000 Message-ID: References: <348A99DA5F5B7549AA880327E580B435892F589D@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B43589315232@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B4358931F82B@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B43589321277@IRSMSX101.ger.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [115.113.156.3] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; CY4PR0701MB3732; 6:g6vbintYf590uA7W231BXXvGNUfM6jtR2zSP859TrGX2BiunhXJ2LHhwyP4JfDNT7p+dUNH6k8rxmFPXNE66QysXq+ctdmCnCkEEjwGpsJ+IPA81sfPduq9brs9RpRgu7wfeNmptX0C1RI6b7sVALoKrjvMDI7UAc4xahzec/RN4SW+xXzpNxffEU42KxCued33+tghvgvdoNpk3XpigiWvgxHWx8TDyOQ2itAUZePh3IxkOCaSM7PfWVoi0eensjS2Le351U97Erqs3j9KpMGwqtBFLn7t6ZvmdcDy1lqBULfIaij7HLVQDQNVmvlCtovFjhNady2H1tV9oh5wmYL+xnHyjYr2jyZPN5W5Tyg20/HvJCBbYIASPKAcUzs5T; 5:FcLGbL4fhkk2rLQVVB99WHoC9WWBQ9ni7jIzOGmEdz5PhwSjt32Gx2A641XdzaUTQ78lzkTpPV15nnr8nLka+PHiA81A1gDzPftwmEmiLggMaY2Cn6PP0eIGltx1X1I1Gi60RPxMEpE73nfHmCzCqtA+d+5C2g41KcO4G/wS8Wo=; 24:mGFUlwiwucz16mhVe9K9FPdsjhm2vUitwLz3083JhdGBciGF6tsaMrpyiJIo511xjn/id08DP9ys64vyFzeb5x5aOFVCwA6XMnq1vbRsO9c=; 7:df/F3EDuj07t+LKHtseKJqA/5BOd9JTWtTBdOr055rkWSZlktBIfBHZTIgW7cUg6bMu8gx3WSzTRrL3K9+rQKcduHeT/CJ2SrVuL0dMiT0WGi06wZpoI0bEbP7gGFj+d0amN+G2hqMvcE0ZCLythWQHe8IkV7fmAVINcJhFt6aWkI7aaaTTvMNoYu9lckKUcAcAWwMQ9SJhIM3xnPM+OHbQBQjYuFSBZAEso/lfvrOkzm2fgKKojqPotsg466+9c x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(346002)(39850400004)(366004)(376002)(39380400002)(396003)(189003)(199004)(57704003)(13464003)(2906002)(97736004)(6246003)(6116002)(3846002)(76176011)(53936002)(7696005)(16200700003)(105586002)(54906003)(316002)(478600001)(72206003)(110136005)(3280700002)(14454004)(9686003)(6436002)(33656002)(561944003)(53946003)(99286004)(55016002)(106356001)(8656006)(186003)(26005)(5250100002)(2501003)(66066001)(5890100001)(93886005)(2900100001)(5660300001)(68736007)(8936002)(102836004)(25786009)(55236004)(81156014)(81166006)(6506007)(8676002)(53546011)(59450400001)(6346003)(2950100002)(4326008)(3660700001)(229853002)(7736002)(86362001)(305945005)(74316002)(559001)(569006); DIR:OUT; SFP:1101; SCL:1; SRVR:CY4PR0701MB3732; H:CY4PR0701MB3634.namprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; x-ms-office365-filtering-correlation-id: a9f3ccf1-4b98-4019-b520-08d5784875fb x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(7168020)(4627221)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:CY4PR0701MB3732; x-ms-traffictypediagnostic: CY4PR0701MB3732: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(244540007438412)(20558992708506)(278428928389397)(271806183753584)(185117386973197)(211171220733660)(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001056)(6040501)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3231101)(944501161)(3002001)(6041288)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123562045)(6072148)(201708071742011); SRVR:CY4PR0701MB3732; BCL:0; PCL:0; RULEID:; SRVR:CY4PR0701MB3732; x-forefront-prvs: 05891FB07F received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) authentication-results: spf=none (sender IP is ) smtp.mailfrom=Shally.Verma@cavium.com; x-microsoft-antispam-message-info: nIkBbQdYcX+vzTbgo3087uaAKUeHROD5+pXDRGhtHsQDW2j3146fs1HcG1FXIq9AU/mmXD9NRS51pH40QXsVqaSHqLn2yx+MPmcDFuSRIhwW/Q8EQ4KnBbiapAXEf7MU8sgsK1CTEYP+cTFvlifq8DlLJG6NRKwqUgIgS9c42lU= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-Network-Message-Id: a9f3ccf1-4b98-4019-b520-08d5784875fb X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Feb 2018 09:58:15.4264 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR0701MB3732 Subject: Re: [dpdk-dev] [RFC v2] doc compression API for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Feb 2018 09:58:20 -0000 >-----Original Message----- >From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com] >Sent: 17 February 2018 02:52 >To: Trahe, Fiona ; Verma, Shally ; dev@dpdk.org >Cc: Athreya, Narayana Prasad ; Gupta, A= shish ; Sahu, Sunila >; De Lara Guarch, Pablo ; Challa, Mahipal >; Jain, Deepak K ; Hem= ant Agrawal ; Roy >Pledge ; Youri Querry >Subject: Re: [RFC v2] doc compression API for DPDK > >>> -----Original Message----- >>> From: Verma, Shally [mailto:Shally.Verma@cavium.com] >>> Sent: Friday, February 16, 2018 7:17 AM >>> To: Ahmed Mansour ; Trahe, Fiona ; >>> dev@dpdk.org >>> Cc: Athreya, Narayana Prasad ; Gupta= , Ashish >>> ; Sahu, Sunila ; De La= ra Guarch, Pablo >>> ; Challa, Mahipal ; Jain, Deepak K >>> ; Hemant Agrawal ; Roy= Pledge >>> ; Youri Querry >>> Subject: RE: [RFC v2] doc compression API for DPDK >>> >>> Hi Fiona, Ahmed >>> >>>> -----Original Message----- >>>> From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com] >>>> Sent: 16 February 2018 02:40 >>>> To: Trahe, Fiona ; Verma, Shally ; dev@dpdk.org >>>> Cc: Athreya, Narayana Prasad ; Gupt= a, Ashish >>> ; Sahu, Sunila >>>> ; De Lara Guarch, Pablo ; Challa, >>> Mahipal >>>> ; Jain, Deepak K ;= Hemant Agrawal >>> ; Roy >>>> Pledge ; Youri Querry >>>> Subject: Re: [RFC v2] doc compression API for DPDK >>>> >>>> On 2/15/2018 1:47 PM, Trahe, Fiona wrote: >>>>> Hi Shally, Ahmed, >>>>> Sorry for the delay in replying, >>>>> Comments below >>>>> >>>>>> -----Original Message----- >>>>>> From: Verma, Shally [mailto:Shally.Verma@cavium.com] >>>>>> Sent: Wednesday, February 14, 2018 7:41 AM >>>>>> To: Ahmed Mansour ; Trahe, Fiona ; >>>>>> dev@dpdk.org >>>>>> Cc: Athreya, Narayana Prasad ; Gu= pta, Ashish >>>>>> ; Sahu, Sunila ; De= Lara Guarch, Pablo >>>>>> ; Challa, Mahipal ; Jain, Deepak K >>>>>> ; Hemant Agrawal ; = Roy Pledge >>>>>> ; Youri Querry >>>>>> Subject: RE: [RFC v2] doc compression API for DPDK >>>>>> >>>>>> Hi Ahmed, >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com] >>>>>>> Sent: 02 February 2018 01:53 >>>>>>> To: Trahe, Fiona ; Verma, Shally ; >>> dev@dpdk.org >>>>>>> Cc: Athreya, Narayana Prasad ; G= upta, Ashish >>>>>> ; Sahu, Sunila >>>>>>> ; De Lara Guarch, Pablo ; Challa, >>>>>> Mahipal >>>>>>> ; Jain, Deepak K ; Hemant Agrawal >>>>>> ; Roy >>>>>>> Pledge ; Youri Querry >>>>>>> Subject: Re: [RFC v2] doc compression API for DPDK >>>>>>> >>>>>>> On 1/31/2018 2:03 PM, Trahe, Fiona wrote: >>>>>>>> Hi Ahmed, Shally, >>>>>>>> >>>>>>>> ///snip/// >>>>>>>>>>>>>> D.1.1 Stateless and OUT_OF_SPACE >>>>>>>>>>>>>> ------------------------------------------------ >>>>>>>>>>>>>> OUT_OF_SPACE is a condition when output buffer runs out of s= pace >>>>>>>>>>> and >>>>>>>>>>>>> where PMD still has more data to produce. If PMD run into suc= h >>>>>>>>>>> condition, >>>>>>>>>>>>> then it's an error condition in stateless processing. >>>>>>>>>>>>>> In such case, PMD resets itself and return with status >>>>>>>>>>>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with produced=3Dconsumed=3D0 >>>>>>>>>>> i.e. >>>>>>>>>>>>> no input read, no output written. >>>>>>>>>>>>>> Application can resubmit an full input with larger output bu= ffer size. >>>>>>>>>>>>> [Ahmed] Can we add an option to allow the user to read the da= ta that >>>>>>>>>>> was >>>>>>>>>>>>> produced while still reporting OUT_OF_SPACE? this is mainly u= seful for >>>>>>>>>>>>> decompression applications doing search. >>>>>>>>>>>> [Shally] It is there but applicable for stateful operation typ= e (please refer to >>>>>>>>>>> handling out_of_space under >>>>>>>>>>>> "Stateful Section"). >>>>>>>>>>>> By definition, "stateless" here means that application (such a= s IPCOMP) >>>>>>>>>>> knows maximum output size >>>>>>>>>>>> guaranteedly and ensure that uncompressed data size cannot gro= w more >>>>>>>>>>> than provided output buffer. >>>>>>>>>>>> Such apps can submit an op with type =3D STATELESS and provide= full input, >>>>>>>>>>> then PMD assume it has >>>>>>>>>>>> sufficient input and output and thus doesn't need to maintain = any contexts >>>>>>>>>>> after op is processed. >>>>>>>>>>>> If application doesn't know about max output size, then it sho= uld process it >>>>>>>>>>> as stateful op i.e. setup op >>>>>>>>>>>> with type =3D STATEFUL and attach a stream so that PMD can mai= ntain >>>>>>>>>>> relevant context to handle such >>>>>>>>>>>> condition. >>>>>>>>>>> [Fiona] There may be an alternative that's useful for Ahmed, wh= ile still >>>>>>>>>>> respecting the stateless concept. >>>>>>>>>>> In Stateless case where a PMD reports OUT_OF_SPACE in decompres= sion >>>>>>>>>>> case >>>>>>>>>>> it could also return consumed=3D0, produced =3D x, where x>0. X= indicates the >>>>>>>>>>> amount of valid data which has >>>>>>>>>>> been written to the output buffer. It is not complete, but if = an application >>>>>>>>>>> wants to search it it may be sufficient. >>>>>>>>>>> If the application still wants the data it must resubmit the wh= ole input with a >>>>>>>>>>> bigger output buffer, and >>>>>>>>>>> decompression will be repeated from the start, it >>>>>>>>>>> cannot expect to continue on as the PMD has not maintained sta= te, history >>>>>>>>>>> or data. >>>>>>>>>>> I don't think there would be any need to indicate this in capab= ilities, PMDs >>>>>>>>>>> which cannot provide this >>>>>>>>>>> functionality would always return produced=3Dconsumed=3D0, whil= e PMDs which >>>>>>>>>>> can could set produced > 0. >>>>>>>>>>> If this works for you both, we could consider a similar case fo= r compression. >>>>>>>>>>> >>>>>>>>>> [Shally] Sounds Fine to me. Though then in that case, consume sh= ould also be updated to >>> actual >>>>>>>>> consumed by PMD. >>>>>>>>>> Setting consumed =3D 0 with produced > 0 doesn't correlate. >>>>>>>>> [Ahmed]I like Fiona's suggestion, but I also do not like the impl= ication >>>>>>>>> of returning consumed =3D 0. At the same time returning consumed = =3D y >>>>>>>>> implies to the user that it can proceed from the middle. I prefer= the >>>>>>>>> consumed =3D 0 implementation, but I think a different return is = needed to >>>>>>>>> distinguish it from OUT_OF_SPACE that the use can recover from. P= erhaps >>>>>>>>> OUT_OF_SPACE_RECOVERABLE and OUT_OF_SPACE_TERMINATED. This also a= llows >>>>>>>>> future PMD implementations to provide recover-ability even in STA= TELESS >>>>>>>>> mode if they so wish. In this model STATELESS or STATEFUL would b= e a >>>>>>>>> hint for the PMD implementation to make optimizations for each ca= se, but >>>>>>>>> it does not force the PMD implementation to limit functionality i= f it >>>>>>>>> can provide recover-ability. >>>>>>>> [Fiona] So you're suggesting the following: >>>>>>>> OUT_OF_SPACE - returned only on stateful operation. Not an error. = Op.produced >>>>>>>> can be used and next op in stream should continue on from op.c= onsumed+1. >>>>>>>> OUT_OF_SPACE_TERMINATED - returned only on stateless operation. >>>>>>>> Error condition, no recovery possible. >>>>>>>> consumed=3Dproduced=3D0. Application must resubmit all input d= ata with >>>>>>>> a bigger output buffer. >>>>>>>> OUT_OF_SPACE_RECOVERABLE - returned only on stateless operation, s= ome recovery possible. >>>>>>>> - consumed =3D 0, produced > 0. Application must resubmit all= input data with >>>>>>>> a bigger output buffer. However in decompression case, dat= a up to produced >>>>>>>> in dst buffer may be inspected/searched. Never happens in = compression >>>>>>>> case as output data would be meaningless. >>>>>>>> - consumed > 0, produced > 0. PMD has stored relevant state a= nd history and so >>>>>>>> can convert to stateful, using op.produced and continuing = from consumed+1. >>>>>>>> I don't expect our PMDs to use this last case, but maybe this work= s for others? >>>>>>>> I'm not convinced it's not just adding complexity. It sounds like = a version of stateful >>>>>>>> without a stream, and maybe less efficient? >>>>>>>> If so should it respect the FLUSH flag? Which would have been FULL= or FINAL in the op. >>>>>>>> Or treat it as FLUSH_NONE or SYNC? I don't know why an application= would not >>>>>>>> simply have submitted a STATEFUL request if this is the behaviour = it wants? >>>>>>> [Ahmed] I was actually suggesting the removal of OUT_OF_SPACE entir= ely >>>>>>> and replacing it with >>>>>>> OUT_OF_SPACE_TERMINATED - returned only on stateless operation. >>>>>>> Error condition, no recovery possible. >>>>>>> - consumed=3D0 produced=3Damount of data produced. Applicati= on must >>>>>>> resubmit all input data with >>>>>>> a bigger output buffer to process all of the op >>>>>>> OUT_OF_SPACE_RECOVERABLE - Normally returned on stateful operation= . Not >>>>>>> an error. Op.produced >>>>>>> can be used and next op in stream should continue on from op.con= sumed+1. >>>>>>> - consumed > 0, produced > 0. PMD has stored relevant state= and >>>>>>> history and so >>>>>>> can continue using op.produced and continuing from consu= med+1. >>>>>>> >>>>>>> We would not return OUT_OF_SPACE_RECOVERABLE in stateless mode in o= ur >>>>>>> implementation either. >>>>>>> >>>>>>> Regardless of speculative future PMDs. The more important aspect of= this >>>>>>> for today is that the return status clearly determines >>>>>>> the meaning of "consumed". If it is RECOVERABLE then consumed is >>>>>>> meaningful. if it is TERMINATED then consumed in meaningless. >>>>>>> This way we take away the ambiguity of having OUT_OF_SPACE mean two >>>>>>> different user work flows. >>>>>>> >>>>>>> A speculative future PMD may be designed to return RECOVERABLE for >>>>>>> stateless ops that are attached to streams. >>>>>>> A future PMD may look to see if an op has a stream is attached and = write >>>>>>> out the state there and go into recoverable mode. >>>>>>> in essence this leaves the choice up to the implementation and allo= ws >>>>>>> the PMD to take advantage of stateless optimizations >>>>>>> so long as a "RECOVERABLE" scenario is rarely hit. The PMD will dum= p >>>>>>> context as soon as it fully processes an op. It will only >>>>>>> write context out in cases where the op chokes. >>>>>>> This futuristic PMD should ignore the FLUSH since this STATELESS mo= de as >>>>>>> indicated by the user and optimize >>>>>> [Shally] IMO, it looks okay to have two separate return code TERMINA= TED and RECOVERABLE with >>>>>> definition as you mentioned and seem doable. >>>>>> So then it mean all following conditions: >>>>>> a. stateless with flush =3D full/final, no stream pointer provided ,= PMD can return TERMINATED i.e. >>> user >>>>>> has to start all over again, it's a failure (as in current definitio= n) >>>>>> b. stateless with flush =3D full/final, stream pointer provided, her= e it's up to PMD to return either >>>>>> TERMINATED or RECOVERABLE depending upon its ability (note if Recove= rable, then PMD will >>> maintain >>>>>> states in stream pointer) >>>>>> c. stateful with flush =3D full / NO_SYNC, stream pointer always the= re, PMD will >>>>>> TERMINATED/RECOVERABLE depending on STATEFUL_COMPRESSION/DECOMPRESSI= ON feature >>> flag >>>>>> enabled or not >>>>> [Fiona] I don't think the flush flag is relevant - it could be out of= space on any flush flag, and if out of >>> space >>>>> should ignore the flush flag. >>>>> Is there a need for TERMINATED? - I didn't think it would ever need t= o be returned in stateful case. >>>>> Why the ref to feature flag? If a PMD doesn't support a feature I th= ink it should fail the op - not with >>>>> out-of space, but unsupported or similar. Or it would fail on stream= creation. >>>> [Ahmed] Agreed with Fiona. The flush flag only matters on success. By >>>> definition the PMD should return OUT_OF_SPACE_RECOVERABLE in stateful >>>> mode when it runs out of space. >>>> @Shally If the user did not provide a stream, then the PMD should >>>> probably return TERMINATED every time. I am not sure we should make a >>>> "really smart" PMD which returns RECOVERABLE even if no stream pointer >>>> was given. In that case the PMD must give some ID back to the caller >>>> that the caller can use to "recover" the op. I am not sure how it woul= d >>>> be implemented in the PMD and when does the PMD decide to retire strea= ms >>>> belonging to dead ops that the caller decided not to "recover". >>>>>> and one more exception case is: >>>>>> d. stateless with flush =3D full, no stream pointer provided, PMD ca= n return RECOVERABLE i.e. PMD >>>>>> internally maintained that state somehow and consumed & produced > 0= , so user can start >>> consumed+1 >>>>>> but there's restriction on user not to alter or change op until it i= s fully processed?! >>>>> [Fiona] Why the need for this case? >>>>> There's always a restriction on user not to alter or change op until = it is fully processed. >>>>> If a PMD can do this - why doesn't it create a stream when that API i= s called - and then it's same as b? >>>> [Ahmed] Agreed. The user should not touch an op once enqueued until th= ey >>>> receive it in dequeue. We ignore the flush in stateless mode. We assum= e >>>> it to be final every time. >>> [Shally] Agreed and am not in favour of supporting such implementation = either. Just listed out different >>> possibilities up here to better visualise Ahmed requirements/applicabil= ity of TERMINATED and >>> RECOVERABLE. >>> >>>>>> API currently takes care of case a and c, and case b can be supporte= d if specification accept another >>>>>> proposal which mention optional usage of stream with stateless. >>>>> [Fiona] API has this, but as we agreed, not optional to call the crea= te_stream() with an op_type >>>>> parameter (stateful/stateless). PMD can return NULL or provide a stre= am, if the latter then that >>>>> stream must be attached to ops. >>>>> >>>>> Until then API takes no difference to >>>>>> case b and c i.e. we can have op such as, >>>>>> - type=3D stateful with flush =3D full/final, stream pointer provide= d, PMD can return >>>>>> TERMINATED/RECOVERABLE according to its ability >>>>>> >>>>>> Case d , is something exceptional, if there's requirement in PMDs to= support it, then believe it will be >>>>>> doable with concept of different return code. >>>>>> >>>>> [Fiona] That's not quite how I understood it. Can it be simpler and o= nly following cases? >>>>> a. stateless with flush =3D full/final, no stream pointer provided , = PMD can return TERMINATED i.e. user >>>>> has to start all over again, it's a failure (as in current defini= tion). >>>>> consumed =3D 0, produced=3Damount of data produced. This is usual= ly 0, but in decompression >>>>> case a PMD may return > 0 and application may find it useful to i= nspect that data. >>>>> b. stateless with flush =3D full/final, stream pointer provided, here= it's up to PMD to return either >>>>> TERMINATED or RECOVERABLE depending upon its ability (note if Rec= overable, then PMD will >>> maintain >>>>> states in stream pointer) >>>>> c. stateful with flush =3D any, stream pointer always there, PMD will= return RECOVERABLE. >>>>> op.produced can be used and next op in stream should continue on = from op.consumed+1. >>>>> Consumed=3D0, produced=3D0 is an unusual but allowed case. I'm no= t sure if it could ever happen, but >>>>> no need to change state to TERMINATED in this case. There may be = useful state/history >>>>> stored in the PMD, even though no output produced yet. >>>> [Ahmed] Agreed >>> [Shally] Sounds good. >>> >>>>>>>>>>>>>> D.2 Compression API Stateful operation >>>>>>>>>>>>>> ---------------------------------------------------------- >>>>>>>>>>>>>> A Stateful operation in DPDK compression means application = invokes >>>>>>>>>>>>> enqueue burst() multiple times to process related chunk of da= ta either >>>>>>>>>>>>> because >>>>>>>>>>>>>> - Application broke data into several ops, and/or >>>>>>>>>>>>>> - PMD ran into out_of_space situation during input processin= g >>>>>>>>>>>>>> >>>>>>>>>>>>>> In case of either one or all of the above conditions, PMD is= required to >>>>>>>>>>>>> maintain state of op across enque_burst() calls and >>>>>>>>>>>>>> ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin w= ith >>>>>>>>>>>>> flush value =3D RTE_COMP_NO/SYNC_FLUSH and end at flush value >>>>>>>>>>>>> RTE_COMP_FULL/FINAL_FLUSH. >>>>>>>>>>>>>> D.2.1 Stateful operation state maintenance >>>>>>>>>>>>>> ------------------------------------------------------------= --- >>>>>>>>>>>>>> It is always an ideal expectation from application that it s= hould parse >>>>>>>>>>>>> through all related chunk of source data making its mbuf-chai= n and >>>>>>>>>>> enqueue >>>>>>>>>>>>> it for stateless processing. >>>>>>>>>>>>>> However, if it need to break it into several enqueue_burst()= calls, then >>>>>>>>>>> an >>>>>>>>>>>>> expected call flow would be something like: >>>>>>>>>>>>>> enqueue_burst( |op.no_flush |) >>>>>>>>>>>>> [Ahmed] The work is now in flight to the PMD.The user will ca= ll dequeue >>>>>>>>>>>>> burst in a loop until all ops are received. Is this correct? >>>>>>>>>>>>> >>>>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next >>>>>>>>>>>> [Shally] Yes. Ideally every submitted op need to be dequeued. = However >>>>>>>>>>> this illustration is specifically in >>>>>>>>>>>> context of stateful op processing to reflect if a stream is br= oken into >>>>>>>>>>> chunks, then each chunk should be >>>>>>>>>>>> submitted as one op at-a-time with type =3D STATEFUL and need = to be >>>>>>>>>>> dequeued first before next chunk is >>>>>>>>>>>> enqueued. >>>>>>>>>>>> >>>>>>>>>>>>>> enqueue_burst( |op.no_flush |) >>>>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next >>>>>>>>>>>>>> enqueue_burst( |op.full_flush |) >>>>>>>>>>>>> [Ahmed] Why now allow multiple work items in flight? I unders= tand that >>>>>>>>>>>>> occasionaly there will be OUT_OF_SPACE exception. Can we just >>>>>>>>>>> distinguish >>>>>>>>>>>>> the response in exception cases? >>>>>>>>>>>> [Shally] Multiples ops are allowed in flight, however conditio= n is each op in >>>>>>>>>>> such case is independent of >>>>>>>>>>>> each other i.e. belong to different streams altogether. >>>>>>>>>>>> Earlier (as part of RFC v1 doc) we did consider the proposal t= o process all >>>>>>>>>>> related chunks of data in single >>>>>>>>>>>> burst by passing them as ops array but later found that as not= -so-useful for >>>>>>>>>>> PMD handling for various >>>>>>>>>>>> reasons. You may please refer to RFC v1 doc review comments fo= r same. >>>>>>>>>>> [Fiona] Agree with Shally. In summary, as only one op can be pr= ocessed at a >>>>>>>>>>> time, since each needs the >>>>>>>>>>> state of the previous, to allow more than 1 op to be in-flight = at a time would >>>>>>>>>>> force PMDs to implement internal queueing and exception handlin= g for >>>>>>>>>>> OUT_OF_SPACE conditions you mention. >>>>>>>>> [Ahmed] But we are putting the ops on qps which would make them >>>>>>>>> sequential. Handling OUT_OF_SPACE conditions would be a little bi= t more >>>>>>>>> complex but doable. >>>>>>>> [Fiona] In my opinion this is not doable, could be very inefficien= t. >>>>>>>> There may be many streams. >>>>>>>> The PMD would have to have an internal queue per stream so >>>>>>>> it could adjust the next src offset and length in the OUT_OF_SPACE= case. >>>>>>>> And this may ripple back though all subsequent ops in the stream a= s each >>>>>>>> source len is increased and its dst buffer is not big enough. >>>>>>> [Ahmed] Regarding multi op OUT_OF_SPACE handling. >>>>>>> The caller would still need to adjust >>>>>>> the src length/output buffer as you say. The PMD cannot handle >>>>>>> OUT_OF_SPACE internally. >>>>>>> After OUT_OF_SPACE occurs, the PMD should reject all ops in this st= ream >>>>>>> until it gets explicit >>>>>>> confirmation from the caller to continue working on this stream. An= y ops >>>>>>> received by >>>>>>> the PMD should be returned to the caller with status STREAM_PAUSED = since >>>>>>> the caller did not >>>>>>> explicitly acknowledge that it has solved the OUT_OF_SPACE issue. >>>>>>> These semantics can be enabled by adding a new function to the API >>>>>>> perhaps stream_resume(). >>>>>>> This allows the caller to indicate that it acknowledges that it has= seen >>>>>>> the issue and this op >>>>>>> should be used to resolve the issue. Implementations that do not su= pport >>>>>>> this mode of use >>>>>>> can push back immediately after one op is in flight. Implementation= s >>>>>>> that support this use >>>>>>> mode can allow many ops from the same session >>>>>>> >>>>>> [Shally] Is it still in context of having single burst where all op = belongs to one stream? If yes, I would >>> still >>>>>> say it would add an overhead to PMDs especially if it is expected to= work closer to HW (which I think >>> is >>>>>> the case with DPDK PMD). >>>>>> Though your approach is doable but why this all cannot be in a layer= above PMD? i.e. a layer above >>> PMD >>>>>> can either pass one-op at a time with burst size =3D 1 OR can make c= hained mbuf of input and output >>> and >>>>>> pass than as one op. >>>>>> Is it just to ease applications of chained mbuf burden or do you see= any performance /use-case >>>>>> impacting aspect also? >>>>>> >>>>>> if it is in context where each op belong to different stream in a bu= rst, then why do we need >>>>>> stream_pause and resume? It is a expectations from app to pass more = output buffer with consumed >>> + 1 >>>>>> from next call onwards as it has already >>>>>> seen OUT_OF_SPACE. >>>> [Ahmed] Yes, this would add extra overhead to the PMD. Our PMD >>>> implementation rejects all ops that belong to a stream that has entere= d >>>> "RECOVERABLE" state for one reason or another. The caller must >>>> acknowledge explicitly that it has received news of the problem before >>>> the PMD allows this stream to exit "RECOVERABLE" state. I agree with y= ou >>>> that implementing this functionality in the software layer above the P= MD >>>> is a bad idea since the latency reductions are lost. >>> [Shally] Just reiterating, I rather meant other way around i.e. I see i= t easier to put all such complexity in a >>> layer above PMD. >>> >>>> This setup is useful in latency sensitive applications where the laten= cy >>>> of buffering multiple ops into one op is significant. We found latency >>>> makes a significant difference in search applications where the PMD >>>> competes with software decompression. >> [Fiona] I see, so when all goes well, you get best-case latency, but whe= n >> out-of-space occurs latency will probably be worse. >[Ahmed] This is exactly right. This use mode assumes out-of-space is a >rare occurrence. Recovering from it should take similar time to >synchronous implementations. The caller gets OUT_OF_SPACE_RECOVERABLE in >both sync and async use. The caller can fix up the op and send it back >to the PMD to continue work just as would be done in sync. Nonetheless, >the added complexity is not justifiable if out-of-space is very common >since the recoverable state will be the limiting factor that forces >synchronicity. >>>>> [Fiona] I still have concerns with this and would not want to support= in our PMD. >>>>> TO make sure I understand, you want to send a burst of ops, with seve= ral from same stream. >>>>> If one causes OUT_OF_SPACE_RECOVERABLE, then the PMD should not proce= ss any >>>>> subsequent ops in that stream. >>>>> Should it return them in a dequeue_burst() with status still NOT_PROC= ESSED? >>>>> Or somehow drop them? How? >>>>> While still processing ops form other streams. >>>> [Ahmed] This is exactly correct. It should return them with >>>> NOT_PROCESSED. Yes, the PMD should continue processing other streams. >>>>> As we want to offload each op to hardware with as little CPU processi= ng as possible we >>>>> would not want to open up each op to see which stream it's attached t= o and >>>>> make decisions to do per-stream storage, or drop it, or bypass hw and= dequeue without processing. >>>> [Ahmed] I think I might have missed your point here, but I will try to >>>> answer. There is no need to "cushion" ops in DPDK. DPDK should send op= s >>>> to the PMD and the PMD should reject until stream_continue() is called= . >>>> The next op to be sent by the user will have a special marker in it to >>>> inform the PMD to continue working on this stream. Alternatively the >>>> DPDK layer can be made "smarter" to fail during the enqueue by checkin= g >>>> the stream and its state, but like you say this adds additional CPU >>>> overhead during the enqueue. >>>> I am curious. In a simple synchronous use case. How do we prevent user= s >>> >from putting multiple ops in flight that belong to a single stream? Do >>>> we just currently say it is undefined behavior? Otherwise we would hav= e >>>> to check the stream and incur the CPU overhead. >> [Fiona] We don't do anything to prevent it. It's undefined. IMO on data = path in >> DPDK model we expect good behaviour and don't have to error check for th= ings like this. >[Ahmed] This makes sense. We also assume good behavior. >> In our PMD if we got a burst of 20 ops, we allocate 20 spaces on the hw = q, then >> build and send those messages. If we found an op from a stream which alr= eady >> had one inflight, we'd have to hold that back, store in a sw stream-spec= ific holding queue, >> only send 19 to hw. We cannot send multiple ops from same stream to >> the hw as it fans them out and does them in parallel. >> Once the enqueue_burst() returns, there is no processing >> context which would spot that the first has completed >> and send the next op to the hw. On a dequeue_burst() we would spot this, >> in that context could process the next op in the stream. >> On out of space, instead of processing the next op we would have to tran= sfer >> all unprocessed ops from the stream to the dequeue result. >> Some parts of this are doable, but seems likely to add a lot more latenc= y, >> we'd need to add extra threads and timers to move ops from the sw >> queue to the hw q to get any benefit, and these constructs would add >> context switching and CPU cycles. So we prefer to push this responsibili= ty >> to above the API and it can achieve similar. >[Ahmed] I see what you mean. Our workflow is almost exactly the same >with our hardware, but the fanning out is done by the hardware based on >the stream and ops that belong to the same stream are never allowed to >go out of order. Otherwise the data would be corrupted. Likewise the >hardware is responsible for checking the state of the stream and >returning frames as NOT_PROCESSED to the software >>>>> Maybe we could add a capability if this behaviour is important for yo= u? >>>>> e.g. ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS ? >>>>> Our PMD would set this to 0. And expect no more than one op from a st= ateful stream >>>>> to be in flight at any time. >>>> [Ahmed] That makes sense. This way the different DPDK implementations = do >>>> not have to add extra checking for unsupported cases. >>> [Shally] @ahmed, If I summarise your use-case, this is how to want to P= MD to support? >>> - a burst *carry only one stream* and all ops then assumed to be belong= to that stream? (please note, >>> here burst is not carrying more than one stream) >[Ahmed] No. In this use case the caller sets up an op and enqueues a >single op. Then before the response comes back from the PMD the caller >enqueues a second op on the same stream. >>> -PMD will submit one op at a time to HW? >[Ahmed] I misunderstood what PMD means. I used it throughout to mean the >HW. I used DPDK to mean the software implementation that talks to the >hardware. >The software will submit all ops immediately. The hardware has to figure >out what to do with the ops depending on what stream they belong to. >>> -if processed successfully, push it back to completion queue with statu= s =3D SUCCESS. If failed or run to >>> into OUT_OF_SPACE, then push it to completion queue with status =3D FAI= LURE/ >>> OUT_OF_SPACE_RECOVERABLE and rest with status =3D NOT_PROCESSED and ret= urn with enqueue count >>> =3D total # of ops submitted originally with burst? >[Ahmed] This is exactly what I had in mind. all ops will be submitted to >the HW. The HW will put all of them on the completion queue with the >correct status exactly as you say. >>> -app assumes all have been enqueued, so it go and dequeue all ops >>> -on seeing an op with OUT_OF_SPACE_RECOVERABLE, app resubmit a burst of= ops with call to >>> stream_continue/resume API starting from op which encountered OUT_OF_SP= ACE and others as >>> NOT_PROCESSED with updated input and output buffer? >[Ahmed] Correct this is what we do today in our proprietary API. >>> -repeat until *all* are dequeued with status =3D SUCCESS or *any* with = status =3D FAILURE? If anytime >>> failure is seen, then app start whole processing all over again or just= drop this burst?! >[Ahmed] The app has the choice on how to proceed. If the issue is >recoverable then the application can continue this stream from where it >stopped. if the failure is unrecoverable then the application should >first fix the problem and start from the beginning of the stream. >>> If all of above is true, then I think we should add another API such as= rte_comp_enque_single_stream() >>> which will be functional under Feature Flag =3D ALLOW_ENQUEUE_MULTIPLE_= STATEFUL_OPS or better >>> name is SUPPORT_ENQUEUE_SINGLE_STREAM?! >[Ahmed] The main advantage in async use is lost if we force all related >ops to be in the same burst. if we do that, then we might as well merge >all the ops into one op. That would reduce the overhead. >The use mode I am proposing is only useful in cases where the data >becomes available after the first enqueue occurred. I want to allow the >caller to enqueue the second set of data as soon as it is available >regardless of whether or not the HW has already started working on the >first op inflight. [Shally] @ahmed, Ok.. seems I missed a point here. So, confirm me followin= g: =20 As per current description in doc, expected stateful usage is: enqueue (op1) --> dequeue(op1) --> enqueue(op2) but you're suggesting to allow an option to change it to=20 enqueue(op1) -->enqueue(op2)=20 i.e. multiple ops from same stream can be put in-flight via subsequent enq= ueue_burst() calls without waiting to dequeue previous ones as PMD support = it . So, no change to current definition of a burst. It will still carry mu= ltiple streams where each op belonging to different stream ?! if yes, then seems your HW can be setup for multiple streams so it is effic= ient for your case to support it in DPDK PMD layer but our hw doesn't by-d= efault and need SW to back it. Given that, I also suggest to enable it unde= r some feature flag. However it looks like an add-on and if it doesn't change current definition= of a burst and minimum expectation set on stateful processing described in= this document, then IMO, you can propose this feature as an incremental pa= tch on baseline version, in absence of which,=20 application will exercise stateful processing as described here (enq->deq->= enq). Thoughts? >> [Fiona] Am curious about Ahmed's response to this. I didn't get that a b= urst should carry only one stream >> Or get how this makes a difference? As there can be many enqueue_burst()= calls done before an dequeue_burst() >> Maybe you're thinking the enqueue_burst() would be a blocking call that = would not return until all the ops >> had been processed? This would turn it into a synchronous call which isn= 't the intent. >[Ahmed] Agreed, a blocking or even a buffering software layer that baby >sits the hardware does not fundamentally change the parameters of the >system as a whole. It just moves workflow management complexity down >into the DPDK software layer. Rather there are real latency and >throughput advantages (because of caching) that I want to expose. > >/// snip ///