From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0045.outbound.protection.outlook.com [104.47.34.45]) by dpdk.org (Postfix) with ESMTP id B20321B6E4 for ; Mon, 29 Jan 2018 13:47:30 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=F6jRcCi0jI7+9b+ZpiFaicELm9hOZYoPPdaGW7SBldQ=; b=NLUH+0ObADEU3B0VJoOMwm1aKcpvaGHTpFbsx/nMgO1wp9gflU3B4JQCoJowdBsslsfrwNSGIgcdpnfm2FNF41DHI05sVx+hT93u7TI2MlP4UMeXnFz+BSyx3I8QfJQKT0eUKs3UGjuoY4FG+NfUbWw9UoK1ma73MglMIlC5YUw= Received: from CY4PR0701MB3634.namprd07.prod.outlook.com (52.132.101.164) by CY4PR0701MB3668.namprd07.prod.outlook.com (52.132.102.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.444.14; Mon, 29 Jan 2018 12:47:28 +0000 Received: from CY4PR0701MB3634.namprd07.prod.outlook.com ([fe80::a90e:9fcd:9ebd:8cad]) by CY4PR0701MB3634.namprd07.prod.outlook.com ([fe80::a90e:9fcd:9ebd:8cad%13]) with mapi id 15.20.0444.016; Mon, 29 Jan 2018 12:47:27 +0000 From: "Verma, Shally" To: Ahmed Mansour , "Trahe, Fiona" , "dev@dpdk.org" CC: "Athreya, Narayana Prasad" , "Gupta, Ashish" , "Sahu, Sunila" , "De Lara Guarch, Pablo" , "Challa, Mahipal" , "Jain, Deepak K" , Hemant Agrawal , Roy Pledge , Youri Querry Thread-Topic: [RFC v2] doc compression API for DPDK Thread-Index: AdOFUW8Wdt99b3u6RKydGSrxJwvtHgTqwRjw Date: Mon, 29 Jan 2018 12:47:27 +0000 Message-ID: References: <348A99DA5F5B7549AA880327E580B435892F589D@IRSMSX101.ger.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Shally.Verma@cavium.com; x-originating-ip: [115.113.156.2] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; CY4PR0701MB3668; 7:JD8fGmlaLNQhvwcAH3AGJ+7aWZG6IUhlrU+Y+S6c2v5mJMnwlO/89r5ePJSS9I0+hXG0NvHM9xMKo6oCN3oXr3GdhT0LPf/n8bMF9GLXrfvZzMLEiGPPBfPU/KPOOi6Uhg+h9ez17ffdxG15H3hPR98O6NFPkW6xJdZJaKnOKNqMAcqFqmttuSsQtAgMwKPcCQQOEzBuGFuE5W3c2tDTKmLN8A2p5Un2FOU/FelDOuhPRnMGlSV6kJdZo+SI+Akk x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(366004)(39850400004)(396003)(39380400002)(376002)(346002)(53474002)(51444003)(189003)(199004)(51914003)(13464003)(53754006)(8656006)(14454004)(2950100002)(97736004)(575784001)(59450400001)(53546011)(6506007)(86362001)(76176011)(68736007)(4326008)(72206003)(966005)(2900100001)(53936002)(53946003)(93886005)(2906002)(7696005)(25786009)(6116002)(3846002)(3280700002)(45080400002)(6246003)(6436002)(66066001)(81156014)(478600001)(3660700001)(5250100002)(81166006)(8936002)(5660300001)(2501003)(106356001)(9686003)(229853002)(54906003)(5890100001)(6306002)(186003)(110136005)(105586002)(316002)(8676002)(55016002)(74316002)(102836004)(33656002)(7736002)(55236004)(305945005)(26005)(99286004)(561944003)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:CY4PR0701MB3668; H:CY4PR0701MB3634.namprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; x-ms-office365-filtering-correlation-id: 7f8a0562-168a-441a-781a-08d567167427 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(7168020)(4627221)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:CY4PR0701MB3668; x-ms-traffictypediagnostic: CY4PR0701MB3668: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(20558992708506)(278428928389397)(189930954265078)(185117386973197)(45079756050767)(211171220733660)(228905959029699)(17755550239193); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(8121501046)(5005006)(10201501046)(3231101)(944501161)(93006095)(93001095)(3002001)(6041288)(20161123558120)(20161123562045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(6072148)(201708071742011); SRVR:CY4PR0701MB3668; BCL:0; PCL:0; RULEID:; SRVR:CY4PR0701MB3668; x-forefront-prvs: 0567A15835 received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: Dx99VXiZtb9KB2PYpWdF+LbsVTmwsN3T+nctJmQzyerK15WvAgNS8GvwIM6vKX9hpuv2VGafzE4bA6UHaaCDog== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7f8a0562-168a-441a-781a-08d567167427 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Jan 2018 12:47:27.6447 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR0701MB3668 Subject: Re: [dpdk-dev] [RFC v2] doc compression API for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Jan 2018 12:47:31 -0000 Hi Ahmed > -----Original Message----- > From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com] > Sent: 25 January 2018 23:49 > To: Verma, Shally ; Trahe, Fiona > ; dev@dpdk.org > Cc: Athreya, Narayana Prasad ; > Gupta, Ashish ; Sahu, Sunila > ; De Lara Guarch, Pablo > ; Challa, Mahipal > ; Jain, Deepak K ; > Hemant Agrawal ; Roy Pledge > ; Youri Querry > Subject: Re: [RFC v2] doc compression API for DPDK >=20 > Hi All, >=20 > Sorry for the delay. Please see responses inline. >=20 > Ahmed >=20 > On 1/12/2018 8:50 AM, Verma, Shally wrote: > > Hi Fiona > > > >> -----Original Message----- > >> From: Trahe, Fiona [mailto:fiona.trahe@intel.com] > >> Sent: 12 January 2018 00:24 > >> To: Verma, Shally ; Ahmed Mansour > >> ; dev@dpdk.org > >> Cc: Athreya, Narayana Prasad ; > >> Gupta, Ashish ; Sahu, Sunila > >> ; De Lara Guarch, Pablo > >> ; Challa, Mahipal > >> ; Jain, Deepak K > ; > >> Hemant Agrawal ; Roy Pledge > >> ; Youri Querry ; > Trahe, > >> Fiona > >> Subject: RE: [RFC v2] doc compression API for DPDK > >> > >> Hi Shally, Ahmed, > >> > >> > >>> -----Original Message----- > >>> From: Verma, Shally [mailto:Shally.Verma@cavium.com] > >>> Sent: Wednesday, January 10, 2018 12:55 PM > >>> To: Ahmed Mansour ; Trahe, Fiona > >> ; dev@dpdk.org > >>> Cc: Athreya, Narayana Prasad ; > >> Gupta, Ashish > >>> ; Sahu, Sunila ; > >> De Lara Guarch, Pablo > >>> ; Challa, Mahipal > >> ; Jain, Deepak K > >>> ; Hemant Agrawal > >> ; Roy Pledge > >>> ; Youri Querry > >>> Subject: RE: [RFC v2] doc compression API for DPDK > >>> > >>> HI Ahmed > >>> > >>>> -----Original Message----- > >>>> From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com] > >>>> Sent: 10 January 2018 00:38 > >>>> To: Verma, Shally ; Trahe, Fiona > >>>> ; dev@dpdk.org > >>>> Cc: Athreya, Narayana Prasad > ; > >>>> Gupta, Ashish ; Sahu, Sunila > >>>> ; De Lara Guarch, Pablo > >>>> ; Challa, Mahipal > >>>> ; Jain, Deepak K > >> ; > >>>> Hemant Agrawal ; Roy Pledge > >>>> ; Youri Querry > >>>> Subject: Re: [RFC v2] doc compression API for DPDK > >>>> > >>>> Hi Shally, > >>>> > >>>> Thanks for the summary. It is very helpful. Please see comments belo= w > >>>> > >>>> > >>>> On 1/4/2018 6:45 AM, Verma, Shally wrote: > >>>>> This is an RFC v2 document to brief understanding and requirements > on > >>>> compression API proposal in DPDK. It is based on "[RFC v3] > Compression > >> API > >>>> in DPDK > >>>> > >> > https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fdpd > >> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F32331%2F&data=3D02%7C01%7Cahm > >> > ed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea > >> > 1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata=3DJF > >>>> tOnJxajgXX7s3DMZ79K7VVM7TXO8lBd6rNeVlsHDg%3D&reserved=3D0 ". > >>>>> Intention of this document is to align on concepts built into > >> compression > >>>> API, its usage and identify further requirements. > >>>>> Going further it could be a base to Compression Module Programmer > >>>> Guide. > >>>>> Current scope is limited to > >>>>> - definition of the terminology which makes up foundation of > >> compression > >>>> API > >>>>> - typical API flow expected to use by applications > >>>>> - Stateless and Stateful operation definition and usage after RFC v= 1 > doc > >>>> review > >>>> > >> > https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fdev. > >>>> dpdk.narkive.com%2FCHS5l01B%2Fdpdk-dev-rfc-v1-doc-compression- > >> api- > >>>> for- > >>>> > >> > dpdk&data=3D02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473 > >> > fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6 > >> > 36506631207323264&sdata=3DFy7xKIyxZX97i7vEM6NqgrvnqKrNrWOYLwIA5dEH > >>>> QNQ%3D&reserved=3D0 > >>>>> 1. Overview > >>>>> ~~~~~~~~~~~ > >>>>> > >>>>> A. Compression Methodologies in compression API > >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>>>> DPDK compression supports two types of compression > methodologies: > >>>>> - Stateless - each data object is compressed individually without a= ny > >>>> reference to previous data, > >>>>> - Stateful - each data object is compressed with reference to prev= ious > >> data > >>>> object i.e. history of data is needed for compression / decompressio= n > >>>>> For more explanation, please refer RFC > >> > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fw > >> > ww.ietf.org%2Frfc%2Frfc1951.txt&data=3D02%7C01%7Cahmed.mansour%40nx > >> > p.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd9 > >> > 9c5c301635%7C0%7C0%7C636506631207323264&sdata=3Dpfp2VX1w3UxH5YLcL > >>>> 2R%2BvKXNeS7jP46CsASq0B1SETw%3D&reserved=3D0 > >>>>> To support both methodologies, DPDK compression introduces two > key > >>>> concepts: Session and Stream. > >>>>> B. Notion of a session in compression API > >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>>>> A Session in DPDK compression is a logical entity which is setup on= e- > >> time > >>>> with immutable parameters i.e. parameters that don't change across > >>>> operations and devices. > >>>>> A session can be shared across multiple devices and multiple > operations > >>>> simultaneously. > >>>>> A typical Session parameters includes info such as: > >>>>> - compress / decompress > >>>>> - compression algorithm and associated configuration parameters > >>>>> > >>>>> Application can create different sessions on a device initialized w= ith > >>>> same/different xforms. Once a session is initialized with one xform = it > >> cannot > >>>> be re-initialized. > >>>>> C. Notion of stream in compression API > >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>>>> Unlike session which carry common set of information across > >> operations, a > >>>> stream in DPDK compression is a logical entity which identify relate= d set > >> of > >>>> operations and carry operation specific information as needed by > device > >>>> during its processing. > >>>>> It is device specific data structure which is opaque to application= , > setup > >> and > >>>> maintained by device. > >>>>> A stream can be used with *only* one op at a time i.e. no two > >> operations > >>>> can share same stream simultaneously. > >>>>> A stream is *must* for stateful ops processing and optional for > >> stateless > >>>> (Please see respective sections for more details). > >>>>> This enables sharing of a session by multiple threads handling > different > >>>> data set as each op carry its own context (internal states, history > buffers > >> et > >>>> el) in its attached stream. > >>>>> Application should call rte_comp_stream_create() and attach to op > >> before > >>>> beginning of operation processing and free via > rte_comp_stream_free() > >>>> after its complete. > >>>>> C. Notion of burst operations in compression API > >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>>>> A burst in DPDK compression is an array of operations where each op > >> carry > >>>> independent set of data. i.e. a burst can look like: > >>>>> -----------------------------= --------------------------------- > -- > >> ----- > >>>> ------------------------------------ > >>>>> enque_burst (|op1.no_flush | op2.no_flush | op3.flush= _final | > >>>> op4.no_flush | op5.no_flush |) > >>>>> ----------------------------= ---------------------------------- > -- > >> ---- > >>>> ------------------------------------- > >>>>> Where, op1 .. op5 are all independent of each other and carry entir= ely > >>>> different set of data. > >>>>> Each op can be attached to same/different session but *must* be > >> attached > >>>> to different stream. > >>>>> Each op (struct rte_comp_op) carry compression/decompression > >>>> operational parameter and is both an input/output parameter. > >>>>> PMD gets source, destination and checksum information at input and > >>>> update it with bytes consumed and produced and checksum at output. > >>>>> Since each operation in a burst is independent and thus can complet= e > >> out- > >>>> of-order, applications which need ordering, should setup per-op use= r > >> data > >>>> area with reordering information so that it can determine enqueue > order > >> at > >>>> deque. > >>>>> Also if multiple threads calls enqueue_burst() on same queue pair > then > >> it's > >>>> application onus to use proper locking mechanism to ensure exclusive > >>>> enqueuing of operations. > >>>>> D. Stateless Vs Stateful > >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>>>> Compression API provide RTE_COMP_FF_STATEFUL feature flag for > >> PMD > >>>> to reflect its support for Stateful operation. Each op carry an op t= ype > >>>> indicating if it's to be processed stateful or stateless. > >>>>> D.1 Compression API Stateless operation > >>>>> ------------------------------------------------------ > >>>>> An op is processed stateless if it has > >>>>> - flush value is set to RTE_FLUSH_FULL or RTE_FLUSH_FI= NAL > >>>> (required only on compression side), > >>>>> - op_type set to RTE_COMP_OP_STATELESS > >>>>> - All-of the required input and sufficient large outpu= t buffer to > >> store > >>>> output i.e. OUT_OF_SPACE can never occur. > >>>>> When all of the above conditions are met, PMD initiates stateless > >>>> processing and releases acquired resources after processing of curre= nt > >>>> operation is complete i.e. full input consumed and full output writt= en. > >> [Fiona] I think 3rd condition conflicts with D1.1 below and anyway can= not > be > >> a precondition. i.e. > >> PMD must initiate stateless processing based on > RTE_COMP_OP_STATELESS. > >> It can't always know if the output buffer is big enough before process= ing, > it > >> must process the input data and > >> only when it has consumed it all can it know that all the output data = fits or > >> doesn't fit in the output buffer. > >> > >> I'd suggest rewording as follows: > >> An op is processed statelessly if op_type is set to > RTE_COMP_OP_STATELESS > >> In this case > >> - The flush value must be set to RTE_FLUSH_FULL or RTE_FLUSH_FINAL > >> (required only on compression side), > >> - All of the input data must be in the src buffer > >> - The dst buffer should be sufficiently large enough to hold the expec= ted > >> output > >> The PMD acquires the necessary resources to process the op. After > >> processing of current operation is > >> complete, whether successful or not, it releases acquired resources an= d > no > >> state, history or data is > >> held in the PMD or carried over to subsequent ops. > >> In SUCCESS case full input is consumed and full output written and sta= tus > is > >> set to RTE_COMP_OP_STATUS_SUCCESS. > >> OUT-OF-SPACE as D1.1 below. > >> > > [Shally] Ok. Agreed. > > > >>>>> Application can optionally attach a stream to such ops. In such cas= e, > >>>> application must attach different stream to each op. > >>>>> Application can enqueue stateless burst via making consecutive > >>>> enque_burst() calls i.e. Following is relevant usage: > >>>>> enqueued =3D rte_comp_enque_burst (dev_id, qp_id, ops1, nb_ops); > >>>>> enqueued =3D rte_comp_enque_burst(dev_id, qp_id, ops2, nb_ops); > >>>>> > >>>>> *Note - Every call has different ops array i.e. same rte_comp_op > array > >>>> *cannot be re-enqueued* to process next batch of data until previous > >> ones > >>>> are completely processed. > >>>>> D.1.1 Stateless and OUT_OF_SPACE > >>>>> ------------------------------------------------ > >>>>> OUT_OF_SPACE is a condition when output buffer runs out of space > >> and > >>>> where PMD still has more data to produce. If PMD run into such > >> condition, > >>>> then it's an error condition in stateless processing. > >>>>> In such case, PMD resets itself and return with status > >>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with > produced=3Dconsumed=3D0 > >> i.e. > >>>> no input read, no output written. > >>>>> Application can resubmit an full input with larger output buffer si= ze. > >>>> [Ahmed] Can we add an option to allow the user to read the data that > >> was > >>>> produced while still reporting OUT_OF_SPACE? this is mainly useful f= or > >>>> decompression applications doing search. > >>> [Shally] It is there but applicable for stateful operation type (plea= se refer > to > >> handling out_of_space under > >>> "Stateful Section"). > >>> By definition, "stateless" here means that application (such as IPCOM= P) > >> knows maximum output size > >>> guaranteedly and ensure that uncompressed data size cannot grow > more > >> than provided output buffer. > >>> Such apps can submit an op with type =3D STATELESS and provide full i= nput, > >> then PMD assume it has > >>> sufficient input and output and thus doesn't need to maintain any > contexts > >> after op is processed. > >>> If application doesn't know about max output size, then it should > process it > >> as stateful op i.e. setup op > >>> with type =3D STATEFUL and attach a stream so that PMD can maintain > >> relevant context to handle such > >>> condition. > >> [Fiona] There may be an alternative that's useful for Ahmed, while sti= ll > >> respecting the stateless concept. > >> In Stateless case where a PMD reports OUT_OF_SPACE in decompression > >> case > >> it could also return consumed=3D0, produced =3D x, where x>0. X indica= tes the > >> amount of valid data which has > >> been written to the output buffer. It is not complete, but if an appl= ication > >> wants to search it it may be sufficient. > >> If the application still wants the data it must resubmit the whole inp= ut > with a > >> bigger output buffer, and > >> decompression will be repeated from the start, it > >> cannot expect to continue on as the PMD has not maintained state, > history > >> or data. > >> I don't think there would be any need to indicate this in capabilities= , PMDs > >> which cannot provide this > >> functionality would always return produced=3Dconsumed=3D0, while PMDs > which > >> can could set produced > 0. > >> If this works for you both, we could consider a similar case for > compression. > >> > > [Shally] Sounds Fine to me. Though then in that case, consume should al= so > be updated to actual consumed by PMD. > > Setting consumed =3D 0 with produced > 0 doesn't correlate. > [Ahmed]I like Fiona's suggestion, but I also do not like the implication > of returning consumed =3D 0. At the same time returning consumed =3D y > implies to the user that it can proceed from the middle. I prefer the > consumed =3D 0 implementation, but I think a different return is needed t= o > distinguish it from OUT_OF_SPACE that the use can recover from. Perhaps > OUT_OF_SPACE_RECOVERABLE and OUT_OF_SPACE_TERMINATED. This also > allows > future PMD implementations to provide recover-ability even in STATELESS > mode if they so wish. In this model STATELESS or STATEFUL would be a > hint for the PMD implementation to make optimizations for each case, but > it does not force the PMD implementation to limit functionality if it > can provide recover-ability. > > > >>>>> D.2 Compression API Stateful operation > >>>>> ---------------------------------------------------------- > >>>>> A Stateful operation in DPDK compression means application invokes > >>>> enqueue burst() multiple times to process related chunk of data eith= er > >>>> because > >>>>> - Application broke data into several ops, and/or > >>>>> - PMD ran into out_of_space situation during input processing > >>>>> > >>>>> In case of either one or all of the above conditions, PMD is requir= ed to > >>>> maintain state of op across enque_burst() calls and > >>>>> ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin > with > >>>> flush value =3D RTE_COMP_NO/SYNC_FLUSH and end at flush value > >>>> RTE_COMP_FULL/FINAL_FLUSH. > >>>>> D.2.1 Stateful operation state maintenance > >>>>> --------------------------------------------------------------- > >>>>> It is always an ideal expectation from application that it should p= arse > >>>> through all related chunk of source data making its mbuf-chain and > >> enqueue > >>>> it for stateless processing. > >>>>> However, if it need to break it into several enqueue_burst() calls, > then > >> an > >>>> expected call flow would be something like: > >>>>> enqueue_burst( |op.no_flush |) > >>>> [Ahmed] The work is now in flight to the PMD.The user will call > dequeue > >>>> burst in a loop until all ops are received. Is this correct? > >>>> > >>>>> deque_burst(op) // should dequeue before we enqueue next > >>> [Shally] Yes. Ideally every submitted op need to be dequeued. However > >> this illustration is specifically in > >>> context of stateful op processing to reflect if a stream is broken in= to > >> chunks, then each chunk should be > >>> submitted as one op at-a-time with type =3D STATEFUL and need to be > >> dequeued first before next chunk is > >>> enqueued. > >>> > >>>>> enqueue_burst( |op.no_flush |) > >>>>> deque_burst(op) // should dequeue before we enqueue next > >>>>> enqueue_burst( |op.full_flush |) > >>>> [Ahmed] Why now allow multiple work items in flight? I understand > that > >>>> occasionaly there will be OUT_OF_SPACE exception. Can we just > >> distinguish > >>>> the response in exception cases? > >>> [Shally] Multiples ops are allowed in flight, however condition is ea= ch op > in > >> such case is independent of > >>> each other i.e. belong to different streams altogether. > >>> Earlier (as part of RFC v1 doc) we did consider the proposal to proce= ss all > >> related chunks of data in single > >>> burst by passing them as ops array but later found that as not-so-use= ful > for > >> PMD handling for various > >>> reasons. You may please refer to RFC v1 doc review comments for same. > >> [Fiona] Agree with Shally. In summary, as only one op can be processed= at > a > >> time, since each needs the > >> state of the previous, to allow more than 1 op to be in-flight at a ti= me > would > >> force PMDs to implement internal queueing and exception handling for > >> OUT_OF_SPACE conditions you mention. > [Ahmed] But we are putting the ops on qps which would make them > sequential. Handling OUT_OF_SPACE conditions would be a little bit more > complex but doable. The question is this mode of use useful for real > life applications or would we be just adding complexity? The technical > advantage of this is that processing of Stateful ops is interdependent > and PMDs can take advantage of caching and other optimizations to make > processing related ops much faster than switching on every op. PMDs have > maintain state of more than 32 KB for DEFLATE for every stream. > >> If the application has all the data, it can put it into chained mbufs = in a > single > >> op rather than > >> multiple ops, which avoids pushing all that complexity down to the PMD= s. > [Ahmed] I think that your suggested scheme of putting all related mbufs > into one op may be the best solution without the extra complexity of > handling OUT_OF_SPACE cases, while still allowing the enqueuer extra > time If we have a way of marking mbufs as ready for consumption. The > enqueuer may not have all the data at hand but can enqueue the op with a > couple of empty mbus marked as not ready for consumption. The enqueuer > will then update the rest of the mbufs to ready for consumption once the > data is added. This introduces a race condition. A second flag for each > mbuf can be updated by the PMD to indicate that it processed it or not. > This way in cases where the PMD beat the application to the op, the > application will just update the op to point to the first unprocessed > mbuf and resend it to the PMD. > >> > >>>>> Here an op *must* be attached to a stream and every subsequent > >>>> enqueue_burst() call should carry *same* stream. Since PMD maintain > >> ops > >>>> state in stream, thus it is mandatory for application to attach stre= am to > >> such > >>>> ops. > >> [Fiona] I think you're referring only to a single stream above, but as= there > >> may be many different streams, > >> maybe add the following? > >> Above is simplified to show just a single stream. However there may be > >> many streams, and each > >> enqueue_burst() may contain ops from different streams, as long as > there is > >> only one op in-flight from any > >> stream at a given time. > >> > > [Shally] Ok get it. > > > >>>>> D.2.2 Stateful and Out_of_Space > >>>>> -------------------------------------------- > >>>>> If PMD support stateful and run into OUT_OF_SPACE situation, then i= t > is > >>>> not an error condition for PMD. In such case, PMD return with status > >>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with consumed =3D number of > >> input > >>>> bytes read and produced =3D length of complete output buffer. > >> [Fiona] - produced would be <=3D output buffer len (typically =3D, but= could > be a > >> few bytes less) > >> > >> > >>>>> Application should enqueue op with source starting at consumed+1 > and > >>>> output buffer with available space. > >>>> > >>>> [Ahmed] Related to OUT_OF_SPACE. What status does the user > recieve > >> in a > >>>> decompression case when the end block is encountered before the > end > >> of > >>>> the input? Does the PMD continue decomp? Does it stop there and > >> return > >>>> the stop index? > >>>> > >>> [Shally] Before I could answer this, please help me understand your u= se > >> case . When you say "when the > >>> end block is encountered before the end of the input?" Do you mean - > >>> "Decompressor process a final block (i.e. has BFINAL=3D1 in its heade= r) and > >> there's some footer data after > >>> that?" Or > >>> you mean "decompressor process one block and has more to process till > its > >> final block?" > >>> What is "end block" and "end of input" reference here? > [Ahmed] I meant BFINAL=3D1 by end block. The end of input is the end of > the input length. > e.g. > | input > length--------------------------------------------------------------| > |--data----data----data------data-------BFINAL-footer-unrelated data| [Shally] I will respond to this with my understanding and wait for Fiona to= respond first on rest of above comments. So, if decompressor encounter a final block before the end of actual input,= then it ideally should continue to decompress the final block and consume = input till it sees its end-of-block marker. Normally decompressor don't process the data after it has finished processi= ng the Final block so unprocessed trailing data may be passed as is back to= application with=20 'consumed =3D length of input till end-of-final-block' and 'status =3D SUCC= ESS/Out_of_space' (Out of space here imply output buffer ran out of space w= hile writing decompressed data to it).=20 Thanks Shally > >>> > >>>>> D.2.3 Sliding Window Size > >>>>> ------------------------------------ > >>>>> Every PMD will reflect in its algorithm capability structure maximu= m > >> length > >>>> of Sliding Window in bytes which would indicate maximum history > buffer > >>>> length used by algo. > >>>>> 2. Example API illustration > >>>>> ~~~~~~~~~~~~~~~~~~~~~~~ > >>>>> > >> [Fiona] I think it would be useful to show an example of both a STATEL= ESS > >> flow and a STATEFUL flow. > >> > > [Shally] Ok. I can add simplified version to illustrate API usage in bo= th cases. > > > >>>>> Following is an illustration on API usage (This is just one flow, = other > >> variants > >>>> are also possible): > >>>>> 1. rte_comp_session *sess =3D rte_compressdev_session_create > >>>> (rte_mempool *pool); > >>>>> 2. rte_compressdev_session_init (int dev_id, rte_comp_session > *sess, > >>>> rte_comp_xform *xform, rte_mempool *sess_pool); > >>>>> 3. rte_comp_op_pool_create(rte_mempool ..) > >>>>> 4. rte_comp_op_bulk_alloc (struct rte_mempool *mempool, struct > >>>> rte_comp_op **ops, uint16_t nb_ops); > >>>>> 5. for every rte_comp_op in ops[], > >>>>> 5.1 rte_comp_op_attach_session (rte_comp_op *op, > >> rte_comp_session > >>>> *sess); > >>>>> 5.2 op.op_type =3D RTE_COMP_OP_STATELESS > >>>>> 5.3 op.flush =3D RTE_FLUSH_FINAL > >>>>> 6. [Optional] for every rte_comp_op in ops[], > >>>>> 6.1 rte_comp_stream_create(int dev_id, rte_comp_session *sess, > >> void > >>>> **stream); > >>>>> 6.2 rte_comp_op_attach_stream(rte_comp_op *op, > >> rte_comp_session > >>>> *stream); > >>>> > >>>> [Ahmed] What is the semantic effect of attaching a stream to every > op? > >> will > >>>> this application benefit for this given that it is setup with op_typ= e > >> STATELESS > >>> [Shally] By role, stream is data structure that hold all information = that > PMD > >> need to maintain for an op > >>> processing and thus it's marked device specific. It is required for s= tateful > >> processing but optional for > >>> statelss as PMD doesn't need to maintain context once op is processed > >> unlike stateful. > >>> It may be of advantage to use stream for stateless to some of the PMD= . > >> They can be designed to do one- > >>> time per op setup (such as mapping session params) during > >> stream_create() in control path than data > >>> path. > >>> > >> [Fiona] yes, we agreed that stream_create() should be called for every > >> session and if it > >> returns non-NULL the PMD needs it so op_attach_stream() must be > called. > >> However I've just realised we don't have a STATEFUL/STATELESS param > on > >> the xform, just on the op. > >> So we could either add stateful/stateless param to stream_create() ? > >> OR add stateful/stateless param to xform so it would be in session? > > [Shally] No it shouldn't be as part of session or xform as sessions are= n't > stateless/stateful. > > So, we shouldn't alter the current definition of session or xforms. > > If we need to mention it, then it could be added as part of stream_crea= te() > as it's device specific and depending upon op_type() device can then setu= p > stream resources. > > > >> However, Shally, can you reconsider if you really need it for STATELES= S or > if > >> the data you want to > >> store there could be stored in the session? Or if it's needed per-op d= oes it > >> really need > >> to be visible on the API as a stream or could it be hidden within the = PMD? > > [Shally] I would say it is not mandatory but a desirable feature that I= am > suggesting. > > I am only trying to enable optimization in data path which may be of he= lp to > some PMD designs as they can use stream_create() to do setup that are 1- > time per op and regardless of op_type, such as I mentioned, setting up us= er > session params to device sess params. > > We can hide it inside PMD however there may be slight overhead in > datapath depending on PMD design. > > But I would say, it's not a blocker for us to freeze on current spec. W= e can > revisit this feature later because it will not alter base API functionali= ty. > > > > Thanks > > Shally > > > >>>>> 7.for every rte_comp_op in ops[], > >>>>> 7.1 set up with src/dst buffer > >>>>> 8. enq =3D rte_compressdev_enqueue_burst (dev_id, qp_id, &ops, > >> nb_ops); > >>>>> 9. do while (dqu < enq) // Wait till all of enqueued are dequeued > >>>>> 9.1 dqu =3D rte_compressdev_dequeue_burst (dev_id, qp_id, &ops, > >> enq); > >>>> [Ahmed] I am assuming that waiting for all enqueued to be dequeued > is > >> not > >>>> strictly necessary, but is just the chosen example in this case > >>>> > >>> [Shally] Yes. By design, for burst_size>1 each op is independent of e= ach > >> other. So app may proceed as soon > >>> as it dequeue any. > >>> > >>>>> 10. Repeat 7 for next batch of data > >>>>> 11. for every ops in ops[] > >>>>> 11.1 rte_comp_stream_free(op->stream); > >>>>> 11. rte_comp_session_clear (sess) ; > >>>>> 12. rte_comp_session_terminate(ret_comp_sess *session) > >>>>> > >>>>> Thanks > >>>>> Shally > >>>>> > >>>>> > >