From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30083.outbound.protection.outlook.com [40.107.3.83]) by dpdk.org (Postfix) with ESMTP id EE3C91B1B6 for ; Thu, 25 Jan 2018 19:19:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=lZFVQiHr3q5df4lfK0E1zNPGyMMcoE8KvyMUYBHAJ9I=; b=EGwc466Ok50GNqSJWf9iqlcLTMXjaRmvBrsQXv1eE3dh9iLxm/KBLMU9hvcBiw8AkcYiGbHPjvfYMje/jxYR3QG+bNfkcD6S9owJpxQ0ao0Ak+42Z6dR7YwlmlsLX6y4TIhvAXKTigy91UMhfolFW/o2uAUwrBAjkUx/+l12iTE= Received: from VI1PR0402MB3853.eurprd04.prod.outlook.com (52.134.16.149) by VI1PR0402MB2863.eurprd04.prod.outlook.com (10.175.23.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.444.14; Thu, 25 Jan 2018 18:19:18 +0000 Received: from VI1PR0402MB3853.eurprd04.prod.outlook.com ([fe80::6469:240a:79d1:fb90]) by VI1PR0402MB3853.eurprd04.prod.outlook.com ([fe80::6469:240a:79d1:fb90%13]) with mapi id 15.20.0428.019; Thu, 25 Jan 2018 18:19:16 +0000 From: Ahmed Mansour To: "Verma, Shally" , "Trahe, Fiona" , "dev@dpdk.org" CC: "Athreya, Narayana Prasad" , "Gupta, Ashish" , "Sahu, Sunila" , "De Lara Guarch, Pablo" , "Challa, Mahipal" , "Jain, Deepak K" , Hemant Agrawal , Roy Pledge , Youri Querry Thread-Topic: [RFC v2] doc compression API for DPDK Thread-Index: AdOFUW8Wdt99b3u6RKydGSrxJwvtHg== Date: Thu, 25 Jan 2018 18:19:16 +0000 Message-ID: References: <348A99DA5F5B7549AA880327E580B435892F589D@IRSMSX101.ger.corp.intel.com> Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=ahmed.mansour@nxp.com; x-originating-ip: [192.88.168.1] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; VI1PR0402MB2863; 7:rZHTgCVjcW8uvWZjVgImj29YDwLiDPHIyWAYtQLiRdefNS9NT7Qxt+zVao2LeO3BcMCQrt1EakagP6W+ofrEZ4ZzNgHVT8touGtceNXH/O0VpRfvXT/rVMwNezXM9D4Kaafq8vM13+AQjgc0xhV4SIFWt9w/ffm0zvkdZABcTMesLB87VcozZsjqoWecUE7XzcUUM1Dl6r+uIkROg53QHQlC0IWw/JrBJLEolHHNcoCUgPrZKq+bmpo9FEwYK6w4 x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(396003)(376002)(39380400002)(366004)(346002)(39860400002)(189003)(199004)(53754006)(51914003)(53474002)(51444003)(13464003)(53546011)(102836004)(6246003)(6506007)(2501003)(575784001)(106356001)(86362001)(59450400001)(8936002)(76176011)(966005)(2900100001)(7696005)(45080400002)(3660700001)(53936002)(26005)(53946003)(97736004)(186003)(5890100001)(5250100002)(66066001)(3280700002)(105586002)(2906002)(3846002)(6116002)(74316002)(229853002)(305945005)(7736002)(55016002)(316002)(14454004)(93886005)(54906003)(110136005)(25786009)(4326008)(561944003)(6436002)(6306002)(9686003)(478600001)(33656002)(5660300001)(8676002)(81166006)(68736007)(99286004)(81156014)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0402MB2863; H:VI1PR0402MB3853.eurprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: fc1be7f5-0357-430f-eea8-08d56420253f x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:VI1PR0402MB2863; x-ms-traffictypediagnostic: VI1PR0402MB2863: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(20558992708506)(278428928389397)(189930954265078)(185117386973197)(45079756050767)(211171220733660)(228905959029699)(17755550239193); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(8121501046)(5005006)(93006095)(93001095)(3231058)(2400081)(944501161)(10201501046)(3002001)(6055026)(6041288)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123558120)(6072148)(201708071742011); SRVR:VI1PR0402MB2863; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0402MB2863; x-forefront-prvs: 0563F2E8B7 received-spf: None (protection.outlook.com: nxp.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: yHhkPRqLkJlZipfGOh6O+w5iN/u0J7zlt5qx6AIvyWlzlukHKThZuSCV5IQynFLfvQtuBOXzQo5BO5Cr+nnINQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nxp.com X-MS-Exchange-CrossTenant-Network-Message-Id: fc1be7f5-0357-430f-eea8-08d56420253f X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Jan 2018 18:19:16.7596 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 686ea1d3-bc2b-4c6f-a92c-d99c5c301635 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0402MB2863 Subject: Re: [dpdk-dev] [RFC v2] doc compression API for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jan 2018 18:19:22 -0000 Hi All,=0A= =0A= Sorry for the delay. Please see responses inline.=0A= =0A= Ahmed=0A= =0A= On 1/12/2018 8:50 AM, Verma, Shally wrote:=0A= > Hi Fiona=0A= >=0A= >> -----Original Message-----=0A= >> From: Trahe, Fiona [mailto:fiona.trahe@intel.com]=0A= >> Sent: 12 January 2018 00:24=0A= >> To: Verma, Shally ; Ahmed Mansour=0A= >> ; dev@dpdk.org=0A= >> Cc: Athreya, Narayana Prasad ;=0A= >> Gupta, Ashish ; Sahu, Sunila=0A= >> ; De Lara Guarch, Pablo=0A= >> ; Challa, Mahipal=0A= >> ; Jain, Deepak K ;= =0A= >> Hemant Agrawal ; Roy Pledge=0A= >> ; Youri Querry ; Trahe,=0A= >> Fiona =0A= >> Subject: RE: [RFC v2] doc compression API for DPDK=0A= >>=0A= >> Hi Shally, Ahmed,=0A= >>=0A= >>=0A= >>> -----Original Message-----=0A= >>> From: Verma, Shally [mailto:Shally.Verma@cavium.com]=0A= >>> Sent: Wednesday, January 10, 2018 12:55 PM=0A= >>> To: Ahmed Mansour ; Trahe, Fiona=0A= >> ; dev@dpdk.org=0A= >>> Cc: Athreya, Narayana Prasad ;=0A= >> Gupta, Ashish=0A= >>> ; Sahu, Sunila ;=0A= >> De Lara Guarch, Pablo=0A= >>> ; Challa, Mahipal=0A= >> ; Jain, Deepak K=0A= >>> ; Hemant Agrawal=0A= >> ; Roy Pledge=0A= >>> ; Youri Querry =0A= >>> Subject: RE: [RFC v2] doc compression API for DPDK=0A= >>>=0A= >>> HI Ahmed=0A= >>>=0A= >>>> -----Original Message-----=0A= >>>> From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com]=0A= >>>> Sent: 10 January 2018 00:38=0A= >>>> To: Verma, Shally ; Trahe, Fiona=0A= >>>> ; dev@dpdk.org=0A= >>>> Cc: Athreya, Narayana Prasad ;=0A= >>>> Gupta, Ashish ; Sahu, Sunila=0A= >>>> ; De Lara Guarch, Pablo=0A= >>>> ; Challa, Mahipal=0A= >>>> ; Jain, Deepak K=0A= >> ;=0A= >>>> Hemant Agrawal ; Roy Pledge=0A= >>>> ; Youri Querry =0A= >>>> Subject: Re: [RFC v2] doc compression API for DPDK=0A= >>>>=0A= >>>> Hi Shally,=0A= >>>>=0A= >>>> Thanks for the summary. It is very helpful. Please see comments below= =0A= >>>>=0A= >>>>=0A= >>>> On 1/4/2018 6:45 AM, Verma, Shally wrote:=0A= >>>>> This is an RFC v2 document to brief understanding and requirements on= =0A= >>>> compression API proposal in DPDK. It is based on "[RFC v3] Compression= =0A= >> API=0A= >>>> in DPDK=0A= >>>>=0A= >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fdpd= =0A= >> k.org%2Fdev%2Fpatchwork%2Fpatch%2F32331%2F&data=3D02%7C01%7Cahm=0A= >> ed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea=0A= >> 1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata=3DJF=0A= >>>> tOnJxajgXX7s3DMZ79K7VVM7TXO8lBd6rNeVlsHDg%3D&reserved=3D0 ".=0A= >>>>> Intention of this document is to align on concepts built into=0A= >> compression=0A= >>>> API, its usage and identify further requirements.=0A= >>>>> Going further it could be a base to Compression Module Programmer=0A= >>>> Guide.=0A= >>>>> Current scope is limited to=0A= >>>>> - definition of the terminology which makes up foundation of=0A= >> compression=0A= >>>> API=0A= >>>>> - typical API flow expected to use by applications=0A= >>>>> - Stateless and Stateful operation definition and usage after RFC v1 = doc=0A= >>>> review=0A= >>>>=0A= >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fdev.= =0A= >>>> dpdk.narkive.com%2FCHS5l01B%2Fdpdk-dev-rfc-v1-doc-compression-=0A= >> api-=0A= >>>> for-=0A= >>>>=0A= >> dpdk&data=3D02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473=0A= >> fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6=0A= >> 36506631207323264&sdata=3DFy7xKIyxZX97i7vEM6NqgrvnqKrNrWOYLwIA5dEH=0A= >>>> QNQ%3D&reserved=3D0=0A= >>>>> 1. Overview=0A= >>>>> ~~~~~~~~~~~=0A= >>>>>=0A= >>>>> A. Compression Methodologies in compression API=0A= >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= >>>>> DPDK compression supports two types of compression methodologies:=0A= >>>>> - Stateless - each data object is compressed individually without any= =0A= >>>> reference to previous data,=0A= >>>>> - Stateful - each data object is compressed with reference to previo= us=0A= >> data=0A= >>>> object i.e. history of data is needed for compression / decompression= =0A= >>>>> For more explanation, please refer RFC=0A= >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fw= =0A= >> ww.ietf.org%2Frfc%2Frfc1951.txt&data=3D02%7C01%7Cahmed.mansour%40nx=0A= >> p.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd9=0A= >> 9c5c301635%7C0%7C0%7C636506631207323264&sdata=3Dpfp2VX1w3UxH5YLcL=0A= >>>> 2R%2BvKXNeS7jP46CsASq0B1SETw%3D&reserved=3D0=0A= >>>>> To support both methodologies, DPDK compression introduces two key=0A= >>>> concepts: Session and Stream.=0A= >>>>> B. Notion of a session in compression API=0A= >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= >>>>> A Session in DPDK compression is a logical entity which is setup one-= =0A= >> time=0A= >>>> with immutable parameters i.e. parameters that don't change across=0A= >>>> operations and devices.=0A= >>>>> A session can be shared across multiple devices and multiple operatio= ns=0A= >>>> simultaneously.=0A= >>>>> A typical Session parameters includes info such as:=0A= >>>>> - compress / decompress=0A= >>>>> - compression algorithm and associated configuration parameters=0A= >>>>>=0A= >>>>> Application can create different sessions on a device initialized wit= h=0A= >>>> same/different xforms. Once a session is initialized with one xform it= =0A= >> cannot=0A= >>>> be re-initialized.=0A= >>>>> C. Notion of stream in compression API=0A= >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= >>>>> Unlike session which carry common set of information across=0A= >> operations, a=0A= >>>> stream in DPDK compression is a logical entity which identify related = set=0A= >> of=0A= >>>> operations and carry operation specific information as needed by devic= e=0A= >>>> during its processing.=0A= >>>>> It is device specific data structure which is opaque to application, = setup=0A= >> and=0A= >>>> maintained by device.=0A= >>>>> A stream can be used with *only* one op at a time i.e. no two=0A= >> operations=0A= >>>> can share same stream simultaneously.=0A= >>>>> A stream is *must* for stateful ops processing and optional for=0A= >> stateless=0A= >>>> (Please see respective sections for more details).=0A= >>>>> This enables sharing of a session by multiple threads handling differ= ent=0A= >>>> data set as each op carry its own context (internal states, history bu= ffers=0A= >> et=0A= >>>> el) in its attached stream.=0A= >>>>> Application should call rte_comp_stream_create() and attach to op=0A= >> before=0A= >>>> beginning of operation processing and free via rte_comp_stream_free()= =0A= >>>> after its complete.=0A= >>>>> C. Notion of burst operations in compression API=0A= >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= >>>>> A burst in DPDK compression is an array of operations where each op= =0A= >> carry=0A= >>>> independent set of data. i.e. a burst can look like:=0A= >>>>> -------------------------------= ---------------------------------=0A= >> -----=0A= >>>> ------------------------------------=0A= >>>>> enque_burst (|op1.no_flush | op2.no_flush | op3.flush_f= inal |=0A= >>>> op4.no_flush | op5.no_flush |)=0A= >>>>> ------------------------------= ----------------------------------=0A= >> ----=0A= >>>> -------------------------------------=0A= >>>>> Where, op1 .. op5 are all independent of each other and carry entirel= y=0A= >>>> different set of data.=0A= >>>>> Each op can be attached to same/different session but *must* be=0A= >> attached=0A= >>>> to different stream.=0A= >>>>> Each op (struct rte_comp_op) carry compression/decompression=0A= >>>> operational parameter and is both an input/output parameter.=0A= >>>>> PMD gets source, destination and checksum information at input and=0A= >>>> update it with bytes consumed and produced and checksum at output.=0A= >>>>> Since each operation in a burst is independent and thus can complete= =0A= >> out-=0A= >>>> of-order, applications which need ordering, should setup per-op user= =0A= >> data=0A= >>>> area with reordering information so that it can determine enqueue orde= r=0A= >> at=0A= >>>> deque.=0A= >>>>> Also if multiple threads calls enqueue_burst() on same queue pair the= n=0A= >> it's=0A= >>>> application onus to use proper locking mechanism to ensure exclusive= =0A= >>>> enqueuing of operations.=0A= >>>>> D. Stateless Vs Stateful=0A= >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= >>>>> Compression API provide RTE_COMP_FF_STATEFUL feature flag for=0A= >> PMD=0A= >>>> to reflect its support for Stateful operation. Each op carry an op typ= e=0A= >>>> indicating if it's to be processed stateful or stateless.=0A= >>>>> D.1 Compression API Stateless operation=0A= >>>>> ------------------------------------------------------=0A= >>>>> An op is processed stateless if it has=0A= >>>>> - flush value is set to RTE_FLUSH_FULL or RTE_FLUSH_FINA= L=0A= >>>> (required only on compression side),=0A= >>>>> - op_type set to RTE_COMP_OP_STATELESS=0A= >>>>> - All-of the required input and sufficient large output = buffer to=0A= >> store=0A= >>>> output i.e. OUT_OF_SPACE can never occur.=0A= >>>>> When all of the above conditions are met, PMD initiates stateless=0A= >>>> processing and releases acquired resources after processing of current= =0A= >>>> operation is complete i.e. full input consumed and full output written= .=0A= >> [Fiona] I think 3rd condition conflicts with D1.1 below and anyway canno= t be=0A= >> a precondition. i.e.=0A= >> PMD must initiate stateless processing based on RTE_COMP_OP_STATELESS.= =0A= >> It can't always know if the output buffer is big enough before processin= g, it=0A= >> must process the input data and=0A= >> only when it has consumed it all can it know that all the output data fi= ts or=0A= >> doesn't fit in the output buffer.=0A= >>=0A= >> I'd suggest rewording as follows:=0A= >> An op is processed statelessly if op_type is set to RTE_COMP_OP_STATELES= S=0A= >> In this case=0A= >> - The flush value must be set to RTE_FLUSH_FULL or RTE_FLUSH_FINAL=0A= >> (required only on compression side),=0A= >> - All of the input data must be in the src buffer=0A= >> - The dst buffer should be sufficiently large enough to hold the expecte= d=0A= >> output=0A= >> The PMD acquires the necessary resources to process the op. After=0A= >> processing of current operation is=0A= >> complete, whether successful or not, it releases acquired resources and = no=0A= >> state, history or data is=0A= >> held in the PMD or carried over to subsequent ops.=0A= >> In SUCCESS case full input is consumed and full output written and statu= s is=0A= >> set to RTE_COMP_OP_STATUS_SUCCESS.=0A= >> OUT-OF-SPACE as D1.1 below.=0A= >>=0A= > [Shally] Ok. Agreed.=0A= >=0A= >>>>> Application can optionally attach a stream to such ops. In such case,= =0A= >>>> application must attach different stream to each op.=0A= >>>>> Application can enqueue stateless burst via making consecutive=0A= >>>> enque_burst() calls i.e. Following is relevant usage:=0A= >>>>> enqueued =3D rte_comp_enque_burst (dev_id, qp_id, ops1, nb_ops);=0A= >>>>> enqueued =3D rte_comp_enque_burst(dev_id, qp_id, ops2, nb_ops);=0A= >>>>>=0A= >>>>> *Note - Every call has different ops array i.e. same rte_comp_op arr= ay=0A= >>>> *cannot be re-enqueued* to process next batch of data until previous= =0A= >> ones=0A= >>>> are completely processed.=0A= >>>>> D.1.1 Stateless and OUT_OF_SPACE=0A= >>>>> ------------------------------------------------=0A= >>>>> OUT_OF_SPACE is a condition when output buffer runs out of space=0A= >> and=0A= >>>> where PMD still has more data to produce. If PMD run into such=0A= >> condition,=0A= >>>> then it's an error condition in stateless processing.=0A= >>>>> In such case, PMD resets itself and return with status=0A= >>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with produced=3Dconsumed=3D0=0A= >> i.e.=0A= >>>> no input read, no output written.=0A= >>>>> Application can resubmit an full input with larger output buffer size= .=0A= >>>> [Ahmed] Can we add an option to allow the user to read the data that= =0A= >> was=0A= >>>> produced while still reporting OUT_OF_SPACE? this is mainly useful for= =0A= >>>> decompression applications doing search.=0A= >>> [Shally] It is there but applicable for stateful operation type (please= refer to=0A= >> handling out_of_space under=0A= >>> "Stateful Section").=0A= >>> By definition, "stateless" here means that application (such as IPCOMP)= =0A= >> knows maximum output size=0A= >>> guaranteedly and ensure that uncompressed data size cannot grow more=0A= >> than provided output buffer.=0A= >>> Such apps can submit an op with type =3D STATELESS and provide full inp= ut,=0A= >> then PMD assume it has=0A= >>> sufficient input and output and thus doesn't need to maintain any conte= xts=0A= >> after op is processed.=0A= >>> If application doesn't know about max output size, then it should proce= ss it=0A= >> as stateful op i.e. setup op=0A= >>> with type =3D STATEFUL and attach a stream so that PMD can maintain=0A= >> relevant context to handle such=0A= >>> condition.=0A= >> [Fiona] There may be an alternative that's useful for Ahmed, while still= =0A= >> respecting the stateless concept.=0A= >> In Stateless case where a PMD reports OUT_OF_SPACE in decompression=0A= >> case=0A= >> it could also return consumed=3D0, produced =3D x, where x>0. X indicate= s the=0A= >> amount of valid data which has=0A= >> been written to the output buffer. It is not complete, but if an applic= ation=0A= >> wants to search it it may be sufficient.=0A= >> If the application still wants the data it must resubmit the whole input= with a=0A= >> bigger output buffer, and=0A= >> decompression will be repeated from the start, it=0A= >> cannot expect to continue on as the PMD has not maintained state, histo= ry=0A= >> or data.=0A= >> I don't think there would be any need to indicate this in capabilities, = PMDs=0A= >> which cannot provide this=0A= >> functionality would always return produced=3Dconsumed=3D0, while PMDs wh= ich=0A= >> can could set produced > 0.=0A= >> If this works for you both, we could consider a similar case for compres= sion.=0A= >>=0A= > [Shally] Sounds Fine to me. Though then in that case, consume should also= be updated to actual consumed by PMD.=0A= > Setting consumed =3D 0 with produced > 0 doesn't correlate. =0A= [Ahmed]I like Fiona's suggestion, but I also do not like the implication=0A= of returning consumed =3D 0. At the same time returning consumed =3D y=0A= implies to the user that it can proceed from the middle. I prefer the=0A= consumed =3D 0 implementation, but I think a different return is needed to= =0A= distinguish it from OUT_OF_SPACE that the use can recover from. Perhaps=0A= OUT_OF_SPACE_RECOVERABLE and OUT_OF_SPACE_TERMINATED. This also allows=0A= future PMD implementations to provide recover-ability even in STATELESS=0A= mode if they so wish. In this model STATELESS or STATEFUL would be a=0A= hint for the PMD implementation to make optimizations for each case, but=0A= it does not force the PMD implementation to limit functionality if it=0A= can provide recover-ability.=0A= >=0A= >>>>> D.2 Compression API Stateful operation=0A= >>>>> ----------------------------------------------------------=0A= >>>>> A Stateful operation in DPDK compression means application invokes= =0A= >>>> enqueue burst() multiple times to process related chunk of data either= =0A= >>>> because=0A= >>>>> - Application broke data into several ops, and/or=0A= >>>>> - PMD ran into out_of_space situation during input processing=0A= >>>>>=0A= >>>>> In case of either one or all of the above conditions, PMD is required= to=0A= >>>> maintain state of op across enque_burst() calls and=0A= >>>>> ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin with=0A= >>>> flush value =3D RTE_COMP_NO/SYNC_FLUSH and end at flush value=0A= >>>> RTE_COMP_FULL/FINAL_FLUSH.=0A= >>>>> D.2.1 Stateful operation state maintenance=0A= >>>>> ---------------------------------------------------------------=0A= >>>>> It is always an ideal expectation from application that it should par= se=0A= >>>> through all related chunk of source data making its mbuf-chain and=0A= >> enqueue=0A= >>>> it for stateless processing.=0A= >>>>> However, if it need to break it into several enqueue_burst() calls, t= hen=0A= >> an=0A= >>>> expected call flow would be something like:=0A= >>>>> enqueue_burst( |op.no_flush |)=0A= >>>> [Ahmed] The work is now in flight to the PMD.The user will call dequeu= e=0A= >>>> burst in a loop until all ops are received. Is this correct?=0A= >>>>=0A= >>>>> deque_burst(op) // should dequeue before we enqueue next=0A= >>> [Shally] Yes. Ideally every submitted op need to be dequeued. However= =0A= >> this illustration is specifically in=0A= >>> context of stateful op processing to reflect if a stream is broken into= =0A= >> chunks, then each chunk should be=0A= >>> submitted as one op at-a-time with type =3D STATEFUL and need to be=0A= >> dequeued first before next chunk is=0A= >>> enqueued.=0A= >>>=0A= >>>>> enqueue_burst( |op.no_flush |)=0A= >>>>> deque_burst(op) // should dequeue before we enqueue next=0A= >>>>> enqueue_burst( |op.full_flush |)=0A= >>>> [Ahmed] Why now allow multiple work items in flight? I understand that= =0A= >>>> occasionaly there will be OUT_OF_SPACE exception. Can we just=0A= >> distinguish=0A= >>>> the response in exception cases?=0A= >>> [Shally] Multiples ops are allowed in flight, however condition is each= op in=0A= >> such case is independent of=0A= >>> each other i.e. belong to different streams altogether.=0A= >>> Earlier (as part of RFC v1 doc) we did consider the proposal to process= all=0A= >> related chunks of data in single=0A= >>> burst by passing them as ops array but later found that as not-so-usefu= l for=0A= >> PMD handling for various=0A= >>> reasons. You may please refer to RFC v1 doc review comments for same.= =0A= >> [Fiona] Agree with Shally. In summary, as only one op can be processed a= t a=0A= >> time, since each needs the=0A= >> state of the previous, to allow more than 1 op to be in-flight at a time= would=0A= >> force PMDs to implement internal queueing and exception handling for=0A= >> OUT_OF_SPACE conditions you mention.=0A= [Ahmed] But we are putting the ops on qps which would make them=0A= sequential. Handling OUT_OF_SPACE conditions would be a little bit more=0A= complex but doable. The question is this mode of use useful for real=0A= life applications or would we be just adding complexity? The technical=0A= advantage of this is that processing of Stateful ops is interdependent=0A= and PMDs can take advantage of caching and other optimizations to make=0A= processing related ops much faster than switching on every op. PMDs have=0A= maintain state of more than 32 KB for DEFLATE for every stream.=0A= >> If the application has all the data, it can put it into chained mbufs in= a single=0A= >> op rather than=0A= >> multiple ops, which avoids pushing all that complexity down to the PMDs.= =0A= [Ahmed] I think that your suggested scheme of putting all related mbufs=0A= into one op may be the best solution without the extra complexity of=0A= handling OUT_OF_SPACE cases, while still allowing the enqueuer extra=0A= time If we have a way of marking mbufs as ready for consumption. The=0A= enqueuer may not have all the data at hand but can enqueue the op with a=0A= couple of empty mbus marked as not ready for consumption. The enqueuer=0A= will then update the rest of the mbufs to ready for consumption once the=0A= data is added. This introduces a race condition. A second flag for each=0A= mbuf can be updated by the PMD to indicate that it processed it or not.=0A= This way in cases where the PMD beat the application to the op, the=0A= application will just update the op to point to the first unprocessed=0A= mbuf and resend it to the PMD.=0A= >>=0A= >>>>> Here an op *must* be attached to a stream and every subsequent=0A= >>>> enqueue_burst() call should carry *same* stream. Since PMD maintain=0A= >> ops=0A= >>>> state in stream, thus it is mandatory for application to attach stream= to=0A= >> such=0A= >>>> ops.=0A= >> [Fiona] I think you're referring only to a single stream above, but as t= here=0A= >> may be many different streams,=0A= >> maybe add the following?=0A= >> Above is simplified to show just a single stream. However there may be= =0A= >> many streams, and each=0A= >> enqueue_burst() may contain ops from different streams, as long as there= is=0A= >> only one op in-flight from any=0A= >> stream at a given time.=0A= >>=0A= > [Shally] Ok get it. =0A= >=0A= >>>>> D.2.2 Stateful and Out_of_Space=0A= >>>>> --------------------------------------------=0A= >>>>> If PMD support stateful and run into OUT_OF_SPACE situation, then it = is=0A= >>>> not an error condition for PMD. In such case, PMD return with status= =0A= >>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with consumed =3D number of=0A= >> input=0A= >>>> bytes read and produced =3D length of complete output buffer.=0A= >> [Fiona] - produced would be <=3D output buffer len (typically =3D, but c= ould be a=0A= >> few bytes less)=0A= >>=0A= >>=0A= >>>>> Application should enqueue op with source starting at consumed+1 and= =0A= >>>> output buffer with available space.=0A= >>>>=0A= >>>> [Ahmed] Related to OUT_OF_SPACE. What status does the user recieve=0A= >> in a=0A= >>>> decompression case when the end block is encountered before the end=0A= >> of=0A= >>>> the input? Does the PMD continue decomp? Does it stop there and=0A= >> return=0A= >>>> the stop index?=0A= >>>>=0A= >>> [Shally] Before I could answer this, please help me understand your use= =0A= >> case . When you say "when the=0A= >>> end block is encountered before the end of the input?" Do you mean -=0A= >>> "Decompressor process a final block (i.e. has BFINAL=3D1 in its header)= and=0A= >> there's some footer data after=0A= >>> that?" Or=0A= >>> you mean "decompressor process one block and has more to process till i= ts=0A= >> final block?"=0A= >>> What is "end block" and "end of input" reference here?=0A= [Ahmed] I meant BFINAL=3D1 by end block. The end of input is the end of=0A= the input length.=0A= e.g.=0A= | input=0A= length--------------------------------------------------------------|=0A= |--data----data----data------data-------BFINAL-footer-unrelated data|=0A= >>>=0A= >>>>> D.2.3 Sliding Window Size=0A= >>>>> ------------------------------------=0A= >>>>> Every PMD will reflect in its algorithm capability structure maximum= =0A= >> length=0A= >>>> of Sliding Window in bytes which would indicate maximum history buffer= =0A= >>>> length used by algo.=0A= >>>>> 2. Example API illustration=0A= >>>>> ~~~~~~~~~~~~~~~~~~~~~~~=0A= >>>>>=0A= >> [Fiona] I think it would be useful to show an example of both a STATELES= S=0A= >> flow and a STATEFUL flow.=0A= >>=0A= > [Shally] Ok. I can add simplified version to illustrate API usage in both= cases.=0A= >=0A= >>>>> Following is an illustration on API usage (This is just one flow, ot= her=0A= >> variants=0A= >>>> are also possible):=0A= >>>>> 1. rte_comp_session *sess =3D rte_compressdev_session_create=0A= >>>> (rte_mempool *pool);=0A= >>>>> 2. rte_compressdev_session_init (int dev_id, rte_comp_session *sess,= =0A= >>>> rte_comp_xform *xform, rte_mempool *sess_pool);=0A= >>>>> 3. rte_comp_op_pool_create(rte_mempool ..)=0A= >>>>> 4. rte_comp_op_bulk_alloc (struct rte_mempool *mempool, struct=0A= >>>> rte_comp_op **ops, uint16_t nb_ops);=0A= >>>>> 5. for every rte_comp_op in ops[],=0A= >>>>> 5.1 rte_comp_op_attach_session (rte_comp_op *op,=0A= >> rte_comp_session=0A= >>>> *sess);=0A= >>>>> 5.2 op.op_type =3D RTE_COMP_OP_STATELESS=0A= >>>>> 5.3 op.flush =3D RTE_FLUSH_FINAL=0A= >>>>> 6. [Optional] for every rte_comp_op in ops[],=0A= >>>>> 6.1 rte_comp_stream_create(int dev_id, rte_comp_session *sess,=0A= >> void=0A= >>>> **stream);=0A= >>>>> 6.2 rte_comp_op_attach_stream(rte_comp_op *op,=0A= >> rte_comp_session=0A= >>>> *stream);=0A= >>>>=0A= >>>> [Ahmed] What is the semantic effect of attaching a stream to every op?= =0A= >> will=0A= >>>> this application benefit for this given that it is setup with op_type= =0A= >> STATELESS=0A= >>> [Shally] By role, stream is data structure that hold all information th= at PMD=0A= >> need to maintain for an op=0A= >>> processing and thus it's marked device specific. It is required for sta= teful=0A= >> processing but optional for=0A= >>> statelss as PMD doesn't need to maintain context once op is processed= =0A= >> unlike stateful.=0A= >>> It may be of advantage to use stream for stateless to some of the PMD.= =0A= >> They can be designed to do one-=0A= >>> time per op setup (such as mapping session params) during=0A= >> stream_create() in control path than data=0A= >>> path.=0A= >>>=0A= >> [Fiona] yes, we agreed that stream_create() should be called for every= =0A= >> session and if it=0A= >> returns non-NULL the PMD needs it so op_attach_stream() must be called.= =0A= >> However I've just realised we don't have a STATEFUL/STATELESS param on= =0A= >> the xform, just on the op.=0A= >> So we could either add stateful/stateless param to stream_create() ?=0A= >> OR add stateful/stateless param to xform so it would be in session?=0A= > [Shally] No it shouldn't be as part of session or xform as sessions aren'= t stateless/stateful.=0A= > So, we shouldn't alter the current definition of session or xforms.=0A= > If we need to mention it, then it could be added as part of stream_create= () as it's device specific and depending upon op_type() device can then set= up stream resources.=0A= >=0A= >> However, Shally, can you reconsider if you really need it for STATELESS = or if=0A= >> the data you want to=0A= >> store there could be stored in the session? Or if it's needed per-op doe= s it=0A= >> really need=0A= >> to be visible on the API as a stream or could it be hidden within the PM= D?=0A= > [Shally] I would say it is not mandatory but a desirable feature that I a= m suggesting. =0A= > I am only trying to enable optimization in data path which may be of help= to some PMD designs as they can use stream_create() to do setup that are 1= -time per op and regardless of op_type, such as I mentioned, setting up use= r session params to device sess params.=0A= > We can hide it inside PMD however there may be slight overhead in datapat= h depending on PMD design.=0A= > But I would say, it's not a blocker for us to freeze on current spec. We = can revisit this feature later because it will not alter base API functiona= lity.=0A= >=0A= > Thanks=0A= > Shally=0A= >=0A= >>>>> 7.for every rte_comp_op in ops[],=0A= >>>>> 7.1 set up with src/dst buffer=0A= >>>>> 8. enq =3D rte_compressdev_enqueue_burst (dev_id, qp_id, &ops,=0A= >> nb_ops);=0A= >>>>> 9. do while (dqu < enq) // Wait till all of enqueued are dequeued=0A= >>>>> 9.1 dqu =3D rte_compressdev_dequeue_burst (dev_id, qp_id, &ops,= =0A= >> enq);=0A= >>>> [Ahmed] I am assuming that waiting for all enqueued to be dequeued is= =0A= >> not=0A= >>>> strictly necessary, but is just the chosen example in this case=0A= >>>>=0A= >>> [Shally] Yes. By design, for burst_size>1 each op is independent of eac= h=0A= >> other. So app may proceed as soon=0A= >>> as it dequeue any.=0A= >>>=0A= >>>>> 10. Repeat 7 for next batch of data=0A= >>>>> 11. for every ops in ops[]=0A= >>>>> 11.1 rte_comp_stream_free(op->stream);=0A= >>>>> 11. rte_comp_session_clear (sess) ;=0A= >>>>> 12. rte_comp_session_terminate(ret_comp_sess *session)=0A= >>>>>=0A= >>>>> Thanks=0A= >>>>> Shally=0A= >>>>>=0A= >>>>>=0A= >=0A= =0A=