From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0064.outbound.protection.outlook.com [104.47.2.64]) by dpdk.org (Postfix) with ESMTP id 95B9E1B1B2 for ; Tue, 9 Jan 2018 20:07:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=2lHxv7uSKV/3FE5w3zN3S31U0doE3UkeHE4jQeYl3vM=; b=VlCmfBwBiatdr6B4kaDt6hzCz/EUmtpOCTsbLi4gcoThrzxvSrggxPcWmP2iXL4pXzioIf8h4nT2meS6uGY1D8lmkObfvMpg0ywJ9HtfCA7ng4TDlsXh1RnkV/HCcBXehhzSAIsgOgMX4VBciRKTqlU27qMszKDUvpqYM+ekA0s= Received: from AM0PR0402MB3842.eurprd04.prod.outlook.com (52.133.39.138) by DB6PR0402MB2854.eurprd04.prod.outlook.com (10.172.247.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.386.5; Tue, 9 Jan 2018 19:07:51 +0000 Received: from AM0PR0402MB3842.eurprd04.prod.outlook.com ([fe80::dce3:8486:f9d6:d566]) by AM0PR0402MB3842.eurprd04.prod.outlook.com ([fe80::dce3:8486:f9d6:d566%13]) with mapi id 15.20.0386.006; Tue, 9 Jan 2018 19:07:49 +0000 From: Ahmed Mansour To: "Verma, Shally" , "Trahe, Fiona" , "dev@dpdk.org" CC: "Athreya, Narayana Prasad" , "Gupta, Ashish" , "Sahu, Sunila" , "De Lara Guarch, Pablo" , "Challa, Mahipal" , "Jain, Deepak K" , Hemant Agrawal , Roy Pledge , Youri Querry Thread-Topic: [RFC v2] doc compression API for DPDK Thread-Index: AdOFUW8Wdt99b3u6RKydGSrxJwvtHg== Date: Tue, 9 Jan 2018 19:07:49 +0000 Message-ID: References: Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.88.168.1] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0402MB2854; 7:nofhs9y3GBdWtZzfZI4o4GAu4xJKFd5nxvC7foNgY4oCQn/jJByOP+WH8NqmrSAF86BQrKHnXiAHEFkVf8MuK3pzM9wlO7+GTfE5yNcJzRWFTndLO1gI+AIzsw5SW48mx526KmYQxCjaJu9Tq/GkJD+UgAjAnVHbQKh4InYgQ9QBX+GNp6UKTGwWi1XNM9022ZUamzG6CfN0hlzmRIyyP/0B6Y4XGcUGwPpmdPmarUF6mWBJ4LtYUAxjJqPsdl2E x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(39380400002)(396003)(376002)(39860400002)(346002)(366004)(199004)(24454002)(189003)(51914003)(55016002)(8936002)(5660300001)(6246003)(102836004)(2900100001)(316002)(5250100002)(74316002)(68736007)(229853002)(6306002)(33656002)(966005)(3280700002)(3660700001)(9686003)(105586002)(6436002)(561944003)(3846002)(2906002)(5890100001)(4326008)(53936002)(14454004)(25786009)(81166006)(478600001)(54906003)(575784001)(2501003)(106356001)(86362001)(7696005)(110136005)(7736002)(81156014)(99286004)(66066001)(8676002)(59450400001)(45080400002)(6116002)(97736004)(76176011)(53546011)(6506007)(305945005); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0402MB2854; H:AM0PR0402MB3842.eurprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 7d5edb53-fb53-480a-a4be-08d5579446d3 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(48565401081)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:DB6PR0402MB2854; x-ms-traffictypediagnostic: DB6PR0402MB2854: authentication-results: spf=none (sender IP is ) smtp.mailfrom=ahmed.mansour@nxp.com; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397)(189930954265078)(45079756050767)(211171220733660); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040470)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3002001)(3231023)(944501110)(6055026)(6041268)(20161123558120)(20161123562045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(6072148)(201708071742011); SRVR:DB6PR0402MB2854; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:DB6PR0402MB2854; x-forefront-prvs: 0547116B72 received-spf: None (protection.outlook.com: nxp.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: PF7h2z5t7MAiRuSFLFGDvZNC51wKcyIk5fnHgvPkbTgFDdpqBGWAB2MKoKEeVXvX0jNhH3aL8rVI0hZ1V8/kzw== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nxp.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7d5edb53-fb53-480a-a4be-08d5579446d3 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Jan 2018 19:07:49.6751 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 686ea1d3-bc2b-4c6f-a92c-d99c5c301635 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0402MB2854 Subject: Re: [dpdk-dev] [RFC v2] doc compression API for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2018 19:07:52 -0000 Hi Shally,=0A= =0A= Thanks for the summary. It is very helpful. Please see comments below=0A= =0A= =0A= On 1/4/2018 6:45 AM, Verma, Shally wrote:=0A= > This is an RFC v2 document to brief understanding and requirements on com= pression API proposal in DPDK. It is based on "[RFC v3] Compression API in = DPDK https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fdp= dk.org%2Fdev%2Fpatchwork%2Fpatch%2F32331%2F&data=3D02%7C01%7Cahmed.mansour%= 40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd99c5c30= 1635%7C0%7C0%7C636506631207323264&sdata=3DJFtOnJxajgXX7s3DMZ79K7VVM7TXO8lBd= 6rNeVlsHDg%3D&reserved=3D0 ".=0A= > Intention of this document is to align on concepts built into compression= API, its usage and identify further requirements. =0A= >=0A= > Going further it could be a base to Compression Module Programmer Guide.= =0A= >=0A= > Current scope is limited to=0A= > - definition of the terminology which makes up foundation of compression = API=0A= > - typical API flow expected to use by applications=0A= > - Stateless and Stateful operation definition and usage after RFC v1 doc = review https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2F= dev.dpdk.narkive.com%2FCHS5l01B%2Fdpdk-dev-rfc-v1-doc-compression-api-for-d= pdk&data=3D02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368= a0e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata= =3DFy7xKIyxZX97i7vEM6NqgrvnqKrNrWOYLwIA5dEHQNQ%3D&reserved=3D0=0A= > =0A= > 1. Overview=0A= > ~~~~~~~~~~~=0A= >=0A= > A. Compression Methodologies in compression API=0A= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= > DPDK compression supports two types of compression methodologies:=0A= > - Stateless - each data object is compressed individually without any ref= erence to previous data, =0A= > - Stateful - each data object is compressed with reference to previous d= ata object i.e. history of data is needed for compression / decompression= =0A= > For more explanation, please refer RFC https://emea01.safelinks.protectio= n.outlook.com/?url=3Dhttps%3A%2F%2Fwww.ietf.org%2Frfc%2Frfc1951.txt&data=3D= 02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686e= a1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata=3Dpfp2VX1w3= UxH5YLcL2R%2BvKXNeS7jP46CsASq0B1SETw%3D&reserved=3D0=0A= >=0A= > To support both methodologies, DPDK compression introduces two key concep= ts: Session and Stream.=0A= >=0A= > B. Notion of a session in compression API=0A= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =0A= > A Session in DPDK compression is a logical entity which is setup one-time= with immutable parameters i.e. parameters that don't change across operati= ons and devices.=0A= > A session can be shared across multiple devices and multiple operations s= imultaneously. =0A= > A typical Session parameters includes info such as:=0A= > - compress / decompress=0A= > - compression algorithm and associated configuration parameters=0A= >=0A= > Application can create different sessions on a device initialized with sa= me/different xforms. Once a session is initialized with one xform it cannot= be re-initialized.=0A= > =0A= > C. Notion of stream in compression API=0A= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= > Unlike session which carry common set of information across operations, a= stream in DPDK compression is a logical entity which identify related set = of operations and carry operation specific information as needed by device = during its processing.=0A= > It is device specific data structure which is opaque to application, setu= p and maintained by device. =0A= >=0A= > A stream can be used with *only* one op at a time i.e. no two operations = can share same stream simultaneously.=0A= > A stream is *must* for stateful ops processing and optional for stateless= (Please see respective sections for more details).=0A= >=0A= > This enables sharing of a session by multiple threads handling different = data set as each op carry its own context (internal states, history buffers= et el) in its attached stream. =0A= > Application should call rte_comp_stream_create() and attach to op before = beginning of operation processing and free via rte_comp_stream_free() afte= r its complete.=0A= >=0A= > C. Notion of burst operations in compression API=0A= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= > A burst in DPDK compression is an array of operations where each op carry= independent set of data. i.e. a burst can look like:=0A= >=0A= > -----------------------------------= ----------------------------------------------------------------------=0A= > enque_burst (|op1.no_flush | op2.no_flush | op3.flush_final= | op4.no_flush | op5.no_flush |)=0A= > ----------------------------------= -----------------------------------------------------------------------=0A= >=0A= > Where, op1 .. op5 are all independent of each other and carry entirely di= fferent set of data. =0A= > Each op can be attached to same/different session but *must* be attached = to different stream.=0A= >=0A= > Each op (struct rte_comp_op) carry compression/decompression operational = parameter and is both an input/output parameter. =0A= > PMD gets source, destination and checksum information at input and update= it with bytes consumed and produced and checksum at output.=0A= >=0A= > Since each operation in a burst is independent and thus can complete out-= of-order, applications which need ordering, should setup per-op user data = area with reordering information so that it can determine enqueue order at = deque.=0A= >=0A= > Also if multiple threads calls enqueue_burst() on same queue pair then it= =92s application onus to use proper locking mechanism to ensure exclusive e= nqueuing of operations.=0A= >=0A= > D. Stateless Vs Stateful=0A= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= > Compression API provide RTE_COMP_FF_STATEFUL feature flag for PMD to refl= ect its support for Stateful operation. Each op carry an op type indicating= if it's to be processed stateful or stateless.=0A= > =0A= > D.1 Compression API Stateless operation=0A= > ------------------------------------------------------ =0A= > An op is processed stateless if it has=0A= > - flush value is set to RTE_FLUSH_FULL or RTE_FLUSH_FINAL (r= equired only on compression side),=0A= > - op_type set to RTE_COMP_OP_STATELESS=0A= > - All-of the required input and sufficient large output buff= er to store output i.e. OUT_OF_SPACE can never occur.=0A= > =0A= > When all of the above conditions are met, PMD initiates stateless process= ing and releases acquired resources after processing of current operation i= s complete i.e. full input consumed and full output written.=0A= > Application can optionally attach a stream to such ops. In such case, app= lication must attach different stream to each op.=0A= >=0A= > Application can enqueue stateless burst via making consecutive enque_burs= t() calls i.e. Following is relevant usage:=0A= > =0A= > enqueued =3D rte_comp_enque_burst (dev_id, qp_id, ops1, nb_ops); =0A= > enqueued =3D rte_comp_enque_burst(dev_id, qp_id, ops2, nb_ops); =0A= > =0A= > *Note =96 Every call has different ops array i.e. same rte_comp_op array= *cannot be re-enqueued* to process next batch of data until previous ones = are completely processed.=0A= >=0A= > D.1.1 Stateless and OUT_OF_SPACE =0A= > ------------------------------------------------=0A= > OUT_OF_SPACE is a condition when output buffer runs out of space and wher= e PMD still has more data to produce. If PMD run into such condition, then = it's an error condition in stateless processing.=0A= > In such case, PMD resets itself and return with status RTE_COMP_OP_STATUS= _OUT_OF_SPACE with produced=3Dconsumed=3D0 i.e. no input read, no output wr= itten.=0A= > Application can resubmit an full input with larger output buffer size.=0A= =0A= [Ahmed] Can we add an option to allow the user to read the data that was pr= oduced while still reporting OUT_OF_SPACE? this is mainly useful for decomp= ression applications doing search.=0A= =0A= > D.2 Compression API Stateful operation=0A= > ----------------------------------------------------------=0A= > A Stateful operation in DPDK compression means application invokes enque= ue burst() multiple times to process related chunk of data either because = =0A= > - Application broke data into several ops, and/or=0A= > - PMD ran into out_of_space situation during input processing=0A= >=0A= > In case of either one or all of the above conditions, PMD is required to = maintain state of op across enque_burst() calls and=0A= > ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin with flush val= ue =3D RTE_COMP_NO/SYNC_FLUSH and end at flush value RTE_COMP_FULL/FINAL_FL= USH.=0A= >=0A= > D.2.1 Stateful operation state maintenance=0A= > ---------------------------------------------------------------=0A= > It is always an ideal expectation from application that it should parse t= hrough all related chunk of source data making its mbuf-chain and enqueue i= t for stateless processing.=0A= > However, if it need to break it into several enqueue_burst() calls, then = an expected call flow would be something like:=0A= >=0A= > enqueue_burst( |op.no_flush |)=0A= =0A= [Ahmed] The work is now in flight to the PMD.The user will call dequeue bur= st in a loop until all ops are received. Is this correct?=0A= =0A= > deque_burst(op) // should dequeue before we enqueue next=0A= > enqueue_burst( |op.no_flush |)=0A= > deque_burst(op) // should dequeue before we enqueue next=0A= > enqueue_burst( |op.full_flush |)=0A= =0A= [Ahmed] Why now allow multiple work items in flight? I understand that occa= sionaly there will be OUT_OF_SPACE exception. Can we just distinguish the r= esponse in exception cases?=0A= =0A= >=0A= > Here an op *must* be attached to a stream and every subsequent enqueue_bu= rst() call should carry *same* stream. Since PMD maintain ops state in stre= am, thus it is mandatory for application to attach stream to such ops.=0A= >=0A= > D.2.2 Stateful and Out_of_Space=0A= > --------------------------------------------=0A= > If PMD support stateful and run into OUT_OF_SPACE situation, then it is n= ot an error condition for PMD. In such case, PMD return with status RTE_COM= P_OP_STATUS_OUT_OF_SPACE with consumed =3D number of input bytes read and p= roduced =3D length of complete output buffer.=0A= > Application should enqueue op with source starting at consumed+1 and outp= ut buffer with available space.=0A= =0A= [Ahmed] Related to OUT_OF_SPACE. What status does the user recieve in a dec= ompression case when the end block is encountered before the end of the inp= ut? Does the PMD continue decomp? Does it stop there and return the stop in= dex?=0A= =0A= > =0A= > D.2.3 Sliding Window Size=0A= > ------------------------------------=0A= > Every PMD will reflect in its algorithm capability structure maximum leng= th of Sliding Window in bytes which would indicate maximum history buffer l= ength used by algo.=0A= >=0A= > 2. Example API illustration=0A= > ~~~~~~~~~~~~~~~~~~~~~~~=0A= >=0A= > Following is an illustration on API usage (This is just one flow, other = variants are also possible):=0A= > 1. rte_comp_session *sess =3D rte_compressdev_session_create (rte_mempool= *pool); =0A= > 2. rte_compressdev_session_init (int dev_id, rte_comp_session *sess, rte_= comp_xform *xform, rte_mempool *sess_pool); =0A= > 3. rte_comp_op_pool_create(rte_mempool ..) =0A= > 4. rte_comp_op_bulk_alloc (struct rte_mempool *mempool, struct rte_comp_o= p **ops, uint16_t nb_ops); =0A= > 5. for every rte_comp_op in ops[],=0A= > 5.1 rte_comp_op_attach_session (rte_comp_op *op, rte_comp_session *se= ss); =0A= > 5.2 op.op_type =3D RTE_COMP_OP_STATELESS=0A= > 5.3 op.flush =3D RTE_FLUSH_FINAL=0A= > 6. [Optional] for every rte_comp_op in ops[],=0A= > 6.1 rte_comp_stream_create(int dev_id, rte_comp_session *sess, void *= *stream); =0A= > 6.2 rte_comp_op_attach_stream(rte_comp_op *op, rte_comp_session *stre= am);=0A= =0A= [Ahmed] What is the semantic effect of attaching a stream to every op? will= this application benefit for this given that it is setup with op_type STAT= ELESS=0A= =0A= > 7.for every rte_comp_op in ops[],=0A= > 7.1 set up with src/dst buffer=0A= > 8. enq =3D rte_compressdev_enqueue_burst (dev_id, qp_id, &ops, nb_ops); = =0A= > 9. do while (dqu < enq) // Wait till all of enqueued are dequeued =0A= > 9.1 dqu =3D rte_compressdev_dequeue_burst (dev_id, qp_id, &ops, enq);= =0A= =0A= [Ahmed] I am assuming that waiting for all enqueued to be dequeued is not s= trictly necessary, but is just the chosen example in this case=0A= =0A= > 10. Repeat 7 for next batch of data =0A= > 11. for every ops in ops[]=0A= > 11.1 rte_comp_stream_free(op->stream);=0A= > 11. rte_comp_session_clear (sess) ;=0A= > 12. rte_comp_session_terminate(ret_comp_sess *session)=0A= >=0A= > Thanks=0A= > Shally=0A= >=0A= >=0A= =0A=