From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0075.outbound.protection.outlook.com [104.47.0.75]) by dpdk.org (Postfix) with ESMTP id 1957B19F5 for ; Sat, 3 Mar 2018 01:52:30 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=/IvvYbEGNRGW25FUvIJ2xFJxgYktGMC7NCWcn6XhFqU=; b=WbSRZtcRqyWlYp7Zq1JpNNsMfiVOSDUgX+1LMJC0jkc5SHb/eUwygTWu6h9gF3t24u+NOPgE9aQO4fqrHcq6X9VmmiWFUuyvifXVJFzveBlurIsu/l36i7OQP/Cwd5gyK3SqrsRyCGo+Cpof16zIaXRkaSAAA5GGYYiwLBnPjcY= Received: from DB3PR0402MB3852.eurprd04.prod.outlook.com (52.134.71.143) by DB3PR0402MB3802.eurprd04.prod.outlook.com (52.134.71.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.527.15; Sat, 3 Mar 2018 00:52:28 +0000 Received: from DB3PR0402MB3852.eurprd04.prod.outlook.com ([fe80::8554:d533:15e:1376]) by DB3PR0402MB3852.eurprd04.prod.outlook.com ([fe80::8554:d533:15e:1376%13]) with mapi id 15.20.0527.023; Sat, 3 Mar 2018 00:52:26 +0000 From: Ahmed Mansour To: "Trahe, Fiona" , "Verma, Shally" , "dev@dpdk.org" CC: "De Lara Guarch, Pablo" , "Athreya, Narayana Prasad" , "Gupta, Ashish" , "Sahu, Sunila" , "Challa, Mahipal" , "Jain, Deepak K" , Hemant Agrawal , Roy Pledge , Youri Querry Thread-Topic: [dpdk-dev] [PATCH] compressdev: implement API Thread-Index: AQHTnFM5yhoAdd9nE0i9ZEIaZBO9+Q== Date: Sat, 3 Mar 2018 00:52:26 +0000 Message-ID: References: <1517595924-25963-1-git-send-email-fiona.trahe@intel.com> <12544144.czVLKRyaz4@xps> <348A99DA5F5B7549AA880327E580B43589325187@IRSMSX101.ger.corp.intel.com> <348A99DA5F5B7549AA880327E580B4358932983C@IRSMSX101.ger.corp.intel.com> Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.88.168.1] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB3PR0402MB3802; 7:fZ41hrNJk7+RwblZFIPib39FMMhE8TWjhfPKYQcKbrMxwbIIs0zsEQQbc3vehpNbDH3AdlHbwkC9h676Q4x65ojnuFP0+oJ64o++KnQ99YcDuYD1AMy3j026z/D0E24Z5uvWJIDpZjW7drg++yAvvgxfIBlf+WfJhasPEsncAH6vQanBVGT9BU68TzGPIOwKercuVyoJ/yNGKRCQ5yTBDnr5zNcxeWwpntJuqTdcFvleG23EsPTg0QUgyoL31PB/ x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(366004)(346002)(376002)(39380400002)(39860400002)(396003)(189003)(199004)(13464003)(99286004)(5250100002)(6116002)(3846002)(76176011)(561944003)(6306002)(66066001)(53936002)(2501003)(5890100001)(6436002)(5660300001)(93886005)(2906002)(575784001)(74316002)(25786009)(86362001)(7696005)(97736004)(4326008)(45080400002)(9686003)(186003)(6346003)(6246003)(59450400001)(26005)(110136005)(106356001)(2900100001)(3660700001)(966005)(54906003)(305945005)(478600001)(7736002)(81156014)(14454004)(229853002)(8676002)(81166006)(55016002)(3280700002)(68736007)(8936002)(105586002)(33656002)(53546011)(6506007)(53946003)(102836004)(316002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0402MB3802; H:DB3PR0402MB3852.eurprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: b1bc53dd-0c84-42d4-77b0-08d580a108bd x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:DB3PR0402MB3802; x-ms-traffictypediagnostic: DB3PR0402MB3802: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397)(192374486261705)(189930954265078)(131327999870524)(185117386973197)(45079756050767)(21532816269658)(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040501)(2401047)(5005006)(8121501046)(3002001)(3231220)(944501244)(52105095)(93006095)(93001095)(10201501046)(6055026)(6041288)(20161123564045)(20161123562045)(20161123558120)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB3PR0402MB3802; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0402MB3802; x-forefront-prvs: 0600F93FE1 received-spf: None (protection.outlook.com: nxp.com does not designate permitted sender hosts) authentication-results: spf=none (sender IP is ) smtp.mailfrom=ahmed.mansour@nxp.com; x-microsoft-antispam-message-info: vclFH4xIsPG2KEoLAGfs/B0N3v+eTTHHVcQY1ETH264D8ekp1y2tj8hDGqksuM4uY48pnndyOIXO/mFf7dnA5WMChpBvE2LSqgw23f6eoKhE/98JotmeGXXou+E861m6cAHHDgOftrFP7RRlUwxap8A+ZhXhlwiJYTAVxTLMPSQ= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nxp.com X-MS-Exchange-CrossTenant-Network-Message-Id: b1bc53dd-0c84-42d4-77b0-08d580a108bd X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Mar 2018 00:52:26.6111 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 686ea1d3-bc2b-4c6f-a92c-d99c5c301635 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0402MB3802 Subject: Re: [dpdk-dev] [PATCH] compressdev: implement API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Mar 2018 00:52:31 -0000 On 2/28/2018 1:39 PM, Trahe, Fiona wrote:=0A= > Hi Ahmed, Shally,=0A= >=0A= > So just to capture what we concluded in the call today:=0A= >=0A= > - There's no requirement for a device-agnostic session to facilitate loa= d-balancing.=0A= > - For stateful data a stream is compulsory. Xform is passed to stream on= creation. =0A= > So no need for a session in stateful op.=0A= >=0A= > Re session data for stateless ops:=0A= > - All PMDs could cope with just passing in a xform with a stateless op. = But it might =0A= > not be performant. =0A= > - Some PMDs need to allocate some resources which can only be used by on= e op=0A= > at a time. For stateful ops these resources can be attached to a strea= m. For stateless=0A= > they could allocate the resources on each op enqueue, but it would be = better if=0A= > the resources were setup once based on the xform and could be re-used = on ops,=0A= > though only by one op at a time. =0A= > - Some PMDs don't need to allocate such resources, but could benefit by= =0A= > setting up some pmd data based on the xform. This data would not be = =0A= > constrained, could be used in parallel by any op or qp of the device. = =0A= > - The name pmd_stateless_data was not popular, maybe something like =0A= > xform_private_data can be used. On creation of this data, the PMD can= return =0A= > an indication of whether it should be used by one op at a time or sha= red. =0A= > =0A= > So I'll =0A= > - remove the session completely from the API.=0A= > - add an initialiser API for the data to be attached to stateless ops=0A= > - add a union to the op:=0A= >=0A= > union {=0A= > void *pmd_private_xform;=0A= > /**< Stateless private PMD data derived from an rte_comp_xform=0A= > * rte_comp_xform_init() must be called on a device =0A= > * before sending any STATELESS operations. The PMD returns a han= dle=0A= > * which must be attached to subsequent STATELESS operations.=0A= > * The PMD also returns a flag, if this is COMP_PRIVATE_XFORM_SHA= REABLE=0A= > * then the xform can be attached to multiple ops at the same tim= e, =0A= > * if it's COMP_PRIVATE_XFORM_SINGLE_OP then it can only be=0A= > * be used on one op at a time, other private xforms must be init= ialised=0A= > * to send other ops in parallel. =0A= > */=0A= > void *stream;=0A= > /* Private PMD data derived initially from an rte_comp_xform, whi= ch holds state=0A= > * and history data and evolves as operations are processed.=0A= > * rte_comp_stream_create() must be called on a device for all ST= ATEFUL =0A= > * data streams and the resulting stream attached=0A= > * to the one or more operations associated with the data stream.= =0A= > * All operations in a stream must be sent to the same device.=0A= > */=0A= > }=0A= >=0A= > Previous startup flow before sending a stateful op: =0A= > rte_comp_get_private_size(devid)=0A= > rte_comp_mempool_create() - returns sess_pool=0A= > rte_comp_session_create(sess_pool)=0A= > rte_comp_session_init(devid, sess, sess_pool, xform)=0A= > rte_comp_stream_create(devid, sess, **stream, op_type)=0A= >=0A= > simplified to:=0A= > rte_comp_xform_init(devid, xform, **priv_xform, *flag) - returns handle a= nd flag =0A= > (pool is within the PMD)=0A= >=0A= > Note, I don't think we bottomed out on removing the xform from the union,= but I don't=0A= > think we need it with above solution. =0A= >=0A= > Other discussion:=0A= > - we should document on API that qp is not thread-safe, so enqueue=0A= > and dequeue should be performed by same thread.=0A= [Ahmed] - I understand a qp should represent a single software user.=0A= This is good because we will not have to add locking as you mentioned,=0A= but are you sure that dequeues cannot be performed by another thread=0A= without adding significant overhead? it would enable producer consumer=0A= applications, and the name queue pair implies some independence.=0A= - Another question. I want to be sure. A qp can be used to send both=0A= compress and decompress ops.=0A= - Nitpick: The name queue pair implies order preservation to the user.=0A= Maybe we should change it to something that does not imply that.=0A= >=0A= > device and qp flow:=0A= > - dev_info_get() - application reads device capabilities, including the = max qps the device can support.=0A= > - dev_config() - application specifies how many qps it intends to use - = typically one per thread, must be < device max=0A= > - qp_setup() - called per qp. Creates the qp based on the size indicated= by max_inflights=0A= > - dev_start() - once started device can't be reconfigured, must call dev= _stop to reconfigure.=0A= >=0A= >=0A= > Regards,=0A= > Fiona=0A= >=0A= >> -----Original Message-----=0A= >> From: Verma, Shally [mailto:Shally.Verma@cavium.com]=0A= >> Sent: Tuesday, February 27, 2018 5:54 AM=0A= >> To: Ahmed Mansour ; Trahe, Fiona ;=0A= >> dev@dpdk.org=0A= >> Cc: De Lara Guarch, Pablo ; Athreya, Nar= ayana Prasad=0A= >> ; Gupta, Ashish ; Sahu, Sunila=0A= >> ; Challa, Mahipal ; J= ain, Deepak K=0A= >> ; Hemant Agrawal ; Roy = Pledge=0A= >> ; Youri Querry =0A= >> Subject: RE: [dpdk-dev] [PATCH] compressdev: implement API=0A= >>=0A= >>=0A= >>=0A= >>> -----Original Message-----=0A= >>> From: Ahmed Mansour [mailto:ahmed.mansour@nxp.com]=0A= >>> Sent: 27 February 2018 03:05=0A= >>> To: Verma, Shally ; Trahe, Fiona ; dev@dpdk.org=0A= >>> Cc: De Lara Guarch, Pablo ; Athreya, Na= rayana Prasad=0A= >> ;=0A= >>> Gupta, Ashish ; Sahu, Sunila ; Challa,=0A= >> Mahipal=0A= >>> ; Jain, Deepak K ; = Hemant Agrawal=0A= >> ; Roy=0A= >>> Pledge ; Youri Querry =0A= >>> Subject: Re: [dpdk-dev] [PATCH] compressdev: implement API=0A= >>>=0A= >>>> Hi Fiona, Ahmed=0A= >>>>> Hi Fiona,=0A= >>>>>=0A= >>>>> Thanks for starting this discussion. In the current API the user must= =0A= >>>>> make 12 API calls just to get information to compress. Maybe there is= a=0A= >>>>> way to simplify. At least for some use cases (stateless). I think a c= all=0A= >>>>> sometime next week would be good to help clarify coalesce some of the= =0A= >>>>> complexity.=0A= >>>>>=0A= >>>>> I added specific comments inline.=0A= >>>>>=0A= >>>>> Thanks,=0A= >>>>>=0A= >>>>> Ahmed=0A= >>>>>=0A= >>>>> On 2/21/2018 2:12 PM, Trahe, Fiona wrote:=0A= >>>>>> We've been struggling with the idea of session in compressdev.=0A= >>>>>>=0A= >>>>>> Is it really a session?=0A= >>>>>> - It's not in the same sense as cryptodev where it's used to hold a= key, and maps to a Security=0A= >> Association.=0A= >>>>>> - It's a set of immutable data that is needed with the op and strea= m to perform the operation.=0A= >>>>>> - It inherited from cryptodev the ability to be set up for multiple= driver types and used across any=0A= >>>>>> devices of those types. For stateful ops this facility can't be = used.=0A= >>>>>> For stateless we don't think it's important, and think it's unli= kely to be used.=0A= >>>>>> - Drivers use it to prepare private data, set up resources, do pre-= work, so there's=0A= >>>>>> less work to be done on the data path. Initially we didn't have = a stream, we do now,=0A= >>>>>> this may be a better alternative place for that work.=0A= >>>>>> So we've been toying with the idea of getting rid of the session.=0A= >>>>> [Ahmed] In our proprietary API the stream and session are one. A sess= ion=0A= >>>>> holds many properties like the op-type, instead of having this=0A= >>>>> information in the op itself. This way we lower the per op setup cos= t.=0A= >>>>> This also allows rapid reuse of stateful infrastructure, once a strea= m=0A= >>>>> is closed on a stateful session, the next op (stream) on this session= =0A= >>>>> reuses the stateful storage. Obviously if a stream is in "pause mode"= on=0A= >>>>> a session, all following ops that may be unrelated to this=0A= >>>>> stream/session must also wait until this current stream is closed or= =0A= >>>>> aborted before the infrastructure can be reused.=0A= >>>>>> We also struggle with the idea of setting up a stream for stateless = ops.=0A= >>>>>> - Well, really I just think the name is misleading, i.e. there's n= o problem with setting=0A= >>>>>> up some private PMD data to use with stateless operations, just = calling it a=0A= >>>>>> stream doesn't seem right.=0A= >>>>> [Ahmed] I agree. The op has all the necessary information to process = it=0A= >>>>> in the current API? Both the stream and the op are one time use. We= =0A= >>>>> can't attach multiple similar ops to a single stream/session and rely= on=0A= >>>>> their properties to simplify op setup, so why the hassle.=0A= >>>> [Shally] As per my knowledge, session came with idea in DPDK, if syst= em has multiple devices setup=0A= >> to do similar jobs then=0A= >>> application can fan out ops to any of them for load-balancing. Though i= t is not possible for stateful ops=0A= >> but it still can work for stateless.=0A= >>> If there's an application which only have stateless ops to process then= I see this is still useful feature to=0A= >> support.=0A= >>> [Ahmed] Is there an advantage to exposing load balancing to the user? I= =0A= >>> do not see load balancing as a feature within itself. Can the PMD take= =0A= >>> care of this? I guess a system that has=0A= >> [Shally] I assume idea was to leverage multiple PMDs that are available = in system (say QAT+SW ZLIB)=0A= >> and I believe matter of load-balancing came out of one of the earlier di= scussion with Fiona on RFC v1.=0A= >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fdev.= dpdk.narkive.com%2FCHS5l01B%2Fdpdk-dev-rfc-v1-doc-compression-api-for-dpdk%= 23post3&data=3D02%7C01%7Cahmed.mansour%40nxp.com%7C4299d16d58144e417f5208d5= 7eda9ba9%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636554399768115871&sd= ata=3DMHWrD0qD%2FMKL%2FjX4j1kOdDeElh0cG1gQj9d1862K4V0%3D&reserved=3D0=0A= >> So, I wait for her comments on this. But in any case, with changed notio= n too it looks achievable to me,=0A= >> if so is desired.=0A= >>=0A= >>>> In current proposal, stream logically represent data and hold its spec= ific information and session is=0A= >> generic information that can be=0A= >>> applied on multiple data. If we want to combine stream and session. The= n one way to look at this is:=0A= >>>> "let application only allocate and initialize session with rte_comp_xf= orm (and possibly op type)=0A= >> information so that PMD can do one-=0A= >>> time setup and allocate enough resources. Once attached to op, cannot b= e reused until that op is fully=0A= >> processed. So, if app has 16=0A= >>> data elements to process in a burst, it will setup 16 sessions."=0A= >>> [Ahmed] Why not allow multiple inflight stateless ops with the same=0A= >>> session? Stateless by definition guarantees that the resources used to= =0A= >>> work on one up will be free after the op is processed. That means that= =0A= >>> even if an op fails to process correctly on a session, it will have no= =0A= >>> effect on the next op since there is not interdependence. This assumes= =0A= >>> that the resources are shareable between hardware instances for=0A= >>> stateless. That is not a bad assumption since hardware should not need= =0A= >>> more than the data of the op itself to work on a statelss op.=0A= >> [Shally] multiple ops in-flight can connect to same session but I assum= e you agree then they cannot=0A= >> execute in parallel i.e. only one op at-a-time can use session here? And= as far as I understand your PMD=0A= >> works this way. Your HW execute one op at-a-time from queue?!=0A= >>=0A= >>>> This is same as what Ahmed suggested. For a particular load-balancing = case suggested above, If=0A= >> application want, can initialize=0A= >>> different sessions on multiple devices with same xform so that each is = prepared to process ops.=0A= >> Application can then fanout stateless=0A= >>> ops to multiple devices for load-balancing but then it would need to ke= ep map of device & a session=0A= >> map.=0A= >>>> If this sound feasible, then I too believe we can rather get rid of ei= ther and keep one (possibly session=0A= >> but am open with stream as=0A= >>> well).=0A= >>>> However, regardless of case whether we live with name stream or sessio= n, I don't see much deviation=0A= >> from current API spec except=0A= >>> description and few modifications/additions as identified.=0A= >>>> So, then I see it as:=0A= >>>>=0A= >>>> - A stream(or session whichever name is chosen) can be used with only = one-op at-a-time=0A= >>>> - It can be re-used when previously attached op is processed=0A= >>>> - if it is stream then currently it is allocated from PMD managed poo= l whereas Sessions are allocated=0A= >> from application created=0A= >>> mempool.=0A= >>>> In either of case, I would expect to review pool management API=0A= >>>>=0A= >>>> With this in mind, below are few of my comments=0A= >>>>=0A= >>>>>> So putting above thoughts together I want to propose:=0A= >>>>>> - Removal of the session and all associated APIs.=0A= >>>>>> - Passing in one of three data types in the rte_comp_op=0A= >>>>>>=0A= >>>>>> union {=0A= >>>>>> struct rte_comp_xform *xform;=0A= >>>>>> /**< Immutable compress/decompress params */=0A= >>>>>> void *pmd_stateless_data;=0A= >>>>>> /**< Stateless private PMD data derived from an rte_comp_xfo= rm=0A= >>>>>> * rte_comp_stateless_data_init() must be called on a device= =0A= >>>>>> * before sending any STATELESS operations. If the PMD retur= ns a non-NULL=0A= >>>>>> * value the handle must be attached to subsequent STATELESS= operations.=0A= >>>>>> * If a PMD returns NULL, then the xform should be passed di= rectly to each op=0A= >>>>>> */=0A= >>>> [Shally] It sounds like stateless_data_init() nothing more than a repl= acement of session_init().=0A= >>>> So, this is needed neither if we retain session concept nor if we ret= ain stream concept (=0A= >> rte_comp_stream_create() with=0A= >>> op_type: stateless can serve same purpose).=0A= >>>> It should be sufficient to provide either stream (or session) pointer= .=0A= >>>>=0A= >>>>>> void *stream;=0A= >>>>>> /* Private PMD data derived initially from an rte_comp_xform= , which holds state=0A= >>>>>> * and history data and evolves as operations are processed.= =0A= >>>>>> * rte_comp_stream_create() must be called on a device for a= ll STATEFUL=0A= >>>>>> * data streams and the resulting stream attached=0A= >>>>>> * to the one or more operations associated with the data st= ream.=0A= >>>>>> * All operations in a stream must be sent to the same devic= e.=0A= >>>>>> */=0A= >>>>>> }=0A= >>>>> [Ahmed] I like this setup, but I am not sure in what cases the xform= =0A= >>>>> immutable would be used. I understand the other two.=0A= >>>> [Shally] my understanding is xform will be mapped by PMD to its intern= ally managed stream(or=0A= >> session data structure). And then we=0A= >>> can remove STATEFUL reference here and just say stream(or session) it b= elongs to. However, This=0A= >> condition still apply:=0A= >>>> *All operations that belong to same stream must be sent to the = same device.*=0A= >>>>=0A= >>>>>> Notes:=0A= >>>>>> 1. Internally if a PMD wants to use the exact same data structure fo= r both it can do,=0A= >>>>>> just on the API I think it's better if they're named differentl= y with=0A= >>>>>> different comments.=0A= >>>>>> 2. I'm not clear of the constraints if any, which attach to the pmd_= stateless_data=0A= >>>>>> For our PMD it would only hold immutable data as the session di= d, and so=0A= >>>>>> could be attached to many ops in parallel.=0A= >>>>>> Is this true for all PMDs or are there constraints which should= be called out?=0A= >>>>>> Is it limited to a specific device, qp, or to be used on one op= at a time?=0A= >>>>>> 3. Am open to other naming suggestions, just trying to capture the e= ssence=0A= >>>>>> of these data structs better than our current API does.=0A= >>>>>>=0A= >>>>>> We would put some more helper fns and structure around the above cod= e if people=0A= >>>>>> are in agreement, just want to see if the concept flies before going= further?=0A= >>>>>>=0A= >>>>>> Fiona=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >=0A= =0A=