From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on0075.outbound.protection.outlook.com [104.47.36.75]) by dpdk.org (Postfix) with ESMTP id B7F2A2BE5 for ; Sat, 15 Sep 2018 13:31:50 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oYOL52ve7MHVC/16sfzNV6jRaxP/z+GxpO4I/7MLDzY=; b=KPvglSTwUNT1jaCE4qQ+vyHQ03kKlSj159e6VoL8gqGAVvOZPa9nMz5bgFSXmb71gqYBeSUkh27CokB4p30aublEziADj9hEJgPZuF/l60gc/qKhgmAaR2HjiyeClMl/P6fw3OAPY19a0ausmhYSaIVJuMSbh97r4jUA2ajeMMY= Received: from SN6PR07MB5152.namprd07.prod.outlook.com (52.135.101.33) by SN6PR07MB4496.namprd07.prod.outlook.com (52.135.93.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1122.18; Sat, 15 Sep 2018 11:31:49 +0000 Received: from SN6PR07MB5152.namprd07.prod.outlook.com ([fe80::c5da:a9f2:b4f4:73d8]) by SN6PR07MB5152.namprd07.prod.outlook.com ([fe80::c5da:a9f2:b4f4:73d8%2]) with mapi id 15.20.1122.021; Sat, 15 Sep 2018 11:31:48 +0000 From: "Verma, Shally" To: "Trahe, Fiona" , "dev@dpdk.org" CC: "Daly, Lee" , "Jozwiak, TomaszX" , Akhil Goyal , "Sahu, Sunila" , "Gupta, Ashish" Thread-Topic: compressdev: append dest data in PMDs instead of in application Thread-Index: AdRJ75jacUBhpLmbTpmqMj+SnYHmWwC4r/PQ Date: Sat, 15 Sep 2018 11:31:47 +0000 Message-ID: References: <348A99DA5F5B7549AA880327E580B43589619BC1@IRSMSX101.ger.corp.intel.com> In-Reply-To: <348A99DA5F5B7549AA880327E580B43589619BC1@IRSMSX101.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Shally.Verma@cavium.com; x-originating-ip: [114.143.185.87] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; SN6PR07MB4496; 6:i2ILUsEHnt2H1kMm02L+t2PDif/B/nuiVWXBK9h7GCc5ct0Iq8ERb556IABtZWx6v1i9qJAImJHngvJYp76gryWYCBGPrWdFrfzXnzIaK1HqHDc87jn7qIiWE6mny6nS140R6XuWoaCUWvaGalK6tj8VWfLh0989sYi4uBZdkLppe+Zx0gzttOznjlXcI6l8IN7YWlZA/+9wjIE+t07LtGzkw4ba9HhBKw1xYuG2banr9jPxjxunK3cFJzm5rNYLIBic2Vc02ReAkf2N1QymwVHltwn35BtkoWH2PA+QpgWCvtchnQfIVkAurW8lTFz2raezlCXnOyKXwavBDlwS+CU4vZ31F0teRhVtCK4nVzyZ1KS33VlT/wi54pn0/n3ZkxogusZCdklO6OhGRa92zbeCuKJXy2OXZQR88bm7f3WpEiTFgffXVHq7DnQHyML5fcuMVRKMRMIzCXqNLSQ17Q==; 5:3vN3sYkkmD3lID2pmZJKy/87Oc4J/ut+GJUjEFUZ8QNmy3MZ+Bt0ZAjr1n+o1s0un/w4nzPTNKYS0nr/chXGxG3Ba1HsgFSJNU4ss0kQX3TidFVvEiaSsjjx9zlwKjrSCy+lYuVImgADe6BKxVQ1hSLh9lk5Pxwu+1O8f09kv8c=; 7:6qfRUW9kA9APUrQug2kYTsuPSfrOhVL9YO1BJfxYRwTpFX/H3QqOOSlxV0lMYEuH6xo4yELqDo2IL8wNoO2pW6+KEZDfKyRLEAceJVtx8cRq/ic2vWchdvSF+HzHIwu7NsDOYh88a84HrUwr4aW2b+YGRKMqZsiai1+THMFsJXPFwlaJjsOViGB6MENDsxW8Ov5S8EmikOcgd3BFbiNUigfrNeBphUawOCN7umTscOmKO57AG6amVPIsfe6fsVKE x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(39850400004)(136003)(346002)(396003)(366004)(376002)(189003)(199004)(13464003)(5024004)(486006)(561944003)(110136005)(5250100002)(74316002)(2501003)(54906003)(76176011)(33656002)(55016002)(9686003)(478600001)(6436002)(256004)(2900100001)(305945005)(6116002)(53936002)(97736004)(476003)(7736002)(68736007)(316002)(7696005)(186003)(3846002)(14444005)(66066001)(107886003)(105586002)(2906002)(6246003)(8936002)(81156014)(6506007)(86362001)(81166006)(25786009)(4326008)(11346002)(229853002)(446003)(8676002)(106356001)(72206003)(26005)(14454004)(99286004)(5660300001)(102836004); DIR:OUT; SFP:1101; SCL:1; SRVR:SN6PR07MB4496; H:SN6PR07MB5152.namprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; x-ms-office365-filtering-correlation-id: 9683b1cd-6835-4d43-253d-08d61afed2db x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989137)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(5600074)(711020)(2017052603328)(7153060)(7193020); SRVR:SN6PR07MB4496; x-ms-traffictypediagnostic: SN6PR07MB4496: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397)(185117386973197)(228905959029699); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3002001)(93006095)(93001095)(10201501046)(3231355)(944501410)(52105095)(149027)(150027)(6041310)(20161123564045)(20161123558120)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(201708071742011)(7699050); SRVR:SN6PR07MB4496; BCL:0; PCL:0; RULEID:; SRVR:SN6PR07MB4496; x-forefront-prvs: 0796EBEDE1 received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: NBE0j7zfjwWI3sRxQ3NSXalp5hwJLx44v7FOnnqUwmWd9PYvEnPQzYWq3yZWOWARcgvk7qviFr7OFHZMn0NWDVBqO3oaK+PpBIW4/FXN2I0cDh1BWEBrHCS8Bxo3y7zbi35SUNNkEM4LrNGicv+oHD1ivIPQRIb1MXI59NJAhgqy8Trot7sFPImrvDJ0VvJgP2u4O6DY5UnTwZr6+l/AIDUYqBM1JnmSVRuhfX/WLwtEDfr863u9soTVGswhHPx/TBFRMgNA3BkRB5CpeYFCZVK3A29G/lFp53WQ8UNWBeLfF/V0qeji/omLT0k37Bt1sdZAQiLxOuqHPHsz9mulp2TA3YUgE5degIe9NhtJUDM= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9683b1cd-6835-4d43-253d-08d61afed2db X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Sep 2018 11:31:47.9545 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR07MB4496 Subject: Re: [dpdk-dev] compressdev: append dest data in PMDs instead of in application X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Sep 2018 11:31:52 -0000 HI Fiona >-----Original Message----- >From: Trahe, Fiona >Sent: 11 September 2018 22:47 >To: dev@dpdk.org >Cc: Trahe, Fiona ; Verma, Shally ; Daly, Lee ; Jozwiak, >TomaszX ; Akhil Goyal >Subject: RFC: compressdev: append dest data in PMDs instead of in applicat= ion > >External Email > >Proposal: >In compressdev, move the responsibility for appending data in the destinat= ion >mbuf from the application to the PMD. > >Why: >Compression operations are all out-of-place and the output size is unknown= . >Source and destination mbufs are passed to the PMDs. >There's no problem with the src, but there is with the destination. >The application allocates the dest mbuf from the pool, guesses how much of >the buffer the PMD will write with output and calls rte_pktmbuf_append() f= or this amount. >No data is actually copied into the dest mbuf by the application. > >The PMD writes output data to the destination mbuf, and returns the amount= written in the >op.produced parameter. > >Throughout this the mbuf is not consistent with expectations, i.e. a call = to >rte_pktmbuf_data_len() will NOT return the actual amount of data in the bu= ffer. A call to >rte_pktmbuf_append() will NOT add space at end of existing data. rte_pktmb= uf_tailroom() > will not return how much space is available at the end of the data. >Also, some PMDs, e.g. ISA-L, need scratch space at end of the output buffe= r for checksum >calculation. So though the appl has appended a specific amount in the expe= ctation that it >can be used for compressed data, the PMD needs to reduce this by 4 bytes t= o reserve >space for the checksum - even though there may be space to append another = 4bytes. > Agree to this issue in first place. However before we start on this. I have= a scenario and question based on it (probably you've already thought it bu= t I couldn't guess limitations here) =20 Given all limitations of mbuf library(especially) on SGL and compression op= behaviour and complexities it's going to add on PMD, why we can't just inp= ut length of dst buffer from an app and leave all of=20 mbuf management to app,* if it's using mbuf data buffer* i.e. consider this= scenario: App alloc rte_mbuf and call rte_pktmbuf_tailroom() to update dst.len with a= mount of free space. PMD setup HW according to offset data pointer and leng= th given. When op is done, then PMD updates op.produced. App call rte_pktmbuf_append(= ) on mbuf to update m->data/pkt_len. Now there're following possibilities with op.produced: - if op.produced < tailroom size , app call tailroom size again to find l= eftover free space and pass offset +=3D op.produced and new length in next = call=20 - if op.produced =3D=3D tailroom size, app alloc a new mbuf and chain it to= previous mbuf, call tailroom size on new mbuf, set offset +=3D op.produced= and length =3D free space in new mbuf. PMD now will have a chained mbuf in= input and offset will now end up in new chained buffer. =20 This, of course, mean PMD can have chained mbuf but only last buffer in cha= in available for use.=20 This is a limitation but only as long as we're working with MBUFs library w= hich were originally designed for other purposes and not so better suited f= or compression-like operations.=20 Why I believe it is more simpler to manage: -It keeps PMD implementation simple and lives with limitation/expectations = of mbuf library -Having length at input will also be a requirement if we add any other buff= er structures types. Although I am not sure if it can solve checksum issue you mentioned as I do= n't fully understood how ISAL uses dst buffer for checksum o/p. Does it wri= te checksum in dst buffer for every op? Or just when last chunk of data is = processed? We already have separate checksum variable at input/for output. So, doesn't= it just copy from dst buffer to op at output and op.produced updated with = only data length? I remember, initially you had started a discussion to add new and more gene= ric buffer structure type to use in place of rte_mbufs for compression libr= ary , such as iovec which simply input pointer to raw bufs with length. Aren't we thinking in that direction anymore? I believe rather than reworki= ng a library/PMD based/around mbufs (unless we've any requirement to suppo= rt mbufs so), we should define a buffer variant to better suit compression = and its users need. It also allow us to define lib to input multiple dst bu= ffers either as an array or SGL. Thanks Shally >It seems more logical that the PMD should take responsibility for appendin= g. >I.e. the application should pass in an empty mbuf, the PMD uses the rte_pk= tmbuf_tailroom() >to know what space is available, uses this space as it needs and appends t= he output data to >match the actual amount of data it writes. >This method caters for an mbuf already containing some data, in this case = the PMD will >append the output AFTER the data already in the mbuf. >It also needs to cater for SGL aka chained_mbuf case, though I'd propose i= n this case only >the first mbuf in the chain is allowed to already contain data, the rest m= ust be empty. >An application can adjust a chain to satisfy this condition. > >Code impacts: > * rte_comp_op.dst.offset should be deprecated. > * comments on the m_dst would be changed to describe this usage > * applications should alloc destination mbuf(s) from a pool, but not appe= nd. > * PMDs should use append() to correctly reflect the amount of data return= ed. > >This turns out to be more complex than expected for SGLs as rte_mbuf chain= s DON'T support >empty mbuf chains, i.e. several macros assume there's only tailroom availa= ble in the last >segment of a chain. So essentially if an application chains a bunch of emp= ty mbufs >together the only tailroom available is the buf_len of the last mbuf and t= he space in >all the other mbufs is "lost". >We've investigated several ways around this, I think either options 1 or 2= would be ok, but >am not keen on option 3. I'm looking for feedback please. > >Option1: create an rte_comp_mbuf, identical to rte_mbuf - probably just ca= st to it - with its own set of > macros. Most of these wrap existing mbuf macros, those that need to ca= ter for the empty chain case. >Option2: pass in a pool on the API instead and the PMD allocs dest mbufs a= s needed and chains them. > Storage customers expect to use external buffers attached to mbufs so t= his may not suit their use-case. > Although it may be an option if all mbufs in the pool are pre-attached = to external buffers. >Option3: appl does as today. PMD trims space not used. Would need changes = or additions to mbuf macros > so it could trim from more than just the last segment. PMD would need t= o free unused segments. > I'm not keen on this as doing the work in 2 places and there's potentia= l > to leak mbufs here as allocating them in API and freeing in PMD. > >Regards, >Fiona >