From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B6F1EA056A; Thu, 5 Mar 2020 19:26:41 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CFA712BE3; Thu, 5 Mar 2020 19:26:40 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id CC13D3B5 for ; Thu, 5 Mar 2020 19:26:38 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Mar 2020 10:26:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,518,1574150400"; d="scan'208";a="413613098" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by orsmga005.jf.intel.com with ESMTP; 05 Mar 2020 10:26:36 -0800 Received: from fmsmsx603.amr.corp.intel.com (10.18.126.83) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 5 Mar 2020 10:26:36 -0800 Received: from fmsmsx608.amr.corp.intel.com (10.18.126.88) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Thu, 5 Mar 2020 10:26:35 -0800 Received: from FMSEDG001.ED.cps.intel.com (10.1.192.133) by fmsmsx608.amr.corp.intel.com (10.18.126.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Thu, 5 Mar 2020 10:26:35 -0800 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (104.47.66.46) by edgegateway.intel.com (192.55.55.68) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 5 Mar 2020 10:26:31 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=f9kMmSrPAUvEJec764cC9J07HDPY7an1k7J5D5g5pbvhqjP+fkhG45m3KlRFTZeOMYr9YKjrQ6UgZWHtnYLpxs4D8e4KgDDiAoTqGW9UNBYfe4uJAto2gnn+Iz2udIy8uZalT5Be0WukUgWFzoOKtZGQcQ5tS5/ZFKOJTg0UgKHmT4u0M0ixNDZXTWy93Hy3VVRf1S8FsmmR/l0+X9E8NrP3+NPQcFHLC3d/cEGrcPGhbCWAdiy4GDnM2fVGtiXRXe3UNufSBlYJfhwWfzxFkWFmVMcSaGV271m1kYmWJOHpSqPoNw98Ryu86GgQj4O5heF2zgHP6OtL1vQ8vMMPIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=24wBIQVAoFfGeMe/RXxzz5Q+QWncqp9chICCp6uuSfs=; b=dQGpHPUsBFgQkPTs/j//wJp6nozz5rXLNVM4Dl58XGwPByE4cL1ZIIUbp4srCD6D5vFIo7dVdbp9hmun6ABr7tLu3RsColaSrJFfToWayzEG2LHB95g2qotNhwf7B2EVASzoJB4asGo3eJCnR84qJKsjVUgvWK0AlzqkYUhbV0LTD7ps/t0+0CDBlup9ILCYc+6G5pL6xXZdzYfFER4dY2fVg0t+T33uzXCWHkbDsY4ifsi2XXEWSPJuFPdQ3CfqUgzQuewRqdPVmRBxXa8/NHnllDjLalNgrQFDDySy4jeTtvpuPFNNwYJtf5F5BrTJpCuNanFq/+iVr5ss+iQ8eg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=24wBIQVAoFfGeMe/RXxzz5Q+QWncqp9chICCp6uuSfs=; b=dUQqvrrJ1jn9qTq1gY33jlPEzFaN9owX8W9JTkFD1ysXye7rj2jz15T81/Qxlj66E5UuUAsifGbZ9/H6r8yKqXX+a/TnECkg/PTrUzF8b9xpNsvUPgblU3/BAaXXgW/l/3Q2GGBev1ULlmmEjtxo2wFbhUctZ3gjBoANigzTGwk= Received: from SN6PR11MB2558.namprd11.prod.outlook.com (2603:10b6:805:5d::19) by SN6PR11MB3373.namprd11.prod.outlook.com (2603:10b6:805:c6::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2772.14; Thu, 5 Mar 2020 18:26:30 +0000 Received: from SN6PR11MB2558.namprd11.prod.outlook.com ([fe80::395e:eb75:6ab7:2ba5]) by SN6PR11MB2558.namprd11.prod.outlook.com ([fe80::395e:eb75:6ab7:2ba5%3]) with mapi id 15.20.2772.019; Thu, 5 Mar 2020 18:26:30 +0000 From: "Ananyev, Konstantin" To: Honnappa Nagarahalli , "olivier.matz@6wind.com" CC: Gavin Hu , "dev@dpdk.org" , nd , nd Thread-Topic: [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs Thread-Index: AQHV61KoLY8qxNN4FEGi1nt9XMNavqgt1y+wgAHraACABJNf0IAEyrmAgAEXl3A= Date: Thu, 5 Mar 2020 18:26:30 +0000 Message-ID: References: <20200224203931.21256-1-honnappa.nagarahalli@arm.com> <20200224203931.21256-2-honnappa.nagarahalli@arm.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYzg4ZDkyNjgtMzlmNS00ZDQ4LWI2YTQtYjhjMmIxNWM4Y2UxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiMXp6S2VYbjZjWlNxOER2c1ZPeHVsZ3NZN29YSFYxMVZreGhSc1JFcVBLeDRyYlBoSHZZODRDNlN4Sm1DU2VYayJ9 dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 x-ctpclassification: CTP_NT authentication-results: spf=none (sender IP is ) smtp.mailfrom=konstantin.ananyev@intel.com; x-originating-ip: [192.198.151.168] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: d4b9868e-5f54-40e7-841f-08d7c132b9c2 x-ms-traffictypediagnostic: SN6PR11MB3373: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 03333C607F x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(39860400002)(346002)(136003)(396003)(366004)(189003)(199004)(316002)(478600001)(6506007)(7696005)(966005)(110136005)(54906003)(2906002)(4326008)(86362001)(52536014)(71200400001)(76116006)(5660300002)(26005)(186003)(81156014)(55016002)(66556008)(66946007)(8676002)(66446008)(66476007)(64756008)(8936002)(81166006)(33656002)(9686003); DIR:OUT; SFP:1102; SCL:1; SRVR:SN6PR11MB3373; H:SN6PR11MB2558.namprd11.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 88rJi6erty2BikgN2ABzY4Nm7x3qI6eUKFNcsess110OrN02W76oiK8vA/jWEjJtSRiugXQy88lKCYks4O8IE8f6w9SS0L9p/IavTTyGfWdi0Bb3FijNMXLnlp9i+Jyc3QZ4XebvMRVVijMo1DAbDYn0h5o3WqKggqOpC4Ja2JgRwmaHUt7EEh0KkycGlJol82t2mXaELnjc444YzIY4x1qXIIK1oakiE6cFnzm6f0AmtoxCh/mx7881Rqug3KvcnwOQ9oNbqjGnEK+uKSjyMdMFqbJi3/oYCi26BMgqBM/5smZ/SRG9hM+vuioNPixB4CCW3Z1xHMxu3Fq56wpRqvAuIfHXK1hD6tOWxCIceIf7IjVDCn3DKVjx9aQAkHOakm29AqrSwYHganETYEATumlVFBcHbzrnKe90kpJVGkqBN9fKwHjxNXl+n39t6jVN6eBrkcpohAJa5CJZaDt0tERL1fo0M09iCYUchL7Q7TzE2OVVq0Zo/SwX3Q9Z7zj3pHBQoQyA++S5jZdxKR1g5A== x-ms-exchange-antispam-messagedata: 2xob9sYf15Itfj+eIfsCU8BzRotlM8kjkQa8doeybyH14+4KjQ9/xBZhGO6YhH0QeeaUezOLXZfS+y1KXX2hV1o97V2gDmsDkwdi9mHECTg7pYG78vedN12fVp26enn8vUTiHqq2Zt+dkxAzFj629g== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: d4b9868e-5f54-40e7-841f-08d7c132b9c2 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Mar 2020 18:26:30.4123 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: aq7JzQejxtGNR3sci3SrCEdDko4NzXxopSjEVEohZg6hdKyAp8sHCYPy/elQz+rRa9xoX9rba4FradTM20AxHJc0PFBuahreW8AQlZVcTeg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR11MB3373 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > > > > > +/** > > > > > + * @internal Reserve ring elements to enqueue several objects on > > > > > +the ring > > > > > + * > > > > > + * @param r > > > > > + * A pointer to the ring structure. > > > > > + * @param esize > > > > > + * The size of ring element, in bytes. It must be a multiple o= f 4. > > > > > + * This must be the same value used while creating the ring. > > Otherwise > > > > > + * the results are undefined. > > > > > + * @param n > > > > > + * The number of elements to reserve in the ring. > > > > > + * @param behavior > > > > > + * RTE_RING_QUEUE_FIXED: Reserve a fixed number of elements > > from a > > > > ring > > > > > + * RTE_RING_QUEUE_VARIABLE: Reserve as many elements as > > possible > > > > from ring > > > > > + * @param is_sp > > > > > + * Indicates whether to use single producer or multi-producer = reserve > > > > > + * @param old_head > > > > > + * Producer's head index before reservation. > > > > > + * @param new_head > > > > > + * Producer's head index after reservation. > > > > > + * @param free_space > > > > > + * returns the amount of space after the reserve operation has > > finished. > > > > > + * It is not updated if the number of reserved elements is zer= o. > > > > > + * @param dst1 > > > > > + * Pointer to location in the ring to copy the data. > > > > > + * @param n1 > > > > > + * Number of elements to copy at dst1 > > > > > + * @param dst2 > > > > > + * In case of ring wrap around, this pointer provides the loca= tion to > > > > > + * copy the remaining elements. The number of elements to copy= at > > this > > > > > + * location is equal to (number of elements reserved - n1) > > > > > + * @return > > > > > + * Actual number of elements reserved. > > > > > + * If behavior =3D=3D RTE_RING_QUEUE_FIXED, this will be 0 or = n only. > > > > > + */ > > > > > +static __rte_always_inline unsigned int > > > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r, unsigned > > > > > +int esize, > > > > > > > > > > > > I do understand the purpose of reserve, then either commit/abort fo= r > > > > serial sync mode, but what is the purpose of non-serial version of > > reserve/commit? > > > In RCU, I have the need for scatter-gather feature. i.e. the data in > > > the ring element is coming from multiple sources ('token' is generate= d > > > by the RCU library and the application provides additional data). If = I do not > > provide the reserve/commit, I need to introduce an intermediate memcpy = to > > get these two data contiguously to copy to the ring element. The sequen= ce is > > 'reserve(1), memcpy1, mempcy2, commit(1)'. > > > Hence, you do not see the abort API for the enqueue. > > > > > > > In serial MP/MC case, after _reserve_(n) you always have to do > > > > _commit_(n) - you can't reduce number of elements, or do _abort_. > > > Agree, the intention here is to provide the scatter/gather feature. > > > > > > > Again you cannot avoid memcpy(n) here anyhow. > > > > So what is the point of these functions for non-serial case? > > > It avoids an intermediate memcpy when the data is coming from multipl= e > > sources. > > > > Ok, I think I understand what was my confusion: > Yes, the following understanding is correct. >=20 > > Your intention: > > 1) reserve/commit for both serial and non-serial mode - > > to allow user get/set contents of the ring manually and avoid > > intermediate load/stores. > > 2) abort only for serial mode. > > > > My intention: > > 1) commit/reserve/abort only for serial case > > (as that's the only mode where we can commit less > > then was reserved or do abort). > I do not know if there is a requirement on committing less than reserved. >From my perspective, that's a necessary part of peek functionality. revert/abort function you introduced below is just one special case of it. Having just abort is enough when you processing elements in the ring one by= one, but not sufficient if someone would try to operate in bulks. Let say you read (reserved) N objects from the ring, inspected them and found that first M ( I think, if the size of commit is not known during reservation, > may be the reservation can be delayed till it is known. In some cases, you do know how much you'd like to commit, but you can't guarantee that you can commit that much, till you inspect contents of reserved elems. =20 > If there is no requirement to commit less than reserved, then I do not se= e a need for serial APIs for enqueue operation. >=20 > > 2) get/set of ring contents are done as part of either > > reserve(for dequeue) or commit(for enqueue) API calls > > (no scatter-gather ability). > > > > I still think that this new API you suggest creates too big exposure of= ring > > internals, and makes it less 'safe-to-use': > > - it provides direct access to contents of the ring. > > - user has to specify head/tail values directly. > It is some what complex. But, with the support of user defined element si= ze, I think it becomes necessary to support scatter gather > feature (since it is not a single pointer that will be stored). I suppose to see the real benefit from scatter-gather, we need a scenario where there are relatively big elems in the ring (32B+ or so), plus enqueue/dequeue done in bulks. If you really envision such use case - I am ok to consider scatter-gather = API too, but I think it shouldn't be the only available API for serial mode. Might be we can have 'normal' enqueue/dequeue API for serial mode (actual copy done internally in ring functions, head/tail values are not ex= posed directly), plus SG API as addon for some special cases. =20 > > > > So in case of some programmatic error in related user code, there are l= ess > > chances it could be catch-up by API, and we can easily end-up with sile= nt > > memory corruption and other nasty things that would be hard to > > catch/reproduce. > > > > That makes me wonder how critical is this scatter-gather ability in ter= ms of > > overall RCU performance? > > Is the gain provided really that significant, especially if you'll upda= te the ring > > by one element at a time? > For RCU, it is 64b token and the size of the user data. Not sure how much= difference it will make. > I can drop the scatter gather requirement for now. >=20 > > > > > > > > > > > > > BTW, I think it would be good to have serial version of _enqueue_ t= oo. > > > If there is a good use case, they should be provided. I did not come = across a > > good use case. > > > > > > > > > > > > + unsigned int n, enum rte_ring_queue_behavior behavior, > > > > > + unsigned int is_sp, unsigned int *old_head, > > > > > + unsigned int *new_head, unsigned int *free_space, > > > > > + void **dst1, unsigned int *n1, void **dst2) > > > > > > > > I do understand the intention to avoid memcpy(), but proposed API > > > > seems overcomplicated, error prone, and not very convenient for the= user. > > > The issue is the need to handle the wrap around in ring storage array= . > > > i.e. when the space is reserved for more than 1 ring element, the wra= p > > around might happen. > > > > > > > I don't think that avoiding memcpy() will save us that many cycles > > > > here, so > > > This depends on the amount of data being copied. > > > > > > > probably better to keep API model a bit more regular: > > > > > > > > n =3D rte_ring_mp_serial_enqueue_bulk_reserve(ring, num, &free_spac= e); ... > > > > /* performs actual memcpy(), m<=3Dn */ > > > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj, m); > > > These do not take care of the wrap-around case or I am not able to > > understand your comment. > > > > I meant that serial_enqueue_commit() will do both: > > actual copy of elements to the ring and tail update (no Scatter-Gather)= , see > > above. > RCU does not require the serial enqueue APIs, do you have any use case? I agree that serial dequeue seems to have more usages then enqueue. Though I still can name at least two cases for enqueue, from top of my head= : 1. serial mode (both enqueue/dequeue) helps to mitigate ring slowdown=20 overcommitted scenarios, see RFC I submitted: http://patches.dpdk.org/cover/66001/ 2. any intermediate node when you have pop/push from/to some external queue= , and enqueue/dequeue to/from the ring, would like to avoid any elem drops in between, and by some reason don't want your own intermediate buffe= rization. Let say: dequeue_from_ring -> tx_burst/cryptodev_enqueue rx_burst/cryptodev_dequeue -> enqueue_to_ring Plus as enqueue/dequeue are sort of mirror, I think it is good to have both= identical. =20