From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 10BEFA04B7; Tue, 13 Oct 2020 13:38:34 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5B4FA1DAC9; Tue, 13 Oct 2020 13:38:32 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 28F981DAC8 for ; Tue, 13 Oct 2020 13:38:28 +0200 (CEST) IronPort-SDR: j+sI9Z/zlXfJ3bjupGNsn4elGDZJXU8lhkuTA9R6xOYN0axGfN9v6G376oZ01DxBka1UzM6Csu f8NoAf2Ydj1Q== X-IronPort-AV: E=McAfee;i="6000,8403,9772"; a="145211284" X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="145211284" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2020 04:38:26 -0700 IronPort-SDR: bxE0a/U19Zjbk6mRcwPUno50znQdcAnMOmCidr7sXI8RxzVPByC1aESK2D1a4InKb8n5/ts+7U WSxmMg57RlYw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,370,1596524400"; d="scan'208";a="356128859" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by FMSMGA003.fm.intel.com with ESMTP; 13 Oct 2020 04:38:26 -0700 Received: from fmsmsx607.amr.corp.intel.com (10.18.126.87) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 13 Oct 2020 04:38:26 -0700 Received: from fmsmsx602.amr.corp.intel.com (10.18.126.82) by fmsmsx607.amr.corp.intel.com (10.18.126.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 13 Oct 2020 04:38:25 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 13 Oct 2020 04:38:25 -0700 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (104.47.70.108) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Tue, 13 Oct 2020 04:38:21 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Dut8XToFIznp2E7/jyHlRWNxKC3PwknhfNjrU1lo2JLAHD/USBENxA0LvDk+v6ENdAh6VspPD4AxvxKhZLo2DixswV71HnnGWmtfUY6R75tc150qydWq3S9Cd5mPq2Adqo2hrDgU2n2n7MVnq6siY4a2GwucegnZIf5/qdWKqab4xedN0HJWc51oZJbHNynbFqVRZzkAY022mzZOa0IjcFKiWUOEQX3DwSGmUgulTSj8e07ejI6T+EJb1LiW9TilEnOQm0sgP2jY+uGL/22iCMZYazdRbHtVGIWR/dJRBe4ZfS8YGxiA7F8xwi2vqBLIveI4SxqHVLz1RAX3uI45qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lgfXyiZKH8FWami57HR7W+AdyBxfg0Vh6S8FkprPZSs=; b=dpVlE8+l9t9iO4N9B/IHUgqS8F/GAFqsqky9bF4rFDzZjqKwrKgC0L7subDcZNjWVAcUxllrzgONA853G51iUjv0RPOPcN7g/dsAqve/PzeaK4B4SsTt2c78lMMGV/1k39y/TlVnyXdPzggN7HIESERLnSFCDsR+BgHCwAmQzxARVkMJgmyt3LFFDUBZ08MCR5fX//XxHvQ0p6q1+4NjlBB4hiSea7Y77eCk4hxh/qFlAJhDGJsoR95KO+SAVfVG3GaWradnIey6d5P2qhLu+IhMzcYIv54Qx9EIjsGj9U+ixwpVX0Ayi5d91pHjJWTsxjn2/db9vtW5h4+OgV71Jw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lgfXyiZKH8FWami57HR7W+AdyBxfg0Vh6S8FkprPZSs=; b=dfOGC/TsNN0qcStEWopkqH2C/+qBtgnhMqxaMLO5nLXKRFsviL94Gai/5RCLHiSYXErWmWzrKHNDziFJD1hiz1KQZJnhzEHSUMR8qB8w1CO+YkmoCT/tJoSuOzBN47dFIkt5eSERCcWYxXvM9MXHEcNiq0okJwCdd0Ar8dpf4wc= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BY5PR11MB4225.namprd11.prod.outlook.com (2603:10b6:a03:1ba::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.21; Tue, 13 Oct 2020 11:38:20 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f5a4:3f6b:ade3:296b%3]) with mapi id 15.20.3455.030; Tue, 13 Oct 2020 11:38:19 +0000 From: "Ananyev, Konstantin" To: Honnappa Nagarahalli , "dev@dpdk.org" CC: "olivier.matz@6wind.com" , "david.marchand@redhat.com" , nd , nd Thread-Topic: [RFC v2 1/1] lib/ring: add scatter gather APIs Thread-Index: AQHWm+TAMRnLEGUYBEqkMDYffUIcWqmT/ylAgACX44CAANa9UA== Date: Tue, 13 Oct 2020 11:38:19 +0000 Message-ID: References: <20200224203931.21256-1-honnappa.nagarahalli@arm.com> <20201006132905.46205-1-honnappa.nagarahalli@arm.com> <20201006132905.46205-2-honnappa.nagarahalli@arm.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: bc436d45-2697-464b-41c9-08d86f6c7bf5 x-ms-traffictypediagnostic: BY5PR11MB4225: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ixtvXxCyI0a/+CkhlD0EOjNDo2Vb37XCZNfucIxYhEM3U0eBfs9k4TTb9YGQsKrxXIm5rZzK7j6pcbigA3M1pS6FWFcRc7cO18IpqdQMzr2VtbYJDJenKW7mmayYqgNhFcOD23TsqdVHG9Mw4Z5irG6xj0BqE3pyGKLejhkyTjQqH2ZRo4SP2JNYnUIjqh/9AJwECSjaqGlXym3clxfwaYoPHRV660UmAJPnqrgQBa++tgayOXLOewOkaPDgWyjnsd3+d0ELYPuJWB3DN4MuPV72yUTr1PsTD4lspRgYiVlZJuQsR9fbBi/CKYqtjN5MhRumR7TmWJPXUCUJPL7npQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(396003)(136003)(39860400002)(346002)(376002)(5660300002)(8936002)(86362001)(9686003)(4326008)(110136005)(8676002)(6506007)(66946007)(478600001)(66446008)(2906002)(66556008)(64756008)(316002)(76116006)(55016002)(54906003)(66476007)(33656002)(83380400001)(186003)(71200400001)(7696005)(26005)(52536014); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: Cvrh1Kz3Jc/4meUKTUbLl0XwT95MWU0shECzpRrgXR3VCHN5II1O7LyjuS4UxeQo3RG5nZfSuEFIl9w0QM0HufE0dfaslKkIEOM97/K/ai+ubiybzhaS9dWJZlHDX5+rhNebBMBgYdUxSTxz/SDHDGvdD+YSIAWVl9viafNZAuhl5i4iUdhw6gCjOIds9s+0fhdAcKMEUVABJ+aKvOiFrOK4ybyAuu+40hq2vQ2R1fiGv8ez8zTshb6rJEJl21CJBtMp4Yd0YsDNSpZUztDCS3m3JMVQZgLm3dL/TcgkoORjP7MRSXMq6iUZ+jySHJSWf2eVCuLSbWAsp5WDFrTzkCceMjb7rKXQXd6gNmca3Zmj5/3445XCmEnTd20VNn/3uKTxD7mUrbjkThm8jqmlcqfjsF4UFucMWkVXaeJBmWkVOHV3BCqVxYap4NA4X2eeO5S9l4sT3Wd/qAgdZTjrXxaj4YLdKQ5cZorT539aJKzcxqxDftrqzxaFlRw4VzhkwJnFqoV19nrdjhNW1o7moZeCKGFfXgvXt30PWvKGSh3R1ZH/+iYqPjBtZ62GlJqrel1kAwH1IutBqxZcrDLTs9O1gWHhc6Zz5d71/dYBI3gLNqMK+Ktx2nQV6FLbyeUb+2/N8mYOMlg6HXhhqy6X+w== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: bc436d45-2697-464b-41c9-08d86f6c7bf5 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Oct 2020 11:38:19.5969 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: BMw5AmBUEWIBEcgAICBoY9pwo2byl/mQz2RVO0fP+oxzi8UzwLMJkP92nuofsksPX4HwRxvu71VHgTUf9+QLUcNI46JriPf0FF3vnwRYYLM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR11MB4225 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Honnappa, > Hi Konstantin, > Appreciate your feedback. >=20 > >=20 > > > > > > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that > > > involve copying large amount of data to/from the ring can benefit fro= m > > > these APIs. > > > > > > Signed-off-by: Honnappa Nagarahalli > > > --- > > > lib/librte_ring/meson.build | 3 +- > > > lib/librte_ring/rte_ring_elem.h | 1 + > > > lib/librte_ring/rte_ring_peek_sg.h | 552 > > > +++++++++++++++++++++++++++++ > > > 3 files changed, 555 insertions(+), 1 deletion(-) create mode 10064= 4 > > > lib/librte_ring/rte_ring_peek_sg.h > > > > As a generic one - need to update ring UT both func and perf to > > test/measure this new API. > Yes, will add. >=20 > > > > > > > > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.buil= d > > > index 31c0b4649..377694713 100644 > > > --- a/lib/librte_ring/meson.build > > > +++ b/lib/librte_ring/meson.build > > > @@ -12,4 +12,5 @@ headers =3D files('rte_ring.h', > > > 'rte_ring_peek.h', > > > 'rte_ring_peek_c11_mem.h', > > > 'rte_ring_rts.h', > > > - 'rte_ring_rts_c11_mem.h') > > > + 'rte_ring_rts_c11_mem.h', > > > + 'rte_ring_peek_sg.h') > > > diff --git a/lib/librte_ring/rte_ring_elem.h > > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644 > > > --- a/lib/librte_ring/rte_ring_elem.h > > > +++ b/lib/librte_ring/rte_ring_elem.h > > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, > > > void *obj_table, > > > > > > #ifdef ALLOW_EXPERIMENTAL_API > > > #include > > > +#include > > > #endif > > > > > > #include > > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h > > > b/lib/librte_ring/rte_ring_peek_sg.h > > > new file mode 100644 > > > index 000000000..97d5764a6 > > > --- /dev/null > > > +++ b/lib/librte_ring/rte_ring_peek_sg.h > > > @@ -0,0 +1,552 @@ > > > +/* SPDX-License-Identifier: BSD-3-Clause > > > + * > > > + * Copyright (c) 2020 Arm > > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org > > > + * All rights reserved. > > > + * Derived from FreeBSD's bufring.h > > > + * Used as BSD-3 Licensed with permission from Kip Macy. > > > + */ > > > + > > > +#ifndef _RTE_RING_PEEK_SG_H_ > > > +#define _RTE_RING_PEEK_SG_H_ > > > + > > > +/** > > > + * @file > > > + * @b EXPERIMENTAL: this API may change without prior notice > > > + * It is not recommended to include this file directly. > > > + * Please include instead. > > > + * > > > + * Ring Peek Scatter Gather APIs > > > + * Introduction of rte_ring with scatter gather serialized > > > +producer/consumer > > > + * (HTS sync mode) makes it possible to split public enqueue/dequeue > > > +API > > > + * into 3 phases: > > > + * - enqueue/dequeue start > > > + * - copy data to/from the ring > > > + * - enqueue/dequeue finish > > > + * Along with the advantages of the peek APIs, these APIs provide th= e > > > +ability > > > + * to avoid copying of the data to temporary area. > > > + * > > > + * Note that right now this new API is available only for two sync m= odes: > > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST) > > > + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS)= . > > > + * It is a user responsibility to create/init ring with appropriate > > > +sync > > > + * modes selected. > > > + * > > > + * Example usage: > > > + * // read 1 elem from the ring: > > > + * n =3D rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL); > > > + * if (n !=3D 0) { > > > + * //Copy objects in the ring > > > + * memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t)); > > > + * if (n !=3D sgd->n1) > > > + * //Second memcpy because of wrapround > > > + * n2 =3D n - sgd->n1; > > > + * memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t)); > > > + * rte_ring_dequeue_sg_finish(ring, n); > > > > It is not clear from the example above why do you need SG(ZC) API. > > Existing peek API would be able to handle such situation (just copy wil= l be > > done internally). Probably better to use examples you provided in your = last > > reply to Olivier. > Agree, not a good example, will change it. >=20 > > > > > + * } > > > + * > > > + * Note that between _start_ and _finish_ none other thread can > > > + proceed > > > + * with enqueue(/dequeue) operation till _finish_ completes. > > > + */ > > > + > > > +#ifdef __cplusplus > > > +extern "C" { > > > +#endif > > > + > > > +#include > > > + > > > +/* Rock that needs to be passed between reserve and commit APIs */ > > > +struct rte_ring_sg_data { > > > + /* Pointer to the first space in the ring */ > > > + void **ptr1; > > > + /* Pointer to the second space in the ring if there is wrap-around = */ > > > + void **ptr2; > > > + /* Number of elements in the first pointer. If this is equal to > > > + * the number of elements requested, then ptr2 is NULL. > > > + * Otherwise, subtracting n1 from number of elements requested > > > + * will give the number of elements available at ptr2. > > > + */ > > > + unsigned int n1; > > > +}; > > > > I wonder what is the primary goal of that API? > > The reason I am asking: from what I understand with this patch ZC API w= ill > > work only for ST and HTS modes (same as peek API). > > Though, I think it is possible to make it work for any sync model, by c= hanging > Agree, the functionality can be extended to other modes as well. I added = these 2 modes as I found the use cases for these. >=20 > > API a bit: instead of returning sg_data to the user, force him to provi= de > > function to read/write elems from/to the ring. > > Just a schematic one, to illustrate the idea: > > > > typedef void (*write_ring_func_t)(void *elem, /*pointer to first elem t= o > > update inside the ring*/ > > uint32_t num, /* number of elems to update > > */ > > uint32_t esize, > > void *udata /* caller provide data */); > > > > rte_ring_enqueue_zc_bulk_elem(struct rte_ring *r, unsigned int esize, > > unsigned int n, unsigned int *free_space, write_ring_func_t wf, void > > *udata) { > > struct rte_ring_sg_data sgd; > > ..... > > n =3D move_head_tail(r, ...); > > > > /* get sgd data based on n */ > > get_elem_addr(r, ..., &sgd); > > > > /* call user defined function to fill reserved elems */ > > wf(sgd.p1, sgd.n1, esize, udata); > > if (n !=3D n1) > > wf(sgd.p2, sgd.n2, esize, udata); > > > > .... > > return n; > > } > > > I think the call back function makes it difficult to use the API. The cal= l back function would be a wrapper around another function or API > which will have its own arguments. Now all those parameters have to passe= d using the 'udata'. For ex: in the 2nd example that I provided > earlier, the user has to create a wrapper around 'rte_eth_rx_burst' API a= nd then provide the parameters to 'rte_eth_rx_burst' through > 'udata'. 'udata' would need a structure definition as well. Yes, it would, though I don't see much problems with that. Let say for eth_rx_burst(), user will need something like struct {uint16_t = p, q;} udata =3D {.p =3D port_id, .q=3Dqueue_id,}; >=20 > > If we want ZC peek API also - some extra work need to be done with > > introducing return value for write_ring_func() and checking it properly= , but I > > don't see any big problems here too. > > That way ZC API can support all sync models, plus we don't need to expo= se > > sg_data to the user directly. > Other modes can be supported with the method used in this patch as well.= =20 You mean via exposing to the user tail value (in sg_data or so)? I am still a bit nervous about doing that.=20 > If you see a need, I can add them. Not, really, I just thought callbacks will be a good idea here... > IMO, only issue with exposing sg_data is ABI compatibility in the future.= I think, we can align the 'struct rte_ring_sg_data' to cache line > boundary and it should provide ability to extend it in the future without= affecting the ABI compatibility. As I understand sg_data is experimental struct (as the rest of API in that = file). So breaking it shouldn't be a problem for a while. I suppose to summarize things - as I understand you think callback approach is not a good choice. >From other hand, I am not really happy with idea to expose tail values upda= tes to the user. Then I suggest we can just go ahead with that patch as it is: sg_data approach, _ZC_ peek API only. >=20 > > Also, in future, we probably can de-duplicate the code by making our no= n-ZC > > API to use that one internally (pass ring_enqueue_elems()/ob_table as a > > parameters). > > > > > +