From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 33734A2EFC for ; Tue, 15 Oct 2019 01:56:26 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 295ED1B94A; Tue, 15 Oct 2019 01:56:25 +0200 (CEST) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00087.outbound.protection.outlook.com [40.107.0.87]) by dpdk.org (Postfix) with ESMTP id 9F8E81C1B9 for ; Tue, 15 Oct 2019 01:56:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W2xUK+GT7VF27k2+Ej7bjVD4UxukczkpFW3AZ2Iccfc=; b=f2QF7NEt1BzPUhmfaWWY5ni1Dg2hD9KQFGnyzGaVtsSiwVatHzQHmtQSgDY+5p8hwd2+RPEfs3XoU3u8bKzjLK5x4VUoI0AL0TMj+TJ8AkDHLy15qtVQCY+/BuAskF4VI8FmfYqhyID6C/hQQ5SWmWx31GjNQtaQChxce/zrefs= Received: from VI1PR08CA0197.eurprd08.prod.outlook.com (2603:10a6:800:d2::27) by HE1SPR00MB140.eurprd08.prod.outlook.com (2603:10a6:3:51::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2347.16; Mon, 14 Oct 2019 23:56:18 +0000 Received: from VE1EUR03FT015.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e09::201) by VI1PR08CA0197.outlook.office365.com (2603:10a6:800:d2::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2347.17 via Frontend Transport; Mon, 14 Oct 2019 23:56:18 +0000 Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=none action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT015.mail.protection.outlook.com (10.152.18.176) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2305.15 via Frontend Transport; Mon, 14 Oct 2019 23:56:16 +0000 Received: ("Tessian outbound 6481c7fa5a3c:v33"); Mon, 14 Oct 2019 23:56:12 +0000 X-CR-MTA-TID: 64aa7808 Received: from f8893d3f1df8.2 (ip-172-16-0-2.eu-west-1.compute.internal [104.47.2.55]) by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9A9A5145-EE11-4026-B8B5-7F6750DAAC12.1; Mon, 14 Oct 2019 23:56:07 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01lp2055.outbound.protection.outlook.com [104.47.2.55]) by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f8893d3f1df8.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 14 Oct 2019 23:56:07 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ss3wQx2wBJ2z5AR5E6rC7XjbSJJMztfwyICGHvE0d2QeA0Gy25sZsrCarutMz+FG3pnsvPKDZrqtlP/Mo05vlFGLV8i/nx6dEn7vl5E85QaqRU5zjrEKVc39lxd+d559Ju3Y01nAQCcCSIKPKVqAwOL+s3Jown+W2I1fMHXGN8XRv93wW+fgZXRnhQZj8uyt+LSkRFZZq0G0jI81MiIjiD5dYMazm8tebBL36re44G0mNSo1Iih8mAHwqiXPQRw4Pccgcq06F+bsy0Dt5zWuvPjPYhRDMvrWLzY5EueWCk+YN9KmOJv9i6pgl7mlxPJVeyNA7BCSN0Llna/bjrmqLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W2xUK+GT7VF27k2+Ej7bjVD4UxukczkpFW3AZ2Iccfc=; b=Tm0Dor1rQXaZhH2Ba5R2dOHw0Bajw9ilHP/7K2hG+XN6BWR9vt+ZHcnzwXptzVthth72VTUtAYZXtT2gosmNNOPYFM2ZzTYEqRE3EVM+1hVKHcvO3ERjVydDq8JOk6j6rKzu60JX+dvKaaE5+Cpf18uaBNqAx1M2/cdAKh+pDYOQRIB1DXuYh70a4C3M9jr/fHCoAuA28Y/cdKl+Vqmwl100vuH/PGl/nMp+KYYzXwykrikfAw5HJO1iapRXhQcQhpEoroEEtdJIfPIykI1qZI/uVdJC7+kXTTfNfSZbRfp6XTYhkFFR5rroT8A2X9eO5VtOSTEFauIh/iTuWWnU7A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W2xUK+GT7VF27k2+Ej7bjVD4UxukczkpFW3AZ2Iccfc=; b=f2QF7NEt1BzPUhmfaWWY5ni1Dg2hD9KQFGnyzGaVtsSiwVatHzQHmtQSgDY+5p8hwd2+RPEfs3XoU3u8bKzjLK5x4VUoI0AL0TMj+TJ8AkDHLy15qtVQCY+/BuAskF4VI8FmfYqhyID6C/hQQ5SWmWx31GjNQtaQChxce/zrefs= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.27) by VE1PR08MB4848.eurprd08.prod.outlook.com (10.255.113.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2347.21; Mon, 14 Oct 2019 23:56:04 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::8c82:8d9c:c78d:22a6]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::8c82:8d9c:c78d:22a6%7]) with mapi id 15.20.2347.023; Mon, 14 Oct 2019 23:56:04 +0000 From: Honnappa Nagarahalli To: "Ananyev, Konstantin" , "olivier.matz@6wind.com" , "sthemmin@microsoft.com" , "jerinj@marvell.com" , "Richardson, Bruce" , "david.marchand@redhat.com" , "pbhagavatula@marvell.com" CC: "dev@dpdk.org" , Dharmik Thakkar , "Ruifeng Wang (Arm Technology China)" , "Gavin Hu (Arm Technology China)" , "stephen@networkplumber.org" , Honnappa Nagarahalli , nd , nd Thread-Topic: [PATCH v4 1/2] lib/ring: apis to support configurable element size Thread-Index: AQHVgGkjycLlQ66roUqbf1hqE7rJk6dajggAgAACWdA= Date: Mon, 14 Oct 2019 23:56:04 +0000 Message-ID: References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191009024709.38144-1-honnappa.nagarahalli@arm.com> <20191009024709.38144-2-honnappa.nagarahalli@arm.com> <2601191342CEEE43887BDE71AB97725801A8C68545@IRSMSX104.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB97725801A8C68545@IRSMSX104.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 5c727301-4c71-4293-ba4f-ba191ba8e270.0 x-checkrecipientchecked: true Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: e1f1ea02-2eca-4e00-91af-08d751021a0d X-MS-Office365-Filtering-HT: Tenant X-MS-TrafficTypeDiagnostic: VE1PR08MB4848:|VE1PR08MB4848:|HE1SPR00MB140: x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:9508;OLM:9508; x-forefront-prvs: 01901B3451 X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(346002)(396003)(136003)(366004)(376002)(199004)(189003)(66476007)(66946007)(76116006)(71200400001)(186003)(71190400001)(74316002)(316002)(54906003)(110136005)(2501003)(64756008)(66556008)(66446008)(33656002)(102836004)(6246003)(1511001)(76176011)(6506007)(14454004)(7696005)(26005)(7736002)(99286004)(66066001)(14444005)(256004)(81166006)(81156014)(4326008)(2906002)(305945005)(30864003)(478600001)(8676002)(229853002)(55016002)(3846002)(25786009)(6116002)(476003)(5660300002)(11346002)(446003)(486006)(2201001)(86362001)(6436002)(52536014)(9686003)(8936002)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4848; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: fzAAsf8hMbIm6LFaRBq/3QPFadS1sdj+eb+hKGzKHQWw9ZvHw2RijGY9Mb590Wa4fXMjzgAkKxteWUDR0LWpLF9In/quTccoGhIOkQTULuXLQxckvsMhMEl9HJws9T4sBblw2rtavQoqF1vGCayslgHsS7ExZkUXOTE3pOy3k+YboXy48KfEKUzyXhciY3OkTZxKGiizxhQd81TdgRybDeVhAbJMj3mNHcMowdqH7Vbsd0dhGPSUNPUFUHWnZ/Lw0H9E8j1999NOaoiAFXrK8vp2OTXQJDE7WH/0Uyisfla+UnkppcsAgpIxJl8sZ1gpabVqcw/CQ9bFjmVeqCkjCLE1Jbro8B6HKPPkvP8PtUQGrGZwLsjc3JmY2r2GiYWwlxaitItLmfIQvn56uoWNiKX12taQNNlWMBw21rqqEmg= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4848 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT015.eop-EUR03.prod.protection.outlook.com X-Forefront-Antispam-Report: CIP:63.35.35.123; IPV:CAL; SCL:-1; CTRY:IE; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(346002)(39860400002)(396003)(376002)(136003)(189003)(199004)(478600001)(74316002)(305945005)(22756006)(23726003)(81156014)(81166006)(4326008)(446003)(1511001)(11346002)(14444005)(8676002)(102836004)(70586007)(126002)(5660300002)(6116002)(14454004)(70206006)(6246003)(316002)(47776003)(86362001)(2201001)(476003)(486006)(26826003)(3846002)(336012)(63350400001)(52536014)(97756001)(50466002)(66066001)(110136005)(7736002)(36906005)(26005)(30864003)(76130400001)(33656002)(99286004)(186003)(356004)(54906003)(2501003)(76176011)(2906002)(229853002)(46406003)(6506007)(55016002)(8936002)(8746002)(25786009)(7696005)(9686003); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1SPR00MB140; H:64aa7808-outbound-1.mta.getcheckrecipient.com; FPR:; SPF:TempError; LANG:en; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; MX:1; A:1; X-MS-Office365-Filtering-Correlation-Id-Prvs: cb5e7047-3d53-42bc-c24d-08d75102130e NoDisclaimer: True X-Forefront-PRVS: 01901B3451 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: YutP4zIijlj/zkO/1OqdjpyH4peJSXoheyZBkpWEEJdVx0S88qDXFTb9e1Zd9hll2kPYBL48EV+OuE6w1VGU1aTWAIGintoH+hJ519FXdrCmBCKdjZ5IzhSWR5E3Mh5O+wF6YMi73cow9JfoQOiKvItLiNTG6p+mF2ARn0AY3gv9Tszzxo8tkzmZeo+rbU26MlukuDzTW+d5PTXGPyTClTP2fPb/p1iTVsr7dwmqSGMFnuUUG8rb1HKNluIFaUp+jiobLvTztbr9Mxd0gEwHKKHuy7juwVa9xxuhhlf4pZkGB4D2qHpUV5ckkhXj77bbEHhxjevqay1jrV65/CsR6SodaPPNeiWNH7cmgcS/bwWky9AslBfeRufE+ZFoGfjZ0TDZ5Ddl0GCb/QDSmayvdykPDFnwVYDHXkSdy3IlLyU= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Oct 2019 23:56:16.3714 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e1f1ea02-2eca-4e00-91af-08d751021a0d X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1SPR00MB140 Subject: Re: [dpdk-dev] [PATCH v4 1/2] lib/ring: apis to support configurable element size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Konstantin, Thank you for the feedback. >=20 > > > > > > Current APIs assume ring elements to be pointers. However, in many > > > use cases, the size can be different. Add new APIs to support > > > configurable ring element sizes. > > > > > > Signed-off-by: Honnappa Nagarahalli > > > Reviewed-by: Dharmik Thakkar > > > Reviewed-by: Gavin Hu > > > Reviewed-by: Ruifeng Wang > > > --- > > > lib/librte_ring/Makefile | 3 +- > > > lib/librte_ring/meson.build | 3 + > > > lib/librte_ring/rte_ring.c | 45 +- > > > lib/librte_ring/rte_ring.h | 1 + > > > lib/librte_ring/rte_ring_elem.h | 946 +++++++++++++++++++++++++= ++ > > > lib/librte_ring/rte_ring_version.map | 2 + > > > 6 files changed, 991 insertions(+), 9 deletions(-) create mode > > > 100644 lib/librte_ring/rte_ring_elem.h > > > > > > diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile > > > index 21a36770d..515a967bb 100644 > > > --- a/lib/librte_ring/Makefile > > > +++ b/lib/librte_ring/Makefile > > > @@ -6,7 +6,7 @@ include $(RTE_SDK)/mk/rte.vars.mk # library name > > > LIB =3D librte_ring.a > > > > > > -CFLAGS +=3D $(WERROR_FLAGS) -I$(SRCDIR) -O3 > > > +CFLAGS +=3D $(WERROR_FLAGS) -I$(SRCDIR) -O3 - > > > DALLOW_EXPERIMENTAL_API > > > LDLIBS +=3D -lrte_eal > > > > > > EXPORT_MAP :=3D rte_ring_version.map > > > @@ -18,6 +18,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_RING) :=3D rte_ring.c > > > > > > # install includes > > > SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include :=3D rte_ring.h \ > > > + rte_ring_elem.h \ > > > rte_ring_generic.h \ > > > rte_ring_c11_mem.h > > > > > > diff --git a/lib/librte_ring/meson.build > > > b/lib/librte_ring/meson.build index ab8b0b469..74219840a 100644 > > > --- a/lib/librte_ring/meson.build > > > +++ b/lib/librte_ring/meson.build > > > @@ -6,3 +6,6 @@ sources =3D files('rte_ring.c') headers =3D files('r= te_ring.h', > > > 'rte_ring_c11_mem.h', > > > 'rte_ring_generic.h') > > > + > > > +# rte_ring_create_elem and rte_ring_get_memsize_elem are > > > +experimental allow_experimental_apis =3D true > > > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c > > > index d9b308036..6fed3648b 100644 > > > --- a/lib/librte_ring/rte_ring.c > > > +++ b/lib/librte_ring/rte_ring.c > > > @@ -33,6 +33,7 @@ > > > #include > > > > > > #include "rte_ring.h" > > > +#include "rte_ring_elem.h" > > > > > > TAILQ_HEAD(rte_ring_list, rte_tailq_entry); > > > > > > @@ -46,23 +47,42 @@ EAL_REGISTER_TAILQ(rte_ring_tailq) > > > > > > /* return the size of memory occupied by a ring */ ssize_t - > > > rte_ring_get_memsize(unsigned count) > > > +rte_ring_get_memsize_elem(unsigned count, unsigned esize) > > > { > > > ssize_t sz; > > > > > > + /* Supported esize values are 4/8/16. > > > + * Others can be added on need basis. > > > + */ > > > + if ((esize !=3D 4) && (esize !=3D 8) && (esize !=3D 16)) { > > > + RTE_LOG(ERR, RING, > > > + "Unsupported esize value. Supported values are 4, 8 > > > and 16\n"); > > > + > > > + return -EINVAL; > > > + } > > > + > > > /* count must be a power of 2 */ > > > if ((!POWEROF2(count)) || (count > RTE_RING_SZ_MASK )) { > > > RTE_LOG(ERR, RING, > > > - "Requested size is invalid, must be power of 2, and " > > > - "do not exceed the size limit %u\n", > > > RTE_RING_SZ_MASK); > > > + "Requested number of elements is invalid, must be " > > > + "power of 2, and do not exceed the limit %u\n", > > > + RTE_RING_SZ_MASK); > > > + > > > return -EINVAL; > > > } > > > > > > - sz =3D sizeof(struct rte_ring) + count * sizeof(void *); > > > + sz =3D sizeof(struct rte_ring) + count * esize; > > > sz =3D RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); > > > return sz; > > > } > > > > > > +/* return the size of memory occupied by a ring */ ssize_t > > > +rte_ring_get_memsize(unsigned count) { > > > + return rte_ring_get_memsize_elem(count, sizeof(void *)); } > > > + > > > void > > > rte_ring_reset(struct rte_ring *r) > > > { > > > @@ -114,10 +134,10 @@ rte_ring_init(struct rte_ring *r, const char > > > *name, unsigned count, > > > return 0; > > > } > > > > > > -/* create the ring */ > > > +/* create the ring for a given element size */ > > > struct rte_ring * > > > -rte_ring_create(const char *name, unsigned count, int socket_id, > > > - unsigned flags) > > > +rte_ring_create_elem(const char *name, unsigned count, unsigned esiz= e, > > > + int socket_id, unsigned flags) > > > { > > > char mz_name[RTE_MEMZONE_NAMESIZE]; > > > struct rte_ring *r; > > > @@ -135,7 +155,7 @@ rte_ring_create(const char *name, unsigned > > > count, int socket_id, > > > if (flags & RING_F_EXACT_SZ) > > > count =3D rte_align32pow2(count + 1); > > > > > > - ring_size =3D rte_ring_get_memsize(count); > > > + ring_size =3D rte_ring_get_memsize_elem(count, esize); > > > if (ring_size < 0) { > > > rte_errno =3D ring_size; > > > return NULL; > > > @@ -182,6 +202,15 @@ rte_ring_create(const char *name, unsigned > > > count, int socket_id, > > > return r; > > > } > > > > > > +/* create the ring */ > > > +struct rte_ring * > > > +rte_ring_create(const char *name, unsigned count, int socket_id, > > > + unsigned flags) > > > +{ > > > + return rte_ring_create_elem(name, count, sizeof(void *), socket_id, > > > + flags); > > > +} > > > + > > > /* free the ring */ > > > void > > > rte_ring_free(struct rte_ring *r) > > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > > > index > > > 2a9f768a1..18fc5d845 100644 > > > --- a/lib/librte_ring/rte_ring.h > > > +++ b/lib/librte_ring/rte_ring.h > > > @@ -216,6 +216,7 @@ int rte_ring_init(struct rte_ring *r, const char > > > *name, unsigned count, > > > */ > > > struct rte_ring *rte_ring_create(const char *name, unsigned count, > > > int socket_id, unsigned flags); > > > + > > > /** > > > * De-allocate all memory used by the ring. > > > * > > > diff --git a/lib/librte_ring/rte_ring_elem.h > > > b/lib/librte_ring/rte_ring_elem.h new file mode 100644 index > > > 000000000..860f059ad > > > --- /dev/null > > > +++ b/lib/librte_ring/rte_ring_elem.h > > > @@ -0,0 +1,946 @@ > > > +/* SPDX-License-Identifier: BSD-3-Clause > > > + * > > > + * Copyright (c) 2019 Arm Limited > > > + * Copyright (c) 2010-2017 Intel Corporation > > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org > > > + * All rights reserved. > > > + * Derived from FreeBSD's bufring.h > > > + * Used as BSD-3 Licensed with permission from Kip Macy. > > > + */ > > > + > > > +#ifndef _RTE_RING_ELEM_H_ > > > +#define _RTE_RING_ELEM_H_ > > > + > > > +/** > > > + * @file > > > + * RTE Ring with flexible element size */ > > > + > > > +#ifdef __cplusplus > > > +extern "C" { > > > +#endif > > > + > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > + > > > +#include "rte_ring.h" > > > + > > > +/** > > > + * @warning > > > + * @b EXPERIMENTAL: this API may change without prior notice > > > + * > > > + * Calculate the memory size needed for a ring with given element > > > +size > > > + * > > > + * This function returns the number of bytes needed for a ring, > > > +given > > > + * the number of elements in it and the size of the element. This > > > +value > > > + * is the sum of the size of the structure rte_ring and the size of > > > +the > > > + * memory needed for storing the elements. The value is aligned to > > > +a cache > > > + * line size. > > > + * > > > + * @param count > > > + * The number of elements in the ring (must be a power of 2). > > > + * @param esize > > > + * The size of ring element, in bytes. It must be a multiple of 4. > > > + * Currently, sizes 4, 8 and 16 are supported. > > > + * @return > > > + * - The memory size needed for the ring on success. > > > + * - -EINVAL if count is not a power of 2. > > > + */ > > > +__rte_experimental > > > +ssize_t rte_ring_get_memsize_elem(unsigned count, unsigned esize); > > > + > > > +/** > > > + * @warning > > > + * @b EXPERIMENTAL: this API may change without prior notice > > > + * > > > + * Create a new ring named *name* that stores elements with given si= ze. > > > + * > > > + * This function uses ``memzone_reserve()`` to allocate memory. > > > +Then it > > > + * calls rte_ring_init() to initialize an empty ring. > > > + * > > > + * The new ring size is set to *count*, which must be a power of > > > + * two. Water marking is disabled by default. The real usable ring > > > +size > > > + * is *count-1* instead of *count* to differentiate a free ring > > > +from an > > > + * empty ring. > > > + * > > > + * The ring is added in RTE_TAILQ_RING list. > > > + * > > > + * @param name > > > + * The name of the ring. > > > + * @param count > > > + * The number of elements in the ring (must be a power of 2). > > > + * @param esize > > > + * The size of ring element, in bytes. It must be a multiple of 4. > > > + * Currently, sizes 4, 8 and 16 are supported. > > > + * @param socket_id > > > + * The *socket_id* argument is the socket identifier in case of > > > + * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA > > > + * constraint for the reserved zone. > > > + * @param flags > > > + * An OR of the following: > > > + * - RING_F_SP_ENQ: If this flag is set, the default behavior whe= n > > > + * using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()`` > > > + * is "single-producer". Otherwise, it is "multi-producers". > > > + * - RING_F_SC_DEQ: If this flag is set, the default behavior whe= n > > > + * using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()`` > > > + * is "single-consumer". Otherwise, it is "multi-consumers". > > > + * @return > > > + * On success, the pointer to the new allocated ring. NULL on erro= r with > > > + * rte_errno set appropriately. Possible errno values include: > > > + * - E_RTE_NO_CONFIG - function could not get pointer to rte_conf= ig > > > structure > > > + * - E_RTE_SECONDARY - function was called from a secondary proce= ss > > > instance > > > + * - EINVAL - count provided is not a power of 2 > > > + * - ENOSPC - the maximum number of memzones has already been > > > allocated > > > + * - EEXIST - a memzone with the same name already exists > > > + * - ENOMEM - no appropriate memory area found in which to create > > > memzone > > > + */ > > > +__rte_experimental > > > +struct rte_ring *rte_ring_create_elem(const char *name, unsigned cou= nt, > > > + unsigned esize, int socket_id, unsigned flags); > > > + > > > +/* the actual enqueue of pointers on the ring. > > > + * Placed here since identical code needed in both > > > + * single and multi producer enqueue functions. > > > + */ > > > +#define ENQUEUE_PTRS_ELEM(r, ring_start, prod_head, obj_table, > > > +esize, n) > > > do { \ > > > + if (esize =3D=3D 4) \ > > > + ENQUEUE_PTRS_32(r, ring_start, prod_head, obj_table, n); \ > > > + else if (esize =3D=3D 8) \ > > > + ENQUEUE_PTRS_64(r, ring_start, prod_head, obj_table, n); \ > > > + else if (esize =3D=3D 16) \ > > > + ENQUEUE_PTRS_128(r, ring_start, prod_head, obj_table, n); > \ } > > > while > > > +(0) > > > + > > > +#define ENQUEUE_PTRS_32(r, ring_start, prod_head, obj_table, n) do {= \ > > > + unsigned int i; \ > > > + const uint32_t size =3D (r)->size; \ > > > + uint32_t idx =3D prod_head & (r)->mask; \ > > > + uint32_t *ring =3D (uint32_t *)ring_start; \ > > > + uint32_t *obj =3D (uint32_t *)obj_table; \ > > > + if (likely(idx + n < size)) { \ > > > + for (i =3D 0; i < (n & ((~(unsigned)0x7))); i +=3D 8, idx +=3D 8) = { \ > > > + ring[idx] =3D obj[i]; \ > > > + ring[idx + 1] =3D obj[i + 1]; \ > > > + ring[idx + 2] =3D obj[i + 2]; \ > > > + ring[idx + 3] =3D obj[i + 3]; \ > > > + ring[idx + 4] =3D obj[i + 4]; \ > > > + ring[idx + 5] =3D obj[i + 5]; \ > > > + ring[idx + 6] =3D obj[i + 6]; \ > > > + ring[idx + 7] =3D obj[i + 7]; \ > > > + } \ > > > + switch (n & 0x7) { \ > > > + case 7: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 6: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 5: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 4: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 3: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 2: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 1: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + } \ > > > + } else { \ > > > + for (i =3D 0; idx < size; i++, idx++)\ > > > + ring[idx] =3D obj[i]; \ > > > + for (idx =3D 0; i < n; i++, idx++) \ > > > + ring[idx] =3D obj[i]; \ > > > + } \ > > > +} while (0) > > > + > > > +#define ENQUEUE_PTRS_64(r, ring_start, prod_head, obj_table, n) do {= \ > > > + unsigned int i; \ > > > + const uint32_t size =3D (r)->size; \ > > > + uint32_t idx =3D prod_head & (r)->mask; \ > > > + uint64_t *ring =3D (uint64_t *)ring_start; \ > > > + uint64_t *obj =3D (uint64_t *)obj_table; \ > > > + if (likely(idx + n < size)) { \ > > > + for (i =3D 0; i < (n & ((~(unsigned)0x3))); i +=3D 4, idx +=3D 4) = { \ > > > + ring[idx] =3D obj[i]; \ > > > + ring[idx + 1] =3D obj[i + 1]; \ > > > + ring[idx + 2] =3D obj[i + 2]; \ > > > + ring[idx + 3] =3D obj[i + 3]; \ > > > + } \ > > > + switch (n & 0x3) { \ > > > + case 3: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 2: \ > > > + ring[idx++] =3D obj[i++]; /* fallthrough */ \ > > > + case 1: \ > > > + ring[idx++] =3D obj[i++]; \ > > > + } \ > > > + } else { \ > > > + for (i =3D 0; idx < size; i++, idx++)\ > > > + ring[idx] =3D obj[i]; \ > > > + for (idx =3D 0; i < n; i++, idx++) \ > > > + ring[idx] =3D obj[i]; \ > > > + } \ > > > +} while (0) > > > + > > > +#define ENQUEUE_PTRS_128(r, ring_start, prod_head, obj_table, n) do > { \ > > > + unsigned int i; \ > > > + const uint32_t size =3D (r)->size; \ > > > + uint32_t idx =3D prod_head & (r)->mask; \ > > > + __uint128_t *ring =3D (__uint128_t *)ring_start; \ > > > + __uint128_t *obj =3D (__uint128_t *)obj_table; \ > > > + if (likely(idx + n < size)) { \ > > > + for (i =3D 0; i < (n >> 1); i +=3D 2, idx +=3D 2) { \ > > > + ring[idx] =3D obj[i]; \ > > > + ring[idx + 1] =3D obj[i + 1]; \ > > > + } \ > > > + switch (n & 0x1) { \ > > > + case 1: \ > > > + ring[idx++] =3D obj[i++]; \ > > > + } \ > > > + } else { \ > > > + for (i =3D 0; idx < size; i++, idx++)\ > > > + ring[idx] =3D obj[i]; \ > > > + for (idx =3D 0; i < n; i++, idx++) \ > > > + ring[idx] =3D obj[i]; \ > > > + } \ > > > +} while (0) > > > + > > > +/* the actual copy of pointers on the ring to obj_table. > > > + * Placed here since identical code needed in both > > > + * single and multi consumer dequeue functions. > > > + */ > > > +#define DEQUEUE_PTRS_ELEM(r, ring_start, cons_head, obj_table, > > > +esize, n) > > > do { \ > > > + if (esize =3D=3D 4) \ > > > + DEQUEUE_PTRS_32(r, ring_start, cons_head, obj_table, n); \ > > > + else if (esize =3D=3D 8) \ > > > + DEQUEUE_PTRS_64(r, ring_start, cons_head, obj_table, n); \ > > > + else if (esize =3D=3D 16) \ > > > + DEQUEUE_PTRS_128(r, ring_start, cons_head, obj_table, n); > \ } > > > while > > > +(0) > > > + > > > +#define DEQUEUE_PTRS_32(r, ring_start, cons_head, obj_table, n) do {= \ > > > + unsigned int i; \ > > > + uint32_t idx =3D cons_head & (r)->mask; \ > > > + const uint32_t size =3D (r)->size; \ > > > + uint32_t *ring =3D (uint32_t *)ring_start; \ > > > + uint32_t *obj =3D (uint32_t *)obj_table; \ > > > + if (likely(idx + n < size)) { \ > > > + for (i =3D 0; i < (n & (~(unsigned)0x7)); i +=3D 8, idx +=3D 8) {\ > > > + obj[i] =3D ring[idx]; \ > > > + obj[i + 1] =3D ring[idx + 1]; \ > > > + obj[i + 2] =3D ring[idx + 2]; \ > > > + obj[i + 3] =3D ring[idx + 3]; \ > > > + obj[i + 4] =3D ring[idx + 4]; \ > > > + obj[i + 5] =3D ring[idx + 5]; \ > > > + obj[i + 6] =3D ring[idx + 6]; \ > > > + obj[i + 7] =3D ring[idx + 7]; \ > > > + } \ > > > + switch (n & 0x7) { \ > > > + case 7: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 6: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 5: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 4: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 3: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 2: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 1: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + } \ > > > + } else { \ > > > + for (i =3D 0; idx < size; i++, idx++) \ > > > + obj[i] =3D ring[idx]; \ > > > + for (idx =3D 0; i < n; i++, idx++) \ > > > + obj[i] =3D ring[idx]; \ > > > + } \ > > > +} while (0) > > > + > > > +#define DEQUEUE_PTRS_64(r, ring_start, cons_head, obj_table, n) do {= \ > > > + unsigned int i; \ > > > + uint32_t idx =3D cons_head & (r)->mask; \ > > > + const uint32_t size =3D (r)->size; \ > > > + uint64_t *ring =3D (uint64_t *)ring_start; \ > > > + uint64_t *obj =3D (uint64_t *)obj_table; \ > > > + if (likely(idx + n < size)) { \ > > > + for (i =3D 0; i < (n & (~(unsigned)0x3)); i +=3D 4, idx +=3D 4) {\ > > > + obj[i] =3D ring[idx]; \ > > > + obj[i + 1] =3D ring[idx + 1]; \ > > > + obj[i + 2] =3D ring[idx + 2]; \ > > > + obj[i + 3] =3D ring[idx + 3]; \ > > > + } \ > > > + switch (n & 0x3) { \ > > > + case 3: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 2: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + case 1: \ > > > + obj[i++] =3D ring[idx++]; \ > > > + } \ > > > + } else { \ > > > + for (i =3D 0; idx < size; i++, idx++) \ > > > + obj[i] =3D ring[idx]; \ > > > + for (idx =3D 0; i < n; i++, idx++) \ > > > + obj[i] =3D ring[idx]; \ > > > + } \ > > > +} while (0) > > > + > > > +#define DEQUEUE_PTRS_128(r, ring_start, cons_head, obj_table, n) do > { \ > > > + unsigned int i; \ > > > + uint32_t idx =3D cons_head & (r)->mask; \ > > > + const uint32_t size =3D (r)->size; \ > > > + __uint128_t *ring =3D (__uint128_t *)ring_start; \ > > > + __uint128_t *obj =3D (__uint128_t *)obj_table; \ > > > + if (likely(idx + n < size)) { \ > > > + for (i =3D 0; i < (n >> 1); i +=3D 2, idx +=3D 2) { \ > > > + obj[i] =3D ring[idx]; \ > > > + obj[i + 1] =3D ring[idx + 1]; \ > > > + } \ > > > + switch (n & 0x1) { \ > > > + case 1: \ > > > + obj[i++] =3D ring[idx++]; /* fallthrough */ \ > > > + } \ > > > + } else { \ > > > + for (i =3D 0; idx < size; i++, idx++) \ > > > + obj[i] =3D ring[idx]; \ > > > + for (idx =3D 0; i < n; i++, idx++) \ > > > + obj[i] =3D ring[idx]; \ > > > + } \ > > > +} while (0) > > > + > > > +/* Between load and load. there might be cpu reorder in weak model > > > + * (powerpc/arm). > > > + * There are 2 choices for the users > > > + * 1.use rmb() memory barrier > > > + * 2.use one-direction load_acquire/store_release barrier,defined > > > +by > > > + * CONFIG_RTE_USE_C11_MEM_MODEL=3Dy > > > + * It depends on performance test results. > > > + * By default, move common functions to rte_ring_generic.h */ > > > +#ifdef RTE_USE_C11_MEM_MODEL #include "rte_ring_c11_mem.h" > > > +#else > > > +#include "rte_ring_generic.h" > > > +#endif > > > + > > > +/** > > > + * @internal Enqueue several objects on the ring > > > + * > > > + * @param r > > > + * A pointer to the ring structure. > > > + * @param obj_table > > > + * A pointer to a table of void * pointers (objects). > > > + * @param esize > > > + * The size of ring element, in bytes. It must be a multiple of 4. > > > + * Currently, sizes 4, 8 and 16 are supported. This should be the = same > > > + * as passed while creating the ring, otherwise the results are un= defined. > > > + * @param n > > > + * The number of objects to add in the ring from the obj_table. > > > + * @param behavior > > > + * RTE_RING_QUEUE_FIXED: Enqueue a fixed number of items from a > ring > > > + * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible > from > > > ring > > > + * @param is_sp > > > + * Indicates whether to use single producer or multi-producer head > update > > > + * @param free_space > > > + * returns the amount of space after the enqueue operation has > finished > > > + * @return > > > + * Actual number of objects enqueued. > > > + * If behavior =3D=3D RTE_RING_QUEUE_FIXED, this will be 0 or n on= ly. > > > + */ > > > +static __rte_always_inline unsigned int > > > +__rte_ring_do_enqueue_elem(struct rte_ring *r, void * const obj_tabl= e, > > > + unsigned int esize, unsigned int n, > > > + enum rte_ring_queue_behavior behavior, unsigned int is_sp, > > > + unsigned int *free_space) >=20 >=20 > I like the idea to add esize as an argument to the public API, so the com= piler > can do it's jib optimizing calls with constant esize. > Though I am not very happy with the rest of implementation: > 1. It doesn't really provide configurable elem size - only 4/8/16B elems = are > supported. Agree. I was thinking other sizes can be added on need basis. However, I am wondering if we should just provide for 4B and then the users= can use bulk operations to construct whatever they need? It would mean ext= ra work for the users. > 2. A lot of code duplication with these 3 copies of ENQUEUE/DEQUEUE > macros. >=20 > Looking at ENQUEUE/DEQUEUE macros, I can see that main loop always does > 32B copy per iteration. Yes, I tried to keep it the same as the existing one (originally, I guess t= he intention was to allow for 256b vector instructions to be generated) > So wonder can we make a generic function that would do 32B copy per > iteration in a main loop, and copy tail by 4B chunks? > That would avoid copy duplication and will allow user to have any elem si= ze > (multiple of 4B) he wants. > Something like that (note didn't test it, just a rough idea): >=20 > static inline void > copy_elems(uint32_t du32[], const uint32_t su32[], uint32_t num, uint32_t > esize) { > uint32_t i, sz; >=20 > sz =3D (num * esize) / sizeof(uint32_t); If 'num' is a compile time constant, 'sz' will be a compile time constant. = Otherwise, this will result in a multiplication operation. I have tried to = avoid the multiplication operation and try to use shift and mask operations= (just like how the rest of the ring code does). >=20 > for (i =3D 0; i < (sz & ~7); i +=3D 8) > memcpy(du32 + i, su32 + i, 8 * sizeof(uint32_t)); I had used memcpy to start with (for the entire copy operation), performanc= e is not the same for 64b elements when compared with the existing ring API= s (some cases more and some cases less). IMO, we have to keep the performance of the 64b and 128b the same as what w= e get with the existing ring and event-ring APIs. That would allow us to re= place them with these new APIs. I suggest that we keep the macros in this p= atch for 64b and 128b. For the rest of the sizes, we could put a for loop around 32b macro (this w= ould allow for all sizes as well). >=20 > switch (sz & 7) { > case 7: du32[sz - 7] =3D su32[sz - 7]; /* fallthrough */ > case 6: du32[sz - 6] =3D su32[sz - 6]; /* fallthrough */ > case 5: du32[sz - 5] =3D su32[sz - 5]; /* fallthrough */ > case 4: du32[sz - 4] =3D su32[sz - 4]; /* fallthrough */ > case 3: du32[sz - 3] =3D su32[sz - 3]; /* fallthrough */ > case 2: du32[sz - 2] =3D su32[sz - 2]; /* fallthrough */ > case 1: du32[sz - 1] =3D su32[sz - 1]; /* fallthrough */ > } > } >=20 > static inline void > enqueue_elems(struct rte_ring *r, void *ring_start, uint32_t prod_head, > void *obj_table, uint32_t num, uint32_t esize) { > uint32_t idx, n; > uint32_t *du32; >=20 > const uint32_t size =3D r->size; >=20 > idx =3D prod_head & (r)->mask; >=20 > du32 =3D ring_start + idx * sizeof(uint32_t); >=20 > if (idx + num < size) > copy_elems(du32, obj_table, num, esize); > else { > n =3D size - idx; > copy_elems(du32, obj_table, n, esize); > copy_elems(ring_start, obj_table + n * sizeof(uint32_t), > num - n, esize); > } > } >=20 > And then, in that function, instead of ENQUEUE_PTRS_ELEM(), just: >=20 > enqueue_elems(r, &r[1], prod_head, obj_table, n, esize); >=20 >=20 > > > +{ > > > + uint32_t prod_head, prod_next; > > > + uint32_t free_entries; > > > + > > > + n =3D __rte_ring_move_prod_head(r, is_sp, n, behavior, > > > + &prod_head, &prod_next, &free_entries); > > > + if (n =3D=3D 0) > > > + goto end; > > > + > > > + ENQUEUE_PTRS_ELEM(r, &r[1], prod_head, obj_table, esize, n); > > > + > > > + update_tail(&r->prod, prod_head, prod_next, is_sp, 1); > > > +end: > > > + if (free_space !=3D NULL) > > > + *free_space =3D free_entries - n; > > > + return n; > > > +} > > > +