From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6E586A0577; Tue, 14 Apr 2020 18:12:21 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B206A1C2AB; Tue, 14 Apr 2020 18:12:20 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id C9EC91C296 for ; Tue, 14 Apr 2020 18:12:18 +0200 (CEST) IronPort-SDR: GxfSPU8S7AeDVBxGfwYKOhzWpRbYTs+7enU2jwJKJlnM7u5g12t7mSn78G2FgA6Y/SZ7crLmkL +392m+4puBqg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2020 09:12:14 -0700 IronPort-SDR: XnnZl0/vZG2CkVhxYNwgNQFkgel0pwX/FynH7BN3kaUfavWXje1o5KYTEoS7/q5GbJEjmv6/l3 vGvJOSCRAWsw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,382,1580803200"; d="scan'208";a="243853421" Received: from orsmsx110.amr.corp.intel.com ([10.22.240.8]) by fmsmga007.fm.intel.com with ESMTP; 14 Apr 2020 09:12:14 -0700 Received: from ORSEDG001.ED.cps.intel.com (10.7.248.4) by ORSMSX110.amr.corp.intel.com (10.22.240.8) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 14 Apr 2020 09:12:14 -0700 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (104.47.66.46) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 14 Apr 2020 09:12:13 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Kq/hytamLQ1B+HYg7BMEhqIwv4IgdO2xc6ExP0WGi0FQbl7ESyF2xFPE65zexI9U6CSKdAtw47iOv7zDMFu0LyLW7JiD9LIdQPzriCLezYKm9ZcOHP8yEZmdNO8vg5ek/3TVjz3I+ylNKoM4Vj4oPrt4aoyoZVLW4x5EEE4RHXp19oL/GxtnurPYgEThHGRz9cb0ZrMiA+BKd7oLAqSBETdb4nK4t8UJqjexGl2zFjnAQO6VrFyq34/U8Ieif65FN9RKa42c8sVCKMy27hYNA+rsD1ZFCOm3gQQzQE5WMjTSciIgrsXO1cxUehMFKBSrPsfseBjBqwvPqrYcxRZgiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IfbCE5qHhJOA7eOAtvfjYktMumZZNf5Q10MLNaxuS84=; b=Djs/38WuDzMNWIgg+LLDgvjr4X9bJh38dSD97SPCYafn+k/weY1Kcn/9tm+1Kies685sGJzivMLZaMaArDWfn5wvqPlkiQr0bD3ZfHrLFNLgXUk+t5TYp/XGrRc6xPudgSs0Xh4+wDABTeoDVv53NU/ztaOptjH3337gJxlFuHQyhiBaUjAR0XoS9jdGQqf/JcJQtm58ogqpqBw4YP7mOYRvLrUX3SGjpWIdpEiIXYmgyecu8eYgnxlyzsM0CwHF3kjAEyXJEEq4pJvmI11j9gZNTdraEDlJGJOHvbNbsiZGXt71tp5yvvIfK0mtnmXu0A3rFojmu8bb8VhqeWhA/w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IfbCE5qHhJOA7eOAtvfjYktMumZZNf5Q10MLNaxuS84=; b=bb1kd7iQAf6TyP15gmID5sVnliuGTgOAl087qGBbiLYJP+dUzdgqYuY9mx29xaf5XxSRgqlMP/T1GKZDBk54rp896pcJiuA+exNV/SY372uCjgziHyLCbfGgrejP1yXdjU8XnTRqsvTJeg5fH+5sCfkjuZu6yRO9ouwJ0QZKMRQ= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BYAPR11MB3317.namprd11.prod.outlook.com (2603:10b6:a03:7b::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2900.28; Tue, 14 Apr 2020 16:12:11 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f8cb:58cd:e958:fff4]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f8cb:58cd:e958:fff4%6]) with mapi id 15.20.2900.028; Tue, 14 Apr 2020 16:12:11 +0000 From: "Ananyev, Konstantin" To: Honnappa Nagarahalli , "dev@dpdk.org" CC: "david.marchand@redhat.com" , "jielong.zjl@antfin.com" , nd , nd Thread-Topic: [PATCH v3 5/9] ring: introduce HTS ring mode Thread-Index: AQHWCd9zHNOpeydtq02y3aBTRTrn0qh3wpMAgAEQFWA= Date: Tue, 14 Apr 2020 16:12:11 +0000 Message-ID: References: <20200402220959.29885-1-konstantin.ananyev@intel.com> <20200403174235.23308-1-konstantin.ananyev@intel.com> <20200403174235.23308-6-konstantin.ananyev@intel.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 authentication-results: spf=none (sender IP is ) smtp.mailfrom=konstantin.ananyev@intel.com; x-originating-ip: [192.198.151.172] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 7bae616b-cc6e-4552-6d08-08d7e08e96e3 x-ms-traffictypediagnostic: BYAPR11MB3317: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 0373D94D15 x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(10019020)(366004)(136003)(376002)(39860400002)(346002)(396003)(2906002)(66556008)(81156014)(66446008)(8936002)(9686003)(33656002)(86362001)(66946007)(66476007)(7696005)(5660300002)(8676002)(52536014)(76116006)(64756008)(55016002)(54906003)(4326008)(186003)(6506007)(316002)(478600001)(26005)(110136005)(71200400001); DIR:OUT; SFP:1102; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: C/mCTmTE70itrfb4eaZSvavB4Y3A8bMYP+ylB9cAp4TyhNDEeU5AnlmQgtG6h7199LqVlUi9tGu4vqupmldIh87kGM/p7bYoJ7S+A9H0o1W5sLkLQIvILRkkD9UYMliNklvZF2PkYeEQ6gtYxYC0GlCPYSw4Jy9YPh2RtlFy/uiUi0VP6GWNPacWnWXliVeX6ezH9b+0acxvKFYqIABj3TfMdX3yYqicYeXhQSAzavHh7H9Z4Z/Ku/UabKJy0NNbWA+Dq5LJQRVawUfZUtERbsJ8UAhqqka5VTOQJp4XG91rc3kATh+JQhsh15UfY/S2Qd1gyAFE8T5QVmZT5x3MnTcWFcmnQcWqiSQsbNESwu6I+4kVy8crUmyBixdQWqtsHb/KrWdYMQfaFaZqewcYwi0OKoP7T5eZ2AQgFicp/t9VFVzBLoy3shpwt2TYeh9N x-ms-exchange-antispam-messagedata: EOME9Nz1i/0YEYedrovNNP+YnYHjMflAOKj1TAyG8Nd+dgbZJw///5VLSLQQRmqjKUVmC67BVFFkazGZ/La2lFjjd0i3HvUN89WIJCz63YjJj+Vg6ae8pmmnInAk7ElBIC+ebwl1bHWUza5dCPG8lA== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 7bae616b-cc6e-4552-6d08-08d7e08e96e3 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Apr 2020 16:12:11.5150 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 8LGIm4B+4uRXlUqeAagLCq3RDdvIAHvzThvXdCUaAfdweb2NiMqj7xuwIL+ha3o6/r6N5m0peDAv2bk60/EC/Q72ve+en0Noy6gZyv1ICEQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3317 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v3 5/9] ring: introduce HTS ring mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Honnappa, >=20 > Hi Konstantin, > Few nits/comments inline. >=20 > >=20 > > diff --git a/lib/librte_ring/rte_ring_hts.h b/lib/librte_ring/rte_ring_= hts.h new > > file mode 100644 index 000000000..062d7be6c > > --- /dev/null > > +++ b/lib/librte_ring/rte_ring_hts.h > > @@ -0,0 +1,210 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * > > + * Copyright (c) 2010-2017 Intel Corporation > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org > > + * All rights reserved. > > + * Derived from FreeBSD's bufring.h > > + * Used as BSD-3 Licensed with permission from Kip Macy. > > + */ > > + > > +#ifndef _RTE_RING_HTS_H_ > > +#define _RTE_RING_HTS_H_ > > + > > +/** > > + * @file rte_ring_hts.h > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * It is not recommended to include this file directly. > > + * Please include instead. > > + * > > + * Contains functions for serialized, aka Head-Tail Sync (HTS) ring mo= de. > > + * In that mode enqueue/dequeue operation is fully serialized: > > + * at any given moement only one enqueue/dequeue operation can proceed= . > ^^^^^^^^ moment > > + * This is achieved by thread is allowed to proceed with changing > ^^^^^^^^^^^^^^ allowing a thr= ead > > +head.value > > + * only when head.value =3D=3D tail.value. > > + * Both head and tail values are updated atomically (as one 64-bit val= ue). > > + * To achieve that 64-bit CAS is used by head update routine. > > + */ > > + > > +#ifdef __cplusplus > > +extern "C" { > > +#endif > > + > > +#include > > + >=20 > >=20 > > diff --git a/lib/librte_ring/rte_ring_hts_generic.h > > b/lib/librte_ring/rte_ring_hts_generic.h > > new file mode 100644 > > index 000000000..da08f1d94 > > --- /dev/null > > +++ b/lib/librte_ring/rte_ring_hts_generic.h > > @@ -0,0 +1,198 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * > > + * Copyright (c) 2010-2020 Intel Corporation > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org > > + * All rights reserved. > > + * Derived from FreeBSD's bufring.h > > + * Used as BSD-3 Licensed with permission from Kip Macy. > > + */ > > + > > +#ifndef _RTE_RING_HTS_GENERIC_H_ > > +#define _RTE_RING_HTS_GENERIC_H_ > > + > > +/** > > + * @file rte_ring_hts_generic.h > > + * It is not recommended to include this file directly, > > + * include instead. > > + * Contains internal helper functions for head/tail sync (HTS) ring mo= de. > > + * For more information please refer to . > > + */ > > + > > +static __rte_always_inline void > > +__rte_ring_hts_update_tail(struct rte_ring_hts_headtail *ht, uint32_t = num, > > + uint32_t enqueue) > > +{ > > + union rte_ring_ht_pos p; > > + > > + if (enqueue) > > + rte_smp_wmb(); > > + else > > + rte_smp_rmb(); > > + > > + p.raw =3D rte_atomic64_read((rte_atomic64_t *)(uintptr_t)&ht- > > >ht.raw); > This read can be avoided if the new head can be returned from '__rte_ring= _hts_head_wait'. Yes, or even cur_head and num should be enough to avoid read here. >=20 > > + > > + p.pos.head =3D p.pos.tail + num; > > + p.pos.tail =3D p.pos.head; > > + > > + rte_atomic64_set((rte_atomic64_t *)(uintptr_t)&ht->ht.raw, p.raw); } > Why not use 32b atomic operation here and update just the tail? Agree, this code-path can do just 32-bit store (as head is not going to cha= nge). =20 >=20 > > + > > +/** > > + * @internal waits till tail will become equal to head. > > + * Means no writer/reader is active for that ring. > > + * Suppose to work as serialization point. > > + */ > > +static __rte_always_inline void > > +__rte_ring_hts_head_wait(const struct rte_ring_hts_headtail *ht, > > + union rte_ring_ht_pos *p) > > +{ > > + p->raw =3D rte_atomic64_read((rte_atomic64_t *) > > + (uintptr_t)&ht->ht.raw); > > + > > + while (p->pos.head !=3D p->pos.tail) { > > + rte_pause(); > > + p->raw =3D rte_atomic64_read((rte_atomic64_t *) > > + (uintptr_t)&ht->ht.raw); > > + } > > +} > > + > > +/** > > + * @internal This function updates the producer head for enqueue > > + * > > + * @param r > > + * A pointer to the ring structure > > + * @param is_sp > > + * Indicates whether multi-producer path is needed or not > > + * @param n > > + * The number of elements we will want to enqueue, i.e. how far shou= ld the > > + * head be moved > > + * @param behavior > > + * RTE_RING_QUEUE_FIXED: Enqueue a fixed number of items from a r= ing > > + * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from > > ring > > + * @param old_head > > + * Returns head value as it was before the move, i.e. where enqueue = starts > > + * @param new_head > > + * Returns the current/new head value i.e. where enqueue finishes Ups, copy/paste thing - will remove. > Would be good to return the new_head from this function and use it in '__= rte_ring_hts_update_tail'. I think old_head + num should be enough here (see above). >=20 > > + * @param free_entries > > + * Returns the amount of free space in the ring BEFORE head was move= d > > + * @return > > + * Actual number of objects enqueued. > > + * If behavior =3D=3D RTE_RING_QUEUE_FIXED, this will be 0 or n only= . > > + */ > Minor, suggest removing the elaborate comments, it is not required and di= fficult to maintain. > I think we should do the same thing for other files too. Sorry, didn't get you here: what exactly do you suggest to remove? >=20 > > +static __rte_always_inline unsigned int > > +__rte_ring_hts_move_prod_head(struct rte_ring *r, unsigned int num, > > + enum rte_ring_queue_behavior behavior, uint32_t *old_head, > > + uint32_t *free_entries) > > +{ > > + uint32_t n; > > + union rte_ring_ht_pos np, op; > > + > > + const uint32_t capacity =3D r->capacity; > > + > > + do { > > + /* Reset n to the initial burst count */ > > + n =3D num; > > + > > + /* wait for tail to be equal to head */ > > + __rte_ring_hts_head_wait(&r->hts_prod, &op); > > + > > + /* add rmb barrier to avoid load/load reorder in weak > > + * memory model. It is noop on x86 > > + */ > > + rte_smp_rmb(); > > + > > + /* > > + * The subtraction is done between two unsigned 32bits > > value > > + * (the result is always modulo 32 bits even if we have > > + * *old_head > cons_tail). So 'free_entries' is always between > > 0 > > + * and capacity (which is < size). > > + */ > > + *free_entries =3D capacity + r->cons.tail - op.pos.head; > > + > > + /* check that we have enough room in ring */ > > + if (unlikely(n > *free_entries)) > > + n =3D (behavior =3D=3D RTE_RING_QUEUE_FIXED) ? > > + 0 : *free_entries; > > + > > + if (n =3D=3D 0) > > + break; > > + > > + np.pos.tail =3D op.pos.tail; > > + np.pos.head =3D op.pos.head + n; > > + > > + } while (rte_atomic64_cmpset(&r->hts_prod.ht.raw, > > + op.raw, np.raw) =3D=3D 0); > I think we can use 32b atomic operation here and just update the head. I think we have to do proper 64 bit CAS here, otherwise ABA race could aris= e: Thread reads head/tail values, then get suspended just before CAS instructi= on for a while. Thread resumes when ring head value is equal to thread's local head value, but tail differs (some other thread enqueuing into the ring). If we'll do CAS just for head - it would succeed, though it shouldn't. =20 I understand that with 32-bit head/tail values probability of such situatio= n is really low, but still. >=20 > > + > > + *old_head =3D op.pos.head; > > + return n; > > +} > > + > > +/** > > + * @internal This function updates the consumer head for dequeue > > + * > > + * @param r > > + * A pointer to the ring structure > > + * @param is_sc > > + * Indicates whether multi-consumer path is needed or not > > + * @param n > > + * The number of elements we will want to enqueue, i.e. how far shou= ld the > > + * head be moved > > + * @param behavior > > + * RTE_RING_QUEUE_FIXED: Dequeue a fixed number of items from a > > ring > > + * RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from > > ring > > + * @param old_head > > + * Returns head value as it was before the move, i.e. where dequeue = starts > > + * @param new_head > > + * Returns the current/new head value i.e. where dequeue finishes > > + * @param entries > > + * Returns the number of entries in the ring BEFORE head was moved > > + * @return > > + * - Actual number of objects dequeued. > > + * If behavior =3D=3D RTE_RING_QUEUE_FIXED, this will be 0 or n on= ly. > > + */ > > +static __rte_always_inline unsigned int > > +__rte_ring_hts_move_cons_head(struct rte_ring *r, unsigned int num, > > + enum rte_ring_queue_behavior behavior, uint32_t *old_head, > > + uint32_t *entries) > > +{ > > + uint32_t n; > > + union rte_ring_ht_pos np, op; > > + > > + /* move cons.head atomically */ > > + do { > > + /* Restore n as it may change every loop */ > > + n =3D num; > > + > > + /* wait for tail to be equal to head */ > > + __rte_ring_hts_head_wait(&r->hts_cons, &op); > > + > > + /* add rmb barrier to avoid load/load reorder in weak > > + * memory model. It is noop on x86 > > + */ > > + rte_smp_rmb(); > > + > > + /* The subtraction is done between two unsigned 32bits value > > + * (the result is always modulo 32 bits even if we have > > + * cons_head > prod_tail). So 'entries' is always between 0 > > + * and size(ring)-1. > > + */ > > + *entries =3D r->prod.tail - op.pos.head; > > + > > + /* Set the actual entries for dequeue */ > > + if (n > *entries) > > + n =3D (behavior =3D=3D RTE_RING_QUEUE_FIXED) ? 0 : > > *entries; > > + > > + if (unlikely(n =3D=3D 0)) > > + break; > > + > > + np.pos.tail =3D op.pos.tail; > > + np.pos.head =3D op.pos.head + n; > > + > > + } while (rte_atomic64_cmpset(&r->hts_cons.ht.raw, > > + op.raw, np.raw) =3D=3D 0); > > + > > + *old_head =3D op.pos.head; > > + return n; > > +} > > + > > +#endif /* _RTE_RING_HTS_GENERIC_H_ */ > > -- > > 2.17.1