From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 144E8A057B; Tue, 24 Mar 2020 14:21:43 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DF9B61C0D6; Tue, 24 Mar 2020 14:21:41 +0100 (CET) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 9BB68374C for ; Tue, 24 Mar 2020 14:21:40 +0100 (CET) IronPort-SDR: nvGoneErK6VsNxG1IoJ0yfhXv5AYzl4fS0khU/eAW1WmBHW/lr2h6vGOBE8pLVnp1xGkTRQa/n ONV3HQWUoVsw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Mar 2020 06:21:39 -0700 IronPort-SDR: xZqxVNAxVUoBSl3cn4e0QeZuG27Uy25/zFzbgoVwTN2ap9uwiSYYRGBgjAxc/TT7zVBKN3yp4s sO5amDVjzheA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,300,1580803200"; d="scan'208";a="419891728" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by orsmga005.jf.intel.com with ESMTP; 24 Mar 2020 06:21:39 -0700 Received: from fmsmsx603.amr.corp.intel.com (10.18.126.83) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 24 Mar 2020 06:21:38 -0700 Received: from fmsmsx605.amr.corp.intel.com (10.18.126.85) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 24 Mar 2020 06:21:38 -0700 Received: from FMSEDG002.ED.cps.intel.com (10.1.192.134) by fmsmsx605.amr.corp.intel.com (10.18.126.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 24 Mar 2020 06:21:38 -0700 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (104.47.55.105) by edgegateway.intel.com (192.55.55.69) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 24 Mar 2020 06:21:37 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XAAnwjYnd4Tya+GIHAfvhtrmirvTX08u4mWYeBQnJTQThIKStVQVS5hOTX/7Risene16zs7emDb8+vx94/Gv2G1QvUNYywpM6o2rIUanOP3UEX7m9F0aqiCaYFbQ7JxIgfx/Kdy1yAkbLBFdo0ZH6li1D+BKFya939qAmXLtrBVh2Nbgrgeq+KOvMC7WuqQvreYAjDVYUXz3IMip7zxyHxlkAjY8xbkw1gCd4xHpbFExEP5fSj5VW2dBW15UYRDBe79EhBFRg8K6140jSyCf3p7EkSP9ap8SzYEtNni+v9x+ISiBIJ5ixbofU6kbWJs+MrIch+gNqt4Ctms/G8+QZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DWFWA7DpLAHbnTOxpwatc/YGmDLYvm2jMyB6PMm30Nc=; b=HyeK/axDwhKgbSQRGciSUCL4TmKjle8/bE9kdje1Scwd/UGvYBGoVPq6lpreMgREB0Tv95kkxH+0t6WbN81V1noiP0WhubsB0tNNA9IVGC3TxppQ4qYV+yro+jKZXsZnOKzyUfIqXzla7673RPQqNO42YTKgPKl51j80bOgZd1IljFOEuG4A3g4tQIUEHOXILu9DMsTtnMBlQOTj8xLDfcRa/QopykcjaT90AVwwMWn6b93yuRO666jLPynHjEruCFz4J2vJ1HRsngqjfo8LKfg4N//eSW6hk2afleRW5oaCif5OeMAqvDaHWDJFpRgDNw6T3jBqXGwSaVUKKIDrMQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DWFWA7DpLAHbnTOxpwatc/YGmDLYvm2jMyB6PMm30Nc=; b=clX6DGfV91zTT+fQiFnmW+c6Hg+f8zC3Ui/vKTrRgZvtAeykBPKRFhj8g1iQRj9JtEhBwsEIHg3IzwBafpy3C3r9Wty8YnSRmjAY8+4HKdqpHeOiuEpglxqMRFtsu0HIXZ1JME0ECOKZV1xKxFGEyqV/1T+V+CAcVsSgy3Hm+/k= Received: from SN6PR11MB2558.namprd11.prod.outlook.com (2603:10b6:805:5d::19) by SN6PR11MB2832.namprd11.prod.outlook.com (2603:10b6:805:5b::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2835.22; Tue, 24 Mar 2020 13:21:34 +0000 Received: from SN6PR11MB2558.namprd11.prod.outlook.com ([fe80::5df7:d515:ec1d:8db1]) by SN6PR11MB2558.namprd11.prod.outlook.com ([fe80::5df7:d515:ec1d:8db1%7]) with mapi id 15.20.2835.021; Tue, 24 Mar 2020 13:21:34 +0000 From: "Ananyev, Konstantin" To: "Ananyev, Konstantin" , Honnappa Nagarahalli , Phil Yang , "thomas@monjalon.net" , "Van Haaren, Harry" , "stephen@networkplumber.org" , "maxime.coquelin@redhat.com" , "dev@dpdk.org" , "Richardson, Bruce" CC: "david.marchand@redhat.com" , "jerinj@marvell.com" , "hemant.agrawal@nxp.com" , Gavin Hu , Ruifeng Wang , Joyce Kong , nd , nd Thread-Topic: [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update Thread-Index: AQHV+/oXyqmxuvMqMkaN9n+g1LYtyqhWh04ggAANbwCAAAGBEIAAEtYAgAEJS8CAABPIsA== Date: Tue, 24 Mar 2020 13:21:34 +0000 Message-ID: References: <1583999071-22872-1-git-send-email-phil.yang@arm.com> <1584407863-774-1-git-send-email-phil.yang@arm.com> <1584407863-774-7-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 authentication-results: spf=none (sender IP is ) smtp.mailfrom=konstantin.ananyev@intel.com; x-originating-ip: [192.198.151.178] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 08d344e9-d5d6-45ef-823d-08d7cff6465f x-ms-traffictypediagnostic: SN6PR11MB2832: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-forefront-prvs: 03524FBD26 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(136003)(39860400002)(396003)(366004)(346002)(376002)(7696005)(4326008)(9686003)(6636002)(81166006)(52536014)(186003)(81156014)(45080400002)(66446008)(33656002)(26005)(316002)(76116006)(66476007)(478600001)(2906002)(66556008)(64756008)(6506007)(110136005)(53546011)(54906003)(7416002)(66946007)(15650500001)(2940100002)(5660300002)(8676002)(8936002)(55016002)(966005)(86362001)(71200400001)(921003)(1121003); DIR:OUT; SFP:1102; SCL:1; SRVR:SN6PR11MB2832; H:SN6PR11MB2558.namprd11.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: RgRulXeSRE3W6SeIxXaI2Ed1kcjiZEsQDhvr91jmQfV1QqO9HxJzVL7RV7sao8KawzEa+/xRpJOaLr7q5vD8fIdOTNNIZ4+7EpHlRjujPUQsdwjBbQdpsXqRAqpmiG8F81RSyPnMfu38Zcpwo712szGQXjSD8+7nT2NA+y8TKzdEeQUJpn73DGmng1qL+JaU8kcpEXC8PNqFDEaXAhZaDXbxwQIYL3JdGrJIbAGPZ4FUi5zihO0Qw/url74DhYVrT6QBeZjk/KF6CnrPw8rKJ6Nt2SLARTgxiKCS+RpXywo39CWpJ830vyQS6VqsUGoC87xuYRkjuJ9OpSRDBG0Wt2glqwHPLuvzsity127CV79mcgq+xlucXJ2ErwAQU6CDEX06aqL2YHUW/q2R8VNTkf/6ISRBS39yRuOAefTbWJ3BPi9RmbG+MtAZcenaFNNunoEAkB5j30Z2hhnTUEmPXFYN3jJIsD5tsqsHju98Dgvas3nTlXYhcPLW9cAavDzOL4LkBezr0N/g1p5KjuiQsEHbneYXs5dR4GGZPIF6P4KY6CWrxaSmUhRe5MI6I46PKiJePK8RWtmZ/KzITiRB1w== x-ms-exchange-antispam-messagedata: npwEXkeUC08P8Drf5IdqU6MxfS4jshGjX9JsPkMEDruBnF+NYHYcfS9mdFz6YvXpNSJJDUZaLMIEBoBPrWMaQiXK7wPi23qMZpjCqW4FwLL1GUcuBTciPNsZB7pAl49zryShlkhXj+x2vSL62Bxt2Q== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 08d344e9-d5d6-45ef-823d-08d7cff6465f X-MS-Exchange-CrossTenant-originalarrivaltime: 24 Mar 2020 13:21:34.3724 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: baRFoQAOCmYrlXfl9+T7OytUXz7yKp3c8hu5XzA0CQbFoR4RDYKajJPkDn8Afea4s0Dvw8K8caaYYJOK4nHUmuPhysLLwIgc7NIGe7QxYpY= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR11MB2832 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: dev On Behalf Of Ananyev, Konstantin > Sent: Tuesday, March 24, 2020 1:10 PM > To: Honnappa Nagarahalli ; Phil Yang ; thomas@monjalon.net; Van Haaren, > Harry ; stephen@networkplumber.org; maxime.co= quelin@redhat.com; dev@dpdk.org; Richardson, Bruce > > Cc: david.marchand@redhat.com; jerinj@marvell.com; hemant.agrawal@nxp.com= ; Gavin Hu ; Ruifeng Wang > ; Joyce Kong ; nd ;= nd > Subject: Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic = for sa outbound sqn update >=20 >=20 > > > > > > For SA outbound packets, rte_atomic64_add_return is used to > > > > > > generate SQN atomically. This introduced an unnecessary full > > > > > > barrier by calling the '__sync' builtin implemented rte_atomic_= XX > > > > > > API on aarch64. This patch optimized it with c11 atomic and > > > > > > eliminated the expensive barrier for aarch64. > > > > > > > > > > > > Signed-off-by: Phil Yang > > > > > > Reviewed-by: Ruifeng Wang > > > > > > Reviewed-by: Gavin Hu > > > > > > --- > > > > > > lib/librte_ipsec/ipsec_sqn.h | 3 ++- > > > > > > lib/librte_ipsec/sa.h | 2 +- > > > > > > 2 files changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/lib/librte_ipsec/ipsec_sqn.h > > > > > > b/lib/librte_ipsec/ipsec_sqn.h index 0c2f76a..e884af7 100644 > > > > > > --- a/lib/librte_ipsec/ipsec_sqn.h > > > > > > +++ b/lib/librte_ipsec/ipsec_sqn.h > > > > > > @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa= , > > > > > > uint32_t *num) > > > > > > > > > > > > n =3D *num; > > > > > > if (SQN_ATOMIC(sa)) > > > > > > - sqn =3D (uint64_t)rte_atomic64_add_return(&sa- > > > > > >sqn.outb.atom, n); > > > > > > + sqn =3D __atomic_add_fetch(&sa->sqn.outb.atom, n, > > > > > > + __ATOMIC_RELAXED); > > > > > > > > > > One generic thing to note: > > > > > clang for i686 in some cases will generate a proper function call > > > > > for 64-bit __atomic builtins (gcc seems to always generate cmpxch= ng8b for > > > such cases). > > > > > Does anyone consider it as a potential problem? > > > > > It probably not a big deal, but would like to know broader opinio= n. > > > > I had looked at this some time back for GCC. The function call is > > > > generated only if the underlying platform does not support the atom= ic > > > instructions for the operand size. Otherwise, gcc generates the instr= uctions > > > directly. > > > > I would think the behavior would be the same for clang. > > > > > > From what I see not really. > > > As an example: > > > > > > $ cat tatm11.c > > > #include > > > > > > struct x { > > > uint64_t v __attribute__((aligned(8))); }; > > > > > > uint64_t > > > ffxadd1(struct x *x, uint32_t n, uint32_t m) { > > > return __atomic_add_fetch(&x->v, n, __ATOMIC_RELAXED); } > > > > > > uint64_t > > > ffxadd11(uint64_t *v, uint32_t n, uint32_t m) { > > > return __atomic_add_fetch(v, n, __ATOMIC_RELAXED); } > > > > > > gcc for i686 will generate code with cmpxchng8b for both cases. > > > clang will generate cmpxchng8b for ffxadd1() - when data is explicitl= y 8B > > > aligned, but will emit a function call for ffxadd11(). > > Does it require libatomic to be linked in this case? >=20 > Yes, it does. > In fact same story even with current dpdk.org master. > To make i686-native-linuxapp-clang successfully, I have to > explicitly add EXTRA_LDFLAGS=3D"-latomic". > To be more specific: > $ for i in i686-native-linuxapp-clang/lib/*.a; do x=3D`nm $i | grep __ato= mic_`; if [[ -n "${x}" ]]; then echo $i; echo $x; fi; done > i686-native-linuxapp-clang/lib/librte_distributor.a > U __atomic_load_8 U __atomic_store_8 > i686-native-linuxapp-clang/lib/librte_pmd_opdl_event.a > U __atomic_load_8 U __atomic_store_8 > i686-native-linuxapp-clang/lib/librte_rcu.a > U __atomic_compare_exchange_8 U __atomic_load_8 >=20 > As there were no complains so far, it makes me think that > probably no-one using clang for IA-32 builds. >=20 > > Clang documentation calls out unaligned case where it would generate th= e function call > > [1]. >=20 > Seems so, and it treats uin64_t as 4B aligned for IA. correction: for IA-32 >=20 > > On aarch64, the atomic instructions need the address to be aligned. >=20 > For that particular case (cmpxchng8b) there is no such restrictions for I= A-32. > Again, as I said before, gcc manages to emit code without function calls > for exactly the same source. >=20 > > > > [1] https://clang.llvm.org/docs/Toolchain.html#atomics-library > > > > > > > > > > > > > > > > > > > > else { > > > > > > sqn =3D sa->sqn.outb.raw + n; > > > > > > sa->sqn.outb.raw =3D sqn; > > > > > > diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h inde= x > > > > > > d22451b..cab9a2e 100644 > > > > > > --- a/lib/librte_ipsec/sa.h > > > > > > +++ b/lib/librte_ipsec/sa.h > > > > > > @@ -120,7 +120,7 @@ struct rte_ipsec_sa { > > > > > > */ > > > > > > union { > > > > > > union { > > > > > > - rte_atomic64_t atom; > > > > > > + uint64_t atom; > > > > > > uint64_t raw; > > > > > > } outb; > > > > > > > > > > If we don't need rte_atomic64 here anymore, then I think we can > > > > > collapse the union to just: > > > > > uint64_t outb; > > > > > > > > > > > struct { > > > > > > -- > > > > > > 2.7.4