From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AB646A0547; Wed, 27 Oct 2021 13:03:24 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2D10D410FA; Wed, 27 Oct 2021 13:03:24 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by mails.dpdk.org (Postfix) with ESMTP id DC41D4068C for ; Wed, 27 Oct 2021 13:03:22 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10149"; a="230408210" X-IronPort-AV: E=Sophos;i="5.87,186,1631602800"; d="scan'208";a="230408210" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Oct 2021 04:03:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,186,1631602800"; d="scan'208";a="486624866" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by orsmga007.jf.intel.com with ESMTP; 27 Oct 2021 04:03:21 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Wed, 27 Oct 2021 04:03:20 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12 via Frontend Transport; Wed, 27 Oct 2021 04:03:20 -0700 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (104.47.74.43) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2242.12; Wed, 27 Oct 2021 04:03:20 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=A56X853JOkHl+Ifr0lSrieda7eTAc0wZUKnpSnEH7SdP4+PFsO92JXSAbqkYNAURwO2WoE42F6DFEVbhCC/Uo9UcIeqKfFrDtuRY7iFziSzLptHTRsRRV6oUK10JWMVsPEGi0BWVR2SkDarFI9ao+0jhRbaLjRi8gTUhKpozaBvZwCrrDpNmIyz4ay5s+IIEn4Fl9plCwCPgQPrW/elxKw5G/+nBRyFu2JYtOvW264IyBKO3CDqWJd+3yztLrE5aeobUO6wq339IEpJo2A1dEBPWcBV6jR7p150aSNiJlH8KxFL2VU2N4Z5NxFcVXcNx24aT0OUvojuGnX48jDc+UA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=eOJIBni3KMqroVOq57AZVsdjjFAbNk6HYWZelCyVsjE=; b=Sz952e3s2fg2xpp+7VZ/ilu80TgppVmpjjFgMGkd6CuLc92jBpHOeuqcl/dS/zMrzP2l6YShqXbcPg+BjIsW3VunesNW+cZxRy0EcSQs+XzknyRY2tCacTpQkPgRI2YkPYbFBnHXebPyzpJxAXfVbzsKSJFapfN5Ct/YAJrFVWl4xZzibt2f8huSiXZNT4SpkONMEjl+WVcvKNZadEqWxAsiCRsnot05ZeM1RQmAU6AOofioEwlamE3MwTAhNr0AsLLiJrIdaVNOv8/bMzda08Svo29le+GSmcDcMHWg6X5t3c7krtxmawwZeerm2yGMoZsGW1kb39I88GE32UmGqA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=eOJIBni3KMqroVOq57AZVsdjjFAbNk6HYWZelCyVsjE=; b=S1Gr99U2ojFDSmvtQAkuAZOokt6TlKPwRFWwqB2Ietzy3WyqbDq4Nk8rgIZU+OXw64sOlpL2/cL/ps8GHeeI6SlrrXZnLMKMkTLsleDYty1IqXK8Z3n1NoMkr6XjhJqsp8IgIlS/LyV3s/hxYCfbgE6cjTfHm0RXFDXTPz+3JDY= Received: from BN0PR11MB5712.namprd11.prod.outlook.com (2603:10b6:408:160::17) by BN6PR1101MB2081.namprd11.prod.outlook.com (2603:10b6:405:52::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.18; Wed, 27 Oct 2021 11:03:09 +0000 Received: from BN0PR11MB5712.namprd11.prod.outlook.com ([fe80::528:c09e:b5cd:2e21]) by BN0PR11MB5712.namprd11.prod.outlook.com ([fe80::528:c09e:b5cd:2e21%3]) with mapi id 15.20.4649.015; Wed, 27 Oct 2021 11:03:09 +0000 From: "Van Haaren, Harry" To: Thomas Monjalon , Aman Kumar CC: "dev@dpdk.org" , "viacheslavo@nvidia.com" , "Burakov, Anatoly" , "keesang.song@amd.com" , "aman.kumar@vvdntech.in" , "jerinjacobk@gmail.com" , "Ananyev, Konstantin" , "Richardson, Bruce" , "honnappa.nagarahalli@arm.com" , Ruifeng Wang , "David Christensen" , "david.marchand@redhat.com" , "stephen@networkplumber.org" Thread-Topic: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform Thread-Index: AQHXywRV93xG4AptJ0KbROxhBgfdp6vmfvMAgAAqDWA= Date: Wed, 27 Oct 2021 11:03:09 +0000 Message-ID: References: <20211026155645.246783-1-aman.kumar@vvdntech.in> <20211027072810.257795-1-aman.kumar@vvdntech.in> <20211027072810.257795-2-aman.kumar@vvdntech.in> <1932804.9rrtejxFVQ@thomas> In-Reply-To: <1932804.9rrtejxFVQ@thomas> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.6.200.16 authentication-results: monjalon.net; dkim=none (message not signed) header.d=none;monjalon.net; dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 351ea3c6-6eb0-4e62-e23d-08d999395cb7 x-ms-traffictypediagnostic: BN6PR1101MB2081: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: a0VhGlyktJpBu5XGlU8xfVzkqqg5hD70Q9A/H6ITvuFIm+Z/oyTTIT4S2yFWqFKOMMPimhDjQ5/5f0wmthMmBFpdoYfkbcuKydS8LMfMEPhxucC9vGHnT7dhlqGlZ/v3hTAV6Me2OL9snmrlKzv2ZBMsnTtwMKsyzj3HecUwkY7fPqPLM/dBXxqRwf5sbwRvjy/87mLPlZI63S4r36DoZ8yzZNGSxFG5qAKxpzi1OIx4QvFVaKviNlk21oiYEyDJhtjIuRAcpRhIveNTiZ5RtjOHz7oz7OxKaD9MRsfsiqKLNgZ1HCIxXTc5mfRuTAFPEQ+8iGzHnOIhPm21C59So4UG6Hs0ezUs85O8uoG4RM05EtkiJTxK9uHC0W8yXGYbZlaDUc4yUtNlI0vD6JqYfS1vVvanYr5RDfhbX+JCDf4q/lR0auRID3t2jugO348G4+lW7sIDAssoAfLJZM42kadrN9Dp+SuYnkzJvjWLYFGGFfryBAYyr4blmZSxOpp84pSnwJJXNFB36n5kP5KW1+esCYjr7AkoUKr9EtDjc3znpFh9O1jhIRYnH5XOzGNZ2xOogj/VGfIk8nED5ud9psfZBDVYBJFehG8yJNwgoCzowXhf5Vz3XMZQj0ltJ16sxSPjlyYm3+o0zfup0wvp+i6Zv32iaUyKdUTVXKrDanaIBQf6m0CpHTFHmVioVaqaXxqtxAksZsAx2VNedBpRjTWUx41WZJfrg61EyG1jrNLniatOYUN15oex/qvFGDFtxnb44aYiR82EybVwhDFFxLwqTkurp+yYvcxOgL65gIA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BN0PR11MB5712.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(64756008)(66446008)(508600001)(38070700005)(7696005)(66476007)(53546011)(66946007)(966005)(4326008)(186003)(38100700002)(66556008)(9686003)(66574015)(52536014)(76116006)(2906002)(55016002)(54906003)(5660300002)(110136005)(122000001)(6506007)(83380400001)(316002)(71200400001)(33656002)(86362001)(8676002)(26005)(82960400001)(8936002)(7416002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?uyKkngDtW0mbavgg3uv9Q52m3u9oILueUvf9IJHO6K28KdWZ/JjEK8DTotkW?= =?us-ascii?Q?6v5Jk+oh4KgN8cdos55pMyl8rzU8Q5ZGkyX59L/S93NM/w43Q3jwaREiALkQ?= =?us-ascii?Q?uvy78Vs9JIZ4E5tn6MgSio4ogi4dO9JhZTrByzKsCBTiqBscvlmjxJ7KU+bO?= =?us-ascii?Q?imwHOyqWXtD2vHIj+ilT5VPkhM6v+dAfRrB2f7pxbC7DxCdNZPP54ZE/tnzE?= =?us-ascii?Q?zAWTdhBYy8TZaYsKpDQpJ76DluLwW8kSuy3TRmTj9ugwfkFG1dfvnmqGXIZq?= =?us-ascii?Q?nF9SPbRuDp4rdPLtjjLmhfWqUHGaGRx4QojKxiuSBKFolIQPdZY3Q6ry+c2q?= =?us-ascii?Q?A5uiSX0DOt1XHmMkoYi82318KjiWbT2w7xzWDXZI2mR29W219rvSIbayDx/n?= =?us-ascii?Q?qB3yzOnmALPKInrsNSKQOpE3D59grjZNMQibVlltMskk7zTbrqlzlGzTKx3b?= =?us-ascii?Q?ckPt5gkGXMM8T6JXSviUgF32fbi54E+PWad1n4ntVOj1MgxvHmT3t6XRiiJc?= =?us-ascii?Q?KcKU4rX3Bo7XbBrEJW+oXWyv0V56VdYK9MY/xPGbv+SWi5vkwXkCaDir6FGO?= =?us-ascii?Q?M/WHRIEWV8UfXdXqic5bHQ0T3dXJ44AHdz1oPAMp8ZXJPgMDRsSYfd3LnV/T?= =?us-ascii?Q?lmABvdk2HhBaAYkWWsg1u4nbCbB6mLwWbL9FtB16UalesDucCUqZzGQRnRx/?= =?us-ascii?Q?REXL87mA1cgTR1RHpF2U4PGWehtA7uFCL+T/x48RC6J5lxByZXmEoS7bqvAD?= =?us-ascii?Q?RXDLRk4qwGlG4H2hKRomiIqWP5LVi2Fdr9T1GFBWAmJTD+dY0omhBvp5CQAe?= =?us-ascii?Q?RtYhlzc0rDndiSQ78GtaFDKwrSzcOpwBuhWL57WlmqaylhnMY+JvOT7AlT8Z?= =?us-ascii?Q?PtCNL8UEcNZPOdGDQZ7hMgI0aORbT57xT5Jz13rTeWAv0EP9oxvjw8aMGJyi?= =?us-ascii?Q?959WWgFSb8yeLwoHP86G/cC3CWuPmgKCG88zjWoGJEhRtlQ923UM1/0IyiB2?= =?us-ascii?Q?i1Q/c1gQuS0U+zh0Ivky+sZIqAGmitdLPHZYGY7kXQouNs292uM9SVz4vvMp?= =?us-ascii?Q?XuhhtzAqn51Qb70vVEC4sFACjjrOeO/oWSyJI0kJrfd0M6H757Deuufax2dJ?= =?us-ascii?Q?6Gaxa2BeT5vG2+crxwsGziOrthaeR2IpmVsgD2FEi+5WA1jRJHymE96cAw3k?= =?us-ascii?Q?bE8PQNPQPOFNwu85Vx7mys1oHyXnHyAmN6CKy6O/9mP3SW3AKemKNcv5cHLA?= =?us-ascii?Q?e/FRhgDg4XUV65F8I0UgFOj3SiHqf7z8MQcYYmfNwVBVNDI8yAXH+NGfdF7v?= =?us-ascii?Q?eHDyLkhIffAiNgj/bg/LW8D/c8LfQ4IAkWQtyZ5P71q1g8DEG1O/w+yRaWb+?= =?us-ascii?Q?acP0rK9uSor2nOrNjW9Xjfdy6vDx9ITm8MFlUagnw2p0eb3QgZxT46sF3Hms?= =?us-ascii?Q?FXFxE7de+GLRHAPFWytOmLgy3xTF7ma55z+0DY3OY9Rz3I3rEpKLiBKUVXK1?= =?us-ascii?Q?a4FFfefid3P5NwJQCcTQj/g97mcTxWBuQVE46f+ATpCKdB6EYoTgb9R7nFbw?= =?us-ascii?Q?nFBGM5FKoS+dHCVP9+BydExb9zP79Dix0nplHCmWn22ZjTO4PahdXLnTzIPQ?= =?us-ascii?Q?7WoIODauzK0weum+MVQlCThwHd54QKb2Ia0r3/wPz6ri6PmMTVWTf1mnHMrD?= =?us-ascii?Q?hqCL2Q=3D=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BN0PR11MB5712.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 351ea3c6-6eb0-4e62-e23d-08d999395cb7 X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Oct 2021 11:03:09.6719 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 1wYWubgVLiV/VrsREUh44OT+43Gy10YHNioBYNOO8ekwwpDMq8Q/BG+ho1nHTtMQO5IljdRt9Ip9xzKqzsEHRscg4M1PDIPZkCbsDl6vf74= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR1101MB2081 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: dev On Behalf Of Thomas Monjalon > Sent: Wednesday, October 27, 2021 9:13 AM > To: Aman Kumar > Cc: dev@dpdk.org; viacheslavo@nvidia.com; Burakov, Anatoly > ; keesang.song@amd.com; > aman.kumar@vvdntech.in; jerinjacobk@gmail.com; Ananyev, Konstantin > ; Richardson, Bruce > ; honnappa.nagarahalli@arm.com; Ruifeng Wang > ; David Christensen ; > david.marchand@redhat.com; stephen@networkplumber.org > Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy > support for AMD platform >=20 > 27/10/2021 09:28, Aman Kumar: > > This patch provides a rte_memcpy* call with temporal stores. > > Use -Dcpu_instruction_set=3DznverX with build to enable this API. > > > > Signed-off-by: Aman Kumar >=20 > For the series, Acked-by: Thomas Monjalon > With the hope that such optimization will go in libc in a near future. >=20 > If there is no objection, I will merge this AMD-specific series in 21.11-= rc2. > It should not affect other platforms. Hi Folks, This patchset was brought to my attention, and I have a few concerns. I'll add short snippets of context from the patch here so I can refer to it= below; +/** + * Copy 16 bytes from one location to another, + * with temporal stores + */ +static __rte_always_inline void +rte_copy16_ts(uint8_t *dst, uint8_t *src) +{ + __m128i var128; + + var128 =3D _mm_stream_load_si128((__m128i *)src); + _mm_storeu_si128((__m128i *)dst, var128); +} 1) What is fundamentally specific to the znverX CPU? Is there any reason th= is can not just be enabled for x86-64 generic with SSE4.1 ISA requirements? _mm_stream_load_si128() is part of SSE4.1 _mm_storeu_si128() is SSE2.=20 Using the intrinsics guide for lookup of intrinsics to ISA level: https://w= ww.intel.com/content/www/us/en/docs/intrinsics-guide/index.html?wapkw=3Dint= rinsics%20guide#text=3D_mm_stream_load&ig_expand=3D6884 2) Are -D options allowed to change/break API/ABI? By allowing -Dcpu_instruction_set=3D to change available functions, any app= lication using it is no longer source-code (API) compatible with "DPDK" pro= per. This patch essentially splits a "DPDK" app to depend on "DPDK + CPU version= -D flag", in an incompatible way (no fallback?). 3) The stream load instruction used here *requires* 16-byte alignment for i= ts operand. This is not documented, and worse, a uint8_t* is accepted, which is cast to= (__m128i *). This cast hides the compiler warning for expanding type-alignments. And the code itself is broken - passing a "src" parameter that is not 16-by= te aligned will segfault. 4) Temporal and Non-temporal are not logically presented here. Temporal loads/stores are normal loads/stores. They use the L1/L2 caches. Non-temporal loads/stores indicate that the data will *not* be used again i= n a short space of time. Non-temporal means "having no relation to time" according to my internet se= arch. 5) The *store* here uses a normal store (temporal, targets cache). The *loa= d* however is a streaming (non-temporal, no cache) load. It is not clearly documented that A) stream load will be used. The inverse is documented "copy with ts" aka, copy with temporal store. Is documenting the store as temporal meant to imply that the load is non-te= mporal? 6) What is the use-case for this? When would a user *want* to use this inst= ead of rte_memcpy()? If the data being loaded is relevant to datapath/packets, presumably other = packets might require the loaded data, so temporal (normal) loads should be used to cache the source = data? 7) Why is streaming (non-temporal) loads & stores not used? I guess maybe t= his is regarding the use-case, but its not clear to me right now why loads are NT, and stores are T. All in all, I do not think merging this patch is a good idea. I would like = to understand the motivation for adding this type of function, and then see it being done in a way that is clearly = documented regarding temporal loads/stores, and not changing/adding APIs for specific CPUs. So apologies for late feedback, but this is not of high enough quality to b= e merged to DPDK right now, NACK.