From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B342DA0C57; Mon, 1 Nov 2021 11:28:01 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7433F40E28; Mon, 1 Nov 2021 11:28:01 +0100 (CET) Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam07on2053.outbound.protection.outlook.com [40.107.95.53]) by mails.dpdk.org (Postfix) with ESMTP id D97614111F for ; Fri, 29 Oct 2021 18:01:30 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MANoNnde3dM0pkL6gPH2f8IJbGB6imy1oiZTOYDDNy8OL3LIG0d/40M7bIUwY4KUSDaAvB9gjpK4qdL4HQDanLtZTUOEmxgle3e5I5fuqCC7ggorzmUiOjyN8jAJ4Ll+awm1PcnTrvkuKE0AeohaVs7UUaxpMDa9MqoCWFmqIj37FfH8Fm0+T24/ROXpqkRk09HH45/0n0NLb+nTDJjZ06VBX+4OOQSd81zZHbfV8kTiwIq1xR+emI0u46BY62lMGm2yekuEqiC0eVA5dEn8MbMRtsSuKkYiy7fnCtQCQSRLWLQX41c18aliz7xTu+auF7fuO7OLr2ZC5LKDckLcPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zexDnKzn8JljLqirNunLoC8j63gwtjB91ru6kW00Kto=; b=SGxU125JW4TOwfQlNumYBSdN0gOSZmvou3ue/rt8ax0KhE5UCkTraGdv1DCkX8R47LGzZThT16FmtGmrTEiXSWuW9/zgMm7B51dtBAxPZv8sr5e2JuOo27/+vkABTazxOD6XmmqBLXDIPh6JAvvi20x/W0p3WDjkpxb9KxbjwnuazYRbz56qgTa+UsV8FxXLH4Rd5W7X/cYPWAKOhtuik65AaCN4Fbls2iegEn+STVzVAtmL8j9urbOfVMbNrI7If4hGC7U/8rANIPdjaQ7JmxwrUOjMZu+dSwIF7aG5vfgRlJsj+24voGV61WJZQ+AA7r2ok5mDauieqrwINEnBgA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zexDnKzn8JljLqirNunLoC8j63gwtjB91ru6kW00Kto=; b=ui+rBI38z96+H8sppTz4V4TEKxpyVyVcZ1r12wdM+dEm0/k9jDGXR46J50Z6Ab63gMFaN9ZuVmgjTMSdDIlkfxDgxSFZnKpNqMZv2EK0Qm1uS0CPgSJO1RsDZUV3en2aGJOMOl3Gz61thKU2Che+G2e6V862rcD4/7l8kM+6YY0= Received: from BY5PR12MB3681.namprd12.prod.outlook.com (2603:10b6:a03:194::16) by BYAPR12MB2631.namprd12.prod.outlook.com (2603:10b6:a03:6b::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.15; Fri, 29 Oct 2021 16:01:25 +0000 Received: from BY5PR12MB3681.namprd12.prod.outlook.com ([fe80::1419:77f5:8c88:c53c]) by BY5PR12MB3681.namprd12.prod.outlook.com ([fe80::1419:77f5:8c88:c53c%7]) with mapi id 15.20.4649.017; Fri, 29 Oct 2021 16:01:25 +0000 From: "Song, Keesang" To: Thomas Monjalon , Aman Kumar , "Ananyev, Konstantin" , "Van Haaren, Harry" CC: "mattias. ronnblom" , "dev@dpdk.org" , "viacheslavo@nvidia.com" , "Burakov, Anatoly" , "jerinjacobk@gmail.com" , "Richardson, Bruce" , "honnappa.nagarahalli@arm.com" , Ruifeng Wang , David Christensen , "david.marchand@redhat.com" , "stephen@networkplumber.org" Thread-Topic: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform Thread-Index: AQHXywQ+Ku/nqGM1bkiUD1T4/wYNfKvmfvQAgAAvd4CAAArGgIAACXgAgAAB/gCAABQtgIAACeeAgAAF4ICAAz0fsA== Date: Fri, 29 Oct 2021 16:01:25 +0000 Message-ID: References: <20211026155645.246783-1-aman.kumar@vvdntech.in> <2811952.0RjXIbInLB@thomas> In-Reply-To: <2811952.0RjXIbInLB@thomas> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_ActionId=90a2fef1-9d0c-46fa-af38-ea6774b9268d; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_ContentBits=0; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_Enabled=true; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_Method=Standard; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_Name=AMD Official Use Only-AIP 2.0; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_SetDate=2021-10-29T15:58:57Z; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d; authentication-results: monjalon.net; dkim=none (message not signed) header.d=none;monjalon.net; dmarc=none action=none header.from=amd.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 10c5645c-80b8-420e-393f-08d99af55c55 x-ms-traffictypediagnostic: BYAPR12MB2631: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: fCso435vy47P2n0spH/ljooFeXPc7gzs6k9XZWVj9bL/hbqfokXNVAeszWsZeAL+k4gpFaNV21eFYRwS+Vp3tkMOmxCaGxdwBzoEm0U2MxnlMcKeH2yrt7wXvwyOVsM7xnPymBzcJUstg/yezAlSqhp3Ao/f6lAEZ0jLqV7o5jQlvjaeqTDjTkgisvM/9vYfa5ka0ARfjvYgWEmSblUzwEhAVgq4iElw565zXpLHgcWckM8Zdu8xyUcSyDjF2f9aqrcuFnRxa1nIuWnaG0tsDZpz8vq/yTg+MD48ARmr0J/pFy8hWYblDAR1WicmBsoelyg5jgYXxbF7rRYcWpD3G2DKLP5LtR4EJkPPlTMyDrDblWzFVO3/spD649JU0Y2QXS04ScpFK7v/h8bUSTkdMfJfqC5Os1j9CX3TJboxl7eTfc2UyWpv2TxMXyUqgRC2Pk4ieTo2eJ/p1zu+XmotESReRbtRSsOqPaJJDPvE6MeQ69BfO5PimX9ldKTmK4Ku7TuEl8lw0yiaKFONMt1EfXhaMZtd63OziUDd61m24q17wi+3bSqiFj0NKOGwMwEekZFrWLND7ApkrHWMIVpryfDExXM7KMhTr2V/RI62+3kUmSVfl0lY8vDdiWeAKVhun9/EQIEBLR05PcTQ3O8Qlc9Nwds7pezh7aqPP/lDPpvLyI9Cu74N5Q3JnnrdYbEDzWrwYWglzSximnQLbVfLrCB8QH1bBiXoRAPmaQnWo6AUzXtgUNGWtOA0YJ/wt5rHrmA0m7bvEDqEldHYN0fuWD0nlZO+qjw1U+cNManpxCo= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY5PR12MB3681.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(53546011)(8936002)(316002)(508600001)(71200400001)(38070700005)(6506007)(110136005)(2906002)(54906003)(33656002)(7416002)(5660300002)(4326008)(86362001)(66946007)(8676002)(9686003)(55016002)(966005)(52536014)(7696005)(66476007)(66556008)(122000001)(83380400001)(45080400002)(26005)(38100700002)(64756008)(76116006)(186003)(66446008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?aXWrMwv9hvbqjcFnqm+tI0a/a7v8yzL44B5RrH3Qay/PDGf4wkUZ/LrbRo1y?= =?us-ascii?Q?W274/biCKa8ZGtiTHucLFKGwhBzQChRJ3f7cvP2ag8ougRCa4thes0MYqqiG?= =?us-ascii?Q?wsfzxsY60exNR0/4cq0zw5v4SQ7SsXlR7YmJx7LZArYnuIYNlSRq2sshvmHK?= =?us-ascii?Q?ob+jMwGe1TZLEPF0B8BIYsaruV+Pk1hyAaVeZiFv8HGBVgPi/8J5cfkQ+4Kj?= =?us-ascii?Q?S8FyDWl+ARJ+IOeefR9OBiQAbWHm14PlCxIjzB8SOgZCdPpEQXQLzOv3/iws?= =?us-ascii?Q?MURrY7cXqIHd1MV1MYYKoeYS/MtdUINhCvbQxK+Z19WRs46v0r0Htla/st+E?= =?us-ascii?Q?wcB5FXoA1APPNVBQiKHSLnuhCg/b5BW8flp+DLh6l7mP0sv0uy2YuIzg5xZZ?= =?us-ascii?Q?erqcsm3fwIG0nwQM2MGGYz5DcCOyjMW+bCZgAeop5Cssd2ORVzG9RstB5fiF?= =?us-ascii?Q?27fhzEDAYCgV/K94Q6+P3L31Z+FI2w9fFK7AdIhR6BYqChTkzCoADZSiDYq5?= =?us-ascii?Q?u3/FU/vliKk02S1oZwKw2F6L0INLHj0vSCWSKxDl2N8WK7UUxQNP2VWDJuiF?= =?us-ascii?Q?VW/h6dEwCu/cOOrCKzH+72qCAhKDvNLwouE2Bs25o7YoHkVE2YmhpqUHDQMG?= =?us-ascii?Q?pPKt/JoqWbWHsM3eVyGM8b2Qyu3JVQWu5DimE8ZWy7i3SqvhtC3zNblWFt67?= =?us-ascii?Q?sVeg3Q7b1WiXQhNXMot1ai2WUS8O1K4HL10R8hFFyxVA02YgwnmOL3HPaTBB?= =?us-ascii?Q?mZhv2cov/QCWRHQCrpIEH+IXSQf7irWXCHfewmXMVLe0GW9M1589/EVb/zw+?= =?us-ascii?Q?gg0INd+O/DNL76/ynSnw4GnjqgvVxJZ94kEQYLhuXxZgHyMZTzYhf4OAMcI8?= =?us-ascii?Q?K3TUMbeS/F4qcnBAGk/xtKbsH9WlB6FB9CaaJvJaXftu2UNCV0BF8ZJWyWIs?= =?us-ascii?Q?CfI4GuHVaz5WuoKbRHFrHgTsVjXhnB/rQea2egBVIAEneksb9Y78KAFg46db?= =?us-ascii?Q?BfZ0YwpVjk+f50aCeaygjGFz6+ItpNEb82acCF8PtGuIakUGev/sraxOFyWd?= =?us-ascii?Q?qtc2O7ZwCBF/5igVV69C2fP9loXtquYUP+83gzn9YoxSxDFtfXbsjKjwSsB1?= =?us-ascii?Q?gYYrp0CawxV+mc+yGhmwpCt/Xw16xe6VnXhwT17H9x/Gs5MiXVcVDwdHgXA/?= =?us-ascii?Q?XaRqdgS5ixN4FU7pntWMsS5BQGaXLI6B3hLJ0Nw4H79Il+i/p9GXC5s1AYIC?= =?us-ascii?Q?CSrx62jz+nLF4U0Wr1Fbrc2C0WjiClHPvs0+43dXCryJlZwd8W6UfGTYIq6C?= =?us-ascii?Q?24Zi6uC2H+O9/6OkYXkPodtyA0QFJDEKkozJ4bAL8ts8aXQiSoW72qdtEhWx?= =?us-ascii?Q?s+pA9A0MJLHxzDXkI3UE/xkQ9kmJR24KLEEQ9Yho1y3JUz9CTXpcSwP9VAHe?= =?us-ascii?Q?2raNyjL3+2/BRNR6dRShbQBmlg4N5bLRv/5YwAI1Q4RgKBtZxxcvjMRNWJqn?= =?us-ascii?Q?LvMEni9QrBninQYyRp98ac32wkJaOEJakvMmOxbL3o92MYPS4qeF0SUg8Qkq?= =?us-ascii?Q?tE8AadmwpBUt6YXX+n7Xc4EQ3dpgBvV5ian3SDFZhMMRy2GgwCnAzTZHVx1t?= =?us-ascii?Q?ttYN8hkz+d0doGvDJkfcREg=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BY5PR12MB3681.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 10c5645c-80b8-420e-393f-08d99af55c55 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Oct 2021 16:01:25.3911 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Jqto2h2TG6dZ315MihnvsOmCYSNgE+m8o+cb5oCYzwFdm9+wP0onQRqm0V1Ovy6JNX8FFcebFuo5KtTFiOY74A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2631 X-Mailman-Approved-At: Mon, 01 Nov 2021 11:28:00 +0100 Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" [AMD Official Use Only] Hi Thomas, There are some gaps among us, so I think we really need another quick meeti= ng call to discuss. I will set up a call like the last time on Monday. Please join in the call if possible. Thanks, Keesang -----Original Message----- From: Thomas Monjalon Sent: Wednesday, October 27, 2021 7:31 AM To: Aman Kumar ; Ananyev, Konstantin ; Van Haaren, Harry Cc: mattias. ronnblom ; dev@dpdk.org; viache= slavo@nvidia.com; Burakov, Anatoly ; Song, Keesa= ng ; jerinjacobk@gmail.com; Richardson, Bruce ; honnappa.nagarahalli@arm.com; Ruifeng Wang ; David Christensen ; david.marchand@r= edhat.com; stephen@networkplumber.org Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy s= upport for AMD platform [CAUTION: External Email] 27/10/2021 16:10, Van Haaren, Harry: > From: Aman Kumar On Wed, Oct 27, 2021 at 5:53 > PM Ananyev, Konstantin wrote > > > > Hi Mattias, > > > > > > 6) What is the use-case for this? When would a user *want* to > > > > use this instead > > > of rte_memcpy()? > > > > If the data being loaded is relevant to datapath/packets, > > > > presumably other > > > packets might require the > > > > loaded data, so temporal (normal) loads should be used to cache > > > > the source > > > data? > > > > > > > > > I'm not sure if your first question is rhetorical or not, but a > > > memcpy() in a NT variant is certainly useful. One use case for a > > > memcpy() with temporal loads and non-temporal stores is if you > > > need to archive packet payload for (distant, potential) future > > > use, and want to avoid causing unnecessary LLC evictions while doing = so. > > > > Yes I agree that there are certainly benefits in using cache-locality h= ints. > > There is an open question around if the src or dst or both are non-temp= oral. > > > > In the implementation of this patch, the NT/T type of store is reversed= from your use-case: > > 1) Loads are NT (so loaded data is not cached for future packets) > > 2) Stores are T (so copied/dst data is now resident in L1/L2) > > > > In theory there might even be valid uses for this type of memcpy > > where loaded data is not needed again soon and stored data is > > referenced again soon, although I cannot think of any here while typing= this mail.. > > > > I think some use-case examples, and clear documentation on when/how > > to choose between rte_memcpy() or any (potential future) > > rte_memcpy_nt() variants is required to progress this patch. > > > > Assuming a strong use-case exists, and it can be clearly indicators > > to users of DPDK APIs which > > rte_memcpy() to use, we can look at technical details around enabling t= he implementation. > > > > [Konstantin wrote]: > +1 here. > Function behaviour and restrictions (src parameter needs to be 16/32 B > aligned, etc.), along with expected usage scenarios have to be documented= properly. > Again, as Harry pointed out, I don't see any AMD specific instructions > in this function, so presumably such function can go into __AVX2__ > code block and no new defines will be required. > > > [Aman wrote]: > Agreed that APIs are generic but we've kept under an AMD flag for a simpl= e reason that it is NOT tested on any other platform. > A use-case on how to use this was planned earlier for mlx5 pmd but droppe= d in this version of patch as the data path of mlx5 is going to be refactor= ed soon and may not be useful for future versions of mlx5 (>22.02). > Ref link: > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fpatch= work.dpdk.org%2Fproject%2Fdpdk%2Fpatch%2F20211019104724.19416-2-aman.kumar%= 40vvdntech.in%2F&data=3D04%7C01%7CKeesang.Song%40amd.com%7C1988237087f7= 4375caf808d9995678f0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637709418= 976849481%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT= iI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3DFErr0cuni6WLxpq5z2KKjAx2StGTlG= uN4QaXoXFE%2BKI%3D&reserved=3D0(we've plan to adapt this into future ve= rsion) The patch in the link basically enhances mlx5 mprq implementation fo= r our specific use-case and with 128B packet size, we achieve ~60% better p= erf. We understand the use of this copy function should be documented which= we shall plan along with few other platform specific optimizations in futu= re versions of DPDK. As this does not conflict with other platforms, can we= still keep under AMD flag for now as suggested by Thomas? I said I could merge if there is no objection. I've overlooked that it's adding completely new functions in the API. And the comments go in the direction of what I asked in previous version: what is specific to AMD here? Now seeing the valid objections, I agree it should be reworked. We must provide API to applications which is generic, stable and well docum= ented. > [HvH wrote]: > As an open-source community, any contributions should aim to improve the = whole. > In the past, numerous improvements have been merged to DPDK that improve = performance. > Sometimes these are architecture specific (x86/arm/ppc) sometimes the are= ISA specific (SSE, AVX512, NEON). > > I am not familiar with any cases in DPDK, where there is a #ifdef based o= n a *specific platform*. > A quick "grep" through the "dpdk/lib" directory does not show any > place where PMD or generic code has been explicitly optimized for a *spec= ific platform*. > > Obviously, in cases where ISA either exists or does not exist, yes there = is an optimization to enable it. > But this is not exposed as a top-level compile-time option, it uses runti= me CPU ISA detection. > > Please take a step back from the code, and look at what this patch asks o= f DPDK: > "Please accept & maintain these changes upstream, which benefit only plat= form X, even though these ISA features are also available on other platform= s". > > Other patches that enhance performance of DPDK ask this: > "Please accept & maintain these changes upstream, which benefit all platf= orms which have ISA capability X". > > > =3D=3D=3D Question "As this does not conflict with other platforms, can w= e still keep under AMD flag for now"? > I feel the contribution is too specific to a platform. Make it generic by= enabling it at an ISA capability level. > > Please yes, contribute to the DPDK community by improving performance of = a PMD by enabling/leveraging ISA. > But do so in a way that does not benefit only a specific platform - do > so in a way that enhances all of DPDK, as other patches have done for the= DPDK that this patch is built on. > > If you have concerns that the PMD maintainers will not accept the > changes due to potential regressions on other platforms, then discuss tho= se, make a plan on how to performance validate, and work to a solution. > > > =3D=3D=3D Regarding specifically the request for "can we still keep under= AMD flag for now"? > I do not believe we should introduce APIs for specific platforms. DPDK's = EAL is an abstraction layer. > The value of EAL is to provide a common abstraction. This > platform-specific flag breaks the abstraction, and results in packaging i= ssues, as well as API/ABI instability based on -Dcpu_instruction_set choice= . > So, no, we should not introduce APIs based on any compile-time flag. I agree