From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 56525A0545; Wed, 10 Aug 2022 13:55:41 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 035814068E; Wed, 10 Aug 2022 13:55:41 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 475B94067C for ; Wed, 10 Aug 2022 13:55:39 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 9E2F493DC for ; Wed, 10 Aug 2022 13:55:38 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 9CCC493DB; Wed, 10 Aug 2022 13:55:38 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A, T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.6 X-Spam-Score: -1.6 Received: from [192.168.1.59] (unknown [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id ECB7B90E4; Wed, 10 Aug 2022 13:55:37 +0200 (CEST) Message-ID: <4baabd37-ead4-d98b-0e61-01c6ecd23191@lysator.liu.se> Date: Wed, 10 Aug 2022 13:55:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [RFC v2] non-temporal memcpy Content-Language: en-US To: Stephen Hemminger , =?UTF-8?Q?Morten_Br=c3=b8rup?= Cc: dev@dpdk.org, Bruce Richardson , Konstantin Ananyev , Jan Viktorin , Ruifeng Wang , David Christensen , Stanislaw Kardach References: <98CBD80474FA8B44BF855DF32C47DC35D871D4@smartserver.smartshare.dk> <9ac934d2-ad05-6ec9-3bb6-63986d68d5d3@lysator.liu.se> <98CBD80474FA8B44BF855DF32C47DC35D87247@smartserver.smartshare.dk> <20220809082602.1fe5bd89@hermes.local> From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= In-Reply-To: <20220809082602.1fe5bd89@hermes.local> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2022-08-09 17:26, Stephen Hemminger wrote: > On Tue, 9 Aug 2022 11:46:19 +0200 > Morten Brørup wrote: > >>> >>> I don't think memcpy() functions should have alignment requirements. >>> That's not very practical, and violates the principle of least >>> surprise. >> >> I didn't make the CPUs with these alignment requirements. >> >> However, I will offer optimized performance in a generic NT memcpy() function in the cases where the individual alignment requirements of various CPUs happen to be met. > > Rather than making a generic equivalent memcpy function, why not have > something which only takes aligned data. And to avoid user confusion > change the name to be something not suggestive of memcpy. > Alignment seems like a non-issue to me. A NT-store memcpy() can be made free of alignment requirements, incurring only a very slight cost for the always-aligned case (who has their data always 16-byte aligned anyways?). The memory barrier required on x86 seems like a bigger issue. > Maybe rte_non_cache_copy()? > rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a rte_memcpy_nt() with the sfence is place, which the user hopefully will find first? I don't know. I would prefer not having the weak variant at all. Accepting weak memory ordering (i.e., no sfence) could also be one of the flags, assuming rte_memcpy_nt() would have a flags parameter. Default is safe (=memcpy() semantics), but potentially slower. > Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and expect > everything to work.