From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 07DE8A0548; Thu, 11 Aug 2022 13:53:05 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9A960427F2; Thu, 11 Aug 2022 13:53:04 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 39E43410FC for ; Thu, 11 Aug 2022 13:53:03 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 0345EAC67 for ; Thu, 11 Aug 2022 13:53:03 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 0234AB052; Thu, 11 Aug 2022 13:53:03 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A, T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.6 X-Spam-Score: -1.6 Received: from [192.168.1.59] (unknown [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 7FC87B131; Thu, 11 Aug 2022 13:53:02 +0200 (CEST) Message-ID: <0e0d0cd9-c39c-5303-5921-601f43d14391@lysator.liu.se> Date: Thu, 11 Aug 2022 13:53:02 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [RFC v2] non-temporal memcpy Content-Language: en-US To: Honnappa Nagarahalli , =?UTF-8?Q?Morten_Br=c3=b8rup?= , Stephen Hemminger Cc: "dev@dpdk.org" , Bruce Richardson , Konstantin Ananyev , Jan Viktorin , Ruifeng Wang , David Christensen , Stanislaw Kardach , nd References: <98CBD80474FA8B44BF855DF32C47DC35D871D4@smartserver.smartshare.dk> <9ac934d2-ad05-6ec9-3bb6-63986d68d5d3@lysator.liu.se> <98CBD80474FA8B44BF855DF32C47DC35D87247@smartserver.smartshare.dk> <20220809082602.1fe5bd89@hermes.local> <4baabd37-ead4-d98b-0e61-01c6ecd23191@lysator.liu.se> <98CBD80474FA8B44BF855DF32C47DC35D87253@smartserver.smartshare.dk> From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2022-08-10 23:20, Honnappa Nagarahalli wrote: > > >> >>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se] >>> Sent: Wednesday, 10 August 2022 13.56 >>> >>> On 2022-08-09 17:26, Stephen Hemminger wrote: >> >> [...] >> >>> >>> Alignment seems like a non-issue to me. A NT-store memcpy() can be >>> made free of alignment requirements, incurring only a very slight cost >>> for the always-aligned case (who has their data always 16-byte aligned >>> anyways?). >>> >>> The memory barrier required on x86 seems like a bigger issue. >>> >>>> Maybe rte_non_cache_copy()? >>>> >>> >>> rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a >>> rte_memcpy_nt() with the sfence is place, which the user hopefully >>> will find first? I don't know. I would prefer not having the weak >>> variant at all. > I think providing weakly ordered version is required to offset the cost of the barriers. One might be able to copy multiple packets and then issue a barrier. > On what architecture? I assumed that only x86 had the peculiar property of having different memory models for regular and NT load/stores. >>> >>> Accepting weak memory ordering (i.e., no sfence) could also be one of >>> the flags, assuming rte_memcpy_nt() would have a flags parameter. >>> Default is safe (=memcpy() semantics), but potentially slower. >> >> Excellent idea! >> >>> >>>> Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and >>> expect >>>> everything to work. >