From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 75AE045B03; Thu, 10 Oct 2024 12:35:23 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 61B7640279; Thu, 10 Oct 2024 12:35:23 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 77AAA4025E for ; Thu, 10 Oct 2024 12:35:22 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 3BD081E926 for ; Thu, 10 Oct 2024 12:35:22 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 2E8E11E87A; Thu, 10 Oct 2024 12:35:22 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.0 Received: from [192.168.30.130] (host-217-213-113-219.mobileonline.telia.com [217.213.113.219]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 9702C1E924; Thu, 10 Oct 2024 12:35:20 +0200 (CEST) Message-ID: <41973921-cab0-4390-89ea-51dad4ad34e3@lysator.liu.se> Date: Thu, 10 Oct 2024 12:35:20 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used To: Stephen Hemminger , =?UTF-8?Q?Mattias_R=C3=B6nnblom?= Cc: dev@dpdk.org, =?UTF-8?Q?Morten_Br=C3=B8rup?= , David Marchand , Pavan Nikhilesh , Bruce Richardson References: <20240724075357.546248-2-mattias.ronnblom@ericsson.com> <20240920102716.738940-1-mattias.ronnblom@ericsson.com> <20240920102716.738940-8-mattias.ronnblom@ericsson.com> <20241009145710.6050d1a8@hermes.local> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: <20241009145710.6050d1a8@hermes.local> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-10-09 23:57, Stephen Hemminger wrote: > On Fri, 20 Sep 2024 12:27:16 +0200 > Mattias Rönnblom wrote: > >> +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) >> +static __rte_always_inline void >> +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) >> +{ >> + void *dst = __builtin_assume_aligned(in_dst, 16); >> + const void *src = __builtin_assume_aligned(in_src, 16); > > Not sure if buffer is really aligned that way but x86 doesn't care. > I think it might care, actually. That's why this makes a difference. With 16-byte alignment assumed, the compiler may use MOVDQA, otherwise, it can't and must use MOVDQU. Generally these things doesn't matter from a performance point of view in my experience, but it this case it did (in my benchmark, on my CPU, with my compiler etc). > Since src and dst can be pointers into mbuf at an offset. > The offset will be a multiple of the buffer len.