From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 57AC1459D7; Fri, 20 Sep 2024 12:36:51 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9AB99433C6; Fri, 20 Sep 2024 12:36:40 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2076.outbound.protection.outlook.com [40.107.20.76]) by mails.dpdk.org (Postfix) with ESMTP id 115AA402AE for ; Fri, 20 Sep 2024 12:36:37 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KWQ2+D9uTouXKTnxeDz4tl7pajusdKqct+3pqxe6POKp7v8X3M9CDY3nmy+mv1dgvN7z3eAuF4o/NdaKdpqXqfFFfgBCkj5KN7o8IXDKBaKDI9ZlKZZrBbqlgCTuMPF/B6Fsgl6pnVlS/b7pPvVzd9I8BSgUoUdSAfQRrRft8x3jlbVXS8pQsD9yD4tNAufeiPjiwx+/2hvBAWstlGC3xiDbSPSl0FDAqyObxWfIeCRAowbUr0Uwe6wnRetmEtlrku/Pwj/zQ36N5U9g01DwbzNE6U6VROTjz1JSJdWgMBuwidDo6HqZ8wbM4rISy2cq4v5cNFt05wP1S3pIIs/5aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NhzMIhddzxFM+bSw9avxtstYMiXUXzVe7s++98BlQMU=; b=LIB2EmYV79QQTJDYVRxEvhC1fa2s8ZcZtgalMnj4+9qC0wlHS6CLCFv8Ib8s6si9HrFopY+u0KP8D6e12Cv47y4sF3DSW1j1OOecIRJSbiy644449axOLfUVrNgAAnkN3l1l4i7mbBxU1taQ641Xof9bS0a3L8PXu7HGmylLGNrLXG0z0iLT3Z6AAcgZeFGHfSWFmXe04J02DuAENm2FZU99bINhK3Y62itWK+nc9RGsoEP0rCRSIzDT8PaArj9zA+C/E1F6O11Gk6cxBOWbnW0YE1qK8dZArJDkEc5J1hnW+9IJC7ge2h2V2DRKP7IcePybiI3kAgWpw+T+ytsvTQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NhzMIhddzxFM+bSw9avxtstYMiXUXzVe7s++98BlQMU=; b=yzg8DakZ/YrytuYEYfIO1bShy4+dzLSVuq8BYWDZDE0B2pAZFbulII5ClChD5PnMOFyf36oOAkZwRMxPhcatQpPXrzWA26EVAn7PRYp2EhUG9/vHlJYRTeMCzO6+P8ZLgoihW52R4LFsUXrpJtztmJtPDMswYtOle2b2X28jE3qTu8vAwCySqIFHCEXCbuEJYh27nUf3ZzJaj6Xcx/onQLFFs0BMTYqYnc4UGCTb+h7j5PONhgDmWgsjGEDOAW2oaPic43+r2Bn4yoAfX81kEymJpbHPho0Wq+c7qnsNmtzxPulsdkbLdW8ByV4PTrsqj8K/PLsb62hCTvE/pPJLaw== Received: from AM6PR01CA0070.eurprd01.prod.exchangelabs.com (2603:10a6:20b:e0::47) by GV1PR07MB9119.eurprd07.prod.outlook.com (2603:10a6:150:8a::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.16; Fri, 20 Sep 2024 10:36:33 +0000 Received: from AM3PEPF0000A79A.eurprd04.prod.outlook.com (2603:10a6:20b:e0:cafe::8b) by AM6PR01CA0070.outlook.office365.com (2603:10a6:20b:e0::47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.30 via Frontend Transport; Fri, 20 Sep 2024 10:36:33 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM3PEPF0000A79A.mail.protection.outlook.com (10.167.16.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Fri, 20 Sep 2024 10:36:33 +0000 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.65) with Microsoft SMTP Server id 15.2.1544.11; Fri, 20 Sep 2024 12:36:32 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id B2CFD1C006B; Fri, 20 Sep 2024 12:36:32 +0200 (CEST) From: =?UTF-8?q?Mattias=20R=C3=B6nnblom?= To: CC: =?UTF-8?q?Mattias=20R=C3=B6nnblom?= , =?UTF-8?q?Morten=20Br=C3=B8rup?= , "Stephen Hemminger" , David Marchand , Pavan Nikhilesh , Bruce Richardson , =?UTF-8?q?Mattias=20R=C3=B6nnblom?= Subject: [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used Date: Fri, 20 Sep 2024 12:27:16 +0200 Message-ID: <20240920102716.738940-8-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240920102716.738940-1-mattias.ronnblom@ericsson.com> References: <20240724075357.546248-2-mattias.ronnblom@ericsson.com> <20240920102716.738940-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF0000A79A:EE_|GV1PR07MB9119:EE_ X-MS-Office365-Filtering-Correlation-Id: 03f8348e-7a3b-472a-ae2c-08dcd960187f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?B?TDI4SDhPbWI5SFg1Szdoa2I1ajlmTVhDaHJrMktRQXZwWUYwVVJZYlI2TXVy?= =?utf-8?B?bG56SjA0b2U5NFpualdsNXU5QTRWTnFHWlFCc3VJWWZWUnpwNG9BZk9CeHUr?= =?utf-8?B?TTZ5V2pnZVNYRUt1K0pDNC9Db2phMWVhM3UxaC9wbm5LVVMwUkJCQlhzSUhY?= =?utf-8?B?MHUxOFB3eTQrVnlHWldrYitYS21MZWcrbG5UcWUrbWp1cTF2ZmhEOTNKUFVW?= =?utf-8?B?Rzl6VHVJeUpiSm1Ucm5kWE1UazZXdEhqVElYMWRDa01HNitxWExOUlY5TWFD?= =?utf-8?B?U2tFTVZEekJSenRSWEZOTDc2TW1oMHduWnYzTFZDelRrRy9XN0dmM0NlT010?= =?utf-8?B?OG5DUVJNbkpDclVOUEtXRVJjQXl4RG9Tbm94dkxSL3FIZmZVMU1oS1ROYStO?= =?utf-8?B?WTRvMFpQMWdDYUZrNFEzRFRkUmxueGVKdkxNZk1laExINmc5ZEM3N0RNY3E0?= =?utf-8?B?aGtES2N2YXdxS0pzUFhFekZGb0s1UTFhY0FuaEY0L1duRWxscW8zZmovRWsy?= =?utf-8?B?WHJOeXlnc2xkd2xaQ1pBRXUzSzhCQWVtM2dZWW5UNzcybTBKTUhoUENXLy9W?= =?utf-8?B?VWQxSlowai8zVnQ0ZVVUNnZldEQrbkM2dGYyeUdnZUhnbmJKL2l1UnQxUHdk?= =?utf-8?B?Rk9ZazhHMTdDSXNJa2NwT3FzTDVpUHh2eUJTV3VTem01Z1RxZGpHQW9QNUIv?= =?utf-8?B?T2pxS0d2RHhzbVB3KzhyVU5YQlJseHRJMHFVWVlBVXpkbXdweDNZamZRQ3pZ?= =?utf-8?B?bG1iUHFCTTJkN1BKQm1hUi9DMWxHUE11ajQyNHpTTXlzTFhnM09VZWp0V2s3?= =?utf-8?B?TUEvdEJTWnRYeHRyR0pYZUlEc0greHlmY3B1alFCeCtLeXdYbWlMOVFIUFlJ?= =?utf-8?B?RGtiT1lDb0I3TXQ3L0s3aXh2Z3lJR3NEUUJTR0FBWEtmUFpweU5aWnkvaDY1?= =?utf-8?B?SHBiMDU4dkpkWm9VMGY2eHEwUzJUaW54bi9sdVlmUW5EV1lQS2RMOEtjNitz?= =?utf-8?B?N1BaQ3FJSVhFcTF6V3IvVHFyekVnS3JBSnVhN2tyZHJiMjBORnRHY29BYm9E?= =?utf-8?B?MnhQS1BEMXg4WlJqZGwzSDJTNk1UOWhrcDVyQW15TE5rQ29wZ0ZXZGo5ZXZa?= =?utf-8?B?QWtnWVF0QTNZcVFobWplVW1meDUrcjNOeWFZcFNJNmRJYlhuaC9DdElScVBy?= =?utf-8?B?OWJxaHFjZmZoLzVWeVU0T2FTSDBtalM2Qzd2MlRUc01wbDFFZ3MwSTh3MHNX?= =?utf-8?B?ZlNPOU1KVG5pVHNsRGVZSDB6b09Eb1dNd2E4dlBEdklIKzdEeW1MK2ptYWl6?= =?utf-8?B?eHRHN1BOb3M5TWJsWGFWTWMxdENmZkdWTkdid2FqQnpodDdCODh6L1pjWjA2?= =?utf-8?B?R21RcG43SkJ4WFJPSGVYSytVblB2YzlOQWxOeEwySFpheFh1M2Ftd0tQTHJR?= =?utf-8?B?VkFlU3ZWcmM5dHprSVE4cHRvVTVtckRwZnMxTEpyM1AyS2dJckgzVk5CaXJB?= =?utf-8?B?ZUlhZ1VLVXQvWXdWU3B1bUV0UGVvbVdRdGJOOXFGZEpza0NxQ2p1OFZjMkcz?= =?utf-8?B?MzErY09UckFHalRqMmtzbUNMMmQyd09qbnpxQ3lqaGUrOTRBaVF6WTZVZVFw?= =?utf-8?B?ZzNoak40dHhzQTdYcjNKSklmTzZpWnE0TjhvVTUxL2ZEb1QyMFNjV3FldWU0?= =?utf-8?B?ODBWZ0FvcWthRjZ5VDNsVkdtblA0enQ4K3RoVlhrSWpISk9laDRSZ09vaktq?= =?utf-8?B?MEI3c1JqamNVU01STXp1eC9JSHd5ZVNSZUdLUVY3dnhPOXZCZ1h3eHJZUFVC?= =?utf-8?B?S2tKTk9KcWxPOUlDOFhjWVlBM3NlVTZhTitPS0RiMVU5N2lZQWM4OG96MlVW?= =?utf-8?Q?KFT66WlbAE2Zx?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(82310400026)(376014)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Sep 2024 10:36:33.0251 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 03f8348e-7a3b-472a-ae2c-08dcd960187f X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A79A.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR07MB9119 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In build where use_cc_memcpy is set to true, the vhost user PMD suffers a large performance drop on Intel P-cores for small packets, at least when built by GCC and (to a much lesser extent) clang. This patch addresses that issue by using a custom virtio memcpy()-based packet copying routine. Performance results from a Raptor Lake @ 3,2 GHz: GCC 12.3.0 64 bytes packets Core Mode Mpps E RTE memcpy 9.5 E cc memcpy 9.7 E cc memcpy+pktcpy 9.0 P RTE memcpy 16.4 P cc memcpy 13.5 P cc memcpy+pktcpy 16.2 GCC 12.3.0 1500 bytes packets Core Mode Mpps P RTE memcpy 5.8 P cc memcpy 5.9 P cc memcpy+pktcpy 5.9 clang 15.0.7 64 bytes packets Core Mode Mpps P RTE memcpy 13.3 P cc memcpy 12.9 P cc memcpy+pktcpy 13.9 "RTE memcpy" is use_cc_memcpy=false, "cc memcpy" is use_cc_memcpy=true and "pktcpy" is when this patch is applied. Signed-off-by: Mattias Rönnblom --- lib/vhost/virtio_net.c | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 370402d849..63571587a8 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -231,6 +231,39 @@ vhost_async_dma_check_completed(struct virtio_net *dev, int16_t dma_id, uint16_t return nr_copies; } +/* The code generated by GCC (and to a lesser extent, clang) with just + * a straight memcpy() to copy packets is less than optimal on Intel + * P-cores, for small packets. Thus the need of this specialized + * memcpy() in builds where use_cc_memcpy is set to true. + */ +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) +static __rte_always_inline void +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) +{ + void *dst = __builtin_assume_aligned(in_dst, 16); + const void *src = __builtin_assume_aligned(in_src, 16); + + if (len <= 256) { + size_t left; + + for (left = len; left >= 32; left -= 32) { + memcpy(dst, src, 32); + dst = RTE_PTR_ADD(dst, 32); + src = RTE_PTR_ADD(src, 32); + } + + memcpy(dst, src, left); + } else + memcpy(dst, src, len); +} +#else +static __rte_always_inline void +pktcpy(void *dst, const void *src, size_t len) +{ + rte_memcpy(dst, src, len); +} +#endif + static inline void do_data_copy_enqueue(struct virtio_net *dev, struct vhost_virtqueue *vq) __rte_shared_locks_required(&vq->iotlb_lock) @@ -240,7 +273,7 @@ do_data_copy_enqueue(struct virtio_net *dev, struct vhost_virtqueue *vq) int i; for (i = 0; i < count; i++) { - rte_memcpy(elem[i].dst, elem[i].src, elem[i].len); + pktcpy(elem[i].dst, elem[i].src, elem[i].len); vhost_log_cache_write_iova(dev, vq, elem[i].log_addr, elem[i].len); PRINT_PACKET(dev, (uintptr_t)elem[i].dst, elem[i].len, 0); @@ -257,7 +290,7 @@ do_data_copy_dequeue(struct vhost_virtqueue *vq) int i; for (i = 0; i < count; i++) - rte_memcpy(elem[i].dst, elem[i].src, elem[i].len); + pktcpy(elem[i].dst, elem[i].src, elem[i].len); vq->batch_copy_nb_elems = 0; } -- 2.43.0