From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0A69045698; Wed, 24 Jul 2024 10:22:22 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D5E3F42E24; Wed, 24 Jul 2024 10:22:11 +0200 (CEST) Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2040.outbound.protection.outlook.com [40.107.21.40]) by mails.dpdk.org (Postfix) with ESMTP id B12D6427B8 for ; Wed, 24 Jul 2024 10:03:44 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=f710TswLDTmFWgwCiqbDGCA82ss2WJw1PWTP0W02oppcfmlAtiYE52y6wFgQIVqAhJNZQ7Ib38SWWOCnvzfVczV5IQ05oEPKhzFhz4vB6Nby+shfVlM4qUxQ56dk6Z92IGpbdxQaq+onZ/HL3BxbbtDtcaWU7g8uKVsOcQS0vZtq+4n1Ru/6L6W5wgq/4Vq86tg+u/NJ/nbyvNo91Uk/dBtO7w6RQehNY7NAMIy0MfGpoRFHoN1Iim8OK8FHm0JbF2u5ES0yHrssAr274ftn9TQrXvfqet7L8RZPDnNcqpD9wgzzMciSh1qXrLVBB4z+LIxStH+ueHMqmo9LfOA4/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=z22bXtS4XrtbkUJgM4b26Ti/93OMNhkH5v3afQAXhLo=; b=eDrscThs1qTGg1jsFbmbPOPVlhNsLeYuWD/F1JtaRjdLKFM++5mmt/NtN/XWofmQtsBQUGpjiKLDaKrQsoBAU5KzOIMRJ3tRRk2/yRkuQTn8Vwjbiatsw4rfGEqQoIeNJ5rCDwOE8IxmacqzYCC7LwD77331o3BlwTOp/a9UpreBGSAK50NiXtljbKYaMb90GsyFYPKvpD0fvE/H866TSb8UTBCX8Qr5HdocRXMzcpAKD0coOgYhDvp2/QaOXDdF8wXkIAkNMFOhVGAPbv4YtDSw/UR8QI9Du+q1cjFUY7Jd3T0b5C+Gr0RktRKf0YyZE6fUSxkxk8mGMz2bdMLePA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=z22bXtS4XrtbkUJgM4b26Ti/93OMNhkH5v3afQAXhLo=; b=GqhvfcWdBASH7P0f5aI7UlE0PEQDzMmXpz11iXmKV0xbFDBnzSFYUPmnZPhoKPNXTYhgbxG3WFGA4YQhjReD5aeJcYpGopipNvMdAKzPeVFtl/0kGlAkq6tNceIQYUubNnlyx3RIusznFskLayZTyigweh+qq/X9g6TKJgi2tXp/5AQZgxfiDFfIj8CVozJE/yfX2WM/bbqMJ16aKLhmKxCnMV6w2ow0vigguq277uAl8NpDPNediXuOC5xZd5y7i1NFkMPGL4j0KoIDq1/vMnh1TbjNucMHWG5o5slDZMryOaH438o7iGwjFGyIFucOZz4/7Fy1uLpzdHSzPExKTQ== Received: from AS4P250CA0011.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:5df::13) by VI1PR07MB6671.eurprd07.prod.outlook.com (2603:10a6:800:184::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.7; Wed, 24 Jul 2024 08:03:43 +0000 Received: from AM2PEPF0001C711.eurprd05.prod.outlook.com (2603:10a6:20b:5df:cafe::b5) by AS4P250CA0011.outlook.office365.com (2603:10a6:20b:5df::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16 via Frontend Transport; Wed, 24 Jul 2024 08:03:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM2PEPF0001C711.mail.protection.outlook.com (10.167.16.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Wed, 24 Jul 2024 08:03:42 +0000 Received: from seliicinfr00049.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.60) with Microsoft SMTP Server id 15.2.1544.11; Wed, 24 Jul 2024 10:03:41 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00049.seli.gic.ericsson.se (Postfix) with ESMTP id CBC18380070; Wed, 24 Jul 2024 10:03:41 +0200 (CEST) From: =?UTF-8?q?Mattias=20R=C3=B6nnblom?= To: CC: =?UTF-8?q?Mattias=20R=C3=B6nnblom?= , =?UTF-8?q?Morten=20Br=C3=B8rup?= , "Stephen Hemminger" , David Marchand , Pavan Nikhilesh , Bruce Richardson , =?UTF-8?q?Mattias=20R=C3=B6nnblom?= Subject: [PATCH v5 6/6] vhost: optimize memcpy routines when cc memcpy is used Date: Wed, 24 Jul 2024 09:53:57 +0200 Message-ID: <20240724075357.546248-7-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240724075357.546248-1-mattias.ronnblom@ericsson.com> References: <20240620175731.420639-2-mattias.ronnblom@ericsson.com> <20240724075357.546248-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM2PEPF0001C711:EE_|VI1PR07MB6671:EE_ X-MS-Office365-Filtering-Correlation-Id: b71a077b-9df9-4148-822f-08dcabb722b6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?B?eXlLNm1aQklBdXRLZ2xjTUFjVElhYVJBYkRXMWZOc2Z0K0JLMG5YTm9rZmZn?= =?utf-8?B?aTVvUDJ2dlVzQ1dGUmRZVzZEVlNFMmtONnVCdEZBWFF3ejFKTjB3cW9vdjh4?= =?utf-8?B?WjU3UDZwc01Gb3UySmxPNGxWVjZ6eFhySnA5SVdKcmo0LzkxZ2tzMjdlYzNl?= =?utf-8?B?UDEvVlRVbDNkenZ3SmVkWUt3ZzU0VUpzS1pYZExlc3BlbWt6L0hGV1VFc1JU?= =?utf-8?B?eTkzUnZJUHhlNDRJUlp3aVJXWVFSRlUrMFNRR0IyNC9wMThidW5UYVQ0YUkz?= =?utf-8?B?cGRnUWFEdU1LdUxiVm94eFRzK3pTTkhPN2ViRk0zSkh6eVRQd05SY24zREN2?= =?utf-8?B?djhSZGJCNmp1TGhhNExWWUUwbWNyb3JyTEdGV3huSTQ4eldOZ3pYdHhpdk96?= =?utf-8?B?TGdSaytQYUVBV1pjZ0M2bHpERmFCQmNUMEhsUWhKb29Ubzd2Y0VUQmpyQ2ll?= =?utf-8?B?dHpkYzgyckRDMmp2bURBakppYzZBbGJPYVlaOU54dFdtQWRSc0UrNGZKZkNp?= =?utf-8?B?Slc1Z09hckJIa2YvUDFucHhPeUZGTitRbXplQ0RybXY5WVFKS1dPMmZjenpX?= =?utf-8?B?REVPWjFEVWw1bm1UZnVmL3RiUWpRUGw3SlRnUnRaWmJPQit6RDJBNkFLRGVi?= =?utf-8?B?cW1zYjlJc1pPUmJ5OEJZYm5IWVAybXlJK3BSRGRiRTk2ZHptRnAwQnlrMDZ5?= =?utf-8?B?cEYvNTF4aCs4RWRjeEs0blFGQUt6U0VsSmVvdlBSZ1p3a0dWL0JMUW1oeDRC?= =?utf-8?B?dDVkY2N1bUJNcEF3NlU4UTl4dnRSd1MyTnBkT0szcFdYVTBlMWdWVGg2R1Nt?= =?utf-8?B?VHh4QnA2c2hhcmNjWHZOdHhnTVZHWlJaeS92QkZpbHExRHJyWkp0bmtKdFJv?= =?utf-8?B?M1lqWE9zd085S2REMHFWVkllTmVIWXQ4eXhISmxQbG1zQm13ZVd5aHUwcUpU?= =?utf-8?B?STFJdVhPNU9kajRMQ0lTYldXVngwVjBaMmpTMkFOYk9kd1Z5bmM5ck8xY1pF?= =?utf-8?B?Qks2MTI3VDJ2djEvYlllc3RmalNFazI0WlRqc3RJc1JRQzQwK3crdFk4Zysx?= =?utf-8?B?bjY0bldybnFZeFFTeThzTnVmSklkVEtzNGpNU21ONlFxOWdJZi9STmJ5bU5m?= =?utf-8?B?WnVJUHlTMCtJZEdGSkQ2TWxLSTZnR2doTzFkaVlyMkNyRmE3TGY3Wm1PT2xR?= =?utf-8?B?Uk9yZmlmSDQwZm9tZ0diVUw1RzJCYlpSdUFBbCtpcXdCaHluYXdFQ2pnQjR2?= =?utf-8?B?ZkV2VndBTVg0NVVmZE1hdStPTy93bnZ4a0F2a2N5MzZDeUZLT1RDV3U2TnFY?= =?utf-8?B?aHRlYjg1bWF2Ry9ZTU1nZWNXUFQ5TG4rdDNCZVEzN3ZiZTlyZ1J2a2lPVmN0?= =?utf-8?B?MzZJUUFPcFVIakZhN0l3WVpaQ2xkSm5KRE5JN1l6bldaVEJPZ2k1emIyNDNJ?= =?utf-8?B?TWhaSklnY2d4V3VOQnNwSUk0T3J2Z3ZweEk1cWx2eTBabGxqV2I0YitFZUt5?= =?utf-8?B?N003OWhUYnlTVGZmOUFkLzBoRnRKOWxTSUxTTDE5aUZwQzlUeDExcGZNRkxE?= =?utf-8?B?OVhScVBoMzhJRDRDK3N4d0V5SWNhcGVGWHAxYktCdnBQTDM0N01OZXZQSFVK?= =?utf-8?B?bDVTK1lvbEMwSWJxdHFJSGdnZXRSdmt4akdRMXVaTFFlT0FlNHFJLzFITWlv?= =?utf-8?B?SFpCcGJxVFVZWks0WWkzU2hWOXRXM25LNUw4WTFieUlBdlpGSFhoNDJUV2NQ?= =?utf-8?B?M2JhMEp4d2wrM1pyVDA2QVcyUXlWK0VOaUxxN2hGUER3TG9XWFg4dUExNHNp?= =?utf-8?B?c1pkaTNZakZyYkJMNHpwdWVmcFAxdFZ4eVpjTVBBTDFFUjZYdnYrd0J5a3h2?= =?utf-8?B?QlJoVFZKYVk0eVMxOXhNMGhCaSt1VjFBTit3MXlaajVoSXE2aXZodTVVUkVz?= =?utf-8?Q?oQmCFn996B5sgIDtf11XstBEVsgRWDMa?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Jul 2024 08:03:42.8893 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b71a077b-9df9-4148-822f-08dcabb722b6 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM2PEPF0001C711.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR07MB6671 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In build where use_cc_memcpy is set to true, the vhost user PMD suffers a large performance drop on Intel P-cores for small packets, at least when built by GCC and (to a much lesser extent) clang. This patch addresses that issue by using a custom virtio memcpy()-based packet copying routine. Performance results from a Raptor Lake @ 3,2 GHz: GCC 12.3.0 64 bytes packets Core Mode Mpps E RTE memcpy 9.5 E cc memcpy 9.7 E cc memcpy+pktcpy 9.0 P RTE memcpy 16.4 P cc memcpy 13.5 P cc memcpy+pktcpy 16.2 GCC 12.3.0 1500 bytes packets Core Mode Mpps P RTE memcpy 5.8 P cc memcpy 5.9 P cc memcpy+pktcpy 5.9 clang 15.0.7 64 bytes packets Core Mode Mpps P RTE memcpy 13.3 P cc memcpy 12.9 P cc memcpy+pktcpy 13.9 "RTE memcpy" is use_cc_memcpy=false, "cc memcpy" is use_cc_memcpy=true and "pktcpy" is when this patch is applied. Signed-off-by: Mattias Rönnblom --- lib/vhost/virtio_net.c | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 370402d849..63571587a8 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -231,6 +231,39 @@ vhost_async_dma_check_completed(struct virtio_net *dev, int16_t dma_id, uint16_t return nr_copies; } +/* The code generated by GCC (and to a lesser extent, clang) with just + * a straight memcpy() to copy packets is less than optimal on Intel + * P-cores, for small packets. Thus the need of this specialized + * memcpy() in builds where use_cc_memcpy is set to true. + */ +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) +static __rte_always_inline void +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) +{ + void *dst = __builtin_assume_aligned(in_dst, 16); + const void *src = __builtin_assume_aligned(in_src, 16); + + if (len <= 256) { + size_t left; + + for (left = len; left >= 32; left -= 32) { + memcpy(dst, src, 32); + dst = RTE_PTR_ADD(dst, 32); + src = RTE_PTR_ADD(src, 32); + } + + memcpy(dst, src, left); + } else + memcpy(dst, src, len); +} +#else +static __rte_always_inline void +pktcpy(void *dst, const void *src, size_t len) +{ + rte_memcpy(dst, src, len); +} +#endif + static inline void do_data_copy_enqueue(struct virtio_net *dev, struct vhost_virtqueue *vq) __rte_shared_locks_required(&vq->iotlb_lock) @@ -240,7 +273,7 @@ do_data_copy_enqueue(struct virtio_net *dev, struct vhost_virtqueue *vq) int i; for (i = 0; i < count; i++) { - rte_memcpy(elem[i].dst, elem[i].src, elem[i].len); + pktcpy(elem[i].dst, elem[i].src, elem[i].len); vhost_log_cache_write_iova(dev, vq, elem[i].log_addr, elem[i].len); PRINT_PACKET(dev, (uintptr_t)elem[i].dst, elem[i].len, 0); @@ -257,7 +290,7 @@ do_data_copy_dequeue(struct vhost_virtqueue *vq) int i; for (i = 0; i < count; i++) - rte_memcpy(elem[i].dst, elem[i].src, elem[i].len); + pktcpy(elem[i].dst, elem[i].src, elem[i].len); vq->batch_copy_nb_elems = 0; } -- 2.34.1