From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7CB994541C for ; Thu, 13 Jun 2024 17:06:59 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 46F04402E4; Thu, 13 Jun 2024 17:06:59 +0200 (CEST) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2078.outbound.protection.outlook.com [40.107.94.78]) by mails.dpdk.org (Postfix) with ESMTP id 04CBE402CB for ; Thu, 13 Jun 2024 17:06:58 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cVyBmHG4dR4ktoqoIUsqj13IHE8RCLOkpo1WKvWeXS7x9xiRFaI2y5/uldHjWDS9xBz0A8PHKVDLvXWGZXOQPBqpdFmtiF8cThM4d8hrHjARzFGHL68bEyMrtLGCNlS6rrq+S1+TL9x0hipb1Cylt9/5QbvsdxWg7he5+AbL9VjDS94IOkXt1LVUR0chiOu0tECcRaHsh8DQvSgIHT/44IATFjwPFxRRdcECZ0bITPKUGCE/RB1HUW8qxXSGMqn96vF4tSDfpu0nHSMn1PS2w9BxqSvTJvQ1dtlr83eAZGXuyOMfpwAOmVwDaG+xBXi3A5B/RATDy9yiWq3DY6+CkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HyFFVQzuhQxq+wr87IGR9ZgmpfZdpvCuDVJVAvsQgl4=; b=YJYQ5vDdXaNkTl0yKV++xgXXn2i2oN7KgwXFKHRohlV/pVOfmd2S6cKW4A3p8jLO59m0IZ0tQnvUtkLs24xY3wPPtPZMzmQEJdrPh1XKHKkZ+i7m36TyZ6CiVNoQJjmBigRlMjG6/Ju/t9N+cLDuPIxwjyZMBk01EuzQHIY6ETjSzW+tb6hD1Y+bGmQa1CWmIipDSb1bLpmuZ/oQ0DyCvYi5uNKWeVBWlerhOs08Jqb+A+Ns4cD0MUcarS4vN2ap23LoM5e4Aako+Z2i7+h9a4JLw3JshSKLmiKurwZfgXssx5Je6ZqFO55TQioiZykgurHIP4pp/Y+peSacHl2dcw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HyFFVQzuhQxq+wr87IGR9ZgmpfZdpvCuDVJVAvsQgl4=; b=bI9/2K5xNBoZP2mZOyUtNLDyopuzZIrq0RDEDDYuKigN56HKgDTUPTt6bJM4H1UAhfmWqAylvbtlYfrxKSsriP/uNjWWGi7jMvCCp5yax4qpS1Ok4MSmg586nHaQeioR1IascckQbeuMwQF8C4v+uPI/o0B12VPPOAU0Fix86e9J+RTjxeJ+rWw/u+z81Pawjl1vjvTjMF1hKEhgVFX1JVzbOhRfRGcdULh4cbpky+FhxHnNDRgDQESi/uzuNeoR29Jw+1AdoInFlDcRe875iu++D1YXZRA6XV82ijxlAwqZMfvAFwUYtqMV1uebEzhyfpql4vHrSNLteBJgfnl1Ow== Received: from PH0PR12MB8800.namprd12.prod.outlook.com (2603:10b6:510:26f::12) by SJ0PR12MB8615.namprd12.prod.outlook.com (2603:10b6:a03:484::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7633.36; Thu, 13 Jun 2024 15:06:53 +0000 Received: from PH0PR12MB8800.namprd12.prod.outlook.com ([fe80::bdb6:e12f:18b6:2b77]) by PH0PR12MB8800.namprd12.prod.outlook.com ([fe80::bdb6:e12f:18b6:2b77%7]) with mapi id 15.20.7677.019; Thu, 13 Jun 2024 15:06:53 +0000 From: Dariusz Sosnowski To: Dmitry Kozlyuk , "users@dpdk.org" Subject: RE: [net/mlx5] Performance drop with HWS compared to SWS Thread-Topic: [net/mlx5] Performance drop with HWS compared to SWS Thread-Index: AQHavXBYLyiyVBFDQ0umvVv/NSnInLHFt63w Date: Thu, 13 Jun 2024 15:06:53 +0000 Message-ID: References: <20240613120145.057d4963@sovereign> In-Reply-To: <20240613120145.057d4963@sovereign> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: PH0PR12MB8800:EE_|SJ0PR12MB8615:EE_ x-ms-office365-filtering-correlation-id: 262db304-e9c8-4412-b67c-08dc8bba75e7 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230035|1800799019|366011|376009|38070700013; x-microsoft-antispam-message-info: =?us-ascii?Q?WVjB31dbntqj5lAA0V7/kRRsWC319EOJpv6n2M1O3xN+OD9gX6Pmr5RHjT/x?= =?us-ascii?Q?ddjgAbI9kgr9UkH9ftylivz/Tb6ASCyS5aD1JBsRKwcoDDvB32RawUb5EK5Q?= =?us-ascii?Q?UBaGVYGv2pPX42lL+6Zjda1u5S5K6jqK7CLg1JpWwXXxsHKudQ9/Umi9mi/a?= =?us-ascii?Q?1hD1AF0/hIHWwoR6qm79sscRSVf7SUidcIUa7CEnUddS5O0TYZySxiUl24Rj?= =?us-ascii?Q?95uNOAwK4KHO6mWPsDIcD7ok90QU3sSVGYgRUQUziABOVK4TYh2usLX6zOFk?= =?us-ascii?Q?H22Bi5zMrnvpCstlyHueUh/1SGnZ+XPE6r+gUR0ImHlQ2ud8PnqZFt4l+Mpe?= =?us-ascii?Q?fee7cX2nZnOSE7eqfk1mQD3+v4kjSCzhS4fHHXGLSP0DUPAgf++YysjDKpj8?= =?us-ascii?Q?mwzTB420UhVj6jO6UZJBB4S7DC/RxxtUXF4sxSkt9+6mn+LVZtW47e8SG8d2?= =?us-ascii?Q?8xRpuJKx+aPQTJDNRk0R7v7137PG6jKi76+rBR5i43/OxCeDb2u6vrPD3NUl?= =?us-ascii?Q?lpNBET7bJFfdag9xs358CRR/lLVv5iL+tsuC4YDdrF1nBOhRBWCNQhyklv3j?= =?us-ascii?Q?RXPEWXIrvggNog5/GCnxDJdBRdXvflOlGYhwqAE2HEQibCbfcDe2p5RJSiaD?= =?us-ascii?Q?Oj5ByOZANTQHaf8/6DDAuHaseHZkGgQ4NsqKbt/95AorMR1KmfByw8Od2+pF?= =?us-ascii?Q?+q7M0uxDKdIDAmEJQYLEypOHCq91VbICIH9atH5AqEMONTbZIOD6FqTfwS3V?= =?us-ascii?Q?2T9CNyNeNxof3hMkkAoqB+7WWXoc/Tvu4jzSRDj8wgW7TqjZmW0vcm55yjFM?= =?us-ascii?Q?tG4mX19cCZo1HZLQ2L0vSbH+MYbSGFAJ3F34t9Yv7M6dnQ3743OT9nPx2S9v?= =?us-ascii?Q?RaFKIzmvZduhANQ5zyaiAgxwoYwgzC78RURlCHvn+GBhi2zerJj+ggUDIf99?= =?us-ascii?Q?kC78/hlypQ1tBuRaFZoB7pstKcbDNvme3ofJ8pb0x1/uM6SjJ2oZTRyXvXmM?= =?us-ascii?Q?eJ9De2ZU4xYuEBEfmS5fR34Y1BS/72jU3XzYw/b2O9R58XYwz0bEM2eQxVJI?= =?us-ascii?Q?zqjuPjPVfXz1YLf83AAsrwsq8Dcbq9Cp1JF5E5LUC4nfvgP+SNwyBlL6HcqP?= =?us-ascii?Q?wD7pEvWOY+7a67NaT2Ash1kA8miQIywRkNMXG6S5LrB0n01XM90yVsj3Z6e5?= =?us-ascii?Q?NdBcWUaiLsD8WkPXRqSMaYZ52gQBDny00ScPl7cDBREosUeMhNKA8Jo7YHp1?= =?us-ascii?Q?YQLtQSaccFTqa47Z9XOdqKRJywp3Xf0XC8lKMGzcRHtyWZwTV7Y0QwohXipx?= =?us-ascii?Q?/0gQzqXbDPUoKt+1hgiVc1s45UvAad5U60cNxRDjESEmE1cc8Q/at6BokyoW?= =?us-ascii?Q?VKorEe4=3D?= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR12MB8800.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230035)(1800799019)(366011)(376009)(38070700013); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?9uhcprdoAZF/tltyZSd4sDTQgIoIfR0tBeZdf5RqUE0GgwKZ4n3H209lr44t?= =?us-ascii?Q?XvRtaipXZwSTv9S0kEMncclcOGMq5MsYUz2b5MQXif4eLE4xl1zdoRAdvZQg?= =?us-ascii?Q?4Ca6s0ATFu+WUxrxzIs2zAcQ2K/1zMk9dEuH6Xes/0H4Eg0lIOJYH4miTeST?= =?us-ascii?Q?u1oVpXTjzBwasdgXNFWte7DviSeA7oAjs1Lt4UUt2FqbXPe1jtEYgO3nV6bk?= =?us-ascii?Q?USdL/8a3p8ebVjyi8QxJ2DAsq1KNO0EVkHbdWFQYOIjDRj62d4+bpnRvolgR?= =?us-ascii?Q?O2UEzGDnkZ/2X+3wHenfF3J/z3m5D8yQaSXqewr7vz5eKjePZyl+AyKqH7N5?= =?us-ascii?Q?mxvOuqnqfqte3XdmCIxTAPT/H6Ep1TnGpAItTADJeO8lv87d/tdasDtn561m?= =?us-ascii?Q?YIRwj22kPOBGSSv3YGsIx6c62D+JIZ6Kjae+DyDxUG0MAoxrIN+b54VA/vn6?= =?us-ascii?Q?0OXO2hrFvEFAhMemk+5jhABcjbOlq0Lr7rRlVMM6Fr2nB1otSnS2zRK0DZiG?= =?us-ascii?Q?ZGpzy5aY5MRUA+Jze07du1CZvi+gKMPg6Kc7tHwHOZ2MFAc44RjDZfpnOY7p?= =?us-ascii?Q?cT45FWPsEyO7rCifqj5R5cgD6lZNScttp4n2X1ugwGT/+KeCDVaFECFZEUli?= =?us-ascii?Q?20r1cE676/zpQby0WFLGRHz/RMEn1rf0F1kbFeX6SYAxY2WPrRgNAAVI3d6K?= =?us-ascii?Q?pHQsFVeINDbrBecp0vaUYh5J8lLws1YQvZqkFUP/9OT3Nh/DxANafZUwASwA?= =?us-ascii?Q?npdIsjRBdyCDq4mViKTdjKWM6QhCsj0IsWG/Wx9QkUXL7QRKP4hG5SFz2flP?= =?us-ascii?Q?CDZ0+4eWUqz96BQ0edfj9TdUbDoPZLugRqgjzAhl82pw3eVJWC4iDQON++V5?= =?us-ascii?Q?OZEwucSFfrcfuAEvixSrxg9FXbVlc2beFN7gSma0U0bHlqQkkYYPybxwO8ym?= =?us-ascii?Q?AmCSXDECdAA22JRyq7h+BcrxB9o3hm4v8Dy8KrEY+rgwPH6E//o98HOhlky2?= =?us-ascii?Q?MsPVeISitMJXv6P2txVg0mhHVixH5luKq47xpCDd+abeg7VNtv7fsEdhqzyT?= =?us-ascii?Q?CJCeDsZw06Z1vmI7huu8lswrPGFIX0wWbAB+6XUe/smehkAnCpVwLozMoJn6?= =?us-ascii?Q?Y5c23pv4V/pKon3WS7kP6X2SABIMuNUDMwKgYsigZgnMvaFEF7ZTeld87aP7?= =?us-ascii?Q?ZWAHJ7IgJKXF1MCrrpTUttZk8G9d5TUjhF5G0QPKYkwSYX6VZksQJniQlVu3?= =?us-ascii?Q?OgzYIy4kASKH0DPhk3j6C6L7NCgOpeCW0S4npYzKKhZ/5d+AB5DjLSkPrk6f?= =?us-ascii?Q?oukJZwRtszp+ipx8355C64xW3/21iOYIoJeYQM+SgpRrZ8DE+a3gN1wyFKAv?= =?us-ascii?Q?i3UkCSb/AEAoE3Ux1Rh0F3KPhGl7IYGnni1ISgy/iAUYvDwLqER4Hdzb6pGZ?= =?us-ascii?Q?4WvbkA9CqPSTQP9PZwgIljFIS/drLjfL0TZk/I92E7kDuNxJQc9Uyc/0dYl+?= =?us-ascii?Q?oat/qFDoZa8/zzs6S4dn3XZdOiA8Iv00t1aLE5aZ6KA8sdsUyTOI/vqOnHF1?= =?us-ascii?Q?xV0EcjFfjnbaSx1F9TlirYHjSc2fED6Mkrpn4d3B?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: PH0PR12MB8800.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 262db304-e9c8-4412-b67c-08dc8bba75e7 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Jun 2024 15:06:53.7784 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: lL2KbvH/uw7Vh+n6PuX6Av5Btg/uU67+9cvPxcr14x9mYaFwXQ/Tjy/UVvQzsyGE/QbuP9ohLRz7a3AKnxI5+A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB8615 X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Hi, > -----Original Message----- > From: Dmitry Kozlyuk > Sent: Thursday, June 13, 2024 11:02 > To: users@dpdk.org > Subject: [net/mlx5] Performance drop with HWS compared to SWS >=20 > Hello, >=20 > We're observing an abrupt performance drop from 148 to 107 Mpps @ 64B > packets apparently caused by any rule that jumps out of ingress group 0 w= hen > using HWS (async API) instead of SWS (sync API). > Is it some known issue or temporary limitation? This is not an expected behavior. It's expected that performance will be th= e same.=20 Thank you for reporting that and for neohost dumps. I have a few questions: - Could you share mlnx_perf stats for SWS case as well? - If group 1 had a flow rule with empty match and RSS action, is the perfor= mance difference the same? (This would help to understand if the problem is with miss behavior or wi= th jump between group 0 and group 1). - Would you be able to do the test with miss in empty group 1, with Etherne= t Flow Control disabled? > NIC: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0/3.= 0 > x16; > FW: 22.40.1000 > OFED: MLNX_OFED_LINUX-24.01-0.3.3.1 > DPDK: v24.03-23-g76cef1af8b > TG is custom, traffic is Ethernet / VLAN / IPv4 / TCP SYN @ 148 Mpps. >=20 > Examples below do only the jump and miss all packets in group 1, but the = same is > observed when dropping all the packets in group 1. >=20 > Software steering: >=20 > /root/build/app/dpdk-testpmd -a 21:00.0,dv_flow_en=3D1 -- -i --rxq=3D1 --= txq=3D1 >=20 > flow create 0 ingress group 0 pattern end actions jump group 1 / end >=20 > Neohost (from OFED 5.7): >=20 > ||=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D > ||=3D=3D=3D=3D=3D > ||| Packet Rate = || > ||---------------------------------------------------------------------- > ||----- > ||| RX Packet Rate || 148,813,590 [Packets/Seconds= ] || > ||| TX Packet Rate || 0 [Packets/Seconds= ] || > ||=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D > ||=3D=3D=3D=3D=3D > ||| eSwitch = || > ||---------------------------------------------------------------------- > ||----- > ||| RX Hops Per Packet || 3.075 [Hops/Packet] = || > ||| RX Optimal Hops Per Packet Per Pipe || 1.5375 [Hops/Packet] = || > ||| RX Optimal Packet Rate Bottleneck || 279.6695 [MPPS] = || > ||| RX Packet Rate Bottleneck || 262.2723 [MPPS] = || >=20 > (Full Neohost output is attached.) >=20 > Hardware steering: >=20 > /root/build/app/dpdk-testpmd -a 21:00.0,dv_flow_en=3D2 -- -i --rxq=3D1 --= txq=3D1 >=20 > port stop 0 > flow configure 0 queues_number 1 queues_size 128 counters_number 16 port > start 0 flow pattern_template 0 create pattern_template_id 1 ingress temp= late > end flow actions_template 0 create ingress actions_template_id 1 template= jump > group 1 / end mask jump group 0xFFFFFFFF / end flow template_table 0 crea= te > ingress group 0 table_id 1 pattern_template 1 actions_template 1 rules_nu= mber > 1 flow queue 0 create 0 template_table 1 pattern_template 0 actions_templ= ate 0 > postpone false pattern end actions jump group 1 / end flow pull 0 queue 0 >=20 > Neohost: >=20 > ||=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D > ||=3D=3D=3D=3D=3D > ||| Packet Rate = || > ||---------------------------------------------------------------------- > ||----- > ||| RX Packet Rate || 107,498,115 [Packets/Seconds= ] || > ||| TX Packet Rate || 0 [Packets/Seconds= ] || > ||=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D > ||=3D=3D=3D=3D=3D > ||| eSwitch = || > ||---------------------------------------------------------------------- > ||----- > ||| RX Hops Per Packet || 4.5503 [Hops/Packet] = || > ||| RX Optimal Hops Per Packet Per Pipe || 2.2751 [Hops/Packet] = || > ||| RX Optimal Packet Rate Bottleneck || 188.9994 [MPPS] = || > ||| RX Packet Rate Bottleneck || 182.5796 [MPPS] = || >=20 > AFAIU, performance is not constrained by the complexity of the rules. >=20 > mlnx_perf -i enp33s0f0np0 -t 1: >=20 > rx_steer_missed_packets: 108,743,272 > rx_vport_unicast_packets: 108,743,424 > rx_vport_unicast_bytes: 6,959,579,136 Bps =3D 55,676.63 Mbps > tx_packets_phy: 7,537 > rx_packets_phy: 150,538,251 > tx_bytes_phy: 482,368 Bps =3D 3.85 Mbps > rx_bytes_phy: 9,634,448,128 Bps =3D 77,075.58 Mbps > tx_mac_control_phy: 7,536 > tx_pause_ctrl_phy: 7,536 > rx_discards_phy: 41,794,740 > rx_64_bytes_phy: 150,538,352 Bps =3D 1,204.30 Mbps > rx_buffer_passed_thres_phy: 202 > rx_prio0_bytes: 9,634,520,256 Bps =3D 77,076.16 Mbps > rx_prio0_packets: 108,744,322 > rx_prio0_discards: 41,795,050 > tx_global_pause: 7,537 > tx_global_pause_duration: 1,011,592 >=20 > "rx_discards_phy" is described as follows [1]: >=20 > The number of received packets dropped due to lack of buffers on a > physical port. If this counter is increasing, it implies that the ada= pter > is congested and cannot absorb the traffic coming from the network. >=20 > However, the adapter certainly *is* able to process 148 Mpps, since it do= es so > with SWS and it can deliver this much to SW (with MPRQ). >=20 > [1]: > https://www.kernel.org/doc/Documentation/networking/device_drivers/ethern= e > t/mellanox/mlx5/counters.rst Best regards, Dariusz Sosnowski