From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 90D5B45459 for ; Wed, 19 Jun 2024 21:15:37 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 482B8402D0; Wed, 19 Jun 2024 21:15:37 +0200 (CEST) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2087.outbound.protection.outlook.com [40.107.237.87]) by mails.dpdk.org (Postfix) with ESMTP id 3EBBE402C6 for ; Wed, 19 Jun 2024 21:15:36 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JWG0s8Wjfu+0HdAr48JnDE/12DWRDxvyqDgoIhewn2N37h7Qqd3/AZLJ/F09mB1SP+2yJ0vY8XxW3rYzFQW+877/C9hCVmEI7uKqpDC0h/ZsRjGs39WwsuwD3g6viEgWHwHDnDT/B+LPqT4OwaiZmf7wBZ+0S3nVz+59l3RCSweETjD+h3JUAFI6sid2gZWTi45KirGuAJBXvWaKRLPKEUVCyNeCj2BvFMsHWmqhIrgaENUHQx8KPdCzJFlDVTLKzKV0rXxGavx8ARf0upXgZlbpZitQJAKMrHP2NNuf75CYnjHZ0CmU/TG4XgfnfKSU/bCkE/sfYLxd0vE4XUyKTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nW9bFSrsiBmHResmrGGxGLGQK4EbkOmusG8swA4X20w=; b=QPbpOJsV735qcnJuvKip2k1q33RTbPJ+AsL21uEg0qJzH8ivw2JsP6jTma1JcVFOw86Hsq1D+Tguw+1IU1tK7IMoSswsxiLpuy4fFOjVQruMauSu2aImnIpiWEh8kxsbEII5I5LKkRJKiqdRwggY4S/X5lGVBvNjupWU3KBR8teVESegoDvDcEyKgc2WS9qFgLq00iu6qqoUFb8FUzi/LZX3bQPFKj2SPjmptorabxZRe6ulZd4XVUR4HaS4rXrL+8M1Hd3rMXK4VVFZa+jLPJZudHT23bPOUzEnVvqloHDyv9v+foqIrTn2fL5/HeUVLcKdhQdqCPVYQEBwucdClg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nW9bFSrsiBmHResmrGGxGLGQK4EbkOmusG8swA4X20w=; b=l1E78Mo1NU2xH3VLC2+onpG7r+ET75lJ9vD+WP/E0jC3iVpYLijZXhD6sTjPPtuBmXNGhlzUXkaSOavTGDgoA5CCT22GdAwMzuJ+W/TPrJ6XwtMdf4s32EFj8EQaO9CKXoqzlFL1j1WWok8olwAPXudkQ7T5mdTdIDIOsGlgPuWPIaLhxKOusjL7e945R5CiiYZERYQylRP4ksBSCcDN/Sf6SsM9OIwTnYIBU1DkRKhLjfm3CKWLFnbqgB4w6nThq3hkUTaoD2ztkV9IOA26xFmXrtZ5UV43K9bXrB9sm5WhTGxThMknLlrJjyr9SvzrLoi9MuPv8b/ziRFbPVIhqA== Received: from PH0PR12MB8800.namprd12.prod.outlook.com (2603:10b6:510:26f::12) by CY8PR12MB7585.namprd12.prod.outlook.com (2603:10b6:930:98::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7677.30; Wed, 19 Jun 2024 19:15:30 +0000 Received: from PH0PR12MB8800.namprd12.prod.outlook.com ([fe80::bdb6:e12f:18b6:2b77]) by PH0PR12MB8800.namprd12.prod.outlook.com ([fe80::bdb6:e12f:18b6:2b77%7]) with mapi id 15.20.7677.030; Wed, 19 Jun 2024 19:15:30 +0000 From: Dariusz Sosnowski To: Dmitry Kozlyuk CC: "users@dpdk.org" Subject: RE: [net/mlx5] Performance drop with HWS compared to SWS Thread-Topic: [net/mlx5] Performance drop with HWS compared to SWS Thread-Index: AQHavXBYLyiyVBFDQ0umvVv/NSnInLHFt63wgABqIACACUxPIA== Date: Wed, 19 Jun 2024 19:15:30 +0000 Message-ID: References: <20240613120145.057d4963@sovereign> <20240613231448.63f1dbbd@sovereign> In-Reply-To: <20240613231448.63f1dbbd@sovereign> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: PH0PR12MB8800:EE_|CY8PR12MB7585:EE_ x-ms-office365-filtering-correlation-id: 9768f694-08bc-43b2-1522-08dc90942f62 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230037|366013|1800799021|376011|38070700015; x-microsoft-antispam-message-info: =?us-ascii?Q?MWCEAkCPzIkn6fvAZc22yKFO9vOnD7zgdcvVUa4+V/M35oA7VIlH/PvoIZV3?= =?us-ascii?Q?Llb8keIFHgg9W7qdFZAunuDB8NEk9lg1NP++DA2TLKdOw1xnSTNnjfFK3/L3?= =?us-ascii?Q?4ZiGsyyxN8i3WRL7NwnzQNYgEnbw6i4FyT0GYXnJvDA3pkz3Z+aI4z4Fg+Rn?= =?us-ascii?Q?sCXm5G4UsT2qH1w5r574Qje/GnpqFl/ENPnP4jgoDpiie1WAzH43Chx+pCk/?= =?us-ascii?Q?oqz6WJCvhBgy5No+Khv7oLXs5+1xLcZatQfcSxHUz7LjY7jaWvuMDp4niPWx?= =?us-ascii?Q?85+yZ9OiOvHxUrQ/w0CFTQNnQ5zBpL3n1Zxgts0sbQck8xqcyEhISBrE7PKC?= =?us-ascii?Q?OXPQwJRuT9CWIVqJ72Sg0GFIYzLQdTujeWIBJR8oIyQ60PElz3cfd6Jg/lrr?= =?us-ascii?Q?7D/5KtdIq9vA+8mYcBcwQ0QvHbbalYxKIbw+Kp9aNDh3XPEG0PC8i6EdtNpv?= =?us-ascii?Q?WgaBU24Gz1cYn3MDPaBQ3MuEWtPBgc68tRO062nkytMXFguVCmtarDVfYu6T?= =?us-ascii?Q?xdN+X4RnSU61NeTkFtl4Dnx79EZdYCurWx+S1EYZ3dW4WWSbSiSgonSQuXx8?= =?us-ascii?Q?wrO/y6VxKM2O85jeK+e2VsgmtAlIcm9HWHBh6BaajzECNWL2hgBQmXEmTi/W?= =?us-ascii?Q?S+959s3h/g9iY8sYpzpDQeO8VeWe62L8G2Hb9BFCqj/r/HhgW76O4lM37hRV?= =?us-ascii?Q?skuTN06yF+Ee0vjQhRqjp/NWwH2Azwl73xQ7YMBVDyjnZNBwanAylZuUB7qk?= =?us-ascii?Q?XkiyAd0zwYvhGZj47/LVDvwNinybhYBx1h+2gA8P4/+CftVAeoD3FmzuwCTv?= =?us-ascii?Q?pQHb6vC94dUnUx/Zj7zooryGc4bEErkWs/eoWN7Fx94e8j0P/0WAT7qxaLkR?= =?us-ascii?Q?o5njl58Ss8CC+IZW8ttxVS0BfEZyzz08BNdZAXR49xvax1A3DAUr1B9JTmdd?= =?us-ascii?Q?GsgCi6g1t5lBz6m7k9lq/vVd1KkrWu3U+0LzzIMKX904ek/zr58cNPnWfDIs?= =?us-ascii?Q?NAXgJrUv4awraYnO/anm2Yc7vYTW+d0G/URz6vqt4fO7Pc92cKPECsyrGg57?= =?us-ascii?Q?ONYRu1niTvs4iovILe0HETjXsyDxCcVDbTT681NqurQg1qYx+HkoA60IAApI?= =?us-ascii?Q?1vxZuZwo+2UUWJu8oMbZl8fxpej79aZm3KTD6CuLYMhe/QEw4/+gI5yVrPWs?= =?us-ascii?Q?qA/kDP2GNi/RYNvxfmI4zmkzj+C9GnyApfwhHseoKdCMPs/CbYxYmnJ5c5Ss?= =?us-ascii?Q?yEGjB9crdzRlDPFTdCobmUkTp3/AiE87ul9eIJx8ydJEtNajwJS+8y1BoxKE?= =?us-ascii?Q?FvRGubpzRdR+Jqwqugr9cZqnBouL9sipwMNW1qq448rstMcXUzkx0kLAp+us?= =?us-ascii?Q?EWZJPec=3D?= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR12MB8800.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230037)(366013)(1800799021)(376011)(38070700015); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?BoWcCGf3XPCK93jsAmjU+Rfmbp/XHI2ZBWorvUfJpwkk6LAeoSEesQA8gHdL?= =?us-ascii?Q?abZ9cxFTWr3c+XGWqUU8so52KcD4cN7/BVPTIyydCT5y60Vf3A+ZotlJ1WVd?= =?us-ascii?Q?UO4VqEAAPIzgl9vkt9sgu4DRlILWkt8/BTCcmnhS27X13Lxs0lJbONPE5hvO?= =?us-ascii?Q?VQx2X5WFzNYuMoGwcEm2VhW7Lu+dVMMUoldWCpYFdEEqIEdE1OQCmsD9Jo5q?= =?us-ascii?Q?8oAygOqPf4yxx0vfR+Emkiv/t5F761Xxrkz+6WHx36XWNQQTHEk6RtwJVs6d?= =?us-ascii?Q?xJvYqblsx4qTdE/+yHhfTj+Hxz2At0eM/RL7vW8IAjST2Sm+mRYrOh2rrQX7?= =?us-ascii?Q?6gzOiOUi3hdIBNg9Xq4oozQHSYNL44U4iUJaPJmgJnKDv1Dg5mDS0c+Km9Iu?= =?us-ascii?Q?1ZOyKp17mf0+DWeujM5hkIuPWmLOq/bll/sVb6ePmaOxmepcuJdggVlkVLpw?= =?us-ascii?Q?h1Hd/Hhvi6P/sN9O0G0z3mPRJt3mvhKhAumZWb+9FjcrZPrRV/qRhQDvyD+U?= =?us-ascii?Q?oNZNN6iOqaFbNYgEOt9WxglQ+0ntE3Sh1hNNKr9Rc4+8q3d0pXfcPAfG47qW?= =?us-ascii?Q?MyzybWAAJnSqWtmhum191gmCNmxSpsRRgmnfJdH7GEe2ptenPCQf+G8jcIbA?= =?us-ascii?Q?wFdjKK07bUwURLHAbm7dRbEf590ZbrSTZU8rtpCjWB9aORUZVqRHTN8+r5pP?= =?us-ascii?Q?9KnlkSfQuJ3ywQAa7IZ4r+3NeIWSKeguNhAOyVPSDYbvrmZWz+cbANSUHDTL?= =?us-ascii?Q?1PNQWSnJFikmHZ0sj4fMW2aurzTR5fDKV9dwS3vPTCUjkP8JKaW7GA4jX8vo?= =?us-ascii?Q?468p+m4SFrSxoPE/2jXgLwwzyz23uWTHo+J9/FXCLdcWxIILM0pvLbJVngil?= =?us-ascii?Q?rXjoyyxf53oSvP6dmMRE7SUCrPPk3I9H1yTk5JllRuwNOxeO8SJWuIAySKRf?= =?us-ascii?Q?K4QBPSwRZGI0zP7Ogfdcg1pmqY7uVGF5dACwgDm7lJC+p8TS1IVvnH89+xhH?= =?us-ascii?Q?BeWehZWUYPCJ5RfMj+DqE4Ues2VoffIN4lD4/diF7crESMsf4N5iHD0/MgCb?= =?us-ascii?Q?8mVyPEtOfvWTdH+LvXwlacR2/JlUOb28FsM5ol2EMpKOhBQdZBBO2zFQ8Vlt?= =?us-ascii?Q?ABYgPF2RuB8+tUtglQSYvogy9gIj91dOb/y+i1PjfW7YCYR1cUUEsOqHGId/?= =?us-ascii?Q?p2dnkW7mRhctj05Vk0tthomfyKbVZd2RwVT3Frs7+g2Ge1ZqNs2FBpjofB7C?= =?us-ascii?Q?S3cKYPlmlmj/iG8YlRvrMG6MEpEnn7zrVduXkgbJ8ThLuOK47NTpbHjlZkex?= =?us-ascii?Q?KUkHgu97CH/rglQ4YbHRbP0ftHMybaRwvZQ0FywYU3I0IJKS68m0ee5/a0KH?= =?us-ascii?Q?veLZH7dRfWCwC6o4BgBsfp99GHdmnGeFF3VHgtLhmIZO5WabTQS1NHThc07r?= =?us-ascii?Q?P6ZxG50+fu3p/LoTTFj5F2wB3kvi7rUPmwSiAL3v99ULr6DamOhiHuLV6ENW?= =?us-ascii?Q?d5Yum6dp0nhe5GQvYaYGlo6rfa/Gud4UuvCLwXinWFFiy4Fcp2eIUUWqxOAb?= =?us-ascii?Q?eNO6jIgurM1kHOJ/Nf3G8wciyK+NzYjHBPzSsYJz?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: PH0PR12MB8800.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9768f694-08bc-43b2-1522-08dc90942f62 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jun 2024 19:15:30.3750 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 4AyAgjHoz/SD18d2Wn7/DUdEvmquCq5MV0WIqntrffqzef16BM3yVVErVc9iDVnzKAuYdt2VQIqHELUpCWUaKw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB7585 X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Hi, Thank you for running all the tests and for all the data. Really appreciate= d. > -----Original Message----- > From: Dmitry Kozlyuk > Sent: Thursday, June 13, 2024 22:15 > To: Dariusz Sosnowski > Cc: users@dpdk.org > Subject: Re: [net/mlx5] Performance drop with HWS compared to SWS >=20 > Hi Dariusz, >=20 > Thank you for looking into the issue, please find full details below. >=20 > Summary: >=20 > Case SWS (Mpps) HWS (Mpps) > -------- ---------- ---------- > baseline 148 - > jump_rss 37 148 > jump_miss 148 107 > jump_drop 148 107 >=20 > From "baseline" vs "jump_rss", the problem is not in jump. > From "jump_miss" vs "jump_drop", the problem is not only in miss. > This is a lab so I can try anything else you need for diagnostic. >=20 > Disabling flow control only fixes the number of packets received by PHY, = but not > the number of packets processed by steering. >=20 > > - Could you share mlnx_perf stats for SWS case as well? >=20 > rx_vport_unicast_packets: 151,716,299 > rx_vport_unicast_bytes: 9,709,843,136 Bps =3D 77,678.74 Mbps > rx_packets_phy: 151,716,517 > rx_bytes_phy: 9,709,856,896 Bps =3D 77,678.85 Mbps > rx_64_bytes_phy: 151,716,867 Bps =3D 1,213.73 Mbps > rx_prio0_bytes: 9,710,051,648 Bps =3D 77,680.41 Mbps > rx_prio0_packets: 151,719,564 >=20 > > - If group 1 had a flow rule with empty match and RSS action, is the > performance difference the same? > > (This would help to understand if the problem is with miss behavior o= r with > jump between group 0 and group 1). >=20 > Case "baseline" > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > No flow rules, just to make sure the host can poll the NIC fast enough. > Result: 148 Mpps >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D1,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > mlnx_perf -i enp33s0f0np0 -t 1 >=20 > rx_vport_unicast_packets: 151,622,123 > rx_vport_unicast_bytes: 9,703,815,872 Bps =3D 77,630.52 Mbps > rx_packets_phy: 151,621,983 > rx_bytes_phy: 9,703,807,872 Bps =3D 77,630.46 Mbps > rx_64_bytes_phy: 151,621,026 Bps =3D 1,212.96 Mbps > rx_prio0_bytes: 9,703,716,480 Bps =3D 77,629.73 Mbps > rx_prio0_packets: 151,620,576 >=20 > Attached: "neohost-cx6dx-baseline-sws.txt". >=20 > Case "jump_rss", SWS > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Jump to group 1, then RSS. > Result: 37 Mpps (?!) > This "37 Mpps" seems to be caused by PCIe bottleneck, which MPRQ is suppo= sed > to overcome. > Is MPRQ limited only to default RSS in SWS mode? >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D1,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > flow create 0 ingress group 0 pattern end actions jump group 1 / end flow= create > 0 ingress group 1 pattern end actions rss queues 0 1 2 3 4 5 6 7 8 9 10 1= 1 12 13 > 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 end / end # start >=20 > mlnx_perf -i enp33s0f0np0 -t 1: >=20 > rx_vport_unicast_packets: 38,155,359 > rx_vport_unicast_bytes: 2,441,942,976 Bps =3D 19,535.54 Mbps > tx_packets_phy: 7,586 > rx_packets_phy: 151,531,694 > tx_bytes_phy: 485,568 Bps =3D 3.88 Mbps > rx_bytes_phy: 9,698,029,248 Bps =3D 77,584.23 Mbps > tx_mac_control_phy: 7,587 > tx_pause_ctrl_phy: 7,587 > rx_discards_phy: 113,376,265 > rx_64_bytes_phy: 151,531,748 Bps =3D 1,212.25 Mbps > rx_buffer_passed_thres_phy: 203 > rx_prio0_bytes: 9,698,066,560 Bps =3D 77,584.53 Mbps > rx_prio0_packets: 38,155,328 > rx_prio0_discards: 113,376,963 > tx_global_pause: 7,587 > tx_global_pause_duration: 1,018,266 >=20 > Attached: "neohost-cx6dx-jump_rss-sws.txt". How are you generating the traffic? Are both IP addresses and TCP ports cha= nging? "jump_rss" case degradation seems to be caused by RSS configuration. It appears that packets are not distributed across all queues. With these flow commands in SWS all packets should go to queue 0 only. Could you please check if that's the case on your side? It can be alleviated this by specifying RSS hash types on RSS action: flow create 0 ingress group 0 pattern end actions jump group 1 / end flow create 0 ingress group 1 pattern end actions rss queues end t= ypes ip tcp end / end Could you please try that on your side? With HWS flow engine, if RSS action does not have hash types specified, implementation defaults to hashing on IP addresses. If IP addresses are variable in your test traffic, that would explain the d= ifference. > Case "jump_rss", HWS > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Result: 148 Mpps >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D2,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > port stop 0 > flow configure 0 queues_number 1 queues_size 128 counters_number 16 port > start 0 # flow pattern_template 0 create pattern_template_id 1 ingress te= mplate > end flow actions_template 0 create ingress actions_template_id 1 template= jump > group 1 / end mask jump group 0xFFFFFFFF / end flow template_table 0 crea= te > ingress group 0 table_id 1 pattern_template 1 actions_template 1 rules_nu= mber > 1 flow queue 0 create 0 template_table 1 pattern_template 0 actions_templ= ate 0 > postpone false pattern end actions jump group 1 / end flow pull 0 queue 0= # flow > actions_template 0 create ingress actions_template_id 2 template rss / en= d mask > rss / end flow template_table 0 create ingress group 1 table_id 2 > pattern_template 1 actions_template 2 rules_number 1 flow queue 0 create = 0 > template_table 2 pattern_template 0 actions_template 0 postpone false pat= tern > end actions rss queues 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 = 20 21 > 22 23 24 25 26 27 28 29 30 31 end / end flow pull 0 queue 0 # start >=20 > mlnx_perf -i enp33s0f0np0 -t 1: >=20 > rx_vport_unicast_packets: 151,514,131 > rx_vport_unicast_bytes: 9,696,904,384 Bps =3D 77,575.23 Mbps > rx_packets_phy: 151,514,275 > rx_bytes_phy: 9,696,913,600 Bps =3D 77,575.30 Mbps > rx_64_bytes_phy: 151,514,122 Bps =3D 1,212.11 Mbps > rx_prio0_bytes: 9,696,814,528 Bps =3D 77,574.51 Mbps > rx_prio0_packets: 151,512,717 >=20 > Attached: "neohost-cx6dx-jump_rss-hws.txt". >=20 > > - Would you be able to do the test with miss in empty group 1, with Eth= ernet > Flow Control disabled? >=20 > $ ethtool -A enp33s0f0np0 rx off tx off >=20 > $ ethtool -a enp33s0f0np0 > Pause parameters for enp33s0f0np0: > Autonegotiate: off > RX: off > TX: off >=20 > testpmd> show port 0 flow_ctrl >=20 > ********************* Flow control infos for port 0 ********************= * FC > mode: > Rx pause: off > Tx pause: off > Autoneg: off > Pause time: 0x0 > High waterline: 0x0 > Low waterline: 0x0 > Send XON: off > Forward MAC control frames: off >=20 >=20 > Case "jump_miss", SWS > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Result: 148 Mpps >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D1,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > flow create 0 ingress group 0 pattern end actions jump group 1 / end star= t >=20 > mlnx_perf -i enp33s0f0np0 >=20 > rx_vport_unicast_packets: 151,526,489 > rx_vport_unicast_bytes: 9,697,695,296 Bps =3D 77,581.56 Mbps > rx_packets_phy: 151,526,193 > rx_bytes_phy: 9,697,676,672 Bps =3D 77,581.41 Mbps > rx_64_bytes_phy: 151,525,423 Bps =3D 1,212.20 Mbps > rx_prio0_bytes: 9,697,488,256 Bps =3D 77,579.90 Mbps > rx_prio0_packets: 151,523,240 >=20 > Attached: "neohost-cx6dx-jump_miss-sws.txt". >=20 >=20 > Case "jump_miss", HWS > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Result: 107 Mpps > Neohost shows RX Packet Rate =3D 148 Mpps, but RX Steering Packets =3D 10= 7 Mpps. >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D2,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > port stop 0 > flow configure 0 queues_number 1 queues_size 128 counters_number 16 port > start 0 flow pattern_template 0 create pattern_template_id 1 ingress temp= late > end flow actions_template 0 create ingress actions_template_id 1 template= jump > group 1 / end mask jump group 0xFFFFFFFF / end flow template_table 0 crea= te > ingress group 0 table_id 1 pattern_template 1 actions_template 1 rules_nu= mber > 1 flow queue 0 create 0 template_table 1 pattern_template 0 actions_templ= ate 0 > postpone false pattern end actions jump group 1 / end flow pull 0 queue 0 >=20 > mlnx_perf -i enp33s0f0np0 >=20 > rx_steer_missed_packets: 109,463,466 > rx_vport_unicast_packets: 109,463,450 > rx_vport_unicast_bytes: 7,005,660,800 Bps =3D 56,045.28 Mbps > rx_packets_phy: 151,518,062 > rx_bytes_phy: 9,697,155,840 Bps =3D 77,577.24 Mbps > rx_64_bytes_phy: 151,516,201 Bps =3D 1,212.12 Mbps > rx_prio0_bytes: 9,697,137,280 Bps =3D 77,577.9 Mbps > rx_prio0_packets: 151,517,782 > rx_prio0_buf_discard: 42,055,156 >=20 > Attached: "neohost-cx6dx-jump_miss-hws.txt". As you can see HWS provides "rx_steer_missed_packets" counter, which is not= available with SWS. It counts the number of packets which did not hit any rule and in the end, = they had to be dropped. To enable that, additional HW flows are required which would handle packets= which did not hit any rule. It has a side effect - these HW flows cause enough backpressure that on very high packet rate, it causes Rx buffer overflow on CX6 Dx. After some internal discussions, I learned that it is kind of expected, because this high number of missed packets is already an indication of the = problem - - NIC resources are wasted on packets for which there is no specified desti= nation. > Case "jump_drop", SWS > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Result: 148 Mpps > Match all in group 0, jump to group 1; match all in group 1, drop. >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D1,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > flow create 0 ingress group 0 pattern end actions jump group 1 / end flow= create > 0 ingress group 1 pattern end actions drop / end >=20 > mlnx_perf -i enp33s0f0np0 >=20 > rx_vport_unicast_packets: 151,705,269 > rx_vport_unicast_bytes: 9,709,137,216 Bps =3D 77,673.9 Mbps > rx_packets_phy: 151,701,498 > rx_bytes_phy: 9,708,896,128 Bps =3D 77,671.16 Mbps > rx_64_bytes_phy: 151,693,532 Bps =3D 1,213.54 Mbps > rx_prio0_bytes: 9,707,005,888 Bps =3D 77,656.4 Mbps > rx_prio0_packets: 151,671,959 >=20 > Attached: "neohost-cx6dx-jump_drop-sws.txt". >=20 >=20 > Case "jump_drop", HWS > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Result: 107 Mpps > Match all in group 0, jump to group 1; match all in group 1, drop. > I've also run this test with a counter attached to the dropping table, an= d it > showed that indeed only 107 Mpps hit the rule. >=20 > /root/build/app/dpdk-testpmd -l 0-31,64-95 -a > 21:00.0,dv_flow_en=3D2,mprq_en=3D1,rx_vec_en=3D1 --in-memory -- \ > -i --rxq=3D32 --txq=3D32 --forward-mode=3Drxonly --nb-cores=3D32 >=20 > port stop 0 > flow configure 0 queues_number 1 queues_size 128 counters_number 16 port > start 0 flow pattern_template 0 create pattern_template_id 1 ingress temp= late > end flow actions_template 0 create ingress actions_template_id 1 template= jump > group 1 / end mask jump group 0xFFFFFFFF / end flow template_table 0 crea= te > ingress group 0 table_id 1 pattern_template 1 actions_template 1 rules_nu= mber > 1 flow queue 0 create 0 template_table 1 pattern_template 0 actions_templ= ate 0 > postpone false pattern end actions jump group 1 / end flow pull 0 queue 0= # flow > actions_template 0 create ingress actions_template_id 2 template drop / e= nd > mask drop / end flow template_table 0 create ingress group 1 table_id 2 > pattern_template 1 actions_template 2 rules_number 1 flow queue 0 create = 0 > template_table 2 pattern_template 0 actions_template 0 postpone false pat= tern > end actions drop / end flow pull 0 queue 0 >=20 > mlnx_perf -i enp33s0f0np0 >=20 > rx_vport_unicast_packets: 109,500,637 > rx_vport_unicast_bytes: 7,008,040,768 Bps =3D 56,064.32 Mbps > rx_packets_phy: 151,568,915 > rx_bytes_phy: 9,700,410,560 Bps =3D 77,603.28 Mbps > rx_64_bytes_phy: 151,569,146 Bps =3D 1,212.55 Mbps > rx_prio0_bytes: 9,699,889,216 Bps =3D 77,599.11 Mbps > rx_prio0_packets: 151,560,756 > rx_prio0_buf_discard: 42,065,705 >=20 > Attached: "neohost-cx6dx-jump_drop-hws.txt". We're still looking into "jump_drop" case. By the way - May I ask what is your target use case with HWS? Best regards, Dariusz Sosnowski