From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0084.outbound.protection.outlook.com [104.47.1.84]) by dpdk.org (Postfix) with ESMTP id 9D9381B20E for ; Wed, 9 May 2018 13:14:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=h/22yi+fxDMYa8njuiJyWmb52xydP4v6zz3zRA6A4kY=; b=WUrGBCMT9kIXyErOfrrMR0jjwPs8sfC7lfBXZXds8JQWsUItb7u80s2FK/SOYoH7TrdtJr692DwtrqEHbYWluwiRYiq4pqybYOY8Hwhl290pyHBnejlQH8W9OoI+N5YthDWzOIm7I7VGBIVSPvYyLGLppt/iQZ4agq0uJ6cnfe0= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; Received: from mellanox.com (209.116.155.178) by HE1PR0501MB2043.eurprd05.prod.outlook.com (2603:10a6:3:35::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.735.18; Wed, 9 May 2018 11:14:10 +0000 From: Yongseok Koh To: adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com Cc: dev@dpdk.org, Yongseok Koh Date: Wed, 9 May 2018 04:13:50 -0700 Message-Id: <20180509111350.20240-4-yskoh@mellanox.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180509111350.20240-1-yskoh@mellanox.com> References: <20180502232054.7852-1-yskoh@mellanox.com> <20180509111350.20240-1-yskoh@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [209.116.155.178] X-ClientProxiedBy: BYAPR07CA0022.namprd07.prod.outlook.com (2603:10b6:a02:bc::35) To HE1PR0501MB2043.eurprd05.prod.outlook.com (2603:10a6:3:35::21) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(48565401081)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:HE1PR0501MB2043; X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 3:yx6w2lRtW7UUtEZlo6e+v/xa+V1vD9Saa5M1HObv4QAgkd8e5sJquK5WucKmx4TonDPZZPefUKPulF4v7FjtK/GIGOuHBVU/DSVdXfQq0W/TpMjkArNOQRPVmm9jUF+tXC++PejuC0gELvnB8+sPE8jZMYd46cl9wPE9nD3oYRM2SorIuLZn2QyirMcleSMWZY4J6q9KNzAR61HSsMRvO49sVqgtsJ+vi/3YnLG3Aqk5kHw0PAdpboBpZfFTb+SD; 25:4xBfQgdTY7y50wB03wD4R8woEdebFRKaMjdqwkwpHJx2sNjRNo/9bVJjPSHM15z/IXNYaDencyQTAWWFZX0FjylH8Yzjg83EJdx94z+EL5gdyuCacAaq5MZG/eYcZmCX5eI7rcF/CrnuqdLdMhjM5MNAUJqXeu3qawF8eWq16WqyYgx95cLZskc+FptDTA59PEhnfjQ2lZEZrwdm3hg/F/0qNfajy7itwXc66P3FMYOU4LzMeRH0LiJ0bMR2E1mDo1lHBq1oj71tuP7rrlVo92Xch6btKaRMwj5byYbCvde8cyVpH+0ldRAbO5PlaqnyilR76pi0KA3Sk9h/p6eebA==; 31:WVI3KTjPLnxr6oPW6kCxrMleLV+7APxWGF5+4Rv1ea3WfmRR+/z2/+43novKVmf2RN21qCAC2QqbL+1yur1Wiy0hs6V1Ee1wzbTTfFxABdZed7tR6plIaT+gPgwCzFZetyxD4j7+4lv8QNLHfi3/23McbmcDqdxjXjcmW7yeeJPlsCkXPkZ7wPjvdNNClvCvKqW0WsWR6hHuut5uw7UVoEBon4e4at6Kvd0PBER+jF8= X-MS-TrafficTypeDiagnostic: HE1PR0501MB2043: X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 20:xxkebPrXaFFBzaBYxbnHpPtgsNULTb7CD1eN0pQYmYMbq9IwDmGHkpA1zo53wqLy5wr+ghJbAE6MW/IemnFz0ZGgFh2sn8NJgMLyIj5Kixn8TNsf8zbO7i27UepUPOO2FjMtNdqeHAwn1Rx+6kpgYSNjy9/icpjXWtWBgOF8ZHbQ9aRJfmHGXXY2gZxF59pl2jOqDocfRT5eLX1JAGRp+I5uS673bwhESRMfd3ewwzzGiVKoqrp8rXDoigUrcqgjJ30thpDyOxu86+FVTOBmMrho24cZjJFaq7bGT9hagE77310SVFdngx9vxBRWTN0krzwE2miZW+RgVzOzTaoOzMzsKUmSJ5Dug/2GaqwJIZTeXEiexM4MCIXhtQXKtprQ7FzSYJDhnONaK5jfv1tHkk9xt6gE+76kFhXmNfjWO0+xEKGFog2GitSCMRbCh9VkOvJx9YneujOuwAim2aBlIn/r/gsvxbkEv9JlHEHgH7mbV4AsY0/QcxUXk6xenWaz; 4:Lzx+r4ob6GqWrnlAZb6ZvR+IIVDDwCiwBP+GO5GgZvQvYh/bRuk17aZnBt/H29eyrtGlwyToZbHYL4JOIidpA3JNqMcLMslJFEjM7CD4vsNo1SHFqvqIrgk7eNcKR8g9UrMu38/nqzDM3EOvDTCoRoMBr+9JhW09YbW5VtFEpSlHVVtUa8vJClMYt5FFF1boEMd2EBGnL9XzSRJjwKgysB6h2Gm05z/fnSL0Ncbm8c8OzsCzZ5/ZFFUVRv4RnRR1E5IfqDWrxCHjcf0fjpD21u0FiRgMoRLz0+3n1zGf0Fe8XLDGhk+eQ0cdtvMiceUV5R4ywm+8oZ06u+V3oikr4Q== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(131327999870524)(788757137089); X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3002001)(3231254)(944501410)(52105095)(6055026)(149027)(150027)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(20161123562045)(6072148)(201708071742011); SRVR:HE1PR0501MB2043; BCL:0; PCL:0; RULEID:; SRVR:HE1PR0501MB2043; X-Forefront-PRVS: 0667289FF8 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(39380400002)(366004)(39860400002)(376002)(396003)(346002)(199004)(189003)(8676002)(5660300001)(48376002)(26005)(107886003)(1076002)(3846002)(97736004)(2906002)(478600001)(81156014)(68736007)(4326008)(55016002)(305945005)(16586007)(7736002)(53936002)(53946003)(6116002)(50466002)(5890100001)(69596002)(66066001)(956004)(8936002)(106356001)(76176011)(105586002)(476003)(11346002)(16526019)(50226002)(47776003)(386003)(316002)(59450400001)(52116002)(81166006)(51416003)(575784001)(2616005)(7696005)(25786009)(21086003)(446003)(186003)(486006)(6666003)(86362001)(36756003)(579004)(559001)(309714004); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0501MB2043; H:mellanox.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; HE1PR0501MB2043; 23:AxO+vhEBD8a4z3pBUEYu//g5dhmDwFTL8zlv0/F?= =?us-ascii?Q?Pm74KNq193ugF0S1wlXUGYp1vOXr6VnxidNn5Bg1xd2kt090cjzQmoZZICrT?= =?us-ascii?Q?oVjTHpZgzIykqc8ILdsCnA9Ng18j7bv/aym61E6iAKXyjW2mJHkQsC4Qh1Dl?= =?us-ascii?Q?G3x9NdL/pHp8SbuE/EbEOtT5y65mfKigvooTp46dHDMr1WhCnyhyq80LC2D0?= =?us-ascii?Q?Vm6t9PETlE6mfaeqNzeKhr0SAiY9qGA0gl3n6wir0DS9l8FSIjt+Plpre2QH?= =?us-ascii?Q?Uk/9P8G3JQ3PB/uUp+2w2gu2xt/TUrxJxac1APO1DRprg7x/c2+p4ypDPep6?= =?us-ascii?Q?76pvB8U7WQ/oDEzc4+byqJ0ZSauCOw3RdE95ojoNPQ8WuGrHi5LhvJVxZCPY?= =?us-ascii?Q?rhs0cHNPfzqjjS5ssTkouQPQdBNec65jbjwtHNe2ueynh0oZ2fmgpTbfB3CB?= =?us-ascii?Q?OjMUQmAiND0wxPKAxsqtGP4iK3Tfyr+KSiY1ocFd2AP3J+EaK9RdtYTawoQC?= =?us-ascii?Q?9uQ1I0dEJT3iY85KlOHSIIYal+GlwHk2zNtxcag5PAjyYzITTRIbk1lqafEo?= =?us-ascii?Q?vM6bIvehE7WtIJ1TFqnxvaYi8EmWV0NQuqogJ6NDHGEGFEtQlH1XoFgjMybq?= =?us-ascii?Q?wAkn1RiGR8xOUnhaNbjQJFaeF2TbAexJLmJjRyzOddBJZfZa2ATdhbmg+P4n?= =?us-ascii?Q?9hqeBHakHXeePCfcMvgWEcUokPDbR8x24vKGLZsJNn/cYUXkzRlrhXhmrtZO?= =?us-ascii?Q?DPLb4vBzAwcukMXI8sPliCdl/TgEGF2dnXeDF6x5Z5UZkm+CJx+ODdMcMQ4U?= =?us-ascii?Q?XH2HJiDpNO3Ig9h8EbK9wk9E9MwQUWD2Nw6h7uVl6sqf0x1DRyxyxyWB1kZL?= =?us-ascii?Q?5G0DKCXeup5YMCp0dIQ9eYTIH2H1P+IR/IDOyo2aajq+07EqLVUrY4w2oanE?= =?us-ascii?Q?m0RZ5dLKZ++VXBZ3Jcs++2ur3+LgeHd2uf1S0f2P319FpbgP1EKY9e0epqNe?= =?us-ascii?Q?8Slf5soX1GLwCo9Bj5Mo/rHQZU1TC9EzHzJpnSz6pC/Rv5tqo+iTE1DZ0Sr7?= =?us-ascii?Q?dHwWA8DF8KPtZJ9I3gxsednjCCUgsVfvyZDdQMV8wBTSh36q5uC1c/oYbxRF?= =?us-ascii?Q?NMVs0LNE4wT3YsS2Is7CWFZBcmfO/qCgy3/c5F32xS746U638os0ejAazFHe?= =?us-ascii?Q?RbTYTH66+VuYwBz3AvWcdT8+3bSLutOtnbuwOtn/lizgHubeNFIcJCpkp1Sy?= =?us-ascii?Q?FalQHyaiH41s2d3QuYtVMnGzeEMCqNnFIoxGGIUp80p9W+AIBDK2E757zG3A?= =?us-ascii?Q?0fL3N9cLMgxy1A+S+oDrWaGBybgwKUzdmDGdHTntBgKpg631YcK4n6IztVLj?= =?us-ascii?Q?h3yH5JEcfmZYzLIx2Z0eUb49QSUQ=3D?= X-Microsoft-Antispam-Message-Info: 8DxYdyvGAjlqNQgVL1L+BrEEglZD6yH8+bkbbIDY7EkZCWsAU0LpybC6HuWF+DidupyEFIRlpwY9i7WWngHLYyhLrxtlqMvXI5iMO0I8ffcOPZliAqe6Kv93FqeADX0SSe92NXDBhCT5DJaQiOJ1uiiLklegYa8jAxwuDBoRdy91jP3o3UswrorZdL/kucM6 X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 6:FUvQyFnFh5YjfIYphXyvr9WDvHT96+qkclr5fulSZplHZq9JO0+Ol7KV1MHCgCqzrTweOJC9Rp0P49VZknNwHRBKmVf48aab/wx+rIy1rLrKdVF4Uz1wPN/hShWWwwOiyrKiI3QyGrCBe7TxuEs8mkc7sWK/hHB4xzXS24jbuJMdpKP73h/ZtOZrllPoMbmjyzjpsXQAMSdojcs+YqwzOAly3t/0yBw9AKi8G+imuDseu1BfTZaX9pWqmCc9CDGcZSii8/euRkuyP9Wta+O43sS1Mt4mYHLYOM4GPmN9mEjIBed2FpX7E+UYJ5mRsCGWkfzyIXcrfRomGrl9vnLk1FtHI0hmWMgrGZADujoS5VplkWxhq5CdGTRX7EQRgyM7YoZZ9DLavC98jfyvpUV/Q9K7b6rEVN5vKu5gR7LuAWSuM3jksI4HOhIIo2GIwjy7CYnlRCdOn/LZtfn5pgv7DA==; 5:qUEXupVhxipv9Phgv2xCbx8sI9m1cRh8vpD0oO1Ga8s6RUAjNXQXox4tps9kLoV/KvbDHqpPSLQ2OKC5KqroD+Hkq/D7d9wToFXyJlcdmU1ckhT9Ce8SLtMxNpKvZWy1LB9HYx0NnFzqjwFs0mgjns821oO7LZODmdCA2ZPM48Y=; 24:lESTrFlviHpgeOlkmdlZ7FbVcyArQMgDG5PuGZs2jgZYVH4/MN1byxnhfedhAHgc904joV7Li/LMSmRv/6Br6m4xgBDwo17X2ojK30/5ayw= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 7:YIvkafoYycHSBqo6cud1gXZmv0FjqQnji+9kV/WX7v3PEVnuFssXYfg9t30NMM0HaFciEjbVJIfp+lOgME8R9VNccLlrvaGasUy01e/iWeTtYN+0k9evMvOmwH2XPPjUuORU4Ggko2RRz8Zrt0fvOHV3eJiVvuKFHFd2OnqQ6i2Uxx4TCdWe64NdeSp0eUHz595+kSV+ga+tNBoS+/m0TAKOL/0Fu6Iq8+G28uxVe8p+qIU/KVbb9Rt5XHPtZrHM X-MS-Office365-Filtering-Correlation-Id: 610fbaf7-5dd2-4da6-c093-08d5b59dfe12 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 May 2018 11:14:10.8989 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 610fbaf7-5dd2-4da6-c093-08d5b59dfe12 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0501MB2043 Subject: [dpdk-dev] [PATCH v2 3/3] net/mlx5: add Multi-Packet Rx support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 May 2018 11:14:15 -0000 Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffer per a packet, one large buffer is posted in order to receive multiple packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides and each stride receives one packet. Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is comparatively small, or PMD attaches the Rx packet to the mbuf by external buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external buffers will be allocated and managed by PMD. Signed-off-by: Yongseok Koh --- doc/guides/nics/mlx5.rst | 58 +++++ drivers/net/mlx5/mlx5.c | 83 +++++++ drivers/net/mlx5/mlx5.h | 11 + drivers/net/mlx5/mlx5_defs.h | 18 ++ drivers/net/mlx5/mlx5_ethdev.c | 3 + drivers/net/mlx5/mlx5_prm.h | 15 ++ drivers/net/mlx5/mlx5_rxq.c | 494 ++++++++++++++++++++++++++++++++++++--- drivers/net/mlx5/mlx5_rxtx.c | 223 +++++++++++++++++- drivers/net/mlx5/mlx5_rxtx.h | 37 ++- drivers/net/mlx5/mlx5_rxtx_vec.c | 4 + drivers/net/mlx5/mlx5_rxtx_vec.h | 3 +- drivers/net/mlx5/mlx5_trigger.c | 6 +- 12 files changed, 912 insertions(+), 43 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 13cc3f831..a7d5c90bc 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -105,10 +105,14 @@ Limitations - A multi segment packet must have less than 6 segments in case the Tx burst function is set to multi-packet send or Enhanced multi-packet send. Otherwise it must have less than 50 segments. + - Count action for RTE flow is **only supported in Mellanox OFED**. + - Flows with a VXLAN Network Identifier equal (or ends to be equal) to 0 are not supported. + - VXLAN TSO and checksum offloads are not supported on VM. + - VF: flow rules created on VF devices can only match traffic targeted at the configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``). @@ -119,6 +123,13 @@ Limitations the device. In case of ungraceful program termination, some entries may remain present and should be removed manually by other means. +- When Multi-Packet Rx queue is configured (``mprq_en``), a Rx packet can be + externally attached to a user-provided mbuf with having EXT_ATTACHED_MBUF in + ol_flags. As the mempool for the external buffer is managed by PMD, all the + Rx mbufs must be freed before the device is closed. Otherwise, the mempool of + the external buffers will be freed by PMD and the application which still + holds the external buffers may be corrupted. + Statistics ---------- @@ -229,6 +240,53 @@ Run-time configuration - x86_64 with ConnectX-4, ConnectX-4 LX and ConnectX-5. - POWER8 and ARMv8 with ConnectX-4 LX and ConnectX-5. +- ``mprq_en`` parameter [int] + + A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is + configured as Multi-Packet RQ if the total number of Rx queues is + ``rxqs_min_mprq`` or more and Rx scatter isn't configured. Disabled by + default. + + Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth + by posting a single large buffer for multiple packets. Instead of posting a + buffers per a packet, one large buffer is posted in order to receive multiple + packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides + and each stride receives one packet. MPRQ can improve throughput for + small-packet tarffic. + + When MPRQ is enabled, max_rx_pkt_len can be larger than the size of + user-provided mbuf even if DEV_RX_OFFLOAD_SCATTER isn't enabled. PMD will + configure large stride size enough to accommodate max_rx_pkt_len as long as + device allows. Note that this can waste system memory compared to enabling Rx + scatter and multi-segment packet. + +- ``mprq_log_stride_num`` parameter [int] + + Log 2 of the number of strides for Multi-Packet Rx queue. Configuring more + strides can reduce PCIe tarffic further. If configured value is not in the + range of device capability, the default value will be set with a warning + message. The default value is 4 which is 16 strides per a buffer, valid only + if ``mprq_en`` is set. + + The size of Rx queue should be bigger than the number of strides. + +- ``mprq_max_memcpy_len`` parameter [int] + + The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx + packet is mem-copied to a user-provided mbuf if the size of Rx packet is less + than or equal to this parameter. Otherwise, PMD will attach the Rx packet to + the mbuf by external buffer attachment - ``rte_pktmbuf_attach_extbuf()``. + A mempool for external buffers will be allocated and managed by PMD. If Rx + packet is externally attached, ol_flags field of the mbuf will have + EXT_ATTACHED_MBUF and this flag must be preserved. ``RTE_MBUF_HAS_EXTBUF()`` + checks the flag. The default value is 128, valid only if ``mprq_en`` is set. + +- ``rxqs_min_mprq`` parameter [int] + + Configure Rx queues as Multi-Packet RQ if the total number of Rx queues is + greater or equal to this value. The default value is 12, valid only if + ``mprq_en`` is set. + - ``txq_inline`` parameter [int] Amount of data to be inlined during TX operations. Improves latency. diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 84fc9d533..8cd2bc04f 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -46,6 +46,18 @@ /* Device parameter to enable RX completion queue compression. */ #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en" +/* Device parameter to enable Multi-Packet Rx queue. */ +#define MLX5_RX_MPRQ_EN "mprq_en" + +/* Device parameter to configure log 2 of the number of strides for MPRQ. */ +#define MLX5_RX_MPRQ_LOG_STRIDE_NUM "mprq_log_stride_num" + +/* Device parameter to limit the size of memcpy'd packet for MPRQ. */ +#define MLX5_RX_MPRQ_MAX_MEMCPY_LEN "mprq_max_memcpy_len" + +/* Device parameter to set the minimum number of Rx queues to enable MPRQ. */ +#define MLX5_RXQS_MIN_MPRQ "rxqs_min_mprq" + /* Device parameter to configure inline send. */ #define MLX5_TXQ_INLINE "txq_inline" @@ -241,6 +253,7 @@ mlx5_dev_close(struct rte_eth_dev *dev) priv->txqs = NULL; } mlx5_flow_delete_drop_queue(dev); + mlx5_mprq_free_mp(dev); mlx5_mr_release(dev); if (priv->pd != NULL) { assert(priv->ctx != NULL); @@ -444,6 +457,14 @@ mlx5_args_check(const char *key, const char *val, void *opaque) } if (strcmp(MLX5_RXQ_CQE_COMP_EN, key) == 0) { config->cqe_comp = !!tmp; + } else if (strcmp(MLX5_RX_MPRQ_EN, key) == 0) { + config->mprq.enabled = !!tmp; + } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_NUM, key) == 0) { + config->mprq.stride_num_n = tmp; + } else if (strcmp(MLX5_RX_MPRQ_MAX_MEMCPY_LEN, key) == 0) { + config->mprq.max_memcpy_len = tmp; + } else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) { + config->mprq.min_rxqs_num = tmp; } else if (strcmp(MLX5_TXQ_INLINE, key) == 0) { config->txq_inline = tmp; } else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0) { @@ -486,6 +507,10 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs) { const char **params = (const char *[]){ MLX5_RXQ_CQE_COMP_EN, + MLX5_RX_MPRQ_EN, + MLX5_RX_MPRQ_LOG_STRIDE_NUM, + MLX5_RX_MPRQ_MAX_MEMCPY_LEN, + MLX5_RXQS_MIN_MPRQ, MLX5_TXQ_INLINE, MLX5_TXQS_MIN_INLINE, MLX5_TXQ_MPW_EN, @@ -667,6 +692,11 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, unsigned int tunnel_en = 0; unsigned int swp = 0; unsigned int verb_priorities = 0; + unsigned int mprq = 0; + unsigned int mprq_min_stride_size_n = 0; + unsigned int mprq_max_stride_size_n = 0; + unsigned int mprq_min_stride_num_n = 0; + unsigned int mprq_max_stride_num_n = 0; int idx; int i; struct mlx5dv_context attrs_out = {0}; @@ -754,6 +784,9 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, #ifdef HAVE_IBV_DEVICE_TUNNEL_SUPPORT attrs_out.comp_mask |= MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS; #endif +#ifdef HAVE_IBV_DEVICE_STRIDING_RQ_SUPPORT + attrs_out.comp_mask |= MLX5DV_CONTEXT_MASK_STRIDING_RQ; +#endif mlx5_glue->dv_query_device(attr_ctx, &attrs_out); if (attrs_out.flags & MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED) { if (attrs_out.flags & MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW) { @@ -772,6 +805,33 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, swp = attrs_out.sw_parsing_caps.sw_parsing_offloads; DRV_LOG(DEBUG, "SWP support: %u", swp); #endif +#ifdef HAVE_IBV_DEVICE_STRIDING_RQ_SUPPORT + if (attrs_out.comp_mask & MLX5DV_CONTEXT_MASK_STRIDING_RQ) { + struct mlx5dv_striding_rq_caps mprq_caps = + attrs_out.striding_rq_caps; + + DRV_LOG(DEBUG, "\tmin_single_stride_log_num_of_bytes: %d", + mprq_caps.min_single_stride_log_num_of_bytes); + DRV_LOG(DEBUG, "\tmax_single_stride_log_num_of_bytes: %d", + mprq_caps.max_single_stride_log_num_of_bytes); + DRV_LOG(DEBUG, "\tmin_single_wqe_log_num_of_strides: %d", + mprq_caps.min_single_wqe_log_num_of_strides); + DRV_LOG(DEBUG, "\tmax_single_wqe_log_num_of_strides: %d", + mprq_caps.max_single_wqe_log_num_of_strides); + DRV_LOG(DEBUG, "\tsupported_qpts: %d", + mprq_caps.supported_qpts); + DRV_LOG(DEBUG, "device supports Multi-Packet RQ"); + mprq = 1; + mprq_min_stride_size_n = + mprq_caps.min_single_stride_log_num_of_bytes; + mprq_max_stride_size_n = + mprq_caps.max_single_stride_log_num_of_bytes; + mprq_min_stride_num_n = + mprq_caps.min_single_wqe_log_num_of_strides; + mprq_max_stride_num_n = + mprq_caps.max_single_wqe_log_num_of_strides; + } +#endif if (RTE_CACHE_LINE_SIZE == 128 && !(attrs_out.flags & MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP)) cqe_comp = 0; @@ -821,6 +881,13 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, .inline_max_packet_sz = MLX5_ARG_UNSET, .vf_nl_en = 1, .swp = !!swp, + .mprq = { + .enabled = 0, /* Disabled by default. */ + .stride_num_n = RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, + mprq_min_stride_num_n), + .max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN, + .min_rxqs_num = MLX5_MPRQ_MIN_RXQS, + }, }; len = snprintf(name, sizeof(name), PCI_PRI_FMT, @@ -986,6 +1053,22 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, DRV_LOG(WARNING, "Rx CQE compression isn't supported"); config.cqe_comp = 0; } + config.mprq.enabled = config.mprq.enabled && mprq; + if (config.mprq.enabled) { + if (config.mprq.stride_num_n > mprq_max_stride_num_n || + config.mprq.stride_num_n < mprq_min_stride_num_n) { + config.mprq.stride_num_n = + RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, + mprq_min_stride_num_n); + DRV_LOG(WARNING, + "the number of strides" + " for Multi-Packet RQ is out of range," + " setting default value (%u)", + 1 << config.mprq.stride_num_n); + } + config.mprq.min_stride_size_n = mprq_min_stride_size_n; + config.mprq.max_stride_size_n = mprq_max_stride_size_n; + } eth_dev = rte_eth_dev_allocate(name); if (eth_dev == NULL) { DRV_LOG(ERR, "can not allocate rte ethdev"); diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index b2ed99087..c4c962b92 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -102,6 +102,16 @@ struct mlx5_dev_config { unsigned int l3_vxlan_en:1; /* Enable L3 VXLAN flow creation. */ unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */ unsigned int swp:1; /* Tx generic tunnel checksum and TSO offload. */ + struct { + unsigned int enabled:1; /* Whether MPRQ is enabled. */ + unsigned int stride_num_n; /* Number of strides. */ + unsigned int min_stride_size_n; /* Min size of a stride. */ + unsigned int max_stride_size_n; /* Max size of a stride. */ + unsigned int max_memcpy_len; + /* Maximum packet size to memcpy Rx packets. */ + unsigned int min_rxqs_num; + /* Rx queue count threshold to enable MPRQ. */ + } mprq; /* Configurations for Multi-Packet RQ. */ unsigned int max_verbs_prio; /* Number of Verb flow priorities. */ unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */ unsigned int ind_table_max_size; /* Maximum indirection table size. */ @@ -154,6 +164,7 @@ struct priv { unsigned int txqs_n; /* TX queues array size. */ struct mlx5_rxq_data *(*rxqs)[]; /* RX queues. */ struct mlx5_txq_data *(*txqs)[]; /* TX queues. */ + struct rte_mempool *mprq_mp; /* Mempool for Multi-Packet RQ. */ struct rte_eth_rss_conf rss_conf; /* RSS configuration. */ struct rte_intr_handle intr_handle; /* Interrupt handler. */ unsigned int (*reta_idx)[]; /* RETA index table. */ diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 72e80af26..51124cdcc 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -95,4 +95,22 @@ */ #define MLX5_UAR_OFFSET (1ULL << 32) +/* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */ +#define MLX5_MPRQ_STRIDE_NUM_N 4U + +/* Two-byte shift is disabled for Multi-Packet RQ. */ +#define MLX5_MPRQ_TWO_BYTE_SHIFT 0 + +/* + * Minimum size of packet to be memcpy'd instead of being attached as an + * external buffer. + */ +#define MLX5_MPRQ_MEMCPY_DEFAULT_LEN 128 + +/* Minimum number Rx queues to enable Multi-Packet RQ. */ +#define MLX5_MPRQ_MIN_RXQS 12 + +/* Cache size of mempool for Multi-Packet RQ. */ +#define MLX5_MPRQ_MP_CACHE_SZ 32 + #endif /* RTE_PMD_MLX5_DEFS_H_ */ diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c index 610b5a006..25dceb2b2 100644 --- a/drivers/net/mlx5/mlx5_ethdev.c +++ b/drivers/net/mlx5/mlx5_ethdev.c @@ -547,6 +547,7 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev) }; if (dev->rx_pkt_burst == mlx5_rx_burst || + dev->rx_pkt_burst == mlx5_rx_burst_mprq || dev->rx_pkt_burst == mlx5_rx_burst_vec) return ptypes; return NULL; @@ -1204,6 +1205,8 @@ mlx5_select_rx_function(struct rte_eth_dev *dev) rx_pkt_burst = mlx5_rx_burst_vec; DRV_LOG(DEBUG, "port %u selected Rx vectorized function", dev->data->port_id); + } else if (mlx5_mprq_enabled(dev)) { + rx_pkt_burst = mlx5_rx_burst_mprq; } return rx_pkt_burst; } diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h index 05a682898..0cf370cd7 100644 --- a/drivers/net/mlx5/mlx5_prm.h +++ b/drivers/net/mlx5/mlx5_prm.h @@ -219,6 +219,21 @@ struct mlx5_mpw { } data; }; +/* WQE for Multi-Packet RQ. */ +struct mlx5_wqe_mprq { + struct mlx5_wqe_srq_next_seg next_seg; + struct mlx5_wqe_data_seg dseg; +}; + +#define MLX5_MPRQ_LEN_MASK 0x000ffff +#define MLX5_MPRQ_LEN_SHIFT 0 +#define MLX5_MPRQ_STRIDE_NUM_MASK 0x3fff0000 +#define MLX5_MPRQ_STRIDE_NUM_SHIFT 16 +#define MLX5_MPRQ_FILLER_MASK 0x80000000 +#define MLX5_MPRQ_FILLER_SHIFT 31 + +#define MLX5_MPRQ_STRIDE_SHIFT_BYTE 2 + /* CQ element structure - should be equal to the cache line size */ struct mlx5_cqe { #if (RTE_CACHE_LINE_SIZE == 128) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index dbfc837d6..ea6ebe676 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -55,7 +55,74 @@ uint8_t rss_hash_default_key[] = { const size_t rss_hash_default_key_len = sizeof(rss_hash_default_key); /** - * Allocate RX queue elements. + * Check whether Multi-Packet RQ can be enabled for the device. + * + * @param dev + * Pointer to Ethernet device. + * + * @return + * 1 if supported, negative errno value if not. + */ +inline int +mlx5_check_mprq_support(struct rte_eth_dev *dev) +{ + struct priv *priv = dev->data->dev_private; + + if (priv->config.mprq.enabled && + priv->rxqs_n >= priv->config.mprq.min_rxqs_num) + return 1; + return -ENOTSUP; +} + +/** + * Check whether Multi-Packet RQ is enabled for the Rx queue. + * + * @param rxq + * Pointer to receive queue structure. + * + * @return + * 0 if disabled, otherwise enabled. + */ +inline int +mlx5_rxq_mprq_enabled(struct mlx5_rxq_data *rxq) +{ + return rxq->strd_num_n > 0; +} + +/** + * Check whether Multi-Packet RQ is enabled for the device. + * + * @param dev + * Pointer to Ethernet device. + * + * @return + * 0 if disabled, otherwise enabled. + */ +inline int +mlx5_mprq_enabled(struct rte_eth_dev *dev) +{ + struct priv *priv = dev->data->dev_private; + uint16_t i; + uint16_t n = 0; + + if (mlx5_check_mprq_support(dev) < 0) + return 0; + /* All the configured queues should be enabled. */ + for (i = 0; i < priv->rxqs_n; ++i) { + struct mlx5_rxq_data *rxq = (*priv->rxqs)[i]; + + if (!rxq) + continue; + if (mlx5_rxq_mprq_enabled(rxq)) + ++n; + } + /* Multi-Packet RQ can't be partially configured. */ + assert(n == 0 || n == priv->rxqs_n); + return n == priv->rxqs_n; +} + +/** + * Allocate RX queue elements for Multi-Packet RQ. * * @param rxq_ctrl * Pointer to RX queue structure. @@ -63,8 +130,58 @@ const size_t rss_hash_default_key_len = sizeof(rss_hash_default_key); * @return * 0 on success, a negative errno value otherwise and rte_errno is set. */ -int -rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl) +static int +rxq_alloc_elts_mprq(struct mlx5_rxq_ctrl *rxq_ctrl) +{ + struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq; + unsigned int wqe_n = 1 << rxq->elts_n; + unsigned int i; + int err; + + /* Iterate on segments. */ + for (i = 0; i <= wqe_n; ++i) { + struct mlx5_mprq_buf *buf; + + if (rte_mempool_get(rxq->mprq_mp, (void **)&buf) < 0) { + DRV_LOG(ERR, "port %u empty mbuf pool", rxq->port_id); + rte_errno = ENOMEM; + goto error; + } + if (i < wqe_n) + (*rxq->mprq_bufs)[i] = buf; + else + rxq->mprq_repl = buf; + } + DRV_LOG(DEBUG, + "port %u Rx queue %u allocated and configured %u segments", + rxq->port_id, rxq_ctrl->idx, wqe_n); + return 0; +error: + err = rte_errno; /* Save rte_errno before cleanup. */ + wqe_n = i; + for (i = 0; (i != wqe_n); ++i) { + if ((*rxq->elts)[i] != NULL) + rte_mempool_put(rxq->mprq_mp, + (*rxq->mprq_bufs)[i]); + (*rxq->mprq_bufs)[i] = NULL; + } + DRV_LOG(DEBUG, "port %u Rx queue %u failed, freed everything", + rxq->port_id, rxq_ctrl->idx); + rte_errno = err; /* Restore rte_errno. */ + return -rte_errno; +} + +/** + * Allocate RX queue elements for Single-Packet RQ. + * + * @param rxq_ctrl + * Pointer to RX queue structure. + * + * @return + * 0 on success, errno value on failure. + */ +static int +rxq_alloc_elts_sprq(struct mlx5_rxq_ctrl *rxq_ctrl) { const unsigned int sges_n = 1 << rxq_ctrl->rxq.sges_n; unsigned int elts_n = 1 << rxq_ctrl->rxq.elts_n; @@ -140,13 +257,57 @@ rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl) } /** - * Free RX queue elements. + * Allocate RX queue elements. + * + * @param rxq_ctrl + * Pointer to RX queue structure. + * + * @return + * 0 on success, errno value on failure. + */ +int +rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl) +{ + return mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq) ? + rxq_alloc_elts_mprq(rxq_ctrl) : rxq_alloc_elts_sprq(rxq_ctrl); +} + +/** + * Free RX queue elements for Multi-Packet RQ. * * @param rxq_ctrl * Pointer to RX queue structure. */ static void -rxq_free_elts(struct mlx5_rxq_ctrl *rxq_ctrl) +rxq_free_elts_mprq(struct mlx5_rxq_ctrl *rxq_ctrl) +{ + struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq; + uint16_t i; + + DRV_LOG(DEBUG, "port %u Multi-Packet Rx queue %u freeing WRs", + rxq->port_id, rxq_ctrl->idx); + if (rxq->mprq_bufs == NULL) + return; + assert(mlx5_rxq_check_vec_support(rxq) < 0); + for (i = 0; (i != (1u << rxq->elts_n)); ++i) { + if ((*rxq->mprq_bufs)[i] != NULL) + mlx5_mprq_buf_free((*rxq->mprq_bufs)[i]); + (*rxq->mprq_bufs)[i] = NULL; + } + if (rxq->mprq_repl != NULL) { + mlx5_mprq_buf_free(rxq->mprq_repl); + rxq->mprq_repl = NULL; + } +} + +/** + * Free RX queue elements for Single-Packet RQ. + * + * @param rxq_ctrl + * Pointer to RX queue structure. + */ +static void +rxq_free_elts_sprq(struct mlx5_rxq_ctrl *rxq_ctrl) { struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq; const uint16_t q_n = (1 << rxq->elts_n); @@ -175,6 +336,21 @@ rxq_free_elts(struct mlx5_rxq_ctrl *rxq_ctrl) } /** + * Free RX queue elements. + * + * @param rxq_ctrl + * Pointer to RX queue structure. + */ +static void +rxq_free_elts(struct mlx5_rxq_ctrl *rxq_ctrl) +{ + if (mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq)) + rxq_free_elts_mprq(rxq_ctrl); + else + rxq_free_elts_sprq(rxq_ctrl); +} + +/** * Clean up a RX queue. * * Destroy objects, free allocated memory and reset the structure for reuse. @@ -623,10 +799,16 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) struct ibv_cq_init_attr_ex ibv; struct mlx5dv_cq_init_attr mlx5; } cq; - struct ibv_wq_init_attr wq; + struct { + struct ibv_wq_init_attr ibv; +#ifdef HAVE_IBV_DEVICE_STRIDING_RQ_SUPPORT + struct mlx5dv_wq_init_attr mlx5; +#endif + } wq; struct ibv_cq_ex cq_attr; } attr; - unsigned int cqe_n = (1 << rxq_data->elts_n) - 1; + unsigned int cqe_n; + unsigned int wqe_n = 1 << rxq_data->elts_n; struct mlx5_rxq_ibv *tmpl; struct mlx5dv_cq cq_info; struct mlx5dv_rwq rwq; @@ -634,6 +816,7 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) int ret = 0; struct mlx5dv_obj obj; struct mlx5_dev_config *config = &priv->config; + const int mprq_en = mlx5_rxq_mprq_enabled(rxq_data); assert(rxq_data); assert(!rxq_ctrl->ibv); @@ -658,6 +841,10 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) goto error; } } + if (mprq_en) + cqe_n = wqe_n * (1 << rxq_data->strd_num_n) - 1; + else + cqe_n = wqe_n - 1; attr.cq.ibv = (struct ibv_cq_init_attr_ex){ .cqe = cqe_n, .channel = tmpl->channel, @@ -695,11 +882,11 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) dev->data->port_id, priv->device_attr.orig_attr.max_qp_wr); DRV_LOG(DEBUG, "port %u priv->device_attr.max_sge is %d", dev->data->port_id, priv->device_attr.orig_attr.max_sge); - attr.wq = (struct ibv_wq_init_attr){ + attr.wq.ibv = (struct ibv_wq_init_attr){ .wq_context = NULL, /* Could be useful in the future. */ .wq_type = IBV_WQT_RQ, /* Max number of outstanding WRs. */ - .max_wr = (1 << rxq_data->elts_n) >> rxq_data->sges_n, + .max_wr = wqe_n >> rxq_data->sges_n, /* Max number of scatter/gather elements in a WR. */ .max_sge = 1 << rxq_data->sges_n, .pd = priv->pd, @@ -713,16 +900,35 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) }; /* By default, FCS (CRC) is stripped by hardware. */ if (rxq_data->crc_present) { - attr.wq.create_flags |= IBV_WQ_FLAGS_SCATTER_FCS; - attr.wq.comp_mask |= IBV_WQ_INIT_ATTR_FLAGS; + attr.wq.ibv.create_flags |= IBV_WQ_FLAGS_SCATTER_FCS; + attr.wq.ibv.comp_mask |= IBV_WQ_INIT_ATTR_FLAGS; } #ifdef HAVE_IBV_WQ_FLAG_RX_END_PADDING if (config->hw_padding) { - attr.wq.create_flags |= IBV_WQ_FLAG_RX_END_PADDING; - attr.wq.comp_mask |= IBV_WQ_INIT_ATTR_FLAGS; + attr.wq.ibv.create_flags |= IBV_WQ_FLAG_RX_END_PADDING; + attr.wq.ibv.comp_mask |= IBV_WQ_INIT_ATTR_FLAGS; } #endif - tmpl->wq = mlx5_glue->create_wq(priv->ctx, &attr.wq); +#ifdef HAVE_IBV_DEVICE_STRIDING_RQ_SUPPORT + attr.wq.mlx5 = (struct mlx5dv_wq_init_attr){ + .comp_mask = 0, + }; + if (mprq_en) { + struct mlx5dv_striding_rq_init_attr *mprq_attr = + &attr.wq.mlx5.striding_rq_attrs; + + attr.wq.mlx5.comp_mask |= MLX5DV_WQ_INIT_ATTR_MASK_STRIDING_RQ; + *mprq_attr = (struct mlx5dv_striding_rq_init_attr){ + .single_stride_log_num_of_bytes = rxq_data->strd_sz_n, + .single_wqe_log_num_of_strides = rxq_data->strd_num_n, + .two_byte_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT, + }; + } + tmpl->wq = mlx5_glue->dv_create_wq(priv->ctx, &attr.wq.ibv, + &attr.wq.mlx5); +#else + tmpl->wq = mlx5_glue->create_wq(priv->ctx, &attr.wq.ibv); +#endif if (tmpl->wq == NULL) { DRV_LOG(ERR, "port %u Rx queue %u WQ creation failure", dev->data->port_id, idx); @@ -733,16 +939,14 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) * Make sure number of WRs*SGEs match expectations since a queue * cannot allocate more than "desc" buffers. */ - if (((int)attr.wq.max_wr != - ((1 << rxq_data->elts_n) >> rxq_data->sges_n)) || - ((int)attr.wq.max_sge != (1 << rxq_data->sges_n))) { + if (attr.wq.ibv.max_wr != (wqe_n >> rxq_data->sges_n) || + attr.wq.ibv.max_sge != (1u << rxq_data->sges_n)) { DRV_LOG(ERR, "port %u Rx queue %u requested %u*%u but got %u*%u" " WRs*SGEs", dev->data->port_id, idx, - ((1 << rxq_data->elts_n) >> rxq_data->sges_n), - (1 << rxq_data->sges_n), - attr.wq.max_wr, attr.wq.max_sge); + wqe_n >> rxq_data->sges_n, (1 << rxq_data->sges_n), + attr.wq.ibv.max_wr, attr.wq.ibv.max_sge); rte_errno = EINVAL; goto error; } @@ -777,25 +981,40 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) goto error; } /* Fill the rings. */ - rxq_data->wqes = (volatile struct mlx5_wqe_data_seg (*)[]) - (uintptr_t)rwq.buf; - for (i = 0; (i != (unsigned int)(1 << rxq_data->elts_n)); ++i) { - struct rte_mbuf *buf = (*rxq_data->elts)[i]; - volatile struct mlx5_wqe_data_seg *scat = &(*rxq_data->wqes)[i]; - + rxq_data->wqes = rwq.buf; + for (i = 0; (i != wqe_n); ++i) { + volatile struct mlx5_wqe_data_seg *scat; + uintptr_t addr; + uint32_t byte_count; + + if (mprq_en) { + struct mlx5_mprq_buf *buf = (*rxq_data->mprq_bufs)[i]; + + scat = &((volatile struct mlx5_wqe_mprq *) + rxq_data->wqes)[i].dseg; + addr = (uintptr_t)mlx5_mprq_buf_addr(buf); + byte_count = (1 << rxq_data->strd_sz_n) * + (1 << rxq_data->strd_num_n); + } else { + struct rte_mbuf *buf = (*rxq_data->elts)[i]; + + scat = &((volatile struct mlx5_wqe_data_seg *) + rxq_data->wqes)[i]; + addr = rte_pktmbuf_mtod(buf, uintptr_t); + byte_count = DATA_LEN(buf); + } /* scat->addr must be able to store a pointer. */ assert(sizeof(scat->addr) >= sizeof(uintptr_t)); *scat = (struct mlx5_wqe_data_seg){ - .addr = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, - uintptr_t)), - .byte_count = rte_cpu_to_be_32(DATA_LEN(buf)), - .lkey = mlx5_rx_mb2mr(rxq_data, buf), + .addr = rte_cpu_to_be_64(addr), + .byte_count = rte_cpu_to_be_32(byte_count), + .lkey = mlx5_rx_addr2mr(rxq_data, addr), }; } rxq_data->rq_db = rwq.dbrec; rxq_data->cqe_n = log2above(cq_info.cqe_cnt); rxq_data->cq_ci = 0; - rxq_data->rq_ci = 0; + rxq_data->strd_ci = 0; rxq_data->rq_pi = 0; rxq_data->zip = (struct rxq_zip){ .ai = 0, @@ -806,7 +1025,7 @@ mlx5_rxq_ibv_new(struct rte_eth_dev *dev, uint16_t idx) rxq_data->cqn = cq_info.cqn; rxq_data->cq_arm_sn = 0; /* Update doorbell counter. */ - rxq_data->rq_ci = (1 << rxq_data->elts_n) >> rxq_data->sges_n; + rxq_data->rq_ci = wqe_n >> rxq_data->sges_n; rte_wmb(); *rxq_data->rq_db = rte_cpu_to_be_32(rxq_data->rq_ci); DRV_LOG(DEBUG, "port %u rxq %u updated with %p", dev->data->port_id, @@ -932,12 +1151,180 @@ mlx5_rxq_ibv_releasable(struct mlx5_rxq_ibv *rxq_ibv) } /** + * Callback function to initialize mbufs for Multi-Packet RQ. + */ +static inline void +mlx5_mprq_buf_init(struct rte_mempool *mp, void *opaque_arg __rte_unused, + void *_m, unsigned int i __rte_unused) +{ + struct mlx5_mprq_buf *buf = _m; + + memset(_m, 0, sizeof(*buf)); + buf->mp = mp; + rte_atomic16_set(&buf->refcnt, 1); +} + +/** + * Free mempool of Multi-Packet RQ. + * + * @param dev + * Pointer to Ethernet device. + * + * @return + * 0 on success, negative errno value on failure. + */ +int +mlx5_mprq_free_mp(struct rte_eth_dev *dev) +{ + struct priv *priv = dev->data->dev_private; + struct rte_mempool *mp = priv->mprq_mp; + unsigned int i; + + if (mp == NULL) + return 0; + DRV_LOG(DEBUG, "port %u freeing mempool (%s) for Multi-Packet RQ", + dev->data->port_id, mp->name); + /* + * If a buffer in the pool has been externally attached to a mbuf and it + * is still in use by application, destroying the Rx qeueue can spoil + * the packet. It is unlikely to happen but if application dynamically + * creates and destroys with holding Rx packets, this can happen. + * + * TODO: It is unavoidable for now because the mempool for Multi-Packet + * RQ isn't provided by application but managed by PMD. + */ + if (!rte_mempool_full(mp)) { + DRV_LOG(ERR, + "port %u mempool for Multi-Packet RQ is still in use", + dev->data->port_id); + rte_errno = EBUSY; + return -rte_errno; + } + rte_mempool_free(mp); + /* Unset mempool for each Rx queue. */ + for (i = 0; i != priv->rxqs_n; ++i) { + struct mlx5_rxq_data *rxq = (*priv->rxqs)[i]; + + if (rxq == NULL) + continue; + rxq->mprq_mp = NULL; + } + return 0; +} + +/** + * Allocate a mempool for Multi-Packet RQ. All configured Rx queues share the + * mempool. If already allocated, reuse it if there're enough elements. + * Otherwise, resize it. + * + * @param dev + * Pointer to Ethernet device. + * + * @return + * 0 on success, negative errno value on failure. + */ +int +mlx5_mprq_alloc_mp(struct rte_eth_dev *dev) +{ + struct priv *priv = dev->data->dev_private; + struct rte_mempool *mp = priv->mprq_mp; + char name[RTE_MEMPOOL_NAMESIZE]; + unsigned int desc = 0; + unsigned int buf_len; + unsigned int obj_num; + unsigned int obj_size; + unsigned int strd_num_n = 0; + unsigned int strd_sz_n = 0; + unsigned int i; + + if (!mlx5_mprq_enabled(dev)) + return 0; + /* Count the total number of descriptors configured. */ + for (i = 0; i != priv->rxqs_n; ++i) { + struct mlx5_rxq_data *rxq = (*priv->rxqs)[i]; + + if (rxq == NULL) + continue; + desc += 1 << rxq->elts_n; + /* Get the max number of strides. */ + if (strd_num_n < rxq->strd_num_n) + strd_num_n = rxq->strd_num_n; + /* Get the max size of a stride. */ + if (strd_sz_n < rxq->strd_sz_n) + strd_sz_n = rxq->strd_sz_n; + } + assert(strd_num_n && strd_sz_n); + buf_len = (1 << strd_num_n) * (1 << strd_sz_n); + obj_size = buf_len + sizeof(struct mlx5_mprq_buf); + /* + * Received packets can be either memcpy'd or externally referenced. In + * case that the packet is attached to an mbuf as an external buffer, as + * it isn't possible to predict how the buffers will be queued by + * application, there's no option to exactly pre-allocate needed buffers + * in advance but to speculatively prepares enough buffers. + * + * In the data path, if this Mempool is depleted, PMD will try to memcpy + * received packets to buffers provided by application (rxq->mp) until + * this Mempool gets available again. + */ + desc *= 4; + obj_num = desc + MLX5_MPRQ_MP_CACHE_SZ * priv->rxqs_n; + /* Check a mempool is already allocated and if it can be resued. */ + if (mp != NULL && mp->elt_size >= obj_size && mp->size >= obj_num) { + DRV_LOG(DEBUG, "port %u mempool %s is being reused", + dev->data->port_id, mp->name); + /* Reuse. */ + goto exit; + } else if (mp != NULL) { + DRV_LOG(DEBUG, "port %u mempool %s should be resized, freeing it", + dev->data->port_id, mp->name); + /* + * If failed to free, which means it may be still in use, no way + * but to keep using the existing one. On buffer underrun, + * packets will be memcpy'd instead of external buffer + * attachment. + */ + if (mlx5_mprq_free_mp(dev)) { + if (mp->elt_size >= obj_size) + goto exit; + else + return -rte_errno; + } + } + snprintf(name, sizeof(name), "%s-mprq", dev->device->name); + mp = rte_mempool_create(name, obj_num, obj_size, MLX5_MPRQ_MP_CACHE_SZ, + 0, NULL, NULL, mlx5_mprq_buf_init, NULL, + dev->device->numa_node, 0); + if (mp == NULL) { + DRV_LOG(ERR, + "port %u failed to allocate a mempool for" + " Multi-Packet RQ, count=%u, size=%u", + dev->data->port_id, obj_num, obj_size); + rte_errno = ENOMEM; + return -rte_errno; + } + priv->mprq_mp = mp; +exit: + /* Set mempool for each Rx queue. */ + for (i = 0; i != priv->rxqs_n; ++i) { + struct mlx5_rxq_data *rxq = (*priv->rxqs)[i]; + + if (rxq == NULL) + continue; + rxq->mprq_mp = mp; + } + DRV_LOG(INFO, "port %u Multi-Packet RQ is configured", + dev->data->port_id); + return 0; +} + +/** * Create a DPDK Rx queue. * * @param dev * Pointer to Ethernet device. * @param idx - * TX queue index. + * RX queue index. * @param desc * Number of descriptors to configure in queue. * @param socket @@ -954,13 +1341,15 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, struct priv *priv = dev->data->dev_private; struct mlx5_rxq_ctrl *tmpl; unsigned int mb_len = rte_pktmbuf_data_room_size(mp); + unsigned int mprq_stride_size; struct mlx5_dev_config *config = &priv->config; /* * Always allocate extra slots, even if eventually * the vector Rx will not be used. */ - const uint16_t desc_n = + uint16_t desc_n = desc + config->rx_vec_en * MLX5_VPMD_DESCS_PER_LOOP; + const int mprq_en = mlx5_check_mprq_support(dev) > 0; tmpl = rte_calloc_socket("RXQ", 1, sizeof(*tmpl) + @@ -978,10 +1367,41 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, tmpl->socket = socket; if (dev->data->dev_conf.intr_conf.rxq) tmpl->irq = 1; - /* Enable scattered packets support for this queue if necessary. */ + /* + * This Rx queue can be configured as a Multi-Packet RQ if all of the + * following conditions are met: + * - MPRQ is enabled. + * - The number of descs is more than the number of strides. + * - max_rx_pkt_len plus overhead is less than the max size of a + * stride. + * Otherwise, enable Rx scatter if necessary. + */ assert(mb_len >= RTE_PKTMBUF_HEADROOM); - if (dev->data->dev_conf.rxmode.max_rx_pkt_len <= - (mb_len - RTE_PKTMBUF_HEADROOM)) { + mprq_stride_size = + dev->data->dev_conf.rxmode.max_rx_pkt_len + + sizeof(struct rte_mbuf_ext_shared_info) + + RTE_PKTMBUF_HEADROOM; + if (mprq_en && + desc >= (1U << config->mprq.stride_num_n) && + mprq_stride_size <= (1U << config->mprq.max_stride_size_n)) { + /* TODO: Rx scatter isn't supported yet. */ + tmpl->rxq.sges_n = 0; + /* Trim the number of descs needed. */ + desc >>= config->mprq.stride_num_n; + tmpl->rxq.strd_num_n = config->mprq.stride_num_n; + tmpl->rxq.strd_sz_n = RTE_MAX(log2above(mprq_stride_size), + config->mprq.min_stride_size_n); + tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; + tmpl->rxq.mprq_max_memcpy_len = + RTE_MIN(mb_len - RTE_PKTMBUF_HEADROOM, + config->mprq.max_memcpy_len); + DRV_LOG(DEBUG, + "port %u Rx queue %u: Multi-Packet RQ is enabled" + " strd_num_n = %u, strd_sz_n = %u", + dev->data->port_id, idx, + tmpl->rxq.strd_num_n, tmpl->rxq.strd_sz_n); + } else if (dev->data->dev_conf.rxmode.max_rx_pkt_len <= + (mb_len - RTE_PKTMBUF_HEADROOM)) { tmpl->rxq.sges_n = 0; } else if (conf->offloads & DEV_RX_OFFLOAD_SCATTER) { unsigned int size = diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 6cb177cca..d960a73d5 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -47,6 +47,9 @@ static __rte_always_inline void rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt, volatile struct mlx5_cqe *cqe, uint32_t rss_hash_res); +static __rte_always_inline void +mprq_buf_replace(struct mlx5_rxq_data *rxq, uint16_t rq_idx); + uint32_t mlx5_ptype_table[] __rte_cache_aligned = { [0xff] = RTE_PTYPE_ALL_MASK, /* Last entry for errored packet. */ }; @@ -1920,7 +1923,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) while (pkts_n) { unsigned int idx = rq_ci & wqe_cnt; - volatile struct mlx5_wqe_data_seg *wqe = &(*rxq->wqes)[idx]; + volatile struct mlx5_wqe_data_seg *wqe = + &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx]; struct rte_mbuf *rep = (*rxq->elts)[idx]; uint32_t rss_hash_res = 0; @@ -2023,6 +2027,223 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) return i; } +void +mlx5_mprq_buf_free_cb(void *addr __rte_unused, void *opaque) +{ + struct mlx5_mprq_buf *buf = opaque; + + if (rte_atomic16_read(&buf->refcnt) == 1) { + rte_mempool_put(buf->mp, buf); + } else if (rte_atomic16_add_return(&buf->refcnt, -1) == 0) { + rte_atomic16_set(&buf->refcnt, 1); + rte_mempool_put(buf->mp, buf); + } +} + +void +mlx5_mprq_buf_free(struct mlx5_mprq_buf *buf) +{ + mlx5_mprq_buf_free_cb(NULL, buf); +} + +static inline void +mprq_buf_replace(struct mlx5_rxq_data *rxq, uint16_t rq_idx) +{ + struct mlx5_mprq_buf *rep = rxq->mprq_repl; + volatile struct mlx5_wqe_data_seg *wqe = + &((volatile struct mlx5_wqe_mprq *)rxq->wqes)[rq_idx].dseg; + void *addr; + + assert(rep != NULL); + /* Replace MPRQ buf. */ + (*rxq->mprq_bufs)[rq_idx] = rep; + /* Replace WQE. */ + addr = mlx5_mprq_buf_addr(rep); + wqe->addr = rte_cpu_to_be_64((uintptr_t)addr); + /* If there's only one MR, no need to replace LKey in WQE. */ + if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) + wqe->lkey = mlx5_rx_addr2mr(rxq, (uintptr_t)addr); + /* Stash a mbuf for next replacement. */ + if (likely(!rte_mempool_get(rxq->mprq_mp, (void **)&rep))) + rxq->mprq_repl = rep; + else + rxq->mprq_repl = NULL; +} + +/** + * DPDK callback for RX with Multi-Packet RQ support. + * + * @param dpdk_rxq + * Generic pointer to RX queue structure. + * @param[out] pkts + * Array to store received packets. + * @param pkts_n + * Maximum number of packets in array. + * + * @return + * Number of packets successfully received (<= pkts_n). + */ +uint16_t +mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) +{ + struct mlx5_rxq_data *rxq = dpdk_rxq; + const unsigned int strd_n = 1 << rxq->strd_num_n; + const unsigned int strd_sz = 1 << rxq->strd_sz_n; + const unsigned int strd_shift = + MLX5_MPRQ_STRIDE_SHIFT_BYTE * rxq->strd_shift_en; + const unsigned int cq_mask = (1 << rxq->cqe_n) - 1; + const unsigned int wq_mask = (1 << rxq->elts_n) - 1; + volatile struct mlx5_cqe *cqe = &(*rxq->cqes)[rxq->cq_ci & cq_mask]; + unsigned int i = 0; + uint16_t rq_ci = rxq->rq_ci; + uint16_t strd_idx = rxq->strd_ci; + struct mlx5_mprq_buf *buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; + + while (i < pkts_n) { + struct rte_mbuf *pkt; + void *addr; + int ret; + unsigned int len; + uint16_t consumed_strd; + uint32_t offset; + uint32_t byte_cnt; + uint32_t rss_hash_res = 0; + + if (strd_idx == strd_n) { + /* Replace WQE only if the buffer is still in use. */ + if (rte_atomic16_read(&buf->refcnt) > 1) { + mprq_buf_replace(rxq, rq_ci & wq_mask); + /* Release the old buffer. */ + mlx5_mprq_buf_free(buf); + } else if (unlikely(rxq->mprq_repl == NULL)) { + struct mlx5_mprq_buf *rep; + + /* + * Currently, the MPRQ mempool is out of buffer + * and doing memcpy regardless of the size of Rx + * packet. Retry allocation to get back to + * normal. + */ + if (!rte_mempool_get(rxq->mprq_mp, + (void **)&rep)) + rxq->mprq_repl = rep; + } + /* Advance to the next WQE. */ + strd_idx = 0; + ++rq_ci; + buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; + } + cqe = &(*rxq->cqes)[rxq->cq_ci & cq_mask]; + ret = mlx5_rx_poll_len(rxq, cqe, cq_mask, &rss_hash_res); + if (!ret) + break; + if (unlikely(ret == -1)) { + /* RX error, packet is likely too large. */ + ++rxq->stats.idropped; + continue; + } + byte_cnt = ret; + consumed_strd = (byte_cnt & MLX5_MPRQ_STRIDE_NUM_MASK) >> + MLX5_MPRQ_STRIDE_NUM_SHIFT; + assert(consumed_strd); + strd_idx += consumed_strd; + if (byte_cnt & MLX5_MPRQ_FILLER_MASK) + continue; + /* + * Currently configured to receive a packet per a stride. But if + * MTU is adjusted through kernel interface, device could + * consume multiple strides without raising an error. In this + * case, the packet should be dropped because it is bigger than + * the max_rx_pkt_len. + */ + if (unlikely(consumed_strd > 1)) { + ++rxq->stats.idropped; + continue; + } + pkt = rte_pktmbuf_alloc(rxq->mp); + if (unlikely(pkt == NULL)) { + ++rxq->stats.rx_nombuf; + break; + } + len = (byte_cnt & MLX5_MPRQ_LEN_MASK) >> MLX5_MPRQ_LEN_SHIFT; + assert((int)len >= (rxq->crc_present << 2)); + if (rxq->crc_present) + len -= ETHER_CRC_LEN; + offset = strd_idx * strd_sz + strd_shift; + addr = RTE_PTR_ADD(mlx5_mprq_buf_addr(buf), offset); + /* Initialize the offload flag. */ + pkt->ol_flags = 0; + /* + * Memcpy packets to the target mbuf if: + * - The size of packet is smaller than mprq_max_memcpy_len. + * - Out of buffer in the Mempool for Multi-Packet RQ. + */ + if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL) { + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, len); + } else { + rte_iova_t buf_iova; + struct rte_mbuf_ext_shared_info *shinfo; + uint16_t buf_len = consumed_strd * strd_sz; + + /* Increment the refcnt of the whole chunk. */ + rte_atomic16_add_return(&buf->refcnt, 1); + assert((uint16_t)rte_atomic16_read(&buf->refcnt) <= + strd_n + 1); + addr = RTE_PTR_SUB(addr, RTE_PKTMBUF_HEADROOM); + /* + * MLX5 device doesn't use iova but it is necessary in a + * case where the Rx packet is transmitted via a + * different PMD. + */ + buf_iova = rte_mempool_virt2iova(buf) + + RTE_PTR_DIFF(addr, buf); + shinfo = rte_pktmbuf_ext_shinfo_init_helper(addr, + &buf_len, mlx5_mprq_buf_free_cb, buf); + /* + * EXT_ATTACHED_MBUF will be set to pkt->ol_flags when + * attaching the stride to mbuf and more offload flags + * will be added below by calling rxq_cq_to_mbuf(). + * Other fields will be overwritten. + */ + rte_pktmbuf_attach_extbuf(pkt, addr, buf_iova, buf_len, + shinfo); + rte_pktmbuf_reset_headroom(pkt); + assert(pkt->ol_flags == EXT_ATTACHED_MBUF); + /* + * Prevent potential overflow due to MTU change through + * kernel interface. + */ + len = RTE_MIN(len, (uint16_t)(pkt->buf_len - + pkt->data_off)); + } + rxq_cq_to_mbuf(rxq, pkt, cqe, rss_hash_res); + PKT_LEN(pkt) = len; + DATA_LEN(pkt) = len; + PORT(pkt) = rxq->port_id; +#ifdef MLX5_PMD_SOFT_COUNTERS + /* Increment bytes counter. */ + rxq->stats.ibytes += PKT_LEN(pkt); +#endif + /* Return packet. */ + *(pkts++) = pkt; + ++i; + } + /* Update the consumer indexes. */ + rxq->strd_ci = strd_idx; + rte_io_wmb(); + *rxq->cq_db = rte_cpu_to_be_32(rxq->cq_ci); + if (rq_ci != rxq->rq_ci) { + rxq->rq_ci = rq_ci; + rte_io_wmb(); + *rxq->rq_db = rte_cpu_to_be_32(rxq->rq_ci); + } +#ifdef MLX5_PMD_SOFT_COUNTERS + /* Increment packets counter. */ + rxq->stats.ipackets += i; +#endif + return i; +} + /** * Dummy DPDK callback for TX. * diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 74581cf9b..a6017f0dd 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -64,6 +64,16 @@ struct rxq_zip { uint32_t cqe_cnt; /* Number of CQEs. */ }; +/* Multi-Packet RQ buffer header. */ +struct mlx5_mprq_buf { + struct rte_mempool *mp; + rte_atomic16_t refcnt; /* Atomically accessed refcnt. */ + uint8_t pad[RTE_PKTMBUF_HEADROOM]; /* Headroom for the first packet. */ +} __rte_cache_aligned; + +/* Get pointer to the first stride. */ +#define mlx5_mprq_buf_addr(ptr) ((ptr) + 1) + /* RX queue descriptor. */ struct mlx5_rxq_data { unsigned int csum:1; /* Enable checksum offloading. */ @@ -75,19 +85,30 @@ struct mlx5_rxq_data { unsigned int elts_n:4; /* Log 2 of Mbufs. */ unsigned int rss_hash:1; /* RSS hash result is enabled. */ unsigned int mark:1; /* Marked flow available on the queue. */ - unsigned int :15; /* Remaining bits. */ + unsigned int strd_num_n:5; /* Log 2 of the number of stride. */ + unsigned int strd_sz_n:4; /* Log 2 of stride size. */ + unsigned int strd_shift_en:1; /* Enable 2bytes shift on a stride. */ + unsigned int :6; /* Remaining bits. */ volatile uint32_t *rq_db; volatile uint32_t *cq_db; uint16_t port_id; uint16_t rq_ci; + uint16_t strd_ci; /* Stride index in a WQE for Multi-Packet RQ. */ uint16_t rq_pi; uint16_t cq_ci; struct mlx5_mr_ctrl mr_ctrl; /* MR control descriptor. */ - volatile struct mlx5_wqe_data_seg(*wqes)[]; + uint16_t mprq_max_memcpy_len; /* Maximum size of packet to memcpy. */ + volatile void *wqes; volatile struct mlx5_cqe(*cqes)[]; struct rxq_zip zip; /* Compressed context. */ - struct rte_mbuf *(*elts)[]; + RTE_STD_C11 + union { + struct rte_mbuf *(*elts)[]; + struct mlx5_mprq_buf *(*mprq_bufs)[]; + }; struct rte_mempool *mp; + struct rte_mempool *mprq_mp; /* Mempool for Multi-Packet RQ. */ + struct mlx5_mprq_buf *mprq_repl; /* Stashed mbuf for replenish. */ struct mlx5_rxq_stats stats; uint64_t mbuf_initializer; /* Default rearm_data for vectorized Rx. */ struct rte_mbuf fake_mbuf; /* elts padding for vectorized Rx. */ @@ -206,6 +227,11 @@ struct mlx5_txq_ctrl { extern uint8_t rss_hash_default_key[]; extern const size_t rss_hash_default_key_len; +int mlx5_check_mprq_support(struct rte_eth_dev *dev); +int mlx5_rxq_mprq_enabled(struct mlx5_rxq_data *rxq); +int mlx5_mprq_enabled(struct rte_eth_dev *dev); +int mlx5_mprq_free_mp(struct rte_eth_dev *dev); +int mlx5_mprq_alloc_mp(struct rte_eth_dev *dev); void mlx5_rxq_cleanup(struct mlx5_rxq_ctrl *rxq_ctrl); int mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, unsigned int socket, const struct rte_eth_rxconf *conf, @@ -229,6 +255,7 @@ int mlx5_rxq_release(struct rte_eth_dev *dev, uint16_t idx); int mlx5_rxq_releasable(struct rte_eth_dev *dev, uint16_t idx); int mlx5_rxq_verify(struct rte_eth_dev *dev); int rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl); +int rxq_alloc_mprq_buf(struct mlx5_rxq_ctrl *rxq_ctrl); struct mlx5_ind_table_ibv *mlx5_ind_table_ibv_new(struct rte_eth_dev *dev, const uint16_t *queues, uint32_t queues_n); @@ -292,6 +319,10 @@ uint16_t mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t mlx5_tx_burst_empw(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n); uint16_t mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n); +void mlx5_mprq_buf_free_cb(void *addr, void *opaque); +void mlx5_mprq_buf_free(struct mlx5_mprq_buf *buf); +uint16_t mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, + uint16_t pkts_n); uint16_t removed_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n); uint16_t removed_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c index 215a9e592..0a4aed8ff 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.c +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c @@ -275,6 +275,8 @@ mlx5_rxq_check_vec_support(struct mlx5_rxq_data *rxq) struct mlx5_rxq_ctrl *ctrl = container_of(rxq, struct mlx5_rxq_ctrl, rxq); + if (mlx5_mprq_enabled(ETH_DEV(ctrl->priv))) + return -ENOTSUP; if (!ctrl->priv->config.rx_vec_en || rxq->sges_n != 0) return -ENOTSUP; return 1; @@ -297,6 +299,8 @@ mlx5_check_vec_rx_support(struct rte_eth_dev *dev) if (!priv->config.rx_vec_en) return -ENOTSUP; + if (mlx5_mprq_enabled(dev)) + return -ENOTSUP; /* All the configured queues should support. */ for (i = 0; i < priv->rxqs_n; ++i) { struct mlx5_rxq_data *rxq = (*priv->rxqs)[i]; diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index 64442836b..598dc7519 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -87,7 +87,8 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) const uint16_t q_mask = q_n - 1; uint16_t elts_idx = rxq->rq_ci & q_mask; struct rte_mbuf **elts = &(*rxq->elts)[elts_idx]; - volatile struct mlx5_wqe_data_seg *wq = &(*rxq->wqes)[elts_idx]; + volatile struct mlx5_wqe_data_seg *wq = + &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[elts_idx]; unsigned int i; assert(n >= MLX5_VPMD_RXQ_RPLNSH_THRESH); diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c index 36b7c9e2f..3e7c0a90f 100644 --- a/drivers/net/mlx5/mlx5_trigger.c +++ b/drivers/net/mlx5/mlx5_trigger.c @@ -102,6 +102,9 @@ mlx5_rxq_start(struct rte_eth_dev *dev) unsigned int i; int ret = 0; + /* Allocate/reuse/resize mempool for Multi-Packet RQ. */ + if (mlx5_mprq_alloc_mp(dev)) + goto error; for (i = 0; i != priv->rxqs_n; ++i) { struct mlx5_rxq_ctrl *rxq_ctrl = mlx5_rxq_get(dev, i); struct rte_mempool *mp; @@ -109,7 +112,8 @@ mlx5_rxq_start(struct rte_eth_dev *dev) if (!rxq_ctrl) continue; /* Pre-register Rx mempool. */ - mp = rxq_ctrl->rxq.mp; + mp = mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq) ? + rxq_ctrl->rxq.mprq_mp : rxq_ctrl->rxq.mp; DRV_LOG(DEBUG, "port %u Rx queue %u registering" " mp %s having %u chunks", -- 2.11.0