From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 588CEA054D; Thu, 18 Feb 2021 10:53:47 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BF481160734; Thu, 18 Feb 2021 10:53:46 +0100 (CET) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by mails.dpdk.org (Postfix) with ESMTP id A3B0440698 for ; Thu, 18 Feb 2021 10:53:45 +0100 (CET) IronPort-SDR: c+ercUBKdEN0V5YZ46jyT01gj/Bqb4h2FO300aYz9aLrXQGjQ2kbRVUyHuRLyVZg89zPEfj0zD 0PIzIbJMZm8A== X-IronPort-AV: E=McAfee;i="6000,8403,9898"; a="179938813" X-IronPort-AV: E=Sophos;i="5.81,186,1610438400"; d="scan'208";a="179938813" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2021 01:53:44 -0800 IronPort-SDR: 87sPpkUdQ0O0SpAR/azT3Hl307KmhMvJerbql2CjCTXQ61g6r1qFWAYDy8LTHPVLCox/DbHdR0 1tnHpQxOzHTg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,186,1610438400"; d="scan'208";a="427090210" Received: from silpixa00399839.ir.intel.com (HELO localhost.localdomain) ([10.237.222.142]) by FMSMGA003.fm.intel.com with ESMTP; 18 Feb 2021 01:53:43 -0800 From: Ciara Loftus To: dev@dpdk.org Date: Thu, 18 Feb 2021 09:23:04 +0000 Message-Id: <20210218092307.29575-1-ciara.loftus@intel.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH RFC 0/3] AF_XDP Preferred Busy Polling X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Single-core performance of AF_XDP at high loads can be poor because a heavily loaded NAPI context will never enter or allow for busy-polling. 1C testpmd rxonly (both IRQs and PMD on core 0): ./dpdk-testpmd -l 0-1 --vdev=net_af_xdp0,iface=eth0 -- --forward-mode=rxonly 0.088Mpps In order to achieve decent performance at high loads, it is currently recommended ensure the IRQs for the netdev and the core running the PMD are different. 2C testpmd rxonly (IRQs on core 0, PMD on core 1): ./dpdk-testpmd -l 0-1 --vdev=net_af_xdp0,iface=eth0 --main-lcore=0 -- \ --forward-mode=rxonly 19.26Mpps However using an extra core is of course not ideal. The SO_PREFER_BUSY_POLL socket option was introduced in kernel v5.11 to help improve 1C performance. See [1]. This series sets this socket option on xsks created with DPDK (ie. instances of the AF_XDP PMD) unless explicitly disabled or not supported by the kernel. It was chosen to be enabled by default in order to bring the AF_XDP PMD in line with most other PMDs which execute on a single core. The following system and netdev settings are recommended in conjunction with busy polling: echo 2 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs echo 200000 | sudo tee /sys/class/net/eth0/gro_flush_timeout With the RFC these must be manually configured, but for the v1 these may be set from the PMD since the performance is tightly coupled with these settings. Re-running the 1C test with busy polling support and the above settings: ./dpdk-testpmd -l 0-1 --vdev=net_af_xdp0,iface=eth0 -- --forward-mode=rxonly 10.45Mpps A new vdev arg is introduced called 'busy_budget' whose default value is 64. busy_budget is the value supplied to the kernel with the SO_BUSY_POLL_BUDGET socket option and represents the busy-polling NAPI budget ie. the number of packets the kernel will attempt to process in the netdev's NAPI context. If set to 0, preferred busy polling is disabled. To set the busy budget to 256: ./dpdk-testpmd --vdev=net_af_xdp0,iface=eth0,busy_budget=256 14.06Mpps To set the busy budget to 512: ./dpdk-testpmd --vdev=net_af_xdp0,iface=eth0,busy_budget=512 14.32Mpps To disable preferred busy polling: ./dpdk-testpmd --vdev=net_af_xdp0,iface=eth0,busy_budget=0 [1] https://lwn.net/Articles/837010/ Ciara Loftus (3): net/af_xdp: Increase max batch size to 512 net/af_xdp: Use recvfrom() instead of poll() net/af_xdp: preferred busy polling doc/guides/nics/af_xdp.rst | 38 +++++++++++- drivers/net/af_xdp/compat.h | 13 ++++ drivers/net/af_xdp/rte_eth_af_xdp.c | 95 ++++++++++++++++++++++------- 3 files changed, 124 insertions(+), 22 deletions(-) -- 2.17.1