From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AF72EA056A; Wed, 10 Mar 2021 18:51:09 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 67BFF22A38C; Wed, 10 Mar 2021 18:51:09 +0100 (CET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by mails.dpdk.org (Postfix) with ESMTP id 18E2622A387 for ; Wed, 10 Mar 2021 18:51:06 +0100 (CET) IronPort-SDR: cqF9CA8id5TxEQgY1SOWiG/L/7SD9WDjwcB+7YH3FONz75BFAy8g/eXp7ZRs33oeVvk4M0fAdL Fi/QB24W2nbw== X-IronPort-AV: E=McAfee;i="6000,8403,9919"; a="208335600" X-IronPort-AV: E=Sophos;i="5.81,237,1610438400"; d="scan'208";a="208335600" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2021 09:51:04 -0800 IronPort-SDR: Ijk6KCok9BnTg48tGjdT82eqpBR9h14io49O29VQOmeppv8VApxvAdkrai7XUT8sf6AT4nEB2v TX8KF7gMr/JA== X-IronPort-AV: E=Sophos;i="5.81,237,1610438400"; d="scan'208";a="509751238" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.213.226.209]) ([10.213.226.209]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2021 09:51:03 -0800 To: Ciara Loftus , dev@dpdk.org References: <20210309101958.27355-1-ciara.loftus@intel.com> <20210310074816.3029-1-ciara.loftus@intel.com> From: Ferruh Yigit X-User: ferruhy Message-ID: Date: Wed, 10 Mar 2021 17:50:55 +0000 MIME-Version: 1.0 In-Reply-To: <20210310074816.3029-1-ciara.loftus@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 0/3] AF_XDP Preferred Busy Polling X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 3/10/2021 7:48 AM, Ciara Loftus wrote: > Single-core performance of AF_XDP at high loads can be poor because > a heavily loaded NAPI context will never enter or allow for busy-polling. > > 1C testpmd rxonly (both IRQs and PMD on core 0): > ./dpdk-testpmd -l 0-1 --vdev=net_af_xdp0,iface=eth0 --main-lcore=1 -- \ > --forward-mode=rxonly > 0.088Mpps > > In order to achieve decent performance at high loads, it is currently > recommended ensure the IRQs for the netdev queue and the core running > the PMD are different. > > 2C testpmd rxonly (IRQs on core 0, PMD on core 1): > ./dpdk-testpmd -l 0-1 --vdev=net_af_xdp0,iface=eth0 --main-lcore=0 -- \ > --forward-mode=rxonly > 19.26Mpps > > However using an extra core is of course not ideal. The SO_PREFER_BUSY_POLL > socket option was introduced in kernel v5.11 to help improve 1C performance. > See [1]. > > This series sets this socket option on xsks created with DPDK (ie. instances of > the AF_XDP PMD) unless explicitly disabled or not supported by the kernel. It > was chosen to be enabled by default in order to bring the AF_XDP PMD in line > with most other PMDs which execute on a single core. > > The following system and netdev settings are recommended in conjunction with > busy polling: > echo 2 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs > echo 200000 | sudo tee /sys/class/net/eth0/gro_flush_timeout > > Re-running the 1C test with busy polling support and the above settings: > ./dpdk-testpmd -l 0-1 --vdev=net_af_xdp0,iface=eth0 --main-lcore=1 -- \ > --forward-mode=rxonly > 10.45Mpps > > A new vdev arg is introduced called 'busy_budget' whose default value is 64. > busy_budget is the value supplied to the kernel with the SO_BUSY_POLL_BUDGET > socket option and represents the busy-polling NAPI budget ie. the number of > packets the kernel will attempt to process in the netdev's NAPI context. > > To set the busy budget to 256: > ./dpdk-testpmd --vdev=net_af_xdp0,iface=eth0,busy_budget=256 > 14.06Mpps > > If you still wish to run using 2 cores (one for PMD once for IRQs) it is > recommended to disable busy polling to achieve optimal 2C performance: > ./dpdk-testpmd --vdev=net_af_xdp0,iface=eth0,busy_budget=0 > 19.09Mpps > > v2->v3: > * Moved release notes update to correct location > * Changed busy_budget from uint32_t to int since this is the type expected > by setsockopt > * Validate busy_budget arg is <= UINT16_MAX during parse > > v1->v2: > * Set batch size to default size of ring (2048) > * Split batches > 2048 into multiples of 2048 or less and process all > packets in the same manner that is done for other drivers eg. ixgbe: > http://code.dpdk.org/dpdk/v21.02/source/drivers/net/ixgbe/ixgbe_rxtx.c#L318 > * Update commit log with reasoning behing batching changes > * Update release notes with note on busy polling support > * Fix return type for sycall_needed function when the wakeup flag is not > present > * Apprpriate log leveling > * Set default_*xportconf burst sizes to the default busy budget size (64) > * Detect support for busy polling via setsockopt instead of using the presence > of the flag > > RFC->v1: > * Fixed behaviour of busy_budget=0 > * Ensure we bail out any of the new setsockopts fail > > [1] https://lwn.net/Articles/837010/ > > > Ciara Loftus (3): > net/af_xdp: allow bigger batch sizes > net/af_xdp: Use recvfrom() instead of poll() > net/af_xdp: preferred busy polling > for series, Reviewed-by: Ferruh Yigit Series applied to dpdk-next-net/main, thanks.