From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 71B4645C89; Tue, 5 Nov 2024 22:20:42 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 56FDC4027E; Tue, 5 Nov 2024 22:20:41 +0100 (CET) Received: from office2.cesnet.cz (office2.cesnet.cz [78.128.248.237]) by mails.dpdk.org (Postfix) with ESMTP id DC85B40156 for ; Tue, 5 Nov 2024 22:20:39 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cesnet.cz; s=office2-2020; t=1730841639; bh=6dXEHvkegGEVppE/6E+wJcOEoIsbobn41JCVz831N+A=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=gKIzbFugsKh6/tar3PEv1PDMsa72m299sA91BQIGzCTkdAZDOQZt0hQ42/FrIEiWK 4afQbsWGNWcdriVaj87OBTRQyVYSiOWVVxNRbAJzGa/GkmJNNMJx4dwCGIlW+IiM4M pZpHLL3puEQN1OvQcDty9BBXwhEHUpxGxJ8VrPBJiOcxm7aq/XnuZ3I+ugqzHliDeu /1mLJGwtjXYtnLDkCxxzjdB6+p5vKrj0u6MAiOIZjFkHm/VlPLuguyEG8hVuahXCsN Izmmhfr91l23F7qnuXsIDK+bZq6UJ7uE5iYwyh8U67WQv4lmvUTexOwwNM8qWqzdkA KVpUU/733yYTQ== Received: from [192.168.23.183] (85-193-33-24.rib.o2.cz [85.193.33.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by office2.cesnet.cz (Postfix) with ESMTPSA id 010F51180083; Tue, 5 Nov 2024 22:20:38 +0100 (CET) Message-ID: <64a18700-194e-4c8f-aeba-125ec3802740@cesnet.cz> Date: Tue, 5 Nov 2024 22:20:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] net: increase the maximum of RX/TX descriptors To: =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger Cc: anatoly.burakov@intel.com, ian.stokes@intel.com, dev@dpdk.org, bruce.richardson@intel.com References: <20241029124832.224112-1-sismis@cesnet.cz> <98CBD80474FA8B44BF855DF32C47DC35E9F845@smartserver.smartshare.dk> <20241030082020.2fe8eadb@hermes.local> <75463f4f-4139-4a53-9e63-05fe4cccb74f@cesnet.cz> <20241030090643.66af553f@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35E9F876@smartserver.smartshare.dk> <20241105075514.45ebea5c@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35E9F884@smartserver.smartshare.dk> Content-Language: en-US From: =?UTF-8?B?THVrw6HFoSDFoGnFoW1pxaE=?= In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F884@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 05. 11. 24 17:50, Morten Brørup wrote: >> From: Stephen Hemminger [mailto:stephen@networkplumber.org] >> Sent: Tuesday, 5 November 2024 16.55 >> >> On Tue, 5 Nov 2024 09:49:39 +0100 >> Morten Brørup wrote: >> >>>> I suspect AF_PACKET provides an intermediate step which can buffer >> more >>>> or spread out the work. >>> Agree. It's a Linux scheduling issue. >>> >>> With DPDK polling, there is no interrupt in the kernel scheduler. >>> If the CPU core running the DPDK polling thread is running some other >> thread when the packets arrive on the hardware, the DPDK polling thread >> is NOT scheduled immediately, but has to wait for the kernel scheduler >> to switch to this thread instead of the other thread. >>> Quite a lot of time can pass before this happens - the kernel >> scheduler does not know that the DPDK polling thread has urgent work >> pending. >>> And the number of RX descriptors needs to be big enough to absorb all >> packets arriving during the scheduling delay. >>> It is not well described how to *guarantee* that nothing but the DPDK >> polling thread runs on a dedicated CPU core. >> >> That why any non-trivial DPDK application needs to run on isolated >> cpu's. > Exactly. > And it is non-trivial and not well described how to do this. > > Especially in virtual environments. > E.g. I ran some scheduling latency tests earlier today, and frequently observed 500-1000 us scheduling latency under vmware vSphere ESXi. This requires a large number of RX descriptors to absorb without packet loss. (Disclaimer: The virtual machine configuration had not been optimized. Tweaking the knobs offered by the hypervisor might improve this.) > > The exact same firmware (same kernel, rootfs, libraries, applications etc.) running directly on our purpose-built hardware has scheduling latency very close to the kernel's default "timerslack" (50 us). > Thanks for the feedback, I am currently not 100% I ran my earlier experiments on isolcpus and whether it had a massive impact or not. But here is a decent guide on latency tuning I found the other day though virtual environments are not exactly described. https://rigtorp.se/low-latency-guide/