From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5CBE5A0C43 for ; Fri, 17 Sep 2021 17:52:04 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D39C24111D; Fri, 17 Sep 2021 17:52:03 +0200 (CEST) Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by mails.dpdk.org (Postfix) with ESMTP id DB4F7410EC for ; Fri, 17 Sep 2021 17:52:02 +0200 (CEST) Received: by mail-pf1-f175.google.com with SMTP id g14so9644497pfm.1 for ; Fri, 17 Sep 2021 08:52:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UAzhlOuJpUfEcp0SWh5TAq3aUXJDihrKZTFSLBTW90I=; b=gljSf7dB3YeUtLK8YnKv/HRUsn3Q/SaQgLTT13dEin7VA7witZxVgCr/YlYj9bSO7j GkxLYo0PT8HWBkdcM6bbXtdzjNNEOMLNwswyVyClMU2oYJ0rG29oiAIq6rak+SVUIy04 2CwEDb31572dYOUeBx3KL+UUiaOdJKljTIYEWkpIq190JgHUTkwOXrWa6sT2+ki8ZEnz xZjfNTRqozcqPdcfT73z5a9uxX0iLpcKP/ib7BAg76HmqZnD77oZh8bVQMoKlpwVBDqS 31enriJpnnFN0pOOcQnelCF32sZikEK4QFN7eLfJmXCANKqkSri3oNTRIBaK8DVNMVwm IevA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UAzhlOuJpUfEcp0SWh5TAq3aUXJDihrKZTFSLBTW90I=; b=dnA7g5G8HaVbSwNn3vZMX7XGQWip1Yy2+LN3DeaWs/0vikm27MF7IC6IZci3PyXhzN p6e1FEQa+f5189KHFq37FJtQ75sXWRFAokEL2q4vqUgAv80Kv5NPRowi37DYQvDhkQse XIHb6qClCm+Obv5Xkif4ypO1RM05NJPl+WfE5BDKzPBDA8/tWLAmXqSzm0ZVK7hr6geI qEbqfyl9aYPVGZPGJp2YhezzDJgZtUTTBRV5htmbUumFc4ddfS6nte3MPZm6lAoVVYoQ wKfS3RHNCaN4mWh2PUkACQYoCpttEOZcxawHerSunTArhnsLXzICp07DupXUGKKyEXEt DLkA== X-Gm-Message-State: AOAM5307TS8q8eL++vUjyYxKB/FDMNTewwMfHR6qQFnFDHSFYR14o9I6 FO5mMzwaUE83LsNyHni2VqIWjg== X-Google-Smtp-Source: ABdhPJzDZaqWDbg8zphrWdU7qze9cDCFwbfOU7/P3IdfYiJhrIa8NgtZ2sv68V85qtk4fCWqgYUmbw== X-Received: by 2002:a63:6f82:: with SMTP id k124mr5978540pgc.218.1631893921805; Fri, 17 Sep 2021 08:52:01 -0700 (PDT) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id r13sm6938690pgl.90.2021.09.17.08.52.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Sep 2021 08:52:01 -0700 (PDT) Date: Fri, 17 Sep 2021 08:51:58 -0700 From: Stephen Hemminger To: Jared Brown Cc: users@dpdk.org Message-ID: <20210917085158.407f7bd4@hermes.local> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-users] DPDK CPU selection criteria? X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" On Fri, 17 Sep 2021 17:23:49 +0200 Jared Brown wrote: > Hello everybody! > > Is there some canonical resource or at least a recommendation on how to evaluate different CPUs for suitability for use with DPDK? > > My use case is software routers (for example DANOS, 6WIND, TNRS), so I am mainly interested in forwarding performance in terms of Mpps. > > What I am looking for is to develop some kind of heuristics to evaluate CPUs in terms of $/Mpps without having to purchase hundreds of SKUs and running tests on them. > > The official DPDK documentation[0] states thus: > > "7.1. Hardware and Memory Requirements > > For best performance use an Intel Xeon class server system such as Ivy Bridge, Haswell or newer." > > This is somewhat... vague. Intentionally, in fact any reference to any one CPU vendor should be removed from website and documentaton. > > I suppose one could take [1] as a baseline, which states on page 2 that an Ivy Bridge Xeon E3-1230 V2 is able to forward unidirectional flows at linerate using 10G NICs at all frequencies above 1.6 GHz and bidirectional flows at linerate using 10G NICs at 3.3 GHz. > > This however pales compared with [2] that on page 23 shows that a 3rd Generation Scalable Xeon 8380 manages to very nearly saturate a 100G NIC at all packet sizes. > > As there is almost a magnitude in difference in forwarding performance per core, you can perhaps understand that I am somewhat at a loss when trying to gauge the performance of a particular CPU model. > > Reading [3] one learns that several aspects of the CPU affect the forwarding performance, but very little light is shed on how much each feature on its own contributes. On page 172 one learns that CPU frequency has a linear impact on the performance. This is borne out by [1], but does not take into consideration inter-generational gaps as witnessed by [2]. > > This begs the question, what are those inter-generational differences made of? > > - L3 cache latency (p. 54) as an upper limit on Mpps. Do newer generations have decidedly lower cache latencies and is this the defining performance factor? > > > - Direct Data I/O (p. 69)? Is DDIO combined with lower L3 cache latency a major force multipler? Or is prefetch sufficient to keep caches hot? This is somewhat confusing, as [3] states on page 62 that DPDK can get one core to handleup to 33 mpps, on average. On one hand this is the performance [1] demonstrated the better part of a decade earlier, but on the other hand [2] demonstrates a magnitude larger performance per core. > > - New instructions? On page 171 [3] notes that the AVX512 instruction can move 64 bytes per cycle which [2] indicates has an almost 30% effect on Mpps on page 22. How important is Transactional Synchronization Extensions (TSX) support (see page 119 of [3]) for forwarding performance? > > - Other factors are mentioned, such as memory frequency, memory size, memory channels and cache sizes, but nothing is said how each of these affect forwarding performance in terms of Mpps. The official documentation [0] only states that: "Ensure that each memory channel has at least one memory DIMM inserted, and that the memory size for each is at least 4GB. Note: this has one of the most direct effects on performance." > > - Turbo boost and hyperthreading? Are these supposed to be enabled or disabled? I am getting conflicting information. Results listed in [2] show increased Mpps by enabling, but [1] notes that they were disabled due to them introducing measurement artifacts. I recall some documentation recommending disabling, since enabling increases latency and variance. > > - Xeon D, W, E3, E5, E7 and Scalable. Are these different processor siblings observably different from each other from the perspective of DPDK? Atoms certainly are as [3] notes on page 57, because they only perform at 50% compared to an equivalent Xeon core. A reson isn't given, but perhaps it is due to the missing L3 cache? > > - Something entirely else? Am I missing something completely obvious that explains the inter-generational differences between CPUS in terms of forwarding performance? Also any performance is application dependent. If you have an application that is very well tuned then some of this matters, for other applications which have other issues (syscalls, locks, huge cache footprint) none of this matters. > So, given all this, how can I perform the mundane task of comparing for example the Xeon W-1250P with the Xeon W-1350P? > > The 1250 is older, but has a larger L2 cache and a higher frequency. > > The 1350 is newer, uses faster memory, has a higher max memory bandwidth, PCIe4.0, more PCI lanes and AVX-512. > > Or any other CPU model comparison, for that matter? > > - Jared > > > [0] https://doc.dpdk.org/guides-16.04/linux_gsg/nic_perf_intel_platform.html > [1] https://www.net.in.tum.de/fileadmin/bibtex/publications/papers/ICN2015.pdf > [2] http://fast.dpdk.org/doc/perf/DPDK_21_05_Intel_NIC_performance_report.pdf > [3] https://www.routledge.com/Data-Plane-Development-Kit-DPDK-A-Software-Optimization-Guide-to-the/Zhu/p/book/9780367373955