From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 725C4A0352 for ; Mon, 21 Feb 2022 18:33:13 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7BF7A410F3; Mon, 21 Feb 2022 18:33:10 +0100 (CET) Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by mails.dpdk.org (Postfix) with ESMTP id 995A640141 for ; Fri, 18 Feb 2022 17:14:20 +0100 (CET) Received: by mail-lj1-f172.google.com with SMTP id o6so5037917ljp.3 for ; Fri, 18 Feb 2022 08:14:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Ebic/lNaMVXzY+xQhL7+3IxeC/zE7LJPXGXaYMNQgmg=; b=AHLovuGmsnpC7Vcuc79S8fgYKMDdcBxX1UZQomrHS8Qr3MJXo6ca0Vbg0iPA09S+0D L4AXu0sXcSumQptgnhSMWMhea8evoP+U8+upkJQGdbght+VcOS9ISqulQOF5YhcTVxxI nPMVmtFjAhEfrSm07mUcYpcvSPNfPGDCc/morjxvJR0frn9Xf6altescg7G93vJ9Adll oC1LnQ09UyWTsFI+wyEVkZ2QA4wl9BNq2FDErM1WTR5o408+Vqb8yovL29+UGyOM1Gx8 DZq6MLUIwMfkmy+n9/TEqV2VJrtGJpZjw9lrYfJSNAjtMNg/ljB+G0mRzk62SvqSN/Gb 4qcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Ebic/lNaMVXzY+xQhL7+3IxeC/zE7LJPXGXaYMNQgmg=; b=CdNh6WTV3m6GotBPy1TjbhkK+UgzQyvC11/KpDGlEY90Juzt2ssLnfWidP4Ff7slmq xdI4efJXrD47sVi0BMJafeu4HD4MD5ItOV+Z2gikBYBUJenz5VVvfPFAxdBH4iXiXdBQ 8JROus6ueRHTnHCMKb9yc20sSBzBGg2FDPUMucog98Vpz99BVP9NO6Pc5MwcvvTko/Bo qxwrYQskxEoxQkOGciTB8HoOtcjNBNRN1VKD3cHEoQEUiUavpogq/Mq81sdcxcU/njm6 ARtMnziNgy6vEQEIm58iLvnfWTuvEPrbs25KGcg59IS5Kc+TB5S9gh8YVp+Z9xA4KFfb n7aw== X-Gm-Message-State: AOAM532mzHGjjogluG2W1099soI2yg2i06UAPctnYvOyof3EB06D7Ivg bW9oJCKiw5rhwKlXUjau4E8W4cn6D8rdem3+pQM= X-Google-Smtp-Source: ABdhPJx0MfRHmlAnC7ZDAVYpzPteV88nabXmYdA7JFcQVTYypILDN0On508lV2lkeGumhbQ1gICuUqdgdxDhtjEL3tI= X-Received: by 2002:a2e:b6c5:0:b0:246:bfd:2724 with SMTP id m5-20020a2eb6c5000000b002460bfd2724mr6174282ljo.259.1645200859749; Fri, 18 Feb 2022 08:14:19 -0800 (PST) MIME-Version: 1.0 References: <20220218133952.3084134-1-dkozlyuk@nvidia.com> In-Reply-To: <20220218133952.3084134-1-dkozlyuk@nvidia.com> From: =?UTF-8?B?0JTQvNC40YLRgNC40Lkg0KHRgtC10L/QsNC90L7Qsg==?= Date: Fri, 18 Feb 2022 19:14:08 +0300 Message-ID: Subject: Re: Mellanox performance degradation with more than 12 lcores To: Dmitry Kozlyuk Cc: users@dpdk.org Content-Type: multipart/alternative; boundary="000000000000d4605405d84d2c34" X-Mailman-Approved-At: Mon, 21 Feb 2022 18:33:08 +0100 X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --000000000000d4605405d84d2c34 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the clarification! I was able to get 148Mpps with 12 lcores after some BIOS tunings. Looks like due to these HW limitations I have to use ring buffer as you suggested to support more than 32 lcores! =D0=BF=D1=82, 18 =D1=84=D0=B5=D0=B2=D1=80. 2022 =D0=B3. =D0=B2 16:40, Dmitr= y Kozlyuk : > Hi, > > > With more than 12 lcores overall receive performance reduces. > > With 16-32 lcores I get 100-110 Mpps, > > It is more about the number of queues than the number of cores: > 12 queues are the threshold when Multi-Packet Receive Queue (MPRQ) > is automatically enabled in mlx5 PMD. > Try increasing --rxd and check out mprq_en device argument. > Please see mlx5 PMD user guide for details about MPRQ. > You should be able to get full 148 Mpps with your HW. > > > and I get a significant performance fall with 33 lcores - 84Mpps. > > With 63 cores I get even 35Mpps overall receive performance. > > > > Are there any limitations on the total number of receive queues (total > > lcores) that can handle a single port on a given NIC? > > This is a hardware limitation. > The limit on the number of queues you can create is very high (16M), > but performance can perfectly scale only up to 32 queues > at high packet rates (as opposed to bit rates). > Using more queues can even degrade it, just as you observe. > One way to overcome this (not specific to mlx5) > is to use a ring buffer for incoming packets, > from which any number of processing cores can take packets. > --000000000000d4605405d84d2c34 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the clarification!=C2=A0
I was able to get = 148Mpps=C2=A0with 12 lcores=C2=A0after some BIOS tunings.=C2=A0
L= ooks like due to these HW limitations I have to use ring buffer as you sugg= ested to support more than 32 lcores!=C2=A0

=D0=BF=D1=82, 18 =D1=84=D0= =B5=D0=B2=D1=80. 2022 =D0=B3. =D0=B2 16:40, Dmitry Kozlyuk <dkozlyuk@nvidia.com>:
Hi,

> With more than 12 lcores overall receive performance reduces.
> With 16-32 lcores I get 100-110 Mpps,

It is more about the number of queues than the number of cores:
12 queues are the threshold when Multi-Packet Receive Queue (MPRQ)
is automatically enabled in mlx5 PMD.
Try increasing --rxd and check out mprq_en device argument.
Please see mlx5 PMD user guide for details about MPRQ.
You should be able to get full 148 Mpps with your HW.

> and I get a significant performance fall with 33 lcores - 84Mpps.
> With 63 cores I get even 35Mpps overall receive performance.
>
> Are there any limitations on the total number of receive queues (total=
> lcores) that can handle a single port on a given NIC?

This is a hardware limitation.
The limit on the number of queues you can create is very high (16M),
but performance can perfectly scale only up to 32 queues
at high packet rates (as opposed to bit rates).
Using more queues can even degrade it, just as you observe.
One way to overcome this (not specific to mlx5)
is to use a ring buffer for incoming packets,
from which any number of processing cores can take packets.
--000000000000d4605405d84d2c34--