From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C9519A0547 for ; Wed, 29 Sep 2021 12:43:54 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4FE10410D7; Wed, 29 Sep 2021 12:43:54 +0200 (CEST) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by mails.dpdk.org (Postfix) with ESMTP id D6F6E40685 for ; Wed, 29 Sep 2021 12:43:53 +0200 (CEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 828DB5C00E6; Wed, 29 Sep 2021 06:43:53 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Wed, 29 Sep 2021 06:43:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm2; bh= vKxxMQx3YC/65OQgl1zs1GFX2ZxUqQ/YLfkA+MGMXTU=; b=t/Woeb2rAq4QrqXf RhwNwjhy3w5XCbbzAz8hEKlqhfyqZl6J6aScDPLa+mOn9b4v6QG+LZsJJo296Cbh isQwpbiqJJTp65R4d8XE/uaPP9t32gBBNpcuhNs2noVJ6SO+2mAr1LtzRuNqu8ii WxzZkcv5brlIAnh0+H00qkvJ3JeLKv0KDLfOrEZvuy1Og0dbekC5Ueuve9Y1HA7S cEaoH6IHzZG3WuuvPtEN3/ird/0A3Wwz1dpmYVrhyeZPpGSipjdwAhTM5MTpu8PK O6JizAhG6Ok8Gxrk2if/ZcLEoGDqTxUSGe/XfSIU3C8OPBpZG+jDrDHcTGJQIHO6 55LibA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=vKxxMQx3YC/65OQgl1zs1GFX2ZxUqQ/YLfkA+MGMX TU=; b=v4Xu3oAwmwOxajbgxpPo/sdR4B5ezyKe5cemmphCkYNkwa2WBnGFgsh5u VsIdZXR7ZmKFvAspDUu7cg6CVOop692Kn+WcCHn8RBkgiDWhgfAp7uZfA/pbvsuS ZUw3iE8+7WE6B0S+wyzuYVJGWPqGa8mhQFuc5NLNvme3MwPHfBf0Zvop/Q+3guq4 Dp/CEC+I/v9DAuHeoon4WchbcAXtCU6QdirAXq1nNTMtiQsrSZlbFeMLF2oZ4Q5f cPhns++GpyCztj7MNk06eaUBsRkxb7bXZ2hsH7XkMNtzSDICxK66X/bPWRzW3UQd 0O0rFpNMd1yYk9Fga782pvTbbvsbA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrudekvddgfedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkjghfggfgtgesthhqre dttddtjeenucfhrhhomhepvfhhohhmrghsucfoohhnjhgrlhhonhcuoehthhhomhgrshes mhhonhhjrghlohhnrdhnvghtqeenucggtffrrghtthgvrhhnpeevledtieeivdetheevfe eukeehkeeutefhleekgfettdfgfeehgeeiieelgfevudenucffohhmrghinheprghmugdr tghomhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe hthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 29 Sep 2021 06:43:52 -0400 (EDT) From: Thomas Monjalon To: Steffen Weise Cc: users@dpdk.org, Filip Janiszewski Subject: Re: [dpdk-users] MLX ConnectX-4 Discarding packets Date: Wed, 29 Sep 2021 12:43:51 +0200 Message-ID: <19675912.xPyGqSisHP@thomas> In-Reply-To: <1c8ec784-aebe-a081-e47d-dff6733917a4@filipjaniszewski.com> References: <57ab0193-689d-55b9-6f8c-dc23682e7c06@filipjaniszewski.com> <1c8ec784-aebe-a081-e47d-dff6733917a4@filipjaniszewski.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Great, thanks for the update! 12/09/2021 11:32, Filip Janiszewski: > Alright, nailed it down to a wrong preferred PCIe device in the BIOS > configuration, it has not been changed after the NIC have been moved to > another PCIe slot. >=20 > Now the EPYC is going really great, getting 100Gbps rate easily. >=20 > Thank >=20 > Il 9/11/21 4:34 PM, Filip Janiszewski ha scritto: > > I wanted just to add, while running the same exact testpmd on the other > > machine I won't get a single miss with the same patter traffic: > >=20 > > . > > testpmd> stop > > Telling cores to stop... > > Waiting for lcores to finish... > >=20 > > ------- Forward Stats for RX Port=3D 0/Queue=3D 0 -> TX Port=3D 0/Que= ue=3D 0 > > ------- > > RX-packets: 61711939 TX-packets: 0 TX-dropped: 0 > >=20 > >=20 > > ------- Forward Stats for RX Port=3D 0/Queue=3D 1 -> TX Port=3D 0/Que= ue=3D 1 > > ------- > > RX-packets: 62889424 TX-packets: 0 TX-dropped: 0 > >=20 > >=20 > > ------- Forward Stats for RX Port=3D 0/Queue=3D 2 -> TX Port=3D 0/Que= ue=3D 2 > > ------- > > RX-packets: 61914199 TX-packets: 0 TX-dropped: 0 > >=20 > >=20 > > ------- Forward Stats for RX Port=3D 0/Queue=3D 3 -> TX Port=3D 0/Que= ue=3D 3 > > ------- > > RX-packets: 63484438 TX-packets: 0 TX-dropped: 0 > >=20 > >=20 > > ---------------------- Forward statistics for port 0 > > ---------------------- > > RX-packets: 250000000 RX-dropped: 0 RX-total: 250000= 000 > > TX-packets: 0 TX-dropped: 0 TX-total: 0 > >=20 > > -----------------------------------------------------------------------= =2D---- > >=20 > > +++++++++++++++ Accumulated forward statistics for all > > ports+++++++++++++++ > > RX-packets: 250000000 RX-dropped: 0 RX-total: 250000= 000 > > TX-packets: 0 TX-dropped: 0 TX-total: 0 > >=20 > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++= +++++ > > . > >=20 > > In the lab I've the EPYC connected directly to the Xeon using a 100GbE > > link, both same RHL8.4 and same DPDK 21.02, running: > >=20 > > . > > ./dpdk-testpmd -l 21-31 -n 8 -w 81:00.1 -- -i --rxq=3D4 --txq=3D4 > > --burst=3D64 --forward-mode=3Drxonly --rss-ip --total-num-mbufs=3D41943= 04 > > --nb-cores=3D4 > > . > >=20 > > and sending from the other end with pktgen, the EPYC loss tons of > > packets (see my previous email), the Xeon don't loss anything. > >=20 > > *Confusion!* > >=20 > > Il 9/11/21 4:19 PM, Filip Janiszewski ha scritto: > >> Thanks, > >> > >> I knew that document and we've implemented many of those settings/rule= s, > >> but perhaps there's one crucial I've forgot? Wonder which one. > >> > >> Anyway, increasing the amount of queues impinge the performance, while > >> sending 250M packets over a 100GbE link to an Intel 810-cqda2 NIC > >> mounted on the EPYC Milan server, i see: > >> > >> . > >> 1 queue, 30Gbps, ~45Mpps, 64B frame =3D imiss: 54,590,111 > >> 2 queue, 30Gbps, ~45Mpps, 64B frame =3D imiss: 79,394,138 > >> 4 queue, 30Gbps, ~45Mpps, 64B frame =3D imiss: 87,414,030 > >> . > >> > >> With DPDK 21.02 on RHL8.4. I can't observe this situation while > >> capturing from my Intel server where increasing the queues leads to > >> better performance (while with the test input set I drop with one queu= e, > >> I do not drop anymore with 2 on the Intel server.) > >> > >> A customer with a brand new EPYC Milan server in his lab observed as > >> well this scenario which is a bit of a worry, but again it might be so= me > >> config/compilation issue we need do deal with? > >> > >> BTW, the same issue can be reproduced with testpmd, using 4 queues and > >> the same input data set (250M of 64bytes frame at 30Gbps): > >> > >> . > >> testpmd> stop > >> Telling cores to stop... > >> Waiting for lcores to finish... > >> > >> ------- Forward Stats for RX Port=3D 0/Queue=3D 0 -> TX Port=3D 0/Qu= eue=3D 0 > >> ------- > >> RX-packets: 41762999 TX-packets: 0 TX-dropped: 0 > >> > >> > >> ------- Forward Stats for RX Port=3D 0/Queue=3D 1 -> TX Port=3D 0/Qu= eue=3D 1 > >> ------- > >> RX-packets: 40152306 TX-packets: 0 TX-dropped: 0 > >> > >> > >> ------- Forward Stats for RX Port=3D 0/Queue=3D 2 -> TX Port=3D 0/Qu= eue=3D 2 > >> ------- > >> RX-packets: 41153402 TX-packets: 0 TX-dropped: 0 > >> > >> > >> ------- Forward Stats for RX Port=3D 0/Queue=3D 3 -> TX Port=3D 0/Qu= eue=3D 3 > >> ------- > >> RX-packets: 38341370 TX-packets: 0 TX-dropped: 0 > >> > >> > >> ---------------------- Forward statistics for port 0 > >> ---------------------- > >> RX-packets: 161410077 RX-dropped: 88589923 RX-total: 25000= 0000 > >> TX-packets: 0 TX-dropped: 0 TX-total: 0 > >> > >> ----------------------------------------------------------------------= =2D----- > >> . > >> > >> . > >> testpmd> show port xstats 0 > >> ###### NIC extended statistics for port 0 > >> rx_good_packets: 161410081 > >> tx_good_packets: 0 > >> rx_good_bytes: 9684605284 > >> tx_good_bytes: 0 > >> rx_missed_errors: 88589923 > >> . > >> > >> Can't figure out what's wrong here.. > >> > >> > >> Il 9/11/21 12:20 PM, Steffen Weise ha scritto: > >>> Hi Filip, > >>> > >>> i have not seen the same issues. > >>> Are you aware of this tuning guide? I applied it and had no issues wi= th > >>> intel 100G NIC. > >>> > >>> HPC Tuning Guide for AMD EPYC Processors > >>> http://developer.amd.com/wp-content/resources/56420.pdf > >>> > >>> > >>> Hope it helps. > >>> > >>> Cheers, > >>> Steffen Weise > >>> > >>> > >>>> Am 11.09.2021 um 10:56 schrieb Filip Janiszewski > >>>> : > >>>> > >>>> =EF=BB=BFI ran more tests, > >>>> > >>>> This AMD server is a bit confusing, I can tune it to capture 28Mpps = (64 > >>>> bytes frame) from one single core, so I would assume that using one = more > >>>> core will at least increase a bit the capture capabilities, but it's > >>>> not, 1% more speed and it drops regardless of how many queues are > >>>> configured - I've not observed this situation on the Intel server, w= here > >>>> adding more queues/cores scale to higher throughput. > >>>> > >>>> This issue have been verified now with both Mellanox and Intel (810 > >>>> series, 100GbE) NICs. > >>>> > >>>> Anybody encountered anything similar? > >>>> > >>>> Thanks > >>>> > >>>> Il 9/10/21 3:34 PM, Filip Janiszewski ha scritto: > >>>>> Hi, > >>>>> > >>>>> I've switched a 100Gbe MLX ConnectX-4 card from an Intel Xeon serve= r to > >>>>> an AMD EPYC server (running 75F3 CPU, 256GiB of RAM and PCIe4 lanes= ), > >>>>> and using the same capture software we can't get any faster than 10= Gbps, > >>>>> when exceeding that speed regardless of the amount of queues config= ured > >>>>> the rx_discards_phy counter starts to raise and packets are lost in= huge > >>>>> amounts. > >>>>> > >>>>> On the Xeon machine, I was able to get easily to 50Gbps with 4 queu= es. > >>>>> > >>>>> Is there any specific DPDK configuration that we might want to setu= p for > >>>>> those AMD servers? The software is DPDK based so I wonder if some b= uild > >>>>> option is missing somewhere. > >>>>> > >>>>> What else I might want to look for to investigate this issue? > >>>>> > >>>>> Thanks