From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6BEB441BE7; Mon, 6 Feb 2023 09:53:09 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0FBDF40FAE; Mon, 6 Feb 2023 09:53:09 +0100 (CET) Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by mails.dpdk.org (Postfix) with ESMTP id 164F240A7A for ; Mon, 6 Feb 2023 09:53:08 +0100 (CET) Received: by mail-ej1-f47.google.com with SMTP id m2so32053886ejb.8 for ; Mon, 06 Feb 2023 00:53:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pantheon-tech.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=T0E9xHkHV7rNMgkgzhZEP30UdGrCwadHv9wReyo3Gjo=; b=4NnYFXtlNp2RPD0aATNx+3pzc2TaUCl1OyeL5UlzaKc7Fkm7s1EGUh0/aIFCA/lfj0 QfHNBHLHUt5ItuX5nXx/krD1Ki62O0dBpIPSkN5yEtoskXQzOKpPBz6cbmELcMc1aJVi adIo1cXgVinq8K/cjB5FdIogCemHFgKevmS3jwLF/OVgs9/GlTw7l8C+7ILQsJx2L3gF 8cZJsBNyP60AY0UDtAW0p7zl00rpi/zflt/nHb6mXWYLuPtHKQjR/ZYZuFZaoNdu4qxh YH4Cjwb7JU5w+OuQtQLyC/aNc87ecZqjDTCTi4CxIUyL+BedqqkhSobKJHyMxcHQNemg EV9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T0E9xHkHV7rNMgkgzhZEP30UdGrCwadHv9wReyo3Gjo=; b=aU03FvIel2eUOq5qJZqEiSa0VZh/QQa+pSWKpnGclJn8zz9i8wASeAHCjVYi7MAGrt qbsYwpGfN4cGF8S2PDCQmnLSlbQryuPhoxfiakOudoE6FVuevyj712B2/wE57r3Xqd0H 40wSS7clNWzWVAszMzUO2E4v3jYtkmgxvO7L4ymZFmdvVb86NCT0H4pQvP0wCcTOCcHM 20NMk9Tzl9FdabgCify9qdECsKxoqj+dX+KZ2ku2jIABCoFe3X1ly50O68UPwEdJcBiV YFe9jPex52BBX+sUHILi20flr5Wd7QP1NX3wLQrzh0DOd28oRhB0qwbxiw/t4ca+21i1 kA7g== X-Gm-Message-State: AO0yUKWfzy2TGq1CdwWQknE1dW84mNBBfmNqDVrP2ym5k26U62ZgzM5y H/ueWQD9wiKC6c79nx42hvyzPkWs5cNrjNR9HRl+Sw== X-Google-Smtp-Source: AK7set9EPyldSTQUpxGtbyazEJcPT9TJm5N7C6HKiOO1P+Dgt9IehPWsvUbQCPqUY66Zzayxf2WHAWSdYoPAJgKaacU= X-Received: by 2002:a17:906:1843:b0:86e:9975:81e6 with SMTP id w3-20020a170906184300b0086e997581e6mr4380384eje.102.1675673587725; Mon, 06 Feb 2023 00:53:07 -0800 (PST) MIME-Version: 1.0 References: <6d232783fb654d0485f8788c027bd70b@pantheon.tech> In-Reply-To: <6d232783fb654d0485f8788c027bd70b@pantheon.tech> From: =?UTF-8?Q?Juraj_Linke=C5=A1?= Date: Mon, 6 Feb 2023 09:52:56 +0100 Message-ID: Subject: Re: Testpmd/l3fwd port shutdown failure on Arm Altra systems To: aman.deep.singh@intel.com, yuying.zhang@intel.com, "Xing, Beilei" Cc: dev@dpdk.org, Ruifeng Wang , Lijian Zhang , Honnappa Nagarahalli Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hello i40e and testpmd maintainers, A gentle reminder - would you please advise how to debug the issue described below? Thanks, Juraj On Fri, Jan 20, 2023 at 1:07 PM Juraj Linke=C5=A1 wrote: > > Adding the logfile. > > > > One thing that's in the logs but didn't explicitly mention is the DPDK ve= rsion we've tried this with: > > EAL: RTE Version: 'DPDK 22.07.0' > > > > We also tried earlier versions going back to 21.08, with no luck. I also = did a quick check on 22.11, also with no luck. > > > > Juraj > > > > From: Juraj Linke=C5=A1 > Sent: Friday, January 20, 2023 12:56 PM > To: 'aman.deep.singh@intel.com' ; 'yuying.zhan= g@intel.com' ; Xing, Beilei > Cc: dev@dpdk.org; Ruifeng Wang ; 'Lijian Zhang' ; 'Honnappa Nagarahalli' > Subject: Testpmd/l3fwd port shutdown failure on Arm Altra systems > > > > Hello i40e and testpmd maintainers, > > > > We're hitting an issue with DPDK testpmd on Ampere Altra servers in FD.io= lab. > > > > A bit of background: along with VPP performance tests (which uses DPDK), = we're running a small number of basic DPDK testpmd and l3fwd tests in FD.io= as well. This is to catch any performance differences due to VPP updating = its DPDK version. > > > > We're running both l3fwd tests and testpmd tests. The Altra servers are t= wo socket and the topology is TG -> DUT1 -> DUT2 -> TG, traffic flows in bo= th directions, but nothing gets forwarded (with a slight caveat - put a pin= in this). There's nothing special in the tests, just forwarding traffic. T= he NIC we're testing is xl710-QDA2. > > > > The same tests are passing on all other testbeds - we have various two no= de (1 DUT, 1 TG) and three node (2 DUT, 1 TG) Intel and Arm testbeds and wi= th various NICs (Intel 700 and 800 series and the Intel testbeds use some M= ellanox NICs as well). We don't have quite the same combination of another = three node topology with the same NIC though, so it looks like something wi= th testpmd/l3fwd and xl710-QDA2 on Altra servers. > > > > VPP performance tests are passing, but l3fwd and testpmd fail. This leads= us to believe to it's a software issue, but there could something wrong wi= th the hardware. I'll talk about testpmd from now on, but as far we can tel= l, the behavior is the same for testpmd and l3fwd. > > > > Getting back to the caveat mentioned earlier, there seems to be something= wrong with port shutdown. When running testpmd on a testbed that hasn't be= en used for a while it seems that all ports are up right away (we don't see= any "Port 0|1: link state change event") and the setup works fine (forward= ing works). After restarting testpmd (restarting on one server is sufficien= t), the ports between DUT1 and DUT2 (but not between DUTs and TG) go down a= nd are not usable in DPDK, VPP or in Linux (with i40e kernel driver) for a = while (measured in minutes, sometimes dozens of minutes; the duration is se= emingly random). The ports eventually recover and can be used again, but th= ere's nothing in syslog suggesting what happened. > > > > What seems to be happening is testpmd put the ports into some faulty stat= e. This only happens on the DUT1 -> DUT2 link though (the ports between the= two testpmds), not on TG -> DUT1 link (the TG port is left alone). > > > > Some more info: > > We've come across the issue with this configuration: > > OS: Ubuntu20.04 with kernel 5.4.0-65-generic. > > Old NIC firmware, never upgraded: 6.01 0x800035da 1.1747.0. > > Drivers versions: i40e 2.17.15 and iavf 4.3.19. > > > > As well as with this configuration: > > OS: Ubuntu22.04 with kernel 5.15.0-46-generic. > > Updated firmware: 8.30 0x8000a4ae 1.2926.0. > > Drivers: i40e 2.19.3 and iavf 4.5.3. > > > > Unsafe noiommu mode is disabled: > > cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode > > N > > > > We used DPDK 22.07 in manual testing and built it on DUTs, using generic = build: > > meson -Dexamples=3Dl3fwd -Dc_args=3D-DRTE_LIBRTE_I40E_16BYTE_RX_DESC=3Dy = -Dplatform=3Dgeneric build > > > > We're running testpmd with this command: > > sudo build/app/dpdk-testpmd -v -l 1,2 -a 0004:04:00.1 -a 0004:04:00.0 --i= n-memory -- -i --forward-mode=3Dio --burst=3D64 --txq=3D1 --rxq=3D1 --tx-of= floads=3D0x0 --numa --auto-start --total-num-mbufs=3D32768 --nb-ports=3D2 -= -portmask=3D0x3 --max-pkt-len=3D1518 --mbuf-size=3D16384 --nb-cores=3D1 > > > > And l3fwd (with different macs on the other server): > > sudo /tmp/openvpp-testing/dpdk/build/examples/dpdk-l3fwd -v -l 1,2 -a 000= 4:04:00.0 -a 0004:04:00.1 --in-memory -- --parse-ptype --eth-dest=3D"0,40:a= 6:b7:85:e7:79" --eth-dest=3D"1,3c:fd:fe:c3:e7:a1" --config=3D"(0, 0, 2),(1,= 0, 2)" -P -L -p 0x3 > > > > We tried adding logs with --log-level=3Dpmd,debug and --no-lsc-interrupt= , but that didn't reveal anything helpful, as far as we can tell - please h= ave a look at the attached log. The faulty port is port0 (starts out as dow= n, then we waited for around 25 minutes for it to go up and then we shut do= wn testpmd). > > > > We'd like to ask for pointers on what could be the cause or how to debug = this issue further. > > > > Thanks, > Juraj