From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6E8024341F; Fri, 1 Dec 2023 15:40:08 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4DD08402AF; Fri, 1 Dec 2023 15:40:08 +0100 (CET) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 77F724029C for ; Fri, 1 Dec 2023 15:40:06 +0100 (CET) Received: from [192.168.1.126] (unknown [188.242.176.176]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id 47B035A; Fri, 1 Dec 2023 17:40:05 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 47B035A DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1701441605; bh=QHPQ5PvUzuSMK5ik1xucnRCdBy1m6Y0bIBJf33S4hF0=; h=Date:Subject:From:To:Cc:References:In-Reply-To:From; b=KpxdP438aybkvf40VGyt7jW7ATKvc7ulL4Sk2QesPZaCAR0FPRrUHrOucDxWKKri1 AhVAFfK9PKTsKrjjYTnBjbDB0jkzE7WJcbzjVMgBNEmWqFTYl4inWXtKtj7WDOyHMS e0O3gqCLOAaKlwtZg0vyYTbGcDSvFMiR6N4N8iMA= Content-Type: multipart/alternative; boundary="------------4CLmft8vw2B0Erm8KwIYUoH6" Message-ID: <970db217-9864-43c4-80b8-c5ce2203b4c6@oktetlabs.ru> Date: Fri, 1 Dec 2023 17:39:58 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Setting up DPDK PMD Test Suite From: Andrew Rybchenko To: Adam Hassick Cc: Konstantin Ushakov , Patrick Robb , ci@dpdk.org References: <1f53aade-73a7-baaf-aecb-2b9a33ab6682@oktetlabs.ru> <4979713b-8e5f-417b-b1af-21f54130eab7@oktetlabs.ru> <8e26b8e6-8d8b-4925-9b30-3fbc5e103e18@oktetlabs.ru> <0cd6a9fb-bc39-4b69-be5d-3470e2374016@oktetlabs.ru> <276e8fb3-b185-4434-aca5-4629c5ff8ad1@oktetlabs.ru> Content-Language: en-US In-Reply-To: <276e8fb3-b185-4434-aca5-4629c5ff8ad1@oktetlabs.ru> X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ci-bounces@dpdk.org This is a multi-part message in MIME format. --------------4CLmft8vw2B0Erm8KwIYUoH6 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hi Adam, I have no good ideas on the problem with LLDP packets. I've tried various things in attempt to repeat the problem without any luck. Since these packet come from Peer/Tester (based on source MAC information), I think it would make sense to check ethtool priv flags while tests run running to see FW LLDP status. May be it is enabled and testing do not notice it (theoretically configuration is synced and checked to match after each test, so it should not be a problem). I think it would be useful to double-check on all interfaces of the NIC. Do you have any progress with run on ARM DUTs? Does it work? Please, let me know if you need any help and if something is blocking it to be moved forward and used on regular basis. Andrew. On 11/20/23 20:18, Andrew Rybchenko wrote: > Hi Adam, > > On 11/16/23 23:03, Adam Hassick wrote: >> Hi Andrew, >> >> If you use copy of dpdk-ethdev-ts has >> 398e272495143884274f5a53c6fe0cc16df41052, you don't need to pass >> --trc-tag=pci-8086-1572 any more since corresponding changeset >> updates expectations to have the same for pci-8086-1583. >> >> >> I'll try this for the next run. >> >> Sorry, but I've failed to find what's wrong there. >> >> >> That if statement works if using the traditional single-bracket >> conditional, or it needs to be rewritten as "[[ -z "${test_log}" ]] >> || [[ ! -r "${test_log}" ]]". The latter is the change I made, but >> both work. > > Thanks a lot. Hopefully fixed. > >> >> As far as I can see LLDP packets spoil testing results: >> https://ts-factory.io/bublik/v2/log/362398?focusId=362760&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_63 >> >> >> As far as I can see main prologue disables FW LLDP on Tester >> https://ts-factory.io/bublik/v2/log/362398?focusId=362400&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_80 >> >> but I guess it could be still enabled on DUT side and DPDK do not >> provide means to disable it as far as I know. I vaguely remember >> that Intel provides FW configuration tools which can do it. >> It is interesting since DPDK gets unexpected LLDP packets but may >> be packets sent by FW go via loopback and visible to PF as well. >> Other possible source of LLDP packet is a switch if NICs are >> connected via switch. If so, LLDP should be disabled on >> corresponding switch ports. >> >> As far as I can see fixing the problem should make results much >> closer. However, I already see some differences in behaviour >> which should be simply fixed in TRC. For example, X710 gets 9 >> packets less than configuration number of Rx descriptors, but >> XL710 gets 10 packets less. >> >> >> I have the "disable-fw-lldp" private flag set on both of the XL710 >> ports on the DUT machine. Very strange how there are still LLDP >> packets appearing in there. > > Me too. Corresponding packet has source MAC from Peer/Tester machine NIC. > It is really strange since prologue disabled LLDP there as well. I'll > try to play with it locally more, but have no good ideas in fact. > >> These systems are not connected to any switch, so maybe a service on >> the DUT itself is sending them. I'm not sure how that could be >> happening though, because I don't have the LLDP daemon installed on >> either system. >> >> Also I see that performance tests are not run because of failed >> prologue: >> https://ts-factory.io/bublik/v2/log/362398?focusId=369564&mode=treeAndinfoAndlog&experimental=true >> >> I'll investigate it, but I guess the source of difference is that >> we always run tests on single interface. Just add -p0 >> (--cfg=iol-dts-xl710-p0) to your configuration name. You don't >> need to change ts-rigs for it since the suffix is handled by >> generic code. It simply comments the second instance and forces >> take the first interface only into account. Initially it was >> introduced to run independent tests on different ports to be able >> to share configuration, but I guess right now it has limitations >> for some packages like representors which require entire NIC. >> >> >> I can try that and will see if it works. > > This problem is fixed in fresh TE and dpdk-ethdev-ts published on GitHub. > > Regards, > Andrew. > >> >> Thanks, >> Adam >> >> On Wed, Nov 8, 2023 at 2:20 AM Andrew Rybchenko >> wrote: >> >> Hi Adam, >> >> On 11/7/23 23:30, Adam Hassick wrote: >>> Hi Andrew, >>> >>> The runner machine was missing a dependency for one of the >>> scripts, "pixz". After installing that, it appears to have >>> worked. I can see the results listed on the ts-factory Bublik >>> instance. >> >> If you use copy of dpdk-ethdev-ts has >> 398e272495143884274f5a53c6fe0cc16df41052, you don't need to pass >> --trc-tag=pci-8086-1572 any more since corresponding changeset >> updates expectations to have the same for pci-8086-1583. >> >>> In the latest revision of ts-rigs, there appears to be a syntax >>> error at line 42 within the script located at >>> "ts-rigs/scripts/publish_logs/prj/ts-factory/publish", within >>> the if condition. I fixed it locally to get it to run. >> >> Sorry, but I've failed to find what's wrong there. >> >>> Taking a quick look at a comparison against your most recent >>> X710 run, it looks like we're NOK on around ~400 more test >>> cases. By percentage of tests, we're 1% off, however, it looks >>> like whole subsets of the test suite that contain low numbers of >>> tests are failing. I wonder if this is due to differences >>> between the Intel X710 and XL710 or issues in our dev testbed. >> >> As far as I can see LLDP packets spoil testing results: >> https://ts-factory.io/bublik/v2/log/362398?focusId=362760&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_63 >> >> >> As far as I can see main prologue disables FW LLDP on Tester >> https://ts-factory.io/bublik/v2/log/362398?focusId=362400&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_80 >> >> but I guess it could be still enabled on DUT side and DPDK do not >> provide means to disable it as far as I know. I vaguely remember >> that Intel provides FW configuration tools which can do it. >> It is interesting since DPDK gets unexpected LLDP packets but may >> be packets sent by FW go via loopback and visible to PF as well. >> Other possible source of LLDP packet is a switch if NICs are >> connected via switch. If so, LLDP should be disabled on >> corresponding switch ports. >> >> As far as I can see fixing the problem should make results much >> closer. However, I already see some differences in behaviour >> which should be simply fixed in TRC. For example, X710 gets 9 >> packets less than configuration number of Rx descriptors, but >> XL710 gets 10 packets less. >> >> Also I see that performance tests are not run because of failed >> prologue: >> https://ts-factory.io/bublik/v2/log/362398?focusId=369564&mode=treeAndinfoAndlog&experimental=true >> >> I'll investigate it, but I guess the source of difference is that >> we always run tests on single interface. Just add -p0 >> (--cfg=iol-dts-xl710-p0) to your configuration name. You don't >> need to change ts-rigs for it since the suffix is handled by >> generic code. It simply comments the second instance and forces >> take the first interface only into account. Initially it was >> introduced to run independent tests on different ports to be able >> to share configuration, but I guess right now it has limitations >> for some packages like representors which require entire NIC. >> >> Regards, >> Andrew. >> >>> Thanks, >>> Adam >> >> (dropped history, to keep mail size small) >> > --------------4CLmft8vw2B0Erm8KwIYUoH6 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit
Hi Adam,

I have no good ideas on the problem with LLDP packets. I've tried various things in attempt to repeat the problem without any luck. Since these packet come from Peer/Tester (based on source MAC information), I think it would make sense to check ethtool priv flags while tests run running to see FW LLDP status. May be it is enabled and testing do not notice it (theoretically configuration is synced and checked to match after each test, so it should not be a problem). I think it would be useful to double-check on all interfaces of the NIC.

Do you have any progress with run on ARM DUTs? Does it work?

Please, let me know if you need any help and if something is blocking it to be moved forward and used on regular basis.

Andrew.

On 11/20/23 20:18, Andrew Rybchenko wrote:
Hi Adam,

On 11/16/23 23:03, Adam Hassick wrote:
Hi Andrew,

If you use copy of dpdk-ethdev-ts has 398e272495143884274f5a53c6fe0cc16df41052, you don't need to pass --trc-tag=pci-8086-1572 any more since corresponding changeset updates expectations to have the same for pci-8086-1583.

I'll try this for the next run.

Sorry, but I've failed to find what's wrong there.

That if statement works if using the traditional single-bracket conditional, or it needs to be rewritten as "[[ -z "${test_log}" ]] || [[ ! -r "${test_log}" ]]". The latter is the change I made, but both work.

Thanks a lot. Hopefully fixed.


As far as I can see LLDP packets spoil testing results:
https://ts-factory.io/bublik/v2/log/362398?focusId=362760&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_63

As far as I can see main prologue disables FW LLDP on Tester
https://ts-factory.io/bublik/v2/log/362398?focusId=362400&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_80
but I guess it could be still enabled on DUT side and DPDK do not provide means to disable it as far as I know. I vaguely remember that Intel provides FW configuration tools which can do it.
It is interesting since DPDK gets unexpected LLDP packets but may be packets sent by FW go via loopback and visible to PF as well.
Other possible source of LLDP packet is a switch if NICs are connected via switch. If so, LLDP should be disabled on corresponding switch ports.

As far as I can see fixing the problem should make results much closer. However, I already see some differences in behaviour which should be simply fixed in TRC. For example, X710 gets 9 packets less than configuration number of Rx descriptors, but XL710 gets 10 packets less.

I have the "disable-fw-lldp" private flag set on both of the XL710 ports on the DUT machine. Very strange how there are still LLDP packets appearing in there.

Me too. Corresponding packet has source MAC from Peer/Tester machine NIC.
It is really strange since prologue disabled LLDP there as well. I'll try to play with it locally more, but have no good ideas in fact.

These systems are not connected to any switch, so maybe a service on the DUT itself is sending them. I'm not sure how that could be happening though, because I don't have the LLDP daemon installed on either system.

Also I see that performance tests are not run because of failed prologue:
https://ts-factory.io/bublik/v2/log/362398?focusId=369564&mode=treeAndinfoAndlog&experimental=true
I'll investigate it, but I guess the source of difference is that we always run tests on single interface. Just add -p0 (--cfg=iol-dts-xl710-p0) to your configuration name. You don't need to change ts-rigs for it since the suffix is handled by generic code. It simply comments the second instance and forces take the first interface only into account. Initially it was introduced to run independent tests on different ports to be able to share configuration, but I guess right now it has limitations for some packages like representors which require entire NIC.

I can try that and will see if it works.

This problem is fixed in fresh TE and dpdk-ethdev-ts published on GitHub.

Regards,
Andrew.


Thanks,
Adam

On Wed, Nov 8, 2023 at 2:20 AM Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
Hi Adam,

On 11/7/23 23:30, Adam Hassick wrote:
Hi Andrew,

The runner machine was missing a dependency for one of the scripts, "pixz". After installing that, it appears to have worked. I can see the results listed on the ts-factory Bublik instance.

If you use copy of dpdk-ethdev-ts has 398e272495143884274f5a53c6fe0cc16df41052, you don't need to pass --trc-tag=pci-8086-1572 any more since corresponding changeset updates expectations to have the same for pci-8086-1583.

In the latest revision of ts-rigs, there appears to be a syntax error at line 42 within the script located at "ts-rigs/scripts/publish_logs/prj/ts-factory/publish", within the if condition. I fixed it locally to get it to run.

Sorry, but I've failed to find what's wrong there.

Taking a quick look at a comparison against your most recent X710 run, it looks like we're NOK on around ~400 more test cases. By percentage of tests, we're 1% off, however, it looks like whole subsets of the test suite that contain low numbers of tests are failing. I wonder if this is due to differences between the Intel X710 and XL710 or issues in our dev testbed.

As far as I can see LLDP packets spoil testing results:
https://ts-factory.io/bublik/v2/log/362398?focusId=362760&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_63

As far as I can see main prologue disables FW LLDP on Tester
https://ts-factory.io/bublik/v2/log/362398?focusId=362400&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_80
but I guess it could be still enabled on DUT side and DPDK do not provide means to disable it as far as I know. I vaguely remember that Intel provides FW configuration tools which can do it.
It is interesting since DPDK gets unexpected LLDP packets but may be packets sent by FW go via loopback and visible to PF as well.
Other possible source of LLDP packet is a switch if NICs are connected via switch. If so, LLDP should be disabled on corresponding switch ports.

As far as I can see fixing the problem should make results much closer. However, I already see some differences in behaviour which should be simply fixed in TRC. For example, X710 gets 9 packets less than configuration number of Rx descriptors, but XL710 gets 10 packets less.

Also I see that performance tests are not run because of failed prologue:
https://ts-factory.io/bublik/v2/log/362398?focusId=369564&mode=treeAndinfoAndlog&experimental=true
I'll investigate it, but I guess the source of difference is that we always run tests on single interface. Just add -p0 (--cfg=iol-dts-xl710-p0) to your configuration name. You don't need to change ts-rigs for it since the suffix is handled by generic code. It simply comments the second instance and forces take the first interface only into account. Initially it was introduced to run independent tests on different ports to be able to share configuration, but I guess right now it has limitations for some packages like representors which require entire NIC.

Regards,
Andrew.

Thanks,
Adam

(dropped history, to keep mail size small)


--------------4CLmft8vw2B0Erm8KwIYUoH6--