From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To: Adam Hassick <ahassick@iol.unh.edu>
Cc: Konstantin Ushakov <Konstantin.Ushakov@oktetlabs.ru>,
Patrick Robb <probb@iol.unh.edu>,
ci@dpdk.org
Subject: Re: Setting up DPDK PMD Test Suite
Date: Mon, 18 Sep 2023 18:04:18 +0300 [thread overview]
Message-ID: <1f53aade-73a7-baaf-aecb-2b9a33ab6682@oktetlabs.ru> (raw)
In-Reply-To: <CAC-YWqgjU5ja3h4V0ewU5Fh8-mX=DSCjjzgwK_H7-n+8kvpdOw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 113160 bytes --]
On 9/18/23 17:44, Adam Hassick wrote:
> Hi Andrew and Konstantin,
>
> Thank you for adding the tester-dial feature, this opens up the
> possibility for us to do CI integrated testing in the future.
>
> Our Mellanox pass rate is similar to yours (about ~2400 passing, ~4400
> failing), however our Intel pass rates are far worse.
> I will try running tests on the XL710 with the trc-tags argument set
> and see if it improves the pass rate.
> Another thing I noticed in the results you uploaded is that the
> results are tagged with vfio-pci and not i40e.
> Though in the environment dump, the driver on the test machine and the
> DUT are set to use the i40e driver. Is this important at all?
I think it is a misunderstanding here. There are two kinds of driver in
configuration: net driver and so-called DPDK driver.
Net driver is a Linux kernel network device driver used on Tester side.
DPDK driver is a Linux kernel driver to bind device to to use it with
DPDK. So, it is NOT a driver inside DPDK (drivers/net/*).
In the case of bifurcated driver (like mlx5_core) it is the same in both
cases.
In non-bifurcated case DPDK driver is some UIO driver(vfio-pci,
uio-pci-generic or igb_uio).
Some expectations depend on used UIO. For example, uio-pci-generic do
not support many interrupts (used by usecases/rx_intr test cases).
That's why we care corresponding TRC tag.
TE_ENV_*_DPDK_DRIVER variables should be vfio-pc in 710's Intel case.
Or uio-pci-generic if IOMMU is turned off on corresponding machines and
Linux distro does not support VFIO no IOMMU mode.
Andrew.
> There isn't anything preventing us from pushing our results up to the
> existing Bublik instance running at ts-factory.io
> <http://ts-factory.io> that I can think of at the moment.
> We will have to work out how to submit our results to your Bublik
> instance in a controlled and secure manner in that case.
> As far as I know we won't need access controls for the results
> themselves. I'll discuss this with Patrick and will let you know once
> we confirm that it's fine.
>
> Thanks,
> Adam
>
> On Mon, Sep 18, 2023 at 2:26 AM Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru> wrote:
>
> On 9/18/23 09:23, Konstantin Ushakov wrote:
>>
>> Hi Andrew,
>>
>> should we always auto-assign the tags or you don’t do it since it
>> slows down (by some seconds) the TE startup?
>>
>
> Tags are auto-assigned, but I guess it differs in Adam's case
> since NIC is a bit different. Below test will help to understand
> if it is the root cause of very different expectations. If pass
> rate will be close to mine, I'll simply update TRC database to
> share expectations for mine NIC and NIC used by Adam.
>
>> Hi Adam,
>>
>> I think I second the question from Andrew - happy to help you
>> with the triage so that we get to the same baseline. Do you have
>> a good way for us to share the logs? I.e. say upload to
>> ts-factory if we add strict permissions system so it’s not
>> publishing or any other way.
>>
>> Thanks,
>> Konstantin
>>
>>
>> On 18 Sep 2023, at 9:15, Andrew Rybchenko wrote:
>>
>> Hi Adam,
>>
>> I've uploaded fresh testing results to ts-factory.io
>> <http://ts-factory.io> [1] to be on the same page.
>>
>> I think I know why your and mine results on Intel 710 series
>> NICs differ so much. Testing results expectations database
>> (dpdk-ethdev-ts/trc/*) is filled in in terms of TRC tags.
>> I.e. expectations depends on TRC tags discovered by helper
>> scripts when testing is started. These tags identify various
>> aspects of what is tested. Ideally expectations should be
>> written in terms of root cause of the expected behaviour. If
>> it is a driver expectations, driver tag should be used. If it
>> is HW limitation, tags with PCI IDs should be used. However,
>> it is not always easy to classify it correctly if you're not
>> involved in driver development. So, in order case
>> expectations for 710's Intel are filled in in terms of PCI
>> IDs. I guess PCI ID differ in your case and that's why
>> expectations filled in for my NIC do not apply to your runs.
>>
>> Just try to add the following option when you run on your
>> 710's Intel in order to mimic mine and see if it helps to
>> achieve better pass rate.
>> --trc-tag=pci-8086-1572
>>
>> BTW, fresh TE tag v1.21.0 has improved algorithm to choose
>> tests for --tester-dial option. It should have better
>> coverage now.
>>
>> Andrew.
>>
>> [1]
>> https://ts-factory.io/bublik/v2/runs?startDate=2023-09-16&finishDate=2023-09-16&runData=&runDataExpr=&page=1
>> <https://ts-factory.io/bublik/v2/runs?startDate=2023-09-16&finishDate=2023-09-16&runData=&runDataExpr=&page=1>
>>
>> On 9/13/23 18:45, Andrew Rybchenko wrote:
>>> Hi Adam,
>>>
>>> I've pushed new TE tag v1.20.0 which supported a new
>>> command-line option --tester-dial=NUM where NUM is from 0 to
>>> 100. it allows to choose percentage of tests to run. If you
>>> want stable set, you should pass --tester-random-seed=0 (or
>>> other integer). It is the first sketch and we have plans to
>>> improve it, but feedback would be welcome.
>>>
>>> > Is it needed on the tester?
>>>
>>> It is hard to say if it is strictly required for simple
>>> tests. However, it is better to update Tester as well, since
>>> performance tests run DPDK on Tester as well.
>>>
>>> > Are there any other manual setup steps for these devices
>>> that I might be missing?
>>>
>>> I don't remember anything else.
>>>
>>> I think it is better to get down to details and take a look
>>> at logs. I'm ready to help with it and explain what's
>>> happening there. May be it will help to understand if it is
>>> a problem with setup/configuration.
>>>
>>> Text logs are not very convenient. Ideally logs should be
>>> imported to bublik, however, manual runs do not provide all
>>> required artifacts right now (Jenkins jobs generate all
>>> required artifacts).
>>> Other option is 'tmp_raw_log' file (should be packed to make
>>> it smaller) which could be converted to various log formats.
>>> Would it be OK for you if I import your logs to bublik at
>>> ts-factory.io <http://ts-factory.io>? Or is it a problem
>>> that it is publicly available?
>>> Would it help if we add authentication and access control there?
>>>
>>> Andrew.
>>>
>>> On 9/8/23 17:57, Adam Hassick wrote:
>>>> Hi Andrew,
>>>>
>>>> I have a couple questions about needed setup of the NICs
>>>> for the ethdev test suite.
>>>>
>>>> Our MCX5s and XL710s are failing the checkup tests. The
>>>> pass rate appears to be much worse on the XL710s (40 of 73
>>>> tests failed, 3 passed unexpectedly).
>>>>
>>>> For the XL710s, I've updated the driver and NVM versions to
>>>> match the minimum supported versions in the compatibility
>>>> matrix found on the DPDK documentation. This did not change
>>>> the failure rate much.
>>>> For the MCX5s, I've installed the latest LTS version of the
>>>> OFED bifurcated driver on the DUT. Is it needed on the tester?
>>>>
>>>> Are there any other manual setup steps for these devices
>>>> that I might be missing?
>>>>
>>>> Thanks,
>>>> Adam
>>>>
>>>> On Wed, Sep 6, 2023 at 11:00 AM Adam Hassick
>>>> <ahassick@iol.unh.edu> wrote:
>>>>
>>>> Hi Andrew,
>>>>
>>>> Yes, I copied the X710 configs to set up XL710 configs.
>>>> I changed the environment variable names from the X710
>>>> suffix to XL710 suffix in the script, and forgot to
>>>> change them in the corresponding environment file.
>>>> That fixed the issue.
>>>>
>>>> I got the checkup tests working on the XL710 now. Most
>>>> of them are failing, which leads me to believe this is
>>>> an issue with our testbed. Based on the DPDK
>>>> documentation for i40e, the firmware and driver
>>>> versions are much older than what DPDK 22.11 LTS and
>>>> main prefer, so I'll try updating those.
>>>>
>>>> For now I'm working on getting the XL710 checkup tests
>>>> passing, and will pick up getting the E810 configured
>>>> properly next. I'll let you know if I run into any more
>>>> issues in relation to the test engine.
>>>>
>>>> Thanks,
>>>> Adam
>>>>
>>>> On Wed, Sep 6, 2023 at 7:36 AM Andrew Rybchenko
>>>> <andrew.rybchenko@oktetlabs.ru> wrote:
>>>>
>>>> Hi Adam,
>>>>
>>>> On 9/5/23 18:01, Adam Hassick wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> The compilation warning issue is now resolved.
>>>>> Again, thank you guys for fixing this for us. I
>>>>> can run the tests on the Mellanox CX5s again,
>>>>> however I'm running into a couple new issues with
>>>>> running the prologues on the Intel cards.
>>>>>
>>>>> When running testing on the Intel XL710s, I see
>>>>> this error appear in the log:
>>>>>
>>>>> ERROR prologue Environment LIB 14:16:13.650
>>>>> Too few networks in available configuration
>>>>> (0) in comparison with required (1)
>>>>>
>>>>>
>>>>> This seems like a trivial configuration error,
>>>>> perhaps this is something I need to set up in
>>>>> ts-rigs. I briefly searched through the examples
>>>>> there and didn't see any mention of how to set up
>>>>> a network.
>>>>> I will attach this log just in case you need more
>>>>> information.
>>>>
>>>> Unfortunately logs are insufficient to understand
>>>> it. I've pushed new tag to TE v1.19.0 which add log
>>>> message with TE_* environment variables.
>>>> Most likely something is wrong with variables which
>>>> are used as conditions when available networks are
>>>> defined in ts-conf/cs/inc.net_cfg_pci_fns.yml:
>>>> TE_PCI_INSTANCE_IUT_TST1
>>>> TE_PCI_INSTANCE_IUT_TST1a
>>>> TE_PCI_INSTANCE_TST1a_IUT
>>>> TE_PCI_INSTANCE_TST1_IUT
>>>> My guess it that you change naming a bit, but
>>>> script like ts-rigs-sample/scripts/iut.h1-x710 is
>>>> not included or not updated.
>>>>
>>>>> There is a different error when running on the
>>>>> Intel E810s. It appears to me like it starts DPDK,
>>>>> does some configuration inside DPDK and on the
>>>>> device, and then fails to bring the device back
>>>>> up. Since this error seems very non-trivial, I
>>>>> will also attach this log.
>>>>
>>>> This one is a bit simpler. Few lines after the
>>>> first ERROR in log I see the following:
>>>> WARN RCF DPDK 13:06:00.144
>>>> ice_program_hw_rx_queue(): currently package
>>>> doesn't support RXDID (22)
>>>> ice_rx_queue_start(): fail to program RX queue 0
>>>> ice_dev_start(): fail to start Rx queue 0
>>>> Device with port_id=0 already stopped
>>>>
>>>> It is stdout/stderr from test agent which runs
>>>> DPDK. Same logs in plain format are available in
>>>> ta.DPDK file.
>>>> I'm not an expert here, but I vaguely remember that
>>>> E810 requires correct firmware and DDP to be loaded.
>>>> There is some information in
>>>> dpdk/doc/guides/nics/ice.rst.
>>>>
>>>> You can try to add --dev-args=safe-mode-support=1
>>>> command-line option described there.
>>>>
>>>> Hope it helps,
>>>> Andrew.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Adam
>>>>>
>>>>> On Fri, Sep 1, 2023 at 3:59 AM Andrew Rybchenko
>>>>> <andrew.rybchenko@oktetlabs.ru> wrote:
>>>>>
>>>>> Hi Adam,
>>>>>
>>>>> On 8/31/23 22:38, Adam Hassick wrote:
>>>>>> Hi Andrew,
>>>>>>
>>>>>> I have one additional question as well: Does
>>>>>> the test engine support running tests on two
>>>>>> ARMv8 test agents?
>>>>>>
>>>>>> 1. We'll sort out warnings this week.
>>>>>> Thanks for heads up.
>>>>>>
>>>>>>
>>>>>> Great. Let me know when that's fixed.
>>>>>
>>>>> Done. We also fixed a number of warnings in TE.
>>>>> Also we fixed root test package name to be
>>>>> consistent with the repository name.
>>>>>
>>>>>> Support for old LTS branches was dropped
>>>>>> some time ago, but in the future it is
>>>>>> definitely possible to keep it for new
>>>>>> LTS branches. I think 22.11 is supported,
>>>>>> but I'm not sure about older LTS releases.
>>>>>>
>>>>>>
>>>>>> Good to know.
>>>>>>
>>>>>> 2. You can add command-line option
>>>>>> --sanity to run tests marked with
>>>>>> TEST_HARNESS_SANITY requirement (see
>>>>>> dpdk-ethdev-ts/scripts/run.sh and grep
>>>>>> TEST_HARNESS_SANITY dpdk-ethdev-ts to see
>>>>>> which tests are marked). Yes, there is a
>>>>>> space for terminology improvement here.
>>>>>> We'll do it.
>>>>>>
>>>>>
>>>>> Done. Now it is called --checkup.
>>>>>
>>>>>>
>>>>>> Also it takes a lot of time because of
>>>>>> failures and tests which wait for some
>>>>>> timeout.
>>>>>>
>>>>>>
>>>>>> That makes sense to me. We'll use the time to
>>>>>> complete tests on virtio or the Intel devices
>>>>>> as a reference for how long the tests really
>>>>>> take to complete.
>>>>>> We will explore the possibility of
>>>>>> periodically running the sanity tests for
>>>>>> patches.
>>>>>
>>>>> I'll double-check and let you know how long
>>>>> entire TS runs on Intel X710, E810, Mellanox
>>>>> CX5 and virtio net. Just to ensure that time
>>>>> observed in your case looks the same.
>>>>>
>>>>>>
>>>>>> The test harness can provide coverage
>>>>>> reports based on gcov, but I'm not sure
>>>>>> what you mean by a "dial" to control test
>>>>>> coverage. Provided reports are rather for
>>>>>> human to analyze.
>>>>>>
>>>>>>
>>>>>> The general idea is to have some kind of
>>>>>> parameter on the test suite, which could be
>>>>>> an integer ranging from zero to ten, that
>>>>>> controls how many tests are run based on how
>>>>>> important the test is.
>>>>>>
>>>>>> Similar to how some command line interfaces
>>>>>> provide a verbosity level parameter (some
>>>>>> number of "-v" arguments) to control the
>>>>>> importance of the information in the log.
>>>>>> The verbosity level zero only prints very
>>>>>> important log messages, while ten prints
>>>>>> everything.
>>>>>>
>>>>>> In much the same manner as above, this "dial"
>>>>>> parameter controls what tests are run and
>>>>>> with what parameters based on how important
>>>>>> those tests and test parameter combinations are.
>>>>>> Coverage Level zero tells the suite to run a
>>>>>> very basic set of important tests, with
>>>>>> minimal parameterization. This mode would
>>>>>> take only ~5-10 minutes to run.
>>>>>> In contrast, Coverage Level ten includes all
>>>>>> the edge cases, every combination of test
>>>>>> parameters, everything the test suite can do,
>>>>>> which takes the normal several hours to run.
>>>>>> The values 1 - 9 are between those two
>>>>>> extremes, allowing the user to get a gradient
>>>>>> of test coverage in the results and to limit
>>>>>> the running time.
>>>>>>
>>>>>> Then we could, for example, run the "run.sh"
>>>>>> with a level of 2 or 3 for incoming patches
>>>>>> that need quick results, and with a level of
>>>>>> 10 for the less often run periodic tests
>>>>>> performed on main or LTS branches.
>>>>>
>>>>> Understood now. Thanks a lot for the idea.
>>>>> We'll discuss it and come back.
>>>>>
>>>>>> 3. Yes, really many tests on Mellanox CX5
>>>>>> NICs report unexpected testing results.
>>>>>> Unfortunately it is time consuming to
>>>>>> fill in expectations database since it is
>>>>>> necessary to analyze testing results and
>>>>>> classify if it is a bug or just
>>>>>> acceptable behaviour aspect.
>>>>>>
>>>>>> Bublik allows to compare results of two
>>>>>> runs. It is useful for human, but still
>>>>>> not good for automation.
>>>>>>
>>>>>> I have local patch for mlx5 driver which
>>>>>> reports Tx ring size maximum. It makes
>>>>>> pass rate higher. It is a problem for
>>>>>> test harness that mlx5 does not report
>>>>>> limits right now.
>>>>>>
>>>>>> Pass rate on Intel X710 is about 92% on
>>>>>> my test rig. Pass rate on virtio net is
>>>>>> 99% right now and could be done 100%
>>>>>> easily (just one thing to fix in
>>>>>> expectations).
>>>>>>
>>>>>> I think logs storage setup is essential
>>>>>> for logs analysis. Of course, you can
>>>>>> request HTML logs when you run tests
>>>>>> (--log-html=html) or generate after run
>>>>>> using dpdk-ethdev-ts/scripts/html-log.sh
>>>>>> and open index.html in a browser, but
>>>>>> logs storage makes it more convenient.
>>>>>>
>>>>>>
>>>>>> We are interested in setting up Bublik,
>>>>>> potentially as an externally-facing
>>>>>> component, once we have our process of
>>>>>> running the test suite stabilized.
>>>>>> Once we are able to run the test suite again,
>>>>>> I'll see what the pass rate is on our other
>>>>>> hardware.
>>>>>> Good to know that it isn't an issue with our
>>>>>> dev testbed causing the high fail rate.
>>>>>>
>>>>>> For Intel hardware, we have an XL710 and an
>>>>>> Intel E810-C in our development testbed.
>>>>>> Although they are slightly different devices,
>>>>>> ideally the pass rate will be identical or
>>>>>> similar. I have yet to set up a VM pair for
>>>>>> virtio, but we will soon.
>>>>>>
>>>>>> Latest version of test-environment has
>>>>>> examples of our CGI scripts which we use
>>>>>> for log storage (see
>>>>>> tools/log_server/README.md).
>>>>>>
>>>>>> Also all bits for Jenkins setup are
>>>>>> available. See
>>>>>> dpdk-ethdev-ts/jenkins/README.md and
>>>>>> examples of jenkins files in ts-rigs-sample.
>>>>>>
>>>>>>
>>>>>> Jenkins integration, setting up production
>>>>>> rig configurations, and permanent log storage
>>>>>> will be our next steps once I am able to run
>>>>>> the tests again.
>>>>>> Unless there is an easy way to have meson not
>>>>>> pass "-Werror" into GCC. Then I would be able
>>>>>> to run the test suite.
>>>>>
>>>>> Hopefully it is resolved now.
>>>>>
>>>>> I thought a bit more about your usecase for
>>>>> Jenkins. I'm not 100% sure that existing
>>>>> pipelines are convenient for your usecase.
>>>>> Fill free to ask questions when you are on it.
>>>>>
>>>>> Thanks,
>>>>> Andrew.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Adam
>>>>>>
>>>>>>
>>>>>> On 8/29/23 17:02, Adam Hassick wrote:
>>>>>>> Hi Andrew,
>>>>>>>
>>>>>>> That fix seems to have resolved the
>>>>>>> issue, thanks for the quick turnaround
>>>>>>> time on that patch.
>>>>>>> Now that we have the RCF timeout issue
>>>>>>> resolved, there are a few other
>>>>>>> questions and issues that we have about
>>>>>>> the tests themselves.
>>>>>>>
>>>>>>> 1. The test suite fails to build with a
>>>>>>> couple warnings.
>>>>>>>
>>>>>>> Below is the stderr log from compilation:
>>>>>>>
>>>>>>> FAILED:
>>>>>>> lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o
>>>>>>> cc -Ilib/76b5a35@@ts_dpdk_pmd@sta
>>>>>>> -Ilib -I../../lib
>>>>>>> -I/opt/tsf/dpdk-ethdev-ts/ts/inst/default/include
>>>>>>> -fdiagnostics-color=always -pipe
>>>>>>> -D_FILE_OFFSET_BITS=64 -Wall
>>>>>>> -Winvalid-pch -Werror -g
>>>>>>> -D_GNU_SOURCE -O0 -ggdb -Wall -W
>>>>>>> -fPIC -MD -MQ
>>>>>>> 'lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o'
>>>>>>> -MF
>>>>>>> 'lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d'
>>>>>>> -o
>>>>>>> 'lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o'
>>>>>>> -c ../../lib/dpdk_pmd_ts.c
>>>>>>> ../../lib/dpdk_pmd_ts.c: In function
>>>>>>> ‘test_create_traffic_generator_params’:
>>>>>>> ../../lib/dpdk_pmd_ts.c:5577:5:
>>>>>>> error: format not a string literal
>>>>>>> and no format arguments
>>>>>>> [-Werror=format-security]
>>>>>>> 5577 | rc =
>>>>>>> te_kvpair_add(result, buf, mode);
>>>>>>> | ^~
>>>>>>> cc1: all warnings being treated as
>>>>>>> errors
>>>>>>> ninja: build stopped: subcommand failed.
>>>>>>> ninja: Entering directory `.'
>>>>>>> FAILED:
>>>>>>> lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o
>>>>>>> cc -Ilib/76b5a35@@ts_dpdk_pmd@sta
>>>>>>> -Ilib -I../../lib
>>>>>>> -I/opt/tsf/dpdk-ethdev-ts/ts/inst/default/include
>>>>>>> -fdiagnostics-color=always -pipe
>>>>>>> -D_FILE_OFFSET_BITS=64 -Wall
>>>>>>> -Winvalid-pch -Werror -g
>>>>>>> -D_GNU_SOURCE -O0 -ggdb -Wall -W
>>>>>>> -fPIC -MD -MQ
>>>>>>> 'lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o'
>>>>>>> -MF
>>>>>>> 'lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d'
>>>>>>> -o
>>>>>>> 'lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o'
>>>>>>> -c ../../lib/dpdk_pmd_ts.c
>>>>>>> ../../lib/dpdk_pmd_ts.c: In function
>>>>>>> ‘test_create_traffic_generator_params’:
>>>>>>> ../../lib/dpdk_pmd_ts.c:5577:5:
>>>>>>> error: format not a string literal
>>>>>>> and no format arguments
>>>>>>> [-Werror=format-security]
>>>>>>> 5577 | rc =
>>>>>>> te_kvpair_add(result, buf, mode);
>>>>>>> | ^~
>>>>>>> cc1: all warnings being treated as
>>>>>>> errors
>>>>>>>
>>>>>>>
>>>>>>> This error wasn't occurring last week,
>>>>>>> which was the last time I ran the tests.
>>>>>>> The TE host and the DUT have GCC v9.4.0
>>>>>>> installed, and the tester has GCC
>>>>>>> v11.4.0 installed, if this information
>>>>>>> is helpful.
>>>>>>>
>>>>>>> 2. On the Mellanox CX5s, there are over
>>>>>>> 6,000 tests run, which collectively take
>>>>>>> around 9 hours. Is it possible, and
>>>>>>> would it make sense, to lower the test
>>>>>>> coverage and have the test suite run faster?
>>>>>>>
>>>>>>> For some context, we run immediate
>>>>>>> testing on incoming patches for DPDK
>>>>>>> main and development branches, as well
>>>>>>> as periodic test runs on the main,
>>>>>>> stable, and LTS branches.
>>>>>>> For us to consider including this test
>>>>>>> suite as part of our immediate testing
>>>>>>> on patches, we would have to reduce the
>>>>>>> test coverage to the most important tests.
>>>>>>> This is primarily to reduce the testing
>>>>>>> time to, for example, less than 30
>>>>>>> minutes. Testing on patches can't take
>>>>>>> too long because the lab can receive
>>>>>>> numerous patches each day, which each
>>>>>>> require individual testing runs.
>>>>>>>
>>>>>>> At what frequency we run these tests,
>>>>>>> and on what, still needs to be discussed
>>>>>>> with the DPDK community, but it would be
>>>>>>> nice to know if the test suite had a
>>>>>>> "dial" to control the testing coverage.
>>>>>>>
>>>>>>> 3. We see a lot of test failures on our
>>>>>>> Mellanox CX5 NICs. Around 2,300 of
>>>>>>> ~6,600 tests passed. Is there anything
>>>>>>> we can do to diagnose these test failures?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Adam
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 29, 2023 at 8:07 AM Andrew
>>>>>>> Rybchenko
>>>>>>> <andrew.rybchenko@oktetlabs.ru> wrote:
>>>>>>>
>>>>>>> Hi Adam,
>>>>>>>
>>>>>>> I've pushed the fix in main branch
>>>>>>> and a new tag v1.18.1. It should
>>>>>>> solve the problem with IPv6 address
>>>>>>> from DNS.
>>>>>>>
>>>>>>> Andrew.
>>>>>>>
>>>>>>> On 8/29/23 00:05, Andrew Rybchenko
>>>>>>> wrote:
>>>>>>>> Hi Adam,
>>>>>>>>
>>>>>>>> > Does the test engine prefer to
>>>>>>>> use IPv6 over IPv4 for initiating
>>>>>>>> the RCF connection to the test bed
>>>>>>>> hosts? And if so, is there a way to
>>>>>>>> force it to use IPv4?
>>>>>>>>
>>>>>>>> Brilliant idea. If DNS returns both
>>>>>>>> IPv4 and IPv6 addresses in your
>>>>>>>> case, I guess it is the root cause
>>>>>>>> of the problem.
>>>>>>>> Of course, it is TE problem since I
>>>>>>>> see really weird code in
>>>>>>>> lib/comm_net_engine/comm_net_engine.c
>>>>>>>> line 135.
>>>>>>>>
>>>>>>>> I've pushed fix to the branch
>>>>>>>> user/arybchik/fix_ipv4_only in
>>>>>>>> ts-factory/test-environment
>>>>>>>> repository. Please, try.
>>>>>>>>
>>>>>>>> It is late night fix with minimal
>>>>>>>> testing and no review. I'll pass it
>>>>>>>> through review process tomorrow and
>>>>>>>> hopefully it will be released in
>>>>>>>> one-two days.
>>>>>>>>
>>>>>>>> Andrew.
>>>>>>>>
>>>>>>>> On 8/28/23 18:02, Adam Hassick wrote:
>>>>>>>>> Hi Andrew,
>>>>>>>>>
>>>>>>>>> We have yet to notice a distinct
>>>>>>>>> pattern with the failures.
>>>>>>>>> Sometimes, the RCF will start and
>>>>>>>>> connect without issue a few times
>>>>>>>>> in a row before failing to connect
>>>>>>>>> again. Once the issue begins to
>>>>>>>>> occur, neither rebooting all of
>>>>>>>>> the hosts (test engine VM, tester,
>>>>>>>>> IUT) or deleting all of the build
>>>>>>>>> directories (suites, agents, inst)
>>>>>>>>> and rebooting the hosts afterward
>>>>>>>>> resolves the issue. When it begins
>>>>>>>>> working again seems very arbitrary
>>>>>>>>> to us.
>>>>>>>>>
>>>>>>>>> I do usually try to terminate the
>>>>>>>>> test engine with Ctrl+C, but when
>>>>>>>>> it hangs while trying to start
>>>>>>>>> RCF, that does not work.
>>>>>>>>>
>>>>>>>>> Does the test engine prefer to use
>>>>>>>>> IPv6 over IPv4 for initiating the
>>>>>>>>> RCF connection to the test bed
>>>>>>>>> hosts? And if so, is there a way
>>>>>>>>> to force it to use IPv4?
>>>>>>>>>
>>>>>>>>> - Adam
>>>>>>>>>
>>>>>>>>> On Fri, Aug 25, 2023 at 1:35 PM
>>>>>>>>> Andrew Rybchenko
>>>>>>>>> <andrew.rybchenko@oktetlabs.ru> wrote:
>>>>>>>>>
>>>>>>>>> > I'll double-check test
>>>>>>>>> engine on Ubuntu 20.04 and
>>>>>>>>> Ubuntu 22.04.
>>>>>>>>>
>>>>>>>>> Done. It works fine for me
>>>>>>>>> without any issues.
>>>>>>>>>
>>>>>>>>> Have you noticed any pattern
>>>>>>>>> when it works or does not work?
>>>>>>>>> May be it is a problem of not
>>>>>>>>> clean state after termination?
>>>>>>>>> Does it work fine the first
>>>>>>>>> time after DUTs reboot?
>>>>>>>>> How do you terminate testing?
>>>>>>>>> It should be done using Ctrl+C
>>>>>>>>> in terminal where you execute
>>>>>>>>> run.sh command.
>>>>>>>>> In this case it should
>>>>>>>>> shutdown gracefully and close
>>>>>>>>> all test agents and engine
>>>>>>>>> applications.
>>>>>>>>>
>>>>>>>>> (I'm trying to understand why
>>>>>>>>> you've seen many test agent
>>>>>>>>> processes. It should not happen.)
>>>>>>>>>
>>>>>>>>> Andrew.
>>>>>>>>>
>>>>>>>>> On 8/25/23 17:41, Andrew
>>>>>>>>> Rybchenko wrote:
>>>>>>>>>> On 8/25/23 17:06, Adam
>>>>>>>>>> Hassick wrote:
>>>>>>>>>>> Hi Andrew,
>>>>>>>>>>>
>>>>>>>>>>> Two of our systems (the Test
>>>>>>>>>>> Engine runner and the DUT
>>>>>>>>>>> host) are running Ubuntu
>>>>>>>>>>> 20.04 LTS, however this
>>>>>>>>>>> morning I noticed that the
>>>>>>>>>>> tester system (the one
>>>>>>>>>>> having issues) is running
>>>>>>>>>>> Ubuntu 22.04 LTS.
>>>>>>>>>>> This could be the source of
>>>>>>>>>>> the problem. I encountered a
>>>>>>>>>>> dependency issue trying to
>>>>>>>>>>> run the Test Engine on 22.04
>>>>>>>>>>> LTS, so I downgraded the
>>>>>>>>>>> system. Since the tester is
>>>>>>>>>>> also the host having
>>>>>>>>>>> connection issues, I will
>>>>>>>>>>> try downgrading that system
>>>>>>>>>>> to 20.04, and see if that
>>>>>>>>>>> changes anything.
>>>>>>>>>>
>>>>>>>>>> Unlikely, but who knows. We
>>>>>>>>>> run tests (DUTs) on Ubuntu
>>>>>>>>>> 20.04, Ubuntu 22.04, Ubuntu
>>>>>>>>>> 22.10, Ubuntu 23.04, Debian
>>>>>>>>>> 11 and Fedora 38 every night.
>>>>>>>>>> Right now Debian 11 is used
>>>>>>>>>> for test engine in nightly
>>>>>>>>>> regressions.
>>>>>>>>>>
>>>>>>>>>> I'll double-check test engine
>>>>>>>>>> on Ubuntu 20.04 and Ubuntu 22.04.
>>>>>>>>>>
>>>>>>>>>>> I did try passing in the
>>>>>>>>>>> "--vg-rcf" argument to the
>>>>>>>>>>> run.sh script of the test
>>>>>>>>>>> suite after installing
>>>>>>>>>>> valgrind, but there was no
>>>>>>>>>>> additional output that I saw.
>>>>>>>>>>
>>>>>>>>>> Sorry, I should valgrind
>>>>>>>>>> output should be in
>>>>>>>>>> valgrind.te_rcf (direction
>>>>>>>>>> where you run test engine).
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I will try pulling in the
>>>>>>>>>>> changes you've pushed up,
>>>>>>>>>>> and will see if that fixes
>>>>>>>>>>> anything.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Adam
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 25, 2023 at
>>>>>>>>>>> 9:57 AM Andrew Rybchenko
>>>>>>>>>>> <andrew.rybchenko@oktetlabs.ru>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello Adam,
>>>>>>>>>>>
>>>>>>>>>>> On 8/24/23 23:54, Andrew
>>>>>>>>>>> Rybchenko wrote:
>>>>>>>>>>>> I'd like to try to
>>>>>>>>>>>> repeat the problem
>>>>>>>>>>>> locally. Which Linux
>>>>>>>>>>>> distro is running on
>>>>>>>>>>>> test engine and agents?
>>>>>>>>>>>>
>>>>>>>>>>>> In fact I know one
>>>>>>>>>>>> problem with Debian 12
>>>>>>>>>>>> and Fedora 38 and we have
>>>>>>>>>>>> patch in review to fix
>>>>>>>>>>>> it, however, the
>>>>>>>>>>>> behaviour is different in
>>>>>>>>>>>> this case, so it is
>>>>>>>>>>>> unlike the same problem.
>>>>>>>>>>>
>>>>>>>>>>> I've just published a
>>>>>>>>>>> new tag which fixes
>>>>>>>>>>> known test engine side
>>>>>>>>>>> problems on Debian 12
>>>>>>>>>>> and Fedora 38.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> One more idea is to
>>>>>>>>>>>> install valgrind on the
>>>>>>>>>>>> test engine host and
>>>>>>>>>>>> run with option
>>>>>>>>>>>> --vg-rcf to check if
>>>>>>>>>>>> something weird is
>>>>>>>>>>>> happening.
>>>>>>>>>>>>
>>>>>>>>>>>> What I don't understand
>>>>>>>>>>>> right now is why I see
>>>>>>>>>>>> just one failed attempt
>>>>>>>>>>>> to connect in your
>>>>>>>>>>>> log.txt and then Logger
>>>>>>>>>>>> shutdown after 9 minutes.
>>>>>>>>>>>>
>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/24/23 23:29, Adam
>>>>>>>>>>>> Hassick wrote:
>>>>>>>>>>>>> > Is there any
>>>>>>>>>>>>> firewall in the
>>>>>>>>>>>>> network or on test
>>>>>>>>>>>>> hosts which could
>>>>>>>>>>>>> block incoming TCP
>>>>>>>>>>>>> connection to the port
>>>>>>>>>>>>> 23571
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> from the host where
>>>>>>>>>>>>> you run test engine?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Our test engine host
>>>>>>>>>>>>> and the testbed are on
>>>>>>>>>>>>> the same subnet. The
>>>>>>>>>>>>> connection does work
>>>>>>>>>>>>> sometimes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> > If behaviour the
>>>>>>>>>>>>> same on the next try
>>>>>>>>>>>>> and you see that test
>>>>>>>>>>>>> agent is kept running,
>>>>>>>>>>>>> could you check using
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > # netstat -tnlp
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > that Test Agent is
>>>>>>>>>>>>> listening on the port
>>>>>>>>>>>>> and try to establish
>>>>>>>>>>>>> TCP connection from
>>>>>>>>>>>>> test agent using
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > $ telnet
>>>>>>>>>>>>> iol-dts-tester.dpdklab.iol.unh.edu
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> 23571
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > and check if TCP
>>>>>>>>>>>>> connection could be
>>>>>>>>>>>>> established.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was able to
>>>>>>>>>>>>> replicate the same
>>>>>>>>>>>>> behavior again, where
>>>>>>>>>>>>> it hangs while RCF is
>>>>>>>>>>>>> trying to start.
>>>>>>>>>>>>> Running this command,
>>>>>>>>>>>>> I see this in the output:
>>>>>>>>>>>>>
>>>>>>>>>>>>> tcp 0 0
>>>>>>>>>>>>> 0.0.0.0:23571
>>>>>>>>>>>>> <http://0.0.0.0:23571>
>>>>>>>>>>>>> <http://0.0.0.0:23571>
>>>>>>>>>>>>> <http://0.0.0.0:23571>
>>>>>>>>>>>>> 0.0.0.0:* LISTEN 18599/ta
>>>>>>>>>>>>>
>>>>>>>>>>>>> So it seems like it is
>>>>>>>>>>>>> listening on the
>>>>>>>>>>>>> correct port.
>>>>>>>>>>>>> Additionally, I was
>>>>>>>>>>>>> able to connect to the
>>>>>>>>>>>>> Tester machine from
>>>>>>>>>>>>> our Test Engine host
>>>>>>>>>>>>> using telnet. It
>>>>>>>>>>>>> printed the PID of the
>>>>>>>>>>>>> process once the
>>>>>>>>>>>>> connection was opened.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tried running the
>>>>>>>>>>>>> "ta" application
>>>>>>>>>>>>> manually on the
>>>>>>>>>>>>> command line, and it
>>>>>>>>>>>>> didn't print anything
>>>>>>>>>>>>> at all.
>>>>>>>>>>>>> Maybe the issue is
>>>>>>>>>>>>> something on the Test
>>>>>>>>>>>>> Engine side.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 24, 2023
>>>>>>>>>>>>> at 2:35 PM Andrew
>>>>>>>>>>>>> Rybchenko
>>>>>>>>>>>>> <andrew.rybchenko@oktetlabs.ru
>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>
>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Adam,
>>>>>>>>>>>>>
>>>>>>>>>>>>> > On the tester
>>>>>>>>>>>>> host (which appears to
>>>>>>>>>>>>> be the Peer agent), there
>>>>>>>>>>>>> are four processes
>>>>>>>>>>>>> that I see running,
>>>>>>>>>>>>> which look like the test
>>>>>>>>>>>>> agent processes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Before the next
>>>>>>>>>>>>> try I'd recommend to
>>>>>>>>>>>>> kill these processes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there any
>>>>>>>>>>>>> firewall in the
>>>>>>>>>>>>> network or on test
>>>>>>>>>>>>> hosts which could
>>>>>>>>>>>>> block incoming TCP
>>>>>>>>>>>>> connection to the port
>>>>>>>>>>>>> 23571
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> from the host
>>>>>>>>>>>>> where you run test
>>>>>>>>>>>>> engine?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If behaviour the
>>>>>>>>>>>>> same on the next try
>>>>>>>>>>>>> and you see that test
>>>>>>>>>>>>> agent is
>>>>>>>>>>>>> kept running,
>>>>>>>>>>>>> could you check using
>>>>>>>>>>>>>
>>>>>>>>>>>>> # netstat -tnlp
>>>>>>>>>>>>>
>>>>>>>>>>>>> that Test Agent is
>>>>>>>>>>>>> listening on the port
>>>>>>>>>>>>> and try to establish TCP
>>>>>>>>>>>>> connection from
>>>>>>>>>>>>> test agent using
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ telnet
>>>>>>>>>>>>> iol-dts-tester.dpdklab.iol.unh.edu
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> 23571
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>>
>>>>>>>>>>>>> and check if TCP
>>>>>>>>>>>>> connection could be
>>>>>>>>>>>>> established.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Another idea is to
>>>>>>>>>>>>> login Tester under
>>>>>>>>>>>>> root as testing does, get
>>>>>>>>>>>>> start TA command
>>>>>>>>>>>>> from the log and try
>>>>>>>>>>>>> it by hands without -n and
>>>>>>>>>>>>> remove extra escaping.
>>>>>>>>>>>>>
>>>>>>>>>>>>> # sudo
>>>>>>>>>>>>> PATH=${PATH}:/tmp/linux_x86_root_76872_1692885663_1
>>>>>>>>>>>>> LD_LIBRARY_PATH=${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}/tmp/linux_x86_root_76872_1692885663_1
>>>>>>>>>>>>> /tmp/linux_x86_root_76872_1692885663_1/ta
>>>>>>>>>>>>> Peer 23571
>>>>>>>>>>>>> host=iol-dts-tester.dpdklab.iol.unh.edu:port=23571:user=root:key=/opt/tsf/keys/id_ed25519:ssh_port=22:copy_timeout=15:kill_timeout=15:sudo=:shell=
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hopefully in this
>>>>>>>>>>>>> case test agent
>>>>>>>>>>>>> directory remains in
>>>>>>>>>>>>> the /tmp and
>>>>>>>>>>>>> you don't need to
>>>>>>>>>>>>> copy it as testing does.
>>>>>>>>>>>>> May be output
>>>>>>>>>>>>> could shed some light
>>>>>>>>>>>>> on what's going on.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/24/23 17:30,
>>>>>>>>>>>>> Adam Hassick wrote:
>>>>>>>>>>>>>> Hi Andrew,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is the
>>>>>>>>>>>>>> output that I see in
>>>>>>>>>>>>>> the terminal when
>>>>>>>>>>>>>> this failure
>>>>>>>>>>>>>> occurs, after the
>>>>>>>>>>>>>> test agent binaries
>>>>>>>>>>>>>> build and the test engine
>>>>>>>>>>>>>> starts:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Platform default
>>>>>>>>>>>>>> build - pass
>>>>>>>>>>>>>> Simple RCF
>>>>>>>>>>>>>> consistency check
>>>>>>>>>>>>>> succeeded
>>>>>>>>>>>>>> --->>> Starting
>>>>>>>>>>>>>> Logger...done
>>>>>>>>>>>>>> --->>> Starting
>>>>>>>>>>>>>> RCF...rcf_net_engine_connect():
>>>>>>>>>>>>>> Connection timed
>>>>>>>>>>>>>> out
>>>>>>>>>>>>>> iol-dts-tester.dpdklab.iol.unh.edu:23571
>>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>>> <http://iol-dts-tester.dpdklab.iol.unh.edu:23571>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Then, it hangs
>>>>>>>>>>>>>> here until I kill the
>>>>>>>>>>>>>> "te_rcf" and "te_tee"
>>>>>>>>>>>>>> processes. I let
>>>>>>>>>>>>>> it hang for around 9
>>>>>>>>>>>>>> minutes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On the tester
>>>>>>>>>>>>>> host (which appears
>>>>>>>>>>>>>> to be the Peer
>>>>>>>>>>>>>> agent), there are
>>>>>>>>>>>>>> four processes
>>>>>>>>>>>>>> that I see running,
>>>>>>>>>>>>>> which look like the
>>>>>>>>>>>>>> test agent
>>>>>>>>>>>>>> processes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ta.Peer is an
>>>>>>>>>>>>>> empty file. I've
>>>>>>>>>>>>>> attached the log.txt
>>>>>>>>>>>>>> from this run.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Adam
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Aug 24,
>>>>>>>>>>>>>> 2023 at 4:22 AM
>>>>>>>>>>>>>> Andrew Rybchenko
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <andrew.rybchenko@oktetlabs.ru
>>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>
>>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Adam,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes,
>>>>>>>>>>>>>> TE_RCFUNIX_TIMEOUT is
>>>>>>>>>>>>>> in seconds. I've
>>>>>>>>>>>>>> double-checked
>>>>>>>>>>>>>> that it goes
>>>>>>>>>>>>>> to 'copy_timeout' in
>>>>>>>>>>>>>> ts-conf/rcf.conf.
>>>>>>>>>>>>>> Description in in
>>>>>>>>>>>>>> doc/sphinx/pages/group_te_engine_rcf.rst
>>>>>>>>>>>>>> says that
>>>>>>>>>>>>>> copy_timeout is in
>>>>>>>>>>>>>> seconds and
>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>> lib/rcfunix/rcfunix.c
>>>>>>>>>>>>>> passes the value to
>>>>>>>>>>>>>> select() tv_sec.
>>>>>>>>>>>>>> Theoretically
>>>>>>>>>>>>>> select() could be
>>>>>>>>>>>>>> interrupted by
>>>>>>>>>>>>>> signal, but I
>>>>>>>>>>>>>> think it is
>>>>>>>>>>>>>> unlikely here.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure
>>>>>>>>>>>>>> that I understand
>>>>>>>>>>>>>> what do you mean by RCF
>>>>>>>>>>>>>> connection timeout.
>>>>>>>>>>>>>> Does it happen on TE
>>>>>>>>>>>>>> startup when RCF
>>>>>>>>>>>>>> starts test
>>>>>>>>>>>>>> agents. If so,
>>>>>>>>>>>>>> TE_RCFUNIX_TIMEOUT
>>>>>>>>>>>>>> could help. Or
>>>>>>>>>>>>>> does it
>>>>>>>>>>>>>> happen when tests are
>>>>>>>>>>>>>> in progress, e.g. in
>>>>>>>>>>>>>> the middle
>>>>>>>>>>>>>> of a test. If
>>>>>>>>>>>>>> so,
>>>>>>>>>>>>>> TE_RCFUNIX_TIMEOUT is
>>>>>>>>>>>>>> unrelated and most
>>>>>>>>>>>>>> likely either
>>>>>>>>>>>>>> host with test agent
>>>>>>>>>>>>>> dies or test agent itself
>>>>>>>>>>>>>> crashes. It would be
>>>>>>>>>>>>>> easier for me if
>>>>>>>>>>>>>> classify it if you share
>>>>>>>>>>>>>> text log
>>>>>>>>>>>>>> (log.txt, full or
>>>>>>>>>>>>>> just corresponding
>>>>>>>>>>>>>> fragment with
>>>>>>>>>>>>>> some
>>>>>>>>>>>>>> context). Also
>>>>>>>>>>>>>> content of ta.DPDK or
>>>>>>>>>>>>>> ta.Peer file
>>>>>>>>>>>>>> depending on which
>>>>>>>>>>>>>> agent has problems
>>>>>>>>>>>>>> could shed some light.
>>>>>>>>>>>>>> Corresponding files
>>>>>>>>>>>>>> contain stdout/stderr
>>>>>>>>>>>>>> of test agents.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 8/23/23
>>>>>>>>>>>>>> 17:45, Adam Hassick
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi Andrew,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've set up
>>>>>>>>>>>>>>> a test rig
>>>>>>>>>>>>>>> repository here, and
>>>>>>>>>>>>>>> have created
>>>>>>>>>>>>>>> configurations for
>>>>>>>>>>>>>>> our development
>>>>>>>>>>>>>>> testbed based off of the
>>>>>>>>>>>>>>> examples.
>>>>>>>>>>>>>>> We've been
>>>>>>>>>>>>>>> able to get the test
>>>>>>>>>>>>>>> suite to run manually on
>>>>>>>>>>>>>>> Mellanox CX5 devices
>>>>>>>>>>>>>>> once.
>>>>>>>>>>>>>>> However, we are
>>>>>>>>>>>>>>> running into an
>>>>>>>>>>>>>>> issue where, when
>>>>>>>>>>>>>>> RCF starts,
>>>>>>>>>>>>>>> the RCF
>>>>>>>>>>>>>>> connection times out
>>>>>>>>>>>>>>> very frequently. We
>>>>>>>>>>>>>>> aren't sure
>>>>>>>>>>>>>>> why this is
>>>>>>>>>>>>>>> the case.
>>>>>>>>>>>>>>> It works
>>>>>>>>>>>>>>> sometimes, but most
>>>>>>>>>>>>>>> of the time when we
>>>>>>>>>>>>>>> try to run
>>>>>>>>>>>>>>> the test
>>>>>>>>>>>>>>> engine, it
>>>>>>>>>>>>>>> encounters this issue.
>>>>>>>>>>>>>>> I've tried
>>>>>>>>>>>>>>> changing the RCF
>>>>>>>>>>>>>>> port by setting
>>>>>>>>>>>>>>> "TE_RCF_PORT=<some
>>>>>>>>>>>>>>> port number>" and
>>>>>>>>>>>>>>> rebooting the testbed
>>>>>>>>>>>>>>> machines. Neither
>>>>>>>>>>>>>>> seems to fix the issue.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It also
>>>>>>>>>>>>>>> seems like the
>>>>>>>>>>>>>>> timeout takes far
>>>>>>>>>>>>>>> longer than 60
>>>>>>>>>>>>>>> seconds, even when
>>>>>>>>>>>>>>> running "export
>>>>>>>>>>>>>>> TE_RCFUNIX_TIMEOUT=60"
>>>>>>>>>>>>>>> before I try
>>>>>>>>>>>>>>> to run the test suite.
>>>>>>>>>>>>>>> I assume the
>>>>>>>>>>>>>>> unit for this
>>>>>>>>>>>>>>> variable is seconds?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Adam
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Aug
>>>>>>>>>>>>>>> 21, 2023 at 10:19 AM
>>>>>>>>>>>>>>> Adam Hassick
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> <ahassick@iol.unh.edu
>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Andrew,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, I've cloned
>>>>>>>>>>>>>>> the example
>>>>>>>>>>>>>>> repository and will
>>>>>>>>>>>>>>> start
>>>>>>>>>>>>>>> setting up a
>>>>>>>>>>>>>>> configuration for
>>>>>>>>>>>>>>> our development testbed
>>>>>>>>>>>>>>> today. I'll let you
>>>>>>>>>>>>>>> know if I run into
>>>>>>>>>>>>>>> any difficulties
>>>>>>>>>>>>>>> or have
>>>>>>>>>>>>>>> any questions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Adam
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun,
>>>>>>>>>>>>>>> Aug 20, 2023 at
>>>>>>>>>>>>>>> 4:40 AM Andrew Rybchenko
>>>>>>>>>>>>>>> <andrew.rybchenko@oktetlabs.ru
>>>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>
>>>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Adam,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've
>>>>>>>>>>>>>>> published
>>>>>>>>>>>>>>> https://github.com/ts-factory/ts-rigs-sample
>>>>>>>>>>>>>>> <https://github.com/ts-factory/ts-rigs-sample>
>>>>>>>>>>>>>>> <https://github.com/ts-factory/ts-rigs-sample>.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hopefully it will
>>>>>>>>>>>>>>> help to define your
>>>>>>>>>>>>>>> test rigs and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> successfully run
>>>>>>>>>>>>>>> some tests manually.
>>>>>>>>>>>>>>> Feel free to
>>>>>>>>>>>>>>> ask
>>>>>>>>>>>>>>> any questions and
>>>>>>>>>>>>>>> I'll answer here and
>>>>>>>>>>>>>>> try to
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> update documentation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Meanwhile I'll
>>>>>>>>>>>>>>> prepare missing bits
>>>>>>>>>>>>>>> for steps (2) and
>>>>>>>>>>>>>>> (3).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hopefully everything
>>>>>>>>>>>>>>> is in place for step
>>>>>>>>>>>>>>> (4), but we
>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>> to make steps (2)
>>>>>>>>>>>>>>> and (3) first.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On
>>>>>>>>>>>>>>> 8/18/23 21:40,
>>>>>>>>>>>>>>> Andrew Rybchenko wrote:
>>>>>>>>>>>>>>>> Hi Adam,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> I've conferred with
>>>>>>>>>>>>>>>> the rest of the
>>>>>>>>>>>>>>>> team, and we
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> think it would be
>>>>>>>>>>>>>>>> best to move
>>>>>>>>>>>>>>>> forward with mainly
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> option B.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> OK,
>>>>>>>>>>>>>>>> I'll provide the
>>>>>>>>>>>>>>>> sample on Monday
>>>>>>>>>>>>>>>> for you. It is
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> almost ready right
>>>>>>>>>>>>>>>> now, but I need to
>>>>>>>>>>>>>>>> double-check
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>> before publishing.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On
>>>>>>>>>>>>>>>> 8/17/23 20:03, Adam
>>>>>>>>>>>>>>>> Hassick wrote:
>>>>>>>>>>>>>>>>> Hi Andrew,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm adding the CI
>>>>>>>>>>>>>>>>> mailing list to this
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> conversation.
>>>>>>>>>>>>>>>>> Others in the
>>>>>>>>>>>>>>>>> community might find
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> this conversation
>>>>>>>>>>>>>>>>> valuable.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We
>>>>>>>>>>>>>>>>> do want to run
>>>>>>>>>>>>>>>>> testing on a
>>>>>>>>>>>>>>>>> regular basis. The
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Jenkins
>>>>>>>>>>>>>>>>> integration will
>>>>>>>>>>>>>>>>> be very useful for
>>>>>>>>>>>>>>>>> us, as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> most of our CI is
>>>>>>>>>>>>>>>>> orchestrated by
>>>>>>>>>>>>>>>>> Jenkins.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've conferred
>>>>>>>>>>>>>>>>> with the rest of
>>>>>>>>>>>>>>>>> the team, and we
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> think it would be
>>>>>>>>>>>>>>>>> best to move
>>>>>>>>>>>>>>>>> forward with mainly
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> option B.
>>>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>>> you would like to
>>>>>>>>>>>>>>>>> know anything
>>>>>>>>>>>>>>>>> about our
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> testbeds that
>>>>>>>>>>>>>>>>> would help you
>>>>>>>>>>>>>>>>> with creating an
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> example ts-rigs
>>>>>>>>>>>>>>>>> repo, I'd be happy
>>>>>>>>>>>>>>>>> to answer any
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> questions you have.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We
>>>>>>>>>>>>>>>>> have multiple test
>>>>>>>>>>>>>>>>> rigs (we call these
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "DUT-tester
>>>>>>>>>>>>>>>>> pairs") that we
>>>>>>>>>>>>>>>>> run our existing
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hardware testing
>>>>>>>>>>>>>>>>> on, with differing
>>>>>>>>>>>>>>>>> network
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hardware and CPU
>>>>>>>>>>>>>>>>> architecture. I
>>>>>>>>>>>>>>>>> figured this might
>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> an important detail.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Adam
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On
>>>>>>>>>>>>>>>>> Thu, Aug 17, 2023
>>>>>>>>>>>>>>>>> at 11:44 AM Andrew
>>>>>>>>>>>>>>>>> Rybchenko
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> <andrew.rybchenko@oktetlabs.ru
>>>>>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>
>>>>>>>>>>>>>>>>> <mailto:andrew.rybchenko@oktetlabs.ru>>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Greatings Adam,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm happy to hear
>>>>>>>>>>>>>>>>> that you're trying
>>>>>>>>>>>>>>>>> to bring
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> it up.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As I understand
>>>>>>>>>>>>>>>>> the final goal is
>>>>>>>>>>>>>>>>> to run it on
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> regular basis. So,
>>>>>>>>>>>>>>>>> we need to make it
>>>>>>>>>>>>>>>>> properly
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> from the very
>>>>>>>>>>>>>>>>> beginning.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Bring up of all
>>>>>>>>>>>>>>>>> features consists
>>>>>>>>>>>>>>>>> of 4 steps:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. Create
>>>>>>>>>>>>>>>>> site-specific
>>>>>>>>>>>>>>>>> repository (we call it
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ts-rigs) which
>>>>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>>>>> information about test
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> rigs and other
>>>>>>>>>>>>>>>>> site-specific
>>>>>>>>>>>>>>>>> information like
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> where to send
>>>>>>>>>>>>>>>>> mails, where to
>>>>>>>>>>>>>>>>> store logs etc.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It is required for
>>>>>>>>>>>>>>>>> manual execution
>>>>>>>>>>>>>>>>> as well,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> since test rigs
>>>>>>>>>>>>>>>>> description is
>>>>>>>>>>>>>>>>> essential. I'll
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> return to the
>>>>>>>>>>>>>>>>> topic below.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. Setup logs
>>>>>>>>>>>>>>>>> storage for
>>>>>>>>>>>>>>>>> automated runs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Basically it is a
>>>>>>>>>>>>>>>>> disk space plus
>>>>>>>>>>>>>>>>> apache2 web
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> server with few
>>>>>>>>>>>>>>>>> CGI scripts which
>>>>>>>>>>>>>>>>> help a lot to
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> save disk space.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3. Setup Bublik
>>>>>>>>>>>>>>>>> web application
>>>>>>>>>>>>>>>>> which provides
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> web interface to
>>>>>>>>>>>>>>>>> view testing
>>>>>>>>>>>>>>>>> results. Same as
>>>>>>>>>>>>>>>>> https://ts-factory.io/bublik
>>>>>>>>>>>>>>>>> <https://ts-factory.io/bublik>
>>>>>>>>>>>>>>>>> <https://ts-factory.io/bublik>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 4. Setup Jenkins
>>>>>>>>>>>>>>>>> to run tests on
>>>>>>>>>>>>>>>>> regularly,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> save logs in log
>>>>>>>>>>>>>>>>> storage (2) and
>>>>>>>>>>>>>>>>> import it to
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> bublik (3).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Last few month we
>>>>>>>>>>>>>>>>> spent on our
>>>>>>>>>>>>>>>>> homework to make
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> it simpler to
>>>>>>>>>>>>>>>>> bring up automated
>>>>>>>>>>>>>>>>> execution
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> using Jenkins -
>>>>>>>>>>>>>>>>> https://github.com/ts-factory/te-jenkins
>>>>>>>>>>>>>>>>> <https://github.com/ts-factory/te-jenkins>
>>>>>>>>>>>>>>>>> <https://github.com/ts-factory/te-jenkins>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Corresponding bits
>>>>>>>>>>>>>>>>> in dpdk-ethdev-ts
>>>>>>>>>>>>>>>>> will be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> available tomorrow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Let's return to
>>>>>>>>>>>>>>>>> the step (1).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Unfortunately
>>>>>>>>>>>>>>>>> there is no
>>>>>>>>>>>>>>>>> publicly available
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> example of the
>>>>>>>>>>>>>>>>> ts-rigs repository
>>>>>>>>>>>>>>>>> since
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> sensitive
>>>>>>>>>>>>>>>>> site-specific
>>>>>>>>>>>>>>>>> information is located
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> there. But I'm
>>>>>>>>>>>>>>>>> ready to help you
>>>>>>>>>>>>>>>>> to create it
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> for UNH. I see two
>>>>>>>>>>>>>>>>> options here:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (A) I'll ask
>>>>>>>>>>>>>>>>> questions and
>>>>>>>>>>>>>>>>> based on your
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> answers will
>>>>>>>>>>>>>>>>> create the first
>>>>>>>>>>>>>>>>> draft with my
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (B) I'll make a
>>>>>>>>>>>>>>>>> template/example
>>>>>>>>>>>>>>>>> ts-rigs repo,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> publish it and
>>>>>>>>>>>>>>>>> you'll create UNH
>>>>>>>>>>>>>>>>> ts-rigs based
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> on it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Of course, I'll
>>>>>>>>>>>>>>>>> help to debug and
>>>>>>>>>>>>>>>>> finally bring
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> it up in any case.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (A) is a bit
>>>>>>>>>>>>>>>>> simpler for me and
>>>>>>>>>>>>>>>>> you, but (B) is
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> a bit more generic
>>>>>>>>>>>>>>>>> and will help other
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> potential users to
>>>>>>>>>>>>>>>>> bring it up.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We can combine
>>>>>>>>>>>>>>>>> (A)+(B). I.e.
>>>>>>>>>>>>>>>>> start from (A).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 8/17/23 15:18,
>>>>>>>>>>>>>>>>> Konstantin Ushakov
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Greetings Adam,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for
>>>>>>>>>>>>>>>>>> contacting us. I
>>>>>>>>>>>>>>>>>> copy Andrew who
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> would be happy to
>>>>>>>>>>>>>>>>>> help
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 16 Aug 2023,
>>>>>>>>>>>>>>>>>>> at 21:50, Adam
>>>>>>>>>>>>>>>>>>> Hassick
>>>>>>>>>>>>>>>>>>> <ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Greetings
>>>>>>>>>>>>>>>>>>> Konstantin,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am in the
>>>>>>>>>>>>>>>>>>> process of
>>>>>>>>>>>>>>>>>>> setting up the DPDK
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Poll Mode Driver
>>>>>>>>>>>>>>>>>>> test suite as an
>>>>>>>>>>>>>>>>>>> addition to
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> our testing
>>>>>>>>>>>>>>>>>>> coverage for
>>>>>>>>>>>>>>>>>>> DPDK at the UNH lab.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have some
>>>>>>>>>>>>>>>>>>> questions about
>>>>>>>>>>>>>>>>>>> how to set the
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> test suite
>>>>>>>>>>>>>>>>>>> arguments.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have been able
>>>>>>>>>>>>>>>>>>> to configure the
>>>>>>>>>>>>>>>>>>> Test Engine
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> to connect to
>>>>>>>>>>>>>>>>>>> the hosts in the
>>>>>>>>>>>>>>>>>>> testbed. The
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> RCF,
>>>>>>>>>>>>>>>>>>> Configurator,
>>>>>>>>>>>>>>>>>>> and Tester all
>>>>>>>>>>>>>>>>>>> begin to
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> run, however the
>>>>>>>>>>>>>>>>>>> prelude of the
>>>>>>>>>>>>>>>>>>> test suite
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> fails to run.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters
>>>>>>>>>>>>>>>>>>> <https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters>
>>>>>>>>>>>>>>>>>>> <https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>> documentation
>>>>>>>>>>>>>>>>>>> mentions that
>>>>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> several test
>>>>>>>>>>>>>>>>>>> parameters for
>>>>>>>>>>>>>>>>>>> the test suite,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> like for the IUT
>>>>>>>>>>>>>>>>>>> test link MAC,
>>>>>>>>>>>>>>>>>>> etc. These
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> seem like they
>>>>>>>>>>>>>>>>>>> would need to be
>>>>>>>>>>>>>>>>>>> set somewhere
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> to run many of
>>>>>>>>>>>>>>>>>>> the tests.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I see in the
>>>>>>>>>>>>>>>>>>> Test Engine
>>>>>>>>>>>>>>>>>>> documentation, there
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> are instructions
>>>>>>>>>>>>>>>>>>> on how to create new
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> parameters for
>>>>>>>>>>>>>>>>>>> test suites in
>>>>>>>>>>>>>>>>>>> the Tester
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> configuration,
>>>>>>>>>>>>>>>>>>> but there is
>>>>>>>>>>>>>>>>>>> nothing in the
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> user guide or in
>>>>>>>>>>>>>>>>>>> the Tester guide
>>>>>>>>>>>>>>>>>>> for how to
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> set the
>>>>>>>>>>>>>>>>>>> arguments for
>>>>>>>>>>>>>>>>>>> the parameters when
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> running the test
>>>>>>>>>>>>>>>>>>> suite that I can
>>>>>>>>>>>>>>>>>>> find. I'm
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> not sure if I
>>>>>>>>>>>>>>>>>>> need to write my
>>>>>>>>>>>>>>>>>>> own Tester
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> config, or if I
>>>>>>>>>>>>>>>>>>> should be
>>>>>>>>>>>>>>>>>>> setting these in
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> some other way.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> How should these
>>>>>>>>>>>>>>>>>>> values be set?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm also not
>>>>>>>>>>>>>>>>>>> sure what
>>>>>>>>>>>>>>>>>>> environment
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> variables/arguments
>>>>>>>>>>>>>>>>>>> are strictly
>>>>>>>>>>>>>>>>>>> necessary or
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> which are optional.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Adam
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> *Adam Hassick*
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Senior Developer
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> UNH
>>>>>>>>>>>>>>>>>>> InterOperability Lab
>>>>>>>>>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>>>>>>>>>> <http://iol.unh.edu>
>>>>>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +1 (603) 475-8248
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *Adam Hassick*
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Senior Developer
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> UNH
>>>>>>>>>>>>>>>>> InterOperability Lab
>>>>>>>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>>>>>>>> <http://iol.unh.edu>
>>>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>> (603) 475-8248
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- *Adam
>>>>>>>>>>>>>>> Hassick*
>>>>>>>>>>>>>>> Senior Developer
>>>>>>>>>>>>>>> UNH InterOperability Lab
>>>>>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>>>>>> <http://iol.unh.edu>
>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>> +1 (603)
>>>>>>>>>>>>>>> 475-8248
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> *Adam Hassick*
>>>>>>>>>>>>>>> Senior Developer
>>>>>>>>>>>>>>> UNH
>>>>>>>>>>>>>>> InterOperability Lab
>>>>>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>>>>>> <http://iol.unh.edu>
>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>>> +1 (603)
>>>>>>>>>>>>>>> 475-8248
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- *Adam Hassick*
>>>>>>>>>>>>>> Senior Developer
>>>>>>>>>>>>>> UNH
>>>>>>>>>>>>>> InterOperability Lab
>>>>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>>>>> <http://iol.unh.edu>
>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>>> +1 (603) 475-8248
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> *Adam Hassick*
>>>>>>>>>>>>> Senior Developer
>>>>>>>>>>>>> UNH InterOperability Lab
>>>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>> <mailto:ahassick@iol.unh.edu>
>>>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>>>> <http://iol.unh.edu>
>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>>>> +1 (603) 475-8248
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Adam Hassick*
>>>>>>>>>>> Senior Developer
>>>>>>>>>>> UNH InterOperability Lab
>>>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>>>> iol.unh.edu
>>>>>>>>>>> <https://www.iol.unh.edu/>
>>>>>>>>>>> +1 (603) 475-8248
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Adam Hassick*
>>>>>>>>> Senior Developer
>>>>>>>>> UNH InterOperability Lab
>>>>>>>>> ahassick@iol.unh.edu
>>>>>>>>> iol.unh.edu <https://www.iol.unh.edu/>
>>>>>>>>> +1 (603) 475-8248
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Adam Hassick*
>>>>>>> Senior Developer
>>>>>>> UNH InterOperability Lab
>>>>>>> ahassick@iol.unh.edu
>>>>>>> iol.unh.edu <https://www.iol.unh.edu/>
>>>>>>> +1 (603) 475-8248
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Adam Hassick*
>>>>> Senior Developer
>>>>> UNH InterOperability Lab
>>>>> ahassick@iol.unh.edu
>>>>> iol.unh.edu <https://www.iol.unh.edu/>
>>>>> +1 (603) 475-8248
>>>>
>>>>
>>>>
>>>> --
>>>> *Adam Hassick*
>>>> Senior Developer
>>>> UNH InterOperability Lab
>>>> ahassick@iol.unh.edu
>>>> iol.unh.edu <https://www.iol.unh.edu/>
>>>> +1 (603) 475-8248
>>>>
>>>>
>>>>
>>>> --
>>>> *Adam Hassick*
>>>> Senior Developer
>>>> UNH InterOperability Lab
>>>> ahassick@iol.unh.edu
>>>> iol.unh.edu <https://www.iol.unh.edu/>
>>>> +1 (603) 475-8248
>>>
>>
>
>
>
> --
> *Adam Hassick*
> Senior Developer
> UNH InterOperability Lab
> ahassick@iol.unh.edu
> iol.unh.edu <https://www.iol.unh.edu/>
> +1 (603) 475-8248
[-- Attachment #2: Type: text/html, Size: 200902 bytes --]
next prev parent reply other threads:[~2023-09-18 15:04 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAC-YWqiQfH4Rx-Et1jGHhGK9i47d0AArKy-B2P77iYbbM+Lpig@mail.gmail.com>
[not found] ` <C3B08390-DA6D-4BDC-BBD7-98561F92FE33@oktetlabs.ru>
[not found] ` <35340484-1d7e-7e5f-cad4-c965ba541397@oktetlabs.ru>
2023-08-17 17:03 ` Adam Hassick
2023-08-18 18:40 ` Andrew Rybchenko
2023-08-20 8:40 ` Andrew Rybchenko
2023-08-21 14:19 ` Adam Hassick
2023-08-23 14:45 ` Adam Hassick
2023-08-24 8:22 ` Andrew Rybchenko
2023-08-24 14:30 ` Adam Hassick
2023-08-24 18:34 ` Andrew Rybchenko
2023-08-24 20:29 ` Adam Hassick
2023-08-24 20:54 ` Andrew Rybchenko
2023-08-25 13:57 ` Andrew Rybchenko
2023-08-25 14:06 ` Adam Hassick
2023-08-25 14:41 ` Andrew Rybchenko
2023-08-25 17:35 ` Andrew Rybchenko
2023-08-28 15:02 ` Adam Hassick
2023-08-28 21:05 ` Andrew Rybchenko
2023-08-29 12:07 ` Andrew Rybchenko
2023-08-29 14:02 ` Adam Hassick
2023-08-29 20:43 ` Andrew Rybchenko
2023-08-31 19:38 ` Adam Hassick
2023-09-01 7:59 ` Andrew Rybchenko
2023-09-05 15:01 ` Adam Hassick
2023-09-06 11:36 ` Andrew Rybchenko
2023-09-06 15:00 ` Adam Hassick
2023-09-08 14:57 ` Adam Hassick
2023-09-13 15:45 ` Andrew Rybchenko
2023-09-18 6:15 ` Andrew Rybchenko
2023-09-18 6:23 ` Konstantin Ushakov
2023-09-18 6:26 ` Andrew Rybchenko
2023-09-18 14:44 ` Adam Hassick
2023-09-18 15:04 ` Andrew Rybchenko [this message]
2023-10-04 13:48 ` Adam Hassick
2023-10-05 10:25 ` Andrew Rybchenko
2023-10-10 14:09 ` Adam Hassick
2023-10-11 11:46 ` Andrew Rybchenko
2023-10-23 11:11 ` Andrew Rybchenko
2023-10-25 20:27 ` Adam Hassick
2023-10-26 12:19 ` Andrew Rybchenko
2023-10-26 17:44 ` Adam Hassick
2023-10-27 8:01 ` Andrew Rybchenko
2023-10-27 19:13 ` Andrew Rybchenko
2023-11-06 23:16 ` Adam Hassick
2023-11-07 16:57 ` Andrew Rybchenko
2023-11-07 20:30 ` Adam Hassick
2023-11-08 7:20 ` Andrew Rybchenko
2023-11-16 20:03 ` Adam Hassick
2023-11-16 20:38 ` DPDK Coverity test run Mcnamara, John
2023-11-16 20:43 ` Patrick Robb
2023-11-16 20:56 ` Mcnamara, John
2023-11-20 17:18 ` Setting up DPDK PMD Test Suite Andrew Rybchenko
2023-12-01 14:39 ` Andrew Rybchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1f53aade-73a7-baaf-aecb-2b9a33ab6682@oktetlabs.ru \
--to=andrew.rybchenko@oktetlabs.ru \
--cc=Konstantin.Ushakov@oktetlabs.ru \
--cc=ahassick@iol.unh.edu \
--cc=ci@dpdk.org \
--cc=probb@iol.unh.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).