test suite reviews and discussions
 help / color / mirror / Atom feed
* Community CI Meeting Minutes - August 31, 2023
@ 2023-08-31 16:06 Patrick Robb
  0 siblings, 0 replies; only message in thread
From: Patrick Robb @ 2023-08-31 16:06 UTC (permalink / raw)
  To: ci; +Cc: dts

[-- Attachment #1: Type: text/plain, Size: 8322 bytes --]

August 31, 2023

#####################################################################
Attendees
1. Patrick Robb
2. Adam Hassick
3. Aaron Conole
4. Bruce Richardson
5. Juraj Linkeš
6. Paul Szczepenek
7. Ali Alnubani

#####################################################################
Agenda
1. General Announcements
2. CI Status
3. DTS Improvements & Test Development
4. Any other business

#####################################################################
Minutes

=====================================================================
General Announcements
* DPDK Summit is Sept 12-13
   * The next CI meeting will be rescheduled from September 14th to
September 21st, to avoid a clash with DPDK summit travel. The following
meeting on September 28th will return us to our normal cycle.
   * There is a gov board and tech board session on the 11th with some ci
discussion in both meetings
* Unit test suites: Bruce’s patch for dynamically building the unit test
suites has hit mainline
   * David’s patch fixing the memory leak from PCI device probing (arm64)
is still pending
   * V3 of patch for skipping specific tests based on an env variable is
submitted
      * Opting to use an environment variable to skip the tests based on
command line parsing concerns
      * Patrick should ack this

=====================================================================
CI Status

---------------------------------------------------------------------
UNH-IOL Community Lab
* Mellanox perf testing:
   * cx5 is back online
   * Hardware Refresh
      * The CX6 NIC is running with no reporting. It is testing at line
rates for performance test runs with frame sizes 256-1518, but is falling
below line rate for the test when run for 64B and 128B frames.
         * Is this due to packet overhead?
         * Currently this is running on a gen3.0 x8 pci slot
         * Ali is going to remote onto the testbed soon to take a look
      * CX7: backordered
   * Running DTS within a VM as a security measure?
      * Will require pci passthrough, cant be done with virtio
      * Does this added level of complexity justify the benefits?
      * Connection to host and rest of network should be blocked, and
dispose of vm after each run
* Intel 8970 QAT Accelerator card:
   * The custom patch doesn’t cleanly apply on the kernel after checking
out to 5.4.0-155(currently running), so I’d like to just rebuild the kernel
from 5.15 or 6.0. But, I don’t want to do it without ARM people authorizing
it, so I’ll proceed once I have the go-ahead from them.
      * Juraj: ubuntu versions should be uniform across the lab (so 22.04
for all systems)
* Retesting framework roadmap - UNH:
   * This is online, and an email explaining the process has been sent to
the dev mailing list:
https://inbox.dpdk.org/dev/CAC-YWqiXqBYyzPsc4UD7LbUHKha_Vb3=Aot+dQomuRLojy2hvA@mail.gmail.com/
   * We will add some basic instructions to the DPDK website:
https://inbox.dpdk.org/web/20230831031834.9271-2-probb@iol.unh.edu/T/#u
   * We’ll also put something on the community lab dashboard about page
* TS-Factory: Using our dev testbed, Adam has attempted to run the ethdev
testsuite on a few nics (MLNX cx5, Intel x710, Intel E810). We have only
gotten the test suite to run on our CX5s so far. Oktet lab is communicating
with us to resolve some issues we’ve run into and provide guidance
regarding how this can be used in CI.
   * Adam discovered the bug that the RCF (remote control) implementation
for ts-factory required that DNS returned v4 ip addresses to the test
engine when initiating connections to the tester and DUT systems. Andrew at
Oktet labs hotfixed this for the testing branch we are using it, and this
bugfix will hit ts-factory mainline soon
   * New changes (as of a few days ago) to ts-factory are causing –werror
builds of the testsuite to fail, which has been reported to Oktet labs and
they are working on this week
   * How to use this in a ci context?
      * The DPDK testsuite has approx 6000 test cases, which presently
takes 9 hours to run, and none of the NICs supported by ts-factory will
pass all of them. So, there is not an expectation that all tests will pass.
The “Bublik” tool used by the framework allows for comparison of pass/fails
from the previous run to the current run, but Oktet says this output is
aimed for human readability and is not designed for automation.
      * There is a flag for cutting down the testsuite to “sanity check”
testcases, which we could more reasonably expect to pass 100%. This would
mean we could more reasonably report CI results, expecting 100% passes, and
also it would mean a much shorter runtime. but my guess is this is throwing
the baby out with the bathwater, as those sanity checks results won’t be
very valuable.
      * Do we need to run this periodically, as opposed to on every patch,
due to the test duration?
      * Should we find a way to report some kind of result based on what
tests pass? I don’t yet know if this is feasible. Or, we can simply run it
and store the human readable artifact on the dashboard at an easy to find
place.
      * Does not compile on ARM, but we can reach out to Oktet/ARM to
resolve these issues
         * Test engine only has to compile on a non-worker node, and that
node could be x86
         * Need to figure out what exactly has to be compiled on the (arm)
worker node and communicate with oktet labs if there are issues
* Patrick and Aaron talked about the UNH possibly doing more redundancy
testing for the Intel lab. So, running some of the testsuites they’re
running which we aren’t.
   * It looks like Intel is reporting results again (woohoo!)
   * Patrick will determine the coverage gap between UNH and the Intel lab
before Dublin so that he can discuss it with any interested parties in
person
* Last meeting Aaron asked about maintainers for next-* branches getting
immediate CI runs on an “on-push” basis, like with the LTS-staging
branches. There is nothing preventing this except A. There needs to be a
github mirror so we can use the github API (like we do with LTS-staging),
and B. to Aaron’s point from last time, we need to agree on testrun
frequency.
   * 2024 SOW item?

---------------------------------------------------------------------
Intel Lab
* They are reporting results again

---------------------------------------------------------------------
Loongarch Lab
* none

---------------------------------------------------------------------
Github Actions
* Retest framework: currently testing this internally, and will soon submit
it for review.
* Physical server move: downtime will occur at the end of the year or
beginning of next year
   * Figuring out if there’s a way to migrate the VM to another system so
that downtime is reduced. Otherwise, it will be about a week of downtime

=====================================================================
DTS Improvements & Test Development
* DTS presentation (work in progress):
https://docs.google.com/presentation/d/1fm8EtbzEQHrFyHoHiy0PNQz3MYY2NsQRatR_eC3SvHw/edit#slide=id.g260b440c69d_0_331
   * Honnappa, Juraj, and Patrick
* Paul Szczepanek will be working on DTS
   * Should be included in any conversation for DTS improvement group
* Group met last week to discuss DTS roadmap for 23.11, and Honnappa is
sending that out today
* Jeremy is porting over the scatter testsuite and packet module for packet
comparison/other packet related functions
* Juraj - Documentation
   * What tools to generate the API docs? Sphinx (developed for python, is
a natural choice) or Doxygen.
   * We need to agree on the format of the documentation
      * Juraj likes Google docs format (very readable)
   * Jeremy will review this patch
* DTS roadmap
   * 1)        Documentation
   * 2)        TG related code (Packet manipulation and verification
module, Support for TREX which require Non-packet-capture method
enhancements)
   * 3)        Scatter test suite
   * 4)        Merge pending patches

=====================================================================
Any other business
* Next meeting is September 21, 2023

[-- Attachment #2: Type: text/html, Size: 8999 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-08-31 16:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-31 16:06 Community CI Meeting Minutes - August 31, 2023 Patrick Robb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).