Hello All,

Here are the minutes from today's call.

January 28, 2021

######################################
Attendees

1. Lincoln Lavoie
2. Brandon Lo
3. Trishan de Lanerolle
4. Aaron Conole
5. Ali Alnubani
6. David Liu
7. Juraj Linkeš
8. Lijuan Tu
9. Michael Santana
10. Owen Hiyard
11. Ruifeng Wang
12. Thomas Monjalon
13. David Marchand

######################################
Agenda

1. CI Status
2. Bugzilla Status
3. Test Development
4. Any other business

######################################
Minutes

======================================
CI Status

--------------------------------------
Intel Lab

* Updated CI systems to avoid running testing
* If any build test fails (on any system, i.e. 1 out of N OSes), the testing of the other case will not run.  Community Lab is similar, but only blocked on the apply patch (blocks whole tree), otherwise, the tree structure allows for the other tests to run in parallel, where possible.  For example, failing windows compile testing won’t stop the performance testing on Mellanox NICs, but if the compile on the Mellanox bare metal fails, no performance / functional tests are run.
* How is the git branch “guessed” for each patch?  This is run from the dpdk-ci script, tries to guess based on the maintainers file.

--------------------------------------
Github Actions / Travis CI / OBS

* Github actions - scripts are ready to monitor jobs and status, trying to get this hooked up to report to patchworks by early next week.
* Arm is working to get more Travis CI credits for the ovsrobot only for the arm testing.
* OBS is still pending, once finished with Github actions.

--------------------------------------
UNH-IOL Community Lab

* ARM hardware functional and performance testing; Intel and Mellanox NICs are ready, but not in production
* MLX5 PMD compilation for Windows clang has been enabled (bug 620)
* Convert performance difference to percent instead of raw value (bug 626)
   * Will be investigating how much needs to be changed
   * Failure criteria, should this also be changed to a % difference?
   * This criteria should be specific to each machine / NIC.  
   * Metric is selected from a set of 10 to 20 runs
   * Next steps: complete the migration work to % difference reporting, allow systems to run for next couple of weeks to build a set of data, from this data the metrics can be proposed.  Metrics will be discussed during the next meeting.
* Containerizing DPDK lab pipelines (compile, coverity, ABI, etc) and publishing it
* Coverity Testing Coverage
   * Only missing dependencies are libmusdk and libAArch64crypto, which require cross-compile to aarch64.  We have not started work on that, before the group confirms its necessary.
   * Group confirmed the desired coverage has been achieved, and we'll update the bug to completed.
* Working on expanding CI Status page to include more information for jobs
   * Descriptions added to Jenkins Job Status
   * Added view to show if workers are offline and the cause / reason.
   * Will solve bug 623 and Bug 604
* Bug 604 status
   * Much of the requested information has been added to the status page.
   * Waiting on internal security review processes to publish job pipeline definitions
   * If a node is taken offline, the reason will now be displayed on the status page
   * Upcoming enhancements to the grafana dashboard, showing number of tests run, outcome (pass/fail/etc.), etc.
   * All of the following information has been added to the status page
      * What is job status? all jobs in progress?
      * What are CI nodes? machines? containers? VMs?
      * What is CI Build Queue? Is it related to jobs?
   * In progress
      * Can we make the job title clickable to display more details? - This will be implemented after publishing job pipeline definitions
* Priority list of DTS test suites to get running in on the bare-metal hardware in the lab (see table as attached image / below)
   * Can you group provide a priority list for the next set (5 to 10 tests) to get configured and reported from the lab.
   * The Priorities document from DPDK Techboard is under review with Redhat.
   * Ask to focus on closing out the existing bugzilla tickets.
* ARM-based testing / hardware bottleneck.  Lab has 1 arm based host currently.
   * Compile/Unit/ABI Time: ~2 hours
   * Performance / Functional (running 2 of 3 NICs): ~1.5 hours total (currently disabled, so the compile/unit can keep up)
   * Could compile testing be handled under emulation, or a cross compile (if we have to go through route anyway for ABI)?  - Group prefers to avoid emulation approaches.
   * Arm team may be able to help with setting the systems to support running parallel.
* OS Coverage (BUG 479) for compile / unit testing
   * Latest FreeBSD - Currently running 11.2
   * Windows - Currently running 10.0.17763.973
   * Ubuntu latest two LTS releases - Currently have 18.04 and 20.04 (container)
   * RHEL 8 - Working to set this up, there is some trickiness to getting the container setup with the full set of required packages, due to licenses
   * OpenSUSE - Currently running 15.1
   * CentOS - Currently running 8
   * Note, bare-metal hardware / NIC testing runs the OS of the system vendor’s choice.
 
======================================
Bugzilla Status

* https://bugs.dpdk.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=CONFIRMED&bug_status=IN_PROGRESS&columnlist=product%2Ccomponent%2Cpriority%2Cbug_status%2Cassigned_to%2Cshort_desc%2Cchangeddate&component=job%20scripts&component=UNH%20infra&component=Intel%20Lab&component=Travis%20CI&list_id=3419&order=priority%2Cchangeddate%20DESC&product=lab&query_format=advanced&resolution=---

======================================
Test Development

* Out of time.

======================================
Any other business

* Will cancel the February 11 meeting, slipping 1 week, so the new schedule is February 18, March 4, etc.

--
Lincoln Lavoie
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
+1-603-674-2755 (m)