* Community CI Meeting Minutes - July 20, 2023
@ 2023-07-20 17:18 Patrick Robb
0 siblings, 0 replies; only message in thread
From: Patrick Robb @ 2023-07-20 17:18 UTC (permalink / raw)
To: ci; +Cc: dts, dev
[-- Attachment #1: Type: text/plain, Size: 12842 bytes --]
July 20, 2023
#####################################################################
Attendees
1. Patrick Robb
2. Alex Agerholm
3. Christian Muf
4. Juraj Linkeš
5. Danylo Vodopianov
6. Jeremy Spewock
7. M Kostenok
8. Lincoln Lavoie
9. Manit Mahajan
10. Alex Kholodnyi
11. Aaron Conole
12. Thomas Monjalon
13. Ali Alnubani
#####################################################################
Agenda
1. General Announcements
2. CI Status
3. DTS Improvements & Test Development
4. Any other business
#####################################################################
Minutes
=====================================================================
General Announcements
* Napatech is beginning the process of contributing their driver for DPDK,
and intends to stand up a process for CI testing
* 1. What is the preferable way for community to work with custom HW?
According
* to the documentation: send hardware to the DPDK Lab in the UNH-IOL is
one possible way? Could we set own lab? Where CI should be located and
which accesses should be provided?
* Making use of the Community Lab is one way to accomplish CI testing
without having to setup your own CI Testing infrastructure. But, setting up
your own internal lab and reporting your results is also good.
* Governing board approval might be required about bringing a new
vendor into the community lab
* Is a specific membership tier required for hardware inclusion in
the lab?
* Napatech is wondering about any regular fee (per month, per year)
* [Aaron] For running an internal lab, scripts for polling patchwork
and DPDK CI testing scripts can be found at
https://github.com/ovsrobot/pw-ci and at https://git.dpdk.org/tools/dpdk-ci/
* For how long should an internal lab be available?
* As long as possible - the DPDK project will continuously be
changing, so if a driver is contributed, long term testing will be very
helpful in verifying for the community that no functionality breaks months
or years down the road. If testing results can be provided long term, the
community can help keep the driver in a good state long term.
* 2. Which DTS tests should be executed by NT team to prove that NT DPDK
driver is okay to be upstreamed?
* There are no specific DTS testsuite requirements for upstreaming,
so it’s best to just extend use of DTS as much as possible according to the
driver’s feature set
* 3. we have implemented custom DTS tests which are covering the
functionality of the NIC. May these tests be considered as enough for merge
of driver? If so, should they be upstreamed as well?
* If you are going to report (to patchwork) results from these custom
testsuites, they should be upstreamed to DTS. Lijuan is the maintainer.
https://git.dpdk.org/tools/dts/
* Ping Lijuan about schedule for development
* Small-Medium changes are fine halfway through a release cycle, but
the RFCs exist for a reason and major changes cannot come late in the
process
* 4. Is there documentation for setting up a CI Testing lab with DPDK
Patchwork?
* There is not much in terms of documentation, but for running an
internal lab, scripts for polling patchwork and DPDK CI testing scripts can
be found at https://github.com/ovsrobot/pw-ci and at
https://git.dpdk.org/tools/dpdk-ci/
* 5. In which format should be this test report?
* [Aaron] Test reports can be as simple as:
https://mails.dpdk.org/archives/test-report/2023-July/420253.html - Note
that we need the subject line, Test-Label, Test-Status, patch url, and
_test text_ to all be present in the form in that email.
* 6. Which access level to the setup we should to provide for the
community?
* [Aaron] If you run your own lab, the community generally doesn't
need access - but you will need to support in the case your lab has issues.
* 7. On which request and how often we have to execute tests and send
report?
* [Aaron] We run the tests on every patch that comes via patchwork.
* Ideally, the report will include the necessary output showing the
failure and help the developer start digging on how to fix their patch
* Currently Napatech has run testing periodically, but not (per
patch/ per series)
* 8. Do test reports need to be saved locally?
* Not a requirement, if enough information can be included in the
test report
* UNH does save some artifacts associated with the testrun, which can
be downloaded from the dashboard. In some cases these logs are much more
extensive than what is provided on the emailed test report.
* The ideal is to hold these for at least a release cycle
* Napatech has developed a PMD dpdk driver for their smartNIC, which they
aim to upstream to DPDK
* Has run their custom DTS tests internally
* Looking for clarity on what is required for upstreaming
* Is CI testing a requirement for submitting a driver? A: No, code
review is sufficient, but CI which reports upstream will help both Napatech
and the community to protect against and performance or functional
regressions or breakage
* Provide an RFC by Aug 12th, further review and development can come
afterwards
* What meetings should Napatech attend for submitting their driver?
* It would be typical for a presentation to be made to the tech board
meeting. But first, the driver should be submitted via the mailing list.
After that, someone will be invited to the tech board.
* Ideally someone will check in with the CI meeting periodically
* If Napatech didn’t want to report per patch, could we key off of RC-*
releases and have them simply report for those releases?
=====================================================================
CI Status
---------------------------------------------------------------------
UNH-IOL Community Lab
* Eal_misc_flags_autotest: As you may have noticed, the Community Lab CI is
failing a fast-test unit test on all new patchseries. The cause is a bnxt
driver commit (
https://git.dpdk.org/dpdk/commit/?id=97435d7906d7706e39e5c3dfefa5e09d7de7f733)
which introduces statics which are stored in the same address space as is
requested by the misc_flags_autotest.
* Ajit suggests simply changing the address space requested by the unit
test - which is fine with me
* Aaron has posted a patch making this change
* This is simply changing from one random address to another
random address, and hoping this issue never happens again. Is there a
better solution?
* How to handle in the short term? Modify meson fast-tests suite every
time?
* The most interested part of this in terms of CI Testing is, how did it
get through CI Testing and into DPDK, if it fails in our CI? The answer is
that when the patchseries hit patchwork, it was “missing'' one of the
patches, which was added to patchwork hours later/next day. Unfortunately,
our process does not currently account for this possibility (which is
admittedly quite rare), and we simply tried to apply the series, failed,
because we were missing one of the patches, and didn’t run the patchseries
through our CI.
* From Patchwork’s API, each “series” object has a “recieved_all”
field. We will modify our process to account for this field. If it is
False, then skip and try again later, until it is true
* Bnxt is going to avoid submitting huge patches in the future
* They created some static globals which “randomly” are assigned to a
specific memory area which overlaps with the unit test
* Apply failures and sanity check build failures are both reporting to
patchwork:
* Apply failure:
https://patchwork.dpdk.org/project/dpdk/patch/20230712173050.599482-1-sivaprasad.tummala@amd.com/
* Sanity check build:
https://patchwork.dpdk.org/project/dpdk/patch/20230713102659.230606-2-maxime.coquelin@redhat.com/
* Compression-Perf sample app
* We have dry run this using the Zlib vdev for compression/decomression
operations, which provides results on validity of the decompressed output
(compared to input file) and also provides performance metrics
* For the zlib vdev, our thinking is for the testcase we should
pass/fail based on the validity of the output. We will report the
performance results, but these should not determine pass/fail for the vdev
* We may have the opportunity to run the compression-perf sample app
using hardware accelerators in the future. In that context, it would be
appropriate to condition pass/fail on a performance variance threshold.
* Linux kmods dpdk compile test has been broken for a few weeks, due to a
linux kernel API modification in v6.5. Ferruh submitted a patch fixing
build behavior for v6.5:
https://git.dpdk.org/dpdk/commit/?id=dd33d53b9a032d7376aa04a28a1235338e1fd78f
* Seen in daily periodic testing of main:
https://dpdkdashboard.iol.unh.edu/results/dashboard/tarballs/25540/
* NIC hardware refresh:
* Marvell board has arrived - still need details from them about
chassis, other components to pair with the board, and guidelines for how
they want to test DPDK on the hardware. They did share a l2fwd sample app
setup specific to the marvell board, which we could setup and “test,” but
presumably they will want to accomplish more on the board than this sample
app run.
* E810 (traffic generator) to pair with this board has arrived
* Need to acquire DAC cabling for this, this needs to be a specific
breakout cable (from discussions with Marvell), so we need to ensure the
correct breakout is ordered.
* Request an exact part number and UNH can probably order it
* 2 Intel E810s are in, DAC cabling has arrived, this NIC will be
installed in the to be donated Intel server
* Legal review continues for Intel’s server donation
* 1 of the 2 mlnx cx6 cards are in, the other is backordered
* DAC cables are here
* 2 mlnx cx7 cards are still on backorder
* DAC cables are here
* Intel QAT card for the arm server is ordered, but backordered right now
* No news from Ericcson about the embedded snow board
* Everything else is ordered
---------------------------------------------------------------------
Intel Lab
* Intel reps keep saying there is a risk of less testing in the near future
* The lab may be decreasing its operations
* What was done in Intel lab, which we can replicate
* UNH should check the gaps between what testing they are doing and
Intel has been doing, and see if/where coverage can be bridged
---------------------------------------------------------------------
Loongarch Lab
* None
---------------------------------------------------------------------
Github Actions
* No news
* Will return to retesting framework stuff going forward
=====================================================================
DTS Improvements & Test Development
* Smoke tests:
http://patches.dpdk.org/project/dpdk/patch/20230719201807.12708-3-jspewock@iol.unh.edu/
* traffic generator abstractions:
http://patches.dpdk.org/project/dpdk/list/?series=28973
* In future releases, changes will not be accepted this late in the release
cycle
* There is a slot for DTS in the DPDK Summit (at the end of the first day -
20 minutes)
* Juraj will be presenting remotely, Patrick will be there in person to
answer any questions
* Docstrings patch should be split into multiple parts. Agreement is still
needed on the format to be used for documentation.
* Extending smoke tests is one possibility
* Dynamically skipping and selecting testsuites according to system info is
one possible QOL feature
* It probably makes sense to record possible development in a spreadsheet
and vote on priorities
* Are we ready to start porting over some of the functional tests? This
should be technically possible, so we should explore this.
* Dockerfile patch for DTS will possibly be a part of 23.11 DTS roadmap -
Thomas will try to review this in August
* Can more DTS patches be merged “mid release” to make the DTS development
process more dynamic?
* This should be possible - make sure to utilize slack and CC Thomas
* Thomas will be the one to merge patches for DTS in the near future
* Can this responsibility be deferred to someone else? Thomas is
going to discuss this with the tech board at Dublin. In general Thomas
would like to recruit more subtree maintainers.
=====================================================================
Any other business
* PW will be upgraded by Ali Alnubani on the first sunday after the 23.07
release, which should be this Sunday
* Next meeting is August 3, 2023
[-- Attachment #2: Type: text/html, Size: 14270 bytes --]
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-07-20 17:18 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-20 17:18 Community CI Meeting Minutes - July 20, 2023 Patrick Robb
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).