DPDK community structure changes
 help / color / mirror / Atom feed
* [dpdk-moving] proposal for DPDK CI improvement
@ 2016-11-05  4:47 Liu, Yong
  2016-11-05 19:15 ` Thomas Monjalon
  0 siblings, 1 reply; 12+ messages in thread
From: Liu, Yong @ 2016-11-05  4:47 UTC (permalink / raw)
  To: moving

I'm marvin and on behalf of Intel STV team. As we are moving to LF, there's
one chance for us to discuss on how to enhance DPDK CI infrastructure.

Currently, DPDK CI has done in a distributed way. Several companies running
their own CI tests internally. Some companies running their own CI tests
internally. Some of them (Intel and IBM) provided their test reports to
mailing list, but others keep their test results for internal use only.

There are two possible approaches that we can consider for improving DPDK CI:

1. Create a centralized DPDK CI lab and buildup required infrastructure.
2. Continue with a distributed model but improve reporting and visibility.

We think the main advantages of a centralized approach are:
Transparency: Everybody can see and access the test infrastructure, see what
              exactly how the servers are configured and what tests have been
              run and their result. The community can review and agree
              collectively when new tests are required.
Flexibility:  Testing can be performed on demand. Instead of a developer
              submitting a patch, having it tested by a distributed CI
              infrastructure and then getting test results. The developer can
              access the CI infrastructure and trigger the tests manually
              before submitting the patch, thus speeding up the development
              process, make short test cycle.
Independence: Instead of each vendor providing their own performance results,
              having these generated in a centralized lab run by an 
              independent body will increase confidence that DPDK users have
              in the test results.

There is one example of how this was done for another project.
(https://wiki.fd.io/view/CSIT).

In their wiki page, you can get the idea about how to configure the servers
and run test cases on these servers. The test report for all releases can be
found on the page. You can also browse the detail test report for each release
if click the link. If click their Jekin's link, you can see the trend of
project status.
              
The main disadvantages of a centralized approach are relocating equipment
from separate vendor labs will require a project budget. We can depend on
budget to decide which infrastructure should be deployed in the public test
lab.

For distributed model, we essentially continue as what we are at present.
Vendors can independently choose the CI tests that they run, and the reports
that they choose to make public. We can add enhancements to Patchwork to
display test results to make tracking easier, and can also look at other ways
to make test reports more visible.

The main advantages of a distributed approach are:
There's no requirement for a project budget.

The disadvantages of a distributed approach are:
We lost the benefits of transparency, independence and the ability to run
tests on demand that are described under the centralized approach above. CI
testing and the publication of the results remains under the control of
vendors (or others who choose to run CI tests).

Based on the above, we would like to propose that a centralized CI lab.
Details of the required budget and funding for this will obviously need to be
determined, but for now our proposal will focus purely on the technical scope
and benefits.

------------------------------------------------------------------------------

At the moment, the hardware requirements below are identified only for Intel
platforms. We'd like to encourage input from other vendors on additional
hardware requirements to support their platforms.

The items below are prioritized so that we can determine the cut-off point
based on the final budget that is available for this activity.

  Daily Performance Test
    Scope: An l3fwd performance test will be run once per day on the master
    branch to identify any performance degradation. The test will use a
    software packet generator (could be pktgen or TRex).
  
    Test execution time:
      Almost 60 mins for RFC2544
    Priority: 1
  
    Hardware Requirements: 
      For x86 this will require two 2U servers, one to run the packet
      generator and one to act as the device under test (DUT). 

      In order to make sure the test results are consistent from one run to
      the next, we recommend to allocate dedicated servers. But if budget
      doesn't allow us to place too much servers, we can optimize our
      performance testing by make performance test beds to share with other
      testing like regression unit and building test, then we can maximize the
      utilization of infrastructure.
    
      Hardware requirements for other CPU architectures need to be determined
      by the relevant vendors.

  Automated Patch Compilation Test (Pre check-in Patch Testing):
    Scope: When developers submit patches to the dev@dpdk.org mailing list a
    back-end automation script will be triggered automatically to run a
    compilation test. 
    The results will be sent to the test-report@dpdk.org mailing list and to
    the patch author. To deliver timely report for patch, the automated patch
    compilation test only verifies patch within few OSVs. So Patch compilation
    test can't achieve same coverage with daily compilation test on master
    branch. 

    Testing should ideally be performed on all supported CPU platforms (x86,
    ARM, Power 8, TILE-Gx etc.), but this will depend on which vendors are
    willing to contribute hardware platforms to the DPDK CI lab.
    Testing will be performed on multiple operating systems (Fedora, Ubuntu,
    RHEL etc.), kernel versions (latest stable Linux Kernel  + default Linux
    Kernel in OSV) and compiler versions (GCC/ICC/Clang).

    Test execution time: 
      5 mins per patch, average 30 min per patch set 
    Priority: 2

    Hardware Requirements: 
      For x86 this will require one dedicated 2U server. Because the tests
      will be run frequently (every time a new patch or patch set is
      submitted), it's not realistic to run this testing on shared servers. 
      Combinations of operating systems, kernels and compiler versions will be
      configured as separate VMs on the server.

    Hardware requirements for other CPU architectures need to be determined
    by the relevant vendors.
 
  Automated Daily Compilation Test:
    Scope: This is similar to the previous item but run once per day on the
    master branch and on the latest stable/LTS release branches.
    Since code merging will cause some issues, which breaks compilation for
    master branch. Automated Daily Compilation Test is used to monitor and
    avoid this kind of issues. In general, Automated Daily Compilation Test
    will verify latest branch with 3 compilers (ICC/GCC/Clang) and 4 building
    options on each mainstream OSV. Currently, intel daily build test was
    performed on almost 14 OVS with different Linux kernel. It includes all
    the mainstream operating systems, including Ubuntu, Redhat, Fedora, SUSE,
    CentOS, Windriver, FreeBSD and MontaVista Linux.
    
    Test execution time: 
      30 mins per one platform. We can perform build testing on different
      OSV in parallel.
    Priority:  3
    
    Hardware Requirements:
      For x86 this will require one 2U server. Because the tests will be run
      at the same time every day, they could be scheduled to run on a shared
      server (approximate current test duration is ~2 hours on 2.6GHz CPU with
      64GB RAM). Combinations of operating systems, kernels and compiler
      versions will be configured as separate VMs on the server.
    
    Hardware requirements for other CPU architectures need to be determined by
    the relevant vendors.
 
  Regression Unit Test:
    Scope: Unit tests will be run once per day on the master branch and on the
    latest stable/LTS release branches with one mainstream operating system.
    
    Test execution time:
      2 hours to complete all automated unit test.
    Priority: 4

    Hardware Requirements: For x86 this will require one 2U server.
      Because the tests will be run at the same time every day, they could be
      scheduled to run on a shared server (approximate current test duration
      is ~1 hour on 2.6GHz CPU with 64GB RAM).  

  Regression Function Test:
    Since this function test depends on NIC features. It's difficult to make
    standard test plan & cases for different platforms. This test can be
    executed in the distributed lab. After platform owners complete testing,
    they can provide reports and test plan to maintainer for reviewing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-11-07 15:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-05  4:47 [dpdk-moving] proposal for DPDK CI improvement Liu, Yong
2016-11-05 19:15 ` Thomas Monjalon
2016-11-07  5:15   ` Liu, Yong
2016-11-07  9:59     ` Thomas Monjalon
2016-11-07 14:59       ` Liu, Yong
2016-11-07  7:55   ` Xu, Qian Q
2016-11-07 10:17     ` Thomas Monjalon
2016-11-07 10:26       ` Jerome Tollet (jtollet)
2016-11-07 10:34         ` O'Driscoll, Tim
2016-11-07 10:47           ` Arnon Warshavsky
2016-11-07 10:56           ` Thomas Monjalon
2016-11-07 12:20       ` Xu, Qian Q

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).