Re: [dpdk-moving] proposal for DPDK CI improvement

DPDK community structure changes
 help / color / mirror / Atom feed

From: "Xu, Qian Q" <qian.q.xu@intel.com>
To: Thomas Monjalon <thomas.monjalon@6wind.com>,
	"moving@dpdk.org" <moving@dpdk.org>,
	"Liu, Yong" <yong.liu@intel.com>
Subject: Re: [dpdk-moving] proposal for DPDK CI improvement
Date: Mon, 7 Nov 2016 07:55:00 +0000	[thread overview]
Message-ID: <82F45D86ADE5454A95A89742C8D1410E3923B784@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <3804736.OkjAMiHs6v@xps13>

I think the discussion about CI is a good start. I agreed on the general ideas: 
1. It's good to have more contributors for CI and it's a community effort. 
2. Building a distributed CI system is good and necessary. 
3. "When and Where" is the very basic and important questions. 

Add my 2 cents here. 
1.  Distributed test vs Centralized lab
We can put the build and functional tests on our distributed lab. As to the performance, as we all know, performance is key to DPDK. 
So I suggested we can have the centralized lab for the performance testing, and some comments as below: 
a). Do we want to publish the performance report on different platforms with different HW/NICs? Anyone against on publishing performance numbers? 
b). If the answer to the first question is "Yes", so how to ensure others trust the performance and how to reproduce the performance if we don't have the platforms/HWs? 
As Marvin said, transparency and independence is the advantage for open centralized lab. Besides, we can demonstrate to all audience about DPDK performance with the 
Lab. Of course, we need the control of the system, not allow others to access it randomly. It's another topic of access control. I even think that if the lab can be used as 
the training lab or demo lab when we have the community training or performance demo days(I just named the events). 

2. Besides "When and Where", then "What" and "How"
When:
	- regularly on a git tree ---what tests need to be done here? Propose to have the daily build, daily functional regression, daily performance regression
	- after each patch submission -> report available via patchwork----what tests need to be done? Build test as the first one, maybe we can add functional or performance in future. 

How to collect and display the results? 
Thanks Thomas for the hard work on patchwork upgrade. And it's good to see the CheckPatch display here. 
IMHO, to build the complete distributed system needs very big effort. Thomas, any effort estimation and the schedule for it? 
a). Currently, there is only " S/W/F for Success/Warning/Fail counters" in tests, so does it refer to build test or functional test or performance test? 
If it only referred to build test, then you may need change the title to Build S/W/F. Then how many architecture or platforms for the builds? For example, we support Intel IA build, 
ARM build, IBM power build. Then we may need collect build results from INTEL/IBM/ARM and etc to show the total S/W/F. For example, if the build is passed on IA but failed on IBM, then we 
Need record it as 1S/0W/1F. I don't know if we need collect the warning information here. 

b). How about performance result display on website? No matter distributed or centralized lab, we need a place to show the performance number or the performance trend to 
ensure no performance regression? Do you have any plan to implement it? 

3.  Proposal to have a CI mailing list for people working on CI to have the regular meetings only discussing about CI? Maybe we can have more frequent meetings at first to have an alignment. Then
We can reduce the frequency if the solution is settle down. Current call is covering many other topics. What do you think? 

Thanks. Welcome any comments. 

-----Original Message-----
From: moving [mailto:moving-bounces@dpdk.org] On Behalf Of Thomas Monjalon
Sent: Sunday, November 6, 2016 3:15 AM
To: moving@dpdk.org; Liu, Yong <yong.liu@intel.com>
Subject: Re: [dpdk-moving] proposal for DPDK CI improvement

2016-11-05 04:47, Liu, Yong:
> Currently, DPDK CI has done in a distributed way. Several companies 
> running their own CI tests internally. Some companies running their 
> own CI tests internally. Some of them (Intel and IBM) provided their 
> test reports to mailing list, but others keep their test results for internal use only.

I'm confident we'll have more contributors to the distributed CI when it will be well advertised (see below).

> There are two possible approaches that we can consider for improving DPDK CI:
> 
> 1. Create a centralized DPDK CI lab and buildup required infrastructure.
> 2. Continue with a distributed model but improve reporting and visibility.

I think these two approaches are good:
1. The centralized open lab can help as a reference 2. The distributed CI instances will bring more diversity

> We think the main advantages of a centralized approach are:
> Transparency: Everybody can see and access the test infrastructure, see what
>               exactly how the servers are configured and what tests have been
>               run and their result. The community can review and agree
>               collectively when new tests are required.
> Flexibility:  Testing can be performed on demand. Instead of a developer
>               submitting a patch, having it tested by a distributed CI
>               infrastructure and then getting test results. The developer can
>               access the CI infrastructure and trigger the tests manually
>               before submitting the patch, thus speeding up the development
>               process, make short test cycle.

It is possible to offer such flexibility in private CI labs by offering an email address where we can send some patches to be tested.
However there can be an issue of hardware bandwith/availability to solve.
This is the same issue for open/centralized labs or private labs.
A test lab accepting any private request can be abused. That's why I think forcing to send the patches publically to the mailing list is a good policy.

> Independence: Instead of each vendor providing their own performance results,
>               having these generated in a centralized lab run by an 
>               independent body will increase confidence that DPDK users have
>               in the test results.
> 
> There is one example of how this was done for another project.
> (https://wiki.fd.io/view/CSIT).
> 
> In their wiki page, you can get the idea about how to configure the 
> servers and run test cases on these servers. The test report for all 
> releases can be found on the page. You can also browse the detail test 
> report for each release if click the link. If click their Jekin's 
> link, you can see the trend of project status.

I do not see an explanation of how to use the CSIT lab on demand (what you described as "Flexibility"). How does it work?

> The main disadvantages of a centralized approach are relocating 
> equipment from separate vendor labs will require a project budget. We 
> can depend on budget to decide which infrastructure should be deployed 
> in the public test lab.
> 
> For distributed model, we essentially continue as what we are at present.
> Vendors can independently choose the CI tests that they run, and the 
> reports that they choose to make public. We can add enhancements to 
> Patchwork to display test results to make tracking easier, and can 
> also look at other ways to make test reports more visible.

Yes, I'm working on it.
One year ago, we discussed a CI integration in patchwork:
	https://lists.ozlabs.org/pipermail/patchwork/2015-July/001363.html
It is now implemented in patchwork and available on dpdk.org:
	http://dpdk.org/ml/archives/dev/2016-September/046282.html
A first basic test (checkpatch) is integrated. See this example:
	http://dpdk.org/patch/16953
The detailed report in test-report mailing list archives is referenced with an hyperlink (in the "Description" column).
The next step (work in progress) is to publish some scripts in a new git repository dpdk-ci to help integrating more test labs in patchwork.
It basically requires only to receive and send some emails from the lab.

Note that any open or centralized lab can be also integrated in patchwork.

Note also that this patchwork integration covers only the tests run when a patch is submitted. The tests run regularly on a git tree won't appear in this interface.

> The main advantages of a distributed approach are:
> There's no requirement for a project budget.

The other major advantage is to have a better test coverage.
Many companies need to have some internal DPDK tests for their needs.
If they use them to provide some public reports, they can avoid having some regressions with their specific use cases or hardware.
That's why the distributed CI approach is a win-win.

> The disadvantages of a distributed approach are:
> We lost the benefits of transparency, independence and the ability to 
> run tests on demand that are described under the centralized approach 
> above. CI testing and the publication of the results remains under the 
> control of vendors (or others who choose to run CI tests).
> 
> Based on the above, we would like to propose that a centralized CI lab.
> Details of the required budget and funding for this will obviously 
> need to be determined, but for now our proposal will focus purely on 
> the technical scope and benefits.

Thanks for the detailed description of the tests that you expect.
I think the CI discussion must be thought with two major questions:
	When? and Where?
When:
	- regularly on a git tree
	- after each patch submission -> report available via patchwork
Where:
	- in a private lab
	- in a foundation lab
Both private and foundation labs can be more or less open.

My conclusion: every kind of tests have some benefits and are welcome!

next prev parent reply	other threads:[~2016-11-07  7:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-05  4:47 Liu, Yong
2016-11-05 19:15 ` Thomas Monjalon
2016-11-07  5:15   ` Liu, Yong
2016-11-07  9:59     ` Thomas Monjalon
2016-11-07 14:59       ` Liu, Yong
2016-11-07  7:55   ` Xu, Qian Q [this message]
2016-11-07 10:17     ` Thomas Monjalon
2016-11-07 10:26       ` Jerome Tollet (jtollet)
2016-11-07 10:34         ` O'Driscoll, Tim
2016-11-07 10:47           ` Arnon Warshavsky
2016-11-07 10:56           ` Thomas Monjalon
2016-11-07 12:20       ` Xu, Qian Q

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82F45D86ADE5454A95A89742C8D1410E3923B784@shsmsx102.ccr.corp.intel.com \
    --to=qian.q.xu@intel.com \
    --cc=moving@dpdk.org \
    --cc=thomas.monjalon@6wind.com \
    --cc=yong.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).