From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by dpdk.org (Postfix) with ESMTP id 30ECBF72 for ; Mon, 7 Nov 2016 10:50:33 +0100 (CET) Received: by mail-wm0-f44.google.com with SMTP id f82so106886248wmf.1 for ; Mon, 07 Nov 2016 01:50:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:user-agent:mime-version :content-transfer-encoding; bh=VvXd8Gl/BXiU6E+xqoXkoFYjhfwfpNuKpOitaCvY3Ok=; b=goUxN+xZkbsH4810AlHNqXVI5kRyrWag/oMiphe64B4LkvQdBuSZM5y5bti3rNYkVL tPtB1sjIxIJusdL2NYlz8E39z4NQ7iaKK0indDgP7iKGSf/P488eaXxAiJ+MAW1kMLjE XP/Ni65S/UNVXjTatzja0V/ch08UoKVS/uy5JUmnQhVYj29eBpQ72uPH6dZCwLyWsVKI oeYsdH7ODhki2vslMwH9Lb2tD95DLOzmryH/kU1kZ1nlZGUdeRmGFbfdEcOQPp3NfzIV rsivGyhIV/42PNVkWMCWtscm9l18GVyrdKnk8JLFO8/a+zeQWIoGaNAbOMKuKgouwrv3 tyOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:date:message-id:user-agent :mime-version:content-transfer-encoding; bh=VvXd8Gl/BXiU6E+xqoXkoFYjhfwfpNuKpOitaCvY3Ok=; b=WGV9JQiOBaT3hYzewyqer2MbdNCt4srJBKG33Ms5jfpNHEwOBqiVfgd1YdW8O14Dz0 2L45Yh2YzrUeKVxvye1aewb0DdDNbV5d3guHk3fmbzYj5ke+L2fgjG1QEOCzpUbfFlze Vr0A5uwmltZFWQ9rlJEfCxhq/wpdYAVhEiqwoxodBmDNiY+b/xv7re0tm9ZpCcOzMRuK eL0KGSvQf2gHqiagbc/TjFnHyuKzDwX8VPL0HP+beJkJ4YRb2hAdruFYAWpH8Eh74RtN 9DOJQ8ZWMA1gmZaP94RKL84KE71ypyhkpd6IvgymmfgsI7qM24iMhov9SCTHr0kWT0/V Z57Q== X-Gm-Message-State: ABUngvdgfC2qujGNw+ju4IvEhGcOt6DwFRNnAVfS+/cHXb1OAsZ5hzKPIsVnQH/QnyuEXjT+ X-Received: by 10.28.91.143 with SMTP id p137mr7773191wmb.51.1478512233136; Mon, 07 Nov 2016 01:50:33 -0800 (PST) Received: from xps13.localnet (184.203.134.77.rev.sfr.net. [77.134.203.184]) by smtp.gmail.com with ESMTPSA id d85sm12470070wmd.17.2016.11.07.01.50.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Nov 2016 01:50:32 -0800 (PST) From: Thomas Monjalon To: ci@dpdk.org Date: Mon, 07 Nov 2016 10:50:31 +0100 Message-ID: <1738163.M8yfTpJz6X@xps13> User-Agent: KMail/4.14.10 (Linux/4.5.4-1-ARCH; KDE/4.14.11; x86_64; ; ) MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: [dpdk-ci] Fwd: [dpdk-moving] proposal for DPDK CI improvement X-BeenThere: ci@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: CI discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Nov 2016 09:50:34 -0000 From: Liu, Yong To: moving@dpdk.org I'm marvin and on behalf of Intel STV team. As we are moving to LF, there's one chance for us to discuss on how to enhance DPDK CI infrastructure. Currently, DPDK CI has done in a distributed way. Several companies running their own CI tests internally. Some companies running their own CI tests internally. Some of them (Intel and IBM) provided their test reports to mailing list, but others keep their test results for internal use only. There are two possible approaches that we can consider for improving DPDK CI: 1. Create a centralized DPDK CI lab and buildup required infrastructure. 2. Continue with a distributed model but improve reporting and visibility. We think the main advantages of a centralized approach are: Transparency: Everybody can see and access the test infrastructure, see what exactly how the servers are configured and what tests have been run and their result. The community can review and agree collectively when new tests are required. Flexibility: Testing can be performed on demand. Instead of a developer submitting a patch, having it tested by a distributed CI infrastructure and then getting test results. The developer can access the CI infrastructure and trigger the tests manually before submitting the patch, thus speeding up the development process, make short test cycle. Independence: Instead of each vendor providing their own performance results, having these generated in a centralized lab run by an independent body will increase confidence that DPDK users have in the test results. There is one example of how this was done for another project. (https://wiki.fd.io/view/CSIT). In their wiki page, you can get the idea about how to configure the servers and run test cases on these servers. The test report for all releases can be found on the page. You can also browse the detail test report for each release if click the link. If click their Jekin's link, you can see the trend of project status. The main disadvantages of a centralized approach are relocating equipment from separate vendor labs will require a project budget. We can depend on budget to decide which infrastructure should be deployed in the public test lab. For distributed model, we essentially continue as what we are at present. Vendors can independently choose the CI tests that they run, and the reports that they choose to make public. We can add enhancements to Patchwork to display test results to make tracking easier, and can also look at other ways to make test reports more visible. The main advantages of a distributed approach are: There's no requirement for a project budget. The disadvantages of a distributed approach are: We lost the benefits of transparency, independence and the ability to run tests on demand that are described under the centralized approach above. CI testing and the publication of the results remains under the control of vendors (or others who choose to run CI tests). Based on the above, we would like to propose that a centralized CI lab. Details of the required budget and funding for this will obviously need to be determined, but for now our proposal will focus purely on the technical scope and benefits. ------------------------------------------------------------------------------ At the moment, the hardware requirements below are identified only for Intel platforms. We'd like to encourage input from other vendors on additional hardware requirements to support their platforms. The items below are prioritized so that we can determine the cut-off point based on the final budget that is available for this activity. Daily Performance Test Scope: An l3fwd performance test will be run once per day on the master branch to identify any performance degradation. The test will use a software packet generator (could be pktgen or TRex). Test execution time: Almost 60 mins for RFC2544 Priority: 1 Hardware Requirements: For x86 this will require two 2U servers, one to run the packet generator and one to act as the device under test (DUT). In order to make sure the test results are consistent from one run to the next, we recommend to allocate dedicated servers. But if budget doesn't allow us to place too much servers, we can optimize our performance testing by make performance test beds to share with other testing like regression unit and building test, then we can maximize the utilization of infrastructure. Hardware requirements for other CPU architectures need to be determined by the relevant vendors. Automated Patch Compilation Test (Pre check-in Patch Testing): Scope: When developers submit patches to the dev@dpdk.org mailing list a back-end automation script will be triggered automatically to run a compilation test. The results will be sent to the test-report@dpdk.org mailing list and to the patch author. To deliver timely report for patch, the automated patch compilation test only verifies patch within few OSVs. So Patch compilation test can't achieve same coverage with daily compilation test on master branch. Testing should ideally be performed on all supported CPU platforms (x86, ARM, Power 8, TILE-Gx etc.), but this will depend on which vendors are willing to contribute hardware platforms to the DPDK CI lab. Testing will be performed on multiple operating systems (Fedora, Ubuntu, RHEL etc.), kernel versions (latest stable Linux Kernel + default Linux Kernel in OSV) and compiler versions (GCC/ICC/Clang). Test execution time: 5 mins per patch, average 30 min per patch set Priority: 2 Hardware Requirements: For x86 this will require one dedicated 2U server. Because the tests will be run frequently (every time a new patch or patch set is submitted), it's not realistic to run this testing on shared servers. Combinations of operating systems, kernels and compiler versions will be configured as separate VMs on the server. Hardware requirements for other CPU architectures need to be determined by the relevant vendors. Automated Daily Compilation Test: Scope: This is similar to the previous item but run once per day on the master branch and on the latest stable/LTS release branches. Since code merging will cause some issues, which breaks compilation for master branch. Automated Daily Compilation Test is used to monitor and avoid this kind of issues. In general, Automated Daily Compilation Test will verify latest branch with 3 compilers (ICC/GCC/Clang) and 4 building options on each mainstream OSV. Currently, intel daily build test was performed on almost 14 OVS with different Linux kernel. It includes all the mainstream operating systems, including Ubuntu, Redhat, Fedora, SUSE, CentOS, Windriver, FreeBSD and MontaVista Linux. Test execution time: 30 mins per one platform. We can perform build testing on different OSV in parallel. Priority: 3 Hardware Requirements: For x86 this will require one 2U server. Because the tests will be run at the same time every day, they could be scheduled to run on a shared server (approximate current test duration is ~2 hours on 2.6GHz CPU with 64GB RAM). Combinations of operating systems, kernels and compiler versions will be configured as separate VMs on the server. Hardware requirements for other CPU architectures need to be determined by the relevant vendors. Regression Unit Test: Scope: Unit tests will be run once per day on the master branch and on the latest stable/LTS release branches with one mainstream operating system. Test execution time: 2 hours to complete all automated unit test. Priority: 4 Hardware Requirements: For x86 this will require one 2U server. Because the tests will be run at the same time every day, they could be scheduled to run on a shared server (approximate current test duration is ~1 hour on 2.6GHz CPU with 64GB RAM). Regression Function Test: Since this function test depends on NIC features. It's difficult to make standard test plan & cases for different platforms. This test can be executed in the distributed lab. After platform owners complete testing, they can provide reports and test plan to maintainer for reviewing.