* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-02 18:39 ` Brandon Lo
@ 2020-04-03 1:14 ` Ma, LihongX
2020-04-15 9:39 ` Ma, LihongX
2020-04-28 3:39 ` Ma, LihongX
2 siblings, 0 replies; 42+ messages in thread
From: Ma, LihongX @ 2020-04-03 1:14 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1: Type: text/plain, Size: 14644 bytes --]
Got it. Thanks Brandon.
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #2: Type: text/html, Size: 77594 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-02 18:39 ` Brandon Lo
2020-04-03 1:14 ` Ma, LihongX
@ 2020-04-15 9:39 ` Ma, LihongX
2020-04-15 19:30 ` Brandon Lo
2020-04-28 3:39 ` Ma, LihongX
2 siblings, 1 reply; 42+ messages in thread
From: Ma, LihongX @ 2020-04-15 9:39 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 15136 bytes --]
Hi, Brandon
I find the grub configuration of isolation cpu miss the logic cores which at some thread.
For example:
If the server cpus layout as below, and want to config the isolation cpu from 1-20
The config of isolation should is ‘isolcpus=1-20,49-68’
[cid:image003.jpg@01D6134C.D6DDF140]
So, I want to change the grub configuration of the isolation cpus.
From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
Can you help me check which time is suite to do this change ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 83861 bytes --]
[-- Attachment #2: image003.jpg --]
[-- Type: image/jpeg, Size: 58140 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-15 9:39 ` Ma, LihongX
@ 2020-04-15 19:30 ` Brandon Lo
2020-04-16 3:40 ` Ma, LihongX
0 siblings, 1 reply; 42+ messages in thread
From: Brandon Lo @ 2020-04-15 19:30 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 14752 bytes --]
Hi Lihong,
I think any time is fine to make these changes.
Please use the 'reboot_to_rw' command once you have logged on, this will
make sure your changes are saved.
It will also disable testing momentarily while you make your changes.
Once you're finished, you can reboot it with 'reboot_to_ro'.
Please let me know when you're done so I can make sure everything is
working as intended.
We could also do this change for you if needed.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
> Hi, Brandon
>
>
>
> I find the grub configuration of isolation cpu miss the logic cores which
> at some thread.
>
>
>
> For example:
>
> If the server cpus layout as below, and want to config the isolation cpu
> from 1-20
>
> The config of isolation should is ‘isolcpus=1-20,49-68’
>
>
>
>
>
> So, I want to change the grub configuration of the isolation cpus.
>
> From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
>
>
>
> Can you help me check which time is suite to do this change ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Friday, April 3, 2020 2:39 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I have changed the baselines to reflect the new expected values.
>
> The performance tests should work as expected and pass.
>
>
>
> We will email again in the future if we come across any problems.
>
> Feel free to email us as well if you would like to make any other changes.
>
>
>
> Thank you for all your help
>
>
>
> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> Thanks for you recommends, I have done the changes.
>
> As the throughput value of nic_single_core is proportional to the cpu
> frequency.
>
> I recommend you can change the baseline according to our report system.
>
>
>
> On the our 2.50GHz system, the baseline value as below:
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562
>
> 64
>
> 2048
>
> 41.439
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608
>
> 64
>
> 2048
>
> 47.73
>
>
>
> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
> should be
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562 / 2.5 * 2.1=44.152
>
> 64
>
> 2048
>
> 41.439 / 2.5 * 2.1=34.809
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608 / 2.5 * 2.1=50.071
>
> 64
>
> 2048
>
> 47.73 / 2.5 * 2.1=40.093
>
>
>
>
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Tuesday, March 31, 2020 9:42 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> To make changes to either Intel machine, please reboot using the command
> "reboot_to_rw" as root to reboot the machine into read/write mode.
>
> This command will also disable any testing on the machine.
>
>
>
> To re-enable the machine, please run "reboot_to_ro" as root, and it will
> save all of the changes that you've made and re-enable testing on the
> machine.
>
> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
> instead of the normal "reboot" while you're making changes.
>
>
>
> After you're done, please let me know. I'll have to manually run a test
> and update the baseline using our internal CI.
>
>
>
> Thank you
>
>
>
> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Please let me know how to make change to this reset machine.
> (ip/access...) and disable it.
>
>
>
> After that please help to change the baseline.
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Thursday, March 26, 2020 11:39 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, we have a system in place that resets any changes made while
> testing is enabled for a machine.
>
> If you would like, I can disable testing and allow you to make permanent
> changes.
>
>
>
> I can also reset the baseline of Intel 10G test performance once you make
> these changes.
>
> Please let me know if you would like to make permanent changes on the
> Intel 10G so I can disable it for you.
>
>
>
> Thanks
>
>
>
> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Thanks. Brandon.
>
>
>
> That’s good. We have made changed on 10G testbed.
>
>
>
> I monitored the several execution results; I found the results of 10G
> always has -0.9%~-1.x% gap against expected number. So it could lead to see
> sometime failures..+-1% I suggest adjusting the expected number. I don’t
> know where the expected number is from? as I know it a dynamic number?
> depends on baseline.. Please help to clarify, thanks.
>
>
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 24, 2020 9:31 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> I have enabled the 10G Intel machine for testing.
>
> If you would like to make any more changes, please let me know so I can
> perform the necessary steps to prepare the machine for changes.
>
> Please feel free to let me know if you need anything.
>
>
>
> Thank you
>
>
>
> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> For 10G, please enable it. our code is at original path
> */opt/test-harness/dts.*
>
>
>
> For 40G, please keep running. and see if any issue. But, anyway, we have
> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
> problem, then use this new DTS instead.
>
>
>
> Thanks a lot
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Saturday, March 21, 2020 1:49 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, the 40G machine is stable enough to be put on production
> dashboard to run tests which may cause Trex to be killed.
>
> Should I disable the 40G Intel machine for you to make changes?
>
>
>
> Also, just for confirmation: on the 10G machine, is the folder that you
> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
> are you still using the one in the standard */opt/test-harness/dts*
> folder?
>
>
>
> If everything is ok, I will enable the 10G machine for production testing.
>
>
>
> Thank you very much
>
>
>
> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Brandon,
>
>
>
> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could
> you please help to recover them?
>
>
>
> But, for FVL(40G) testbed, we met some problems, could you please help to
> check before recover it
>
> - Sometime 1G hugepage will be changed to 2Mhugepage
> automatically...we have to restart the system
> - When we debugging on the testbed, found that Trex was killed by some
> one(app)..
>
> Please help to check if any other program running on the testbed.
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Sent:* Wednesday, March 18, 2020 9:04 PM
> *To:* Brandon Lo <blo@iol.unh.edu>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Brandon, we almost made a workaround.
>
>
>
> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
> soon.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Wednesday, March 18, 2020 3:34 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Have you finished making changes on the Intel machine?
>
> I will turn on the machine on March 3rd for testing if you do not have any
> issues with it.
>
> Please let me know if you need anything else.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
> upgrading.
>
> So we are reviewing our DTS script, different Trex version, and CI calling
> procedure.
>
>
>
> Anyway, we are focusing on this task recently, any update will let you
> know.
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 10, 2020 10:46 PM
> *To:* David Marchand <david.marchand@redhat.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> How is the current status of the Intel 82599ES?
>
> Were there any configuration changes made to fix performance issues?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi David,
>
>
>
> This was just a weird issue with the packet generator not cleaning itself
> after a test fast enough before another test.
>
> I'll rerun the tests that were affected and keep an eye out to see if it's
> stable enough to be put back online.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
> wrote:
>
> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
> >
> > Hi David and Zhaoyan,
> >
> >
> > Yes, those results are related to the Intel machine; I have disabled
> testing for the Intel testbed.
> >
> > The 82599ES machine is now available for ssh and modifications.
>
> Any news about this?
>
> I received a failure on a patch of mine (changing macros in a ARM header).
> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>
> But this time, it is with the 40G Intel nic test.
>
> --
> David Marchand
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #1.2: Type: text/html, Size: 67441 bytes --]
[-- Attachment #2: image003.jpg --]
[-- Type: image/jpeg, Size: 58140 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-15 19:30 ` Brandon Lo
@ 2020-04-16 3:40 ` Ma, LihongX
2020-04-16 15:04 ` Brandon Lo
0 siblings, 1 reply; 42+ messages in thread
From: Ma, LihongX @ 2020-04-16 3:40 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 17724 bytes --]
Hi, Brandon
I use the cmd ‘reboot_to_rw’ to restart server, and edit the file ‘/etc/default/grub’ to reset the isolation cpus,
Changes the field ‘isolcpus=1-48 nohz_full=1-48 rcu_nocbs=1-48’ to
‘isolcpus=1-21,49-69,24-37,72-85 nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85’
Detail as below:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable"
Then use cmd: update-grub to make the configuration effective.
Then reboot server use ‘reboot_to_ro’, but I can’t find the change in cmdline ( cat /proc/cmdline).
Can you help me to check it?
Thanks
Lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Thursday, April 16, 2020 3:31 AM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I think any time is fine to make these changes.
Please use the 'reboot_to_rw' command once you have logged on, this will make sure your changes are saved.
It will also disable testing momentarily while you make your changes.
Once you're finished, you can reboot it with 'reboot_to_ro'.
Please let me know when you're done so I can make sure everything is working as intended.
We could also do this change for you if needed.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the grub configuration of isolation cpu miss the logic cores which at some thread.
For example:
If the server cpus layout as below, and want to config the isolation cpu from 1-20
The config of isolation should is ‘isolcpus=1-20,49-68’
[cid:1717f4fa8006917eb1]
So, I want to change the grub configuration of the isolation cpus.
From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
Can you help me check which time is suite to do this change ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 95500 bytes --]
[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 41815 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-16 3:40 ` Ma, LihongX
@ 2020-04-16 15:04 ` Brandon Lo
2020-04-17 1:54 ` Ma, LihongX
2020-04-17 7:00 ` Ma, LihongX
0 siblings, 2 replies; 42+ messages in thread
From: Brandon Lo @ 2020-04-16 15:04 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 17587 bytes --]
Hi Lihong,
I saw your changes and applied them manually.
This involves me taking the machine down for maintenance on our backend,
and then doing a normal `reboot`.
Now when I do `cat /proc/cmdline` it shows:
`BOOT_IMAGE=/boot/vmlinuz-4.15.0-55-generic
root=UUID=3801030b-237d-428e-9e67-d81e12f16308 ro quiet splash
hugepagesz=1G hugepages=40 default_hugepagesz=1G
isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt
nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85
intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup
processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off
tsc=reliable numa_balancing=disable vt.handoff=1`
Which is what I think we wanted to achieve.
I have now done the `reboot_to_ro` command and it should be working fine
for testing.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 11:40 PM Ma, LihongX <lihongx.ma@intel.com> wrote:
> Hi, Brandon
>
>
>
> I use the cmd ‘reboot_to_rw’ to restart server, and edit the file
> ‘/etc/default/grub’ to reset the isolation cpus,
>
>
>
> Changes the field ‘isolcpus=1-48 nohz_full=1-48 rcu_nocbs=1-48’ to
>
> ‘isolcpus=1-21,49-69,24-37,72-85 nohz_full=1-21,49-69,24-37,72-85
> rcu_nocbs=1-21,49-69,24-37,72-85’
>
>
>
> Detail as below:
>
> GRUB_CMDLINE_LINUX_DEFAULT="quiet splash hugepagesz=1G hugepages=40
> default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on
> iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85
> intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup
> processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off
> tsc=reliable numa_balancing=disable"
>
>
>
> Then use cmd: update-grub to make the configuration effective.
>
> Then reboot server use ‘reboot_to_ro’, but I can’t find the change in
> cmdline ( cat /proc/cmdline).
>
>
>
> Can you help me to check it?
>
>
>
> Thanks
>
> Lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Thursday, April 16, 2020 3:31 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I think any time is fine to make these changes.
>
>
>
> Please use the 'reboot_to_rw' command once you have logged on, this will
> make sure your changes are saved.
>
> It will also disable testing momentarily while you make your changes.
>
>
>
> Once you're finished, you can reboot it with 'reboot_to_ro'.
>
> Please let me know when you're done so I can make sure everything is
> working as intended.
>
> We could also do this change for you if needed.
>
>
>
> Thanks,
>
> Brandon
>
>
>
> On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
>
>
> I find the grub configuration of isolation cpu miss the logic cores which
> at some thread.
>
>
>
> For example:
>
> If the server cpus layout as below, and want to config the isolation cpu
> from 1-20
>
> The config of isolation should is ‘isolcpus=1-20,49-68’
>
>
>
> [image: cid:1717f4fa8006917eb1]
>
>
>
> So, I want to change the grub configuration of the isolation cpus.
>
> From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
>
>
>
> Can you help me check which time is suite to do this change ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Friday, April 3, 2020 2:39 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I have changed the baselines to reflect the new expected values.
>
> The performance tests should work as expected and pass.
>
>
>
> We will email again in the future if we come across any problems.
>
> Feel free to email us as well if you would like to make any other changes.
>
>
>
> Thank you for all your help
>
>
>
> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> Thanks for you recommends, I have done the changes.
>
> As the throughput value of nic_single_core is proportional to the cpu
> frequency.
>
> I recommend you can change the baseline according to our report system.
>
>
>
> On the our 2.50GHz system, the baseline value as below:
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562
>
> 64
>
> 2048
>
> 41.439
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608
>
> 64
>
> 2048
>
> 47.73
>
>
>
> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
> should be
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562 / 2.5 * 2.1=44.152
>
> 64
>
> 2048
>
> 41.439 / 2.5 * 2.1=34.809
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608 / 2.5 * 2.1=50.071
>
> 64
>
> 2048
>
> 47.73 / 2.5 * 2.1=40.093
>
>
>
>
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Tuesday, March 31, 2020 9:42 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> To make changes to either Intel machine, please reboot using the command
> "reboot_to_rw" as root to reboot the machine into read/write mode.
>
> This command will also disable any testing on the machine.
>
>
>
> To re-enable the machine, please run "reboot_to_ro" as root, and it will
> save all of the changes that you've made and re-enable testing on the
> machine.
>
> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
> instead of the normal "reboot" while you're making changes.
>
>
>
> After you're done, please let me know. I'll have to manually run a test
> and update the baseline using our internal CI.
>
>
>
> Thank you
>
>
>
> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Please let me know how to make change to this reset machine.
> (ip/access...) and disable it.
>
>
>
> After that please help to change the baseline.
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Thursday, March 26, 2020 11:39 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, we have a system in place that resets any changes made while
> testing is enabled for a machine.
>
> If you would like, I can disable testing and allow you to make permanent
> changes.
>
>
>
> I can also reset the baseline of Intel 10G test performance once you make
> these changes.
>
> Please let me know if you would like to make permanent changes on the
> Intel 10G so I can disable it for you.
>
>
>
> Thanks
>
>
>
> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Thanks. Brandon.
>
>
>
> That’s good. We have made changed on 10G testbed.
>
>
>
> I monitored the several execution results; I found the results of 10G
> always has -0.9%~-1.x% gap against expected number. So it could lead to see
> sometime failures..+-1% I suggest adjusting the expected number. I don’t
> know where the expected number is from? as I know it a dynamic number?
> depends on baseline.. Please help to clarify, thanks.
>
>
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 24, 2020 9:31 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> I have enabled the 10G Intel machine for testing.
>
> If you would like to make any more changes, please let me know so I can
> perform the necessary steps to prepare the machine for changes.
>
> Please feel free to let me know if you need anything.
>
>
>
> Thank you
>
>
>
> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> For 10G, please enable it. our code is at original path
> */opt/test-harness/dts.*
>
>
>
> For 40G, please keep running. and see if any issue. But, anyway, we have
> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
> problem, then use this new DTS instead.
>
>
>
> Thanks a lot
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Saturday, March 21, 2020 1:49 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, the 40G machine is stable enough to be put on production
> dashboard to run tests which may cause Trex to be killed.
>
> Should I disable the 40G Intel machine for you to make changes?
>
>
>
> Also, just for confirmation: on the 10G machine, is the folder that you
> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
> are you still using the one in the standard */opt/test-harness/dts*
> folder?
>
>
>
> If everything is ok, I will enable the 10G machine for production testing.
>
>
>
> Thank you very much
>
>
>
> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Brandon,
>
>
>
> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could
> you please help to recover them?
>
>
>
> But, for FVL(40G) testbed, we met some problems, could you please help to
> check before recover it
>
> - Sometime 1G hugepage will be changed to 2Mhugepage
> automatically...we have to restart the system
> - When we debugging on the testbed, found that Trex was killed by some
> one(app)..
>
> Please help to check if any other program running on the testbed.
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Sent:* Wednesday, March 18, 2020 9:04 PM
> *To:* Brandon Lo <blo@iol.unh.edu>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Brandon, we almost made a workaround.
>
>
>
> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
> soon.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Wednesday, March 18, 2020 3:34 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Have you finished making changes on the Intel machine?
>
> I will turn on the machine on March 3rd for testing if you do not have any
> issues with it.
>
> Please let me know if you need anything else.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
> upgrading.
>
> So we are reviewing our DTS script, different Trex version, and CI calling
> procedure.
>
>
>
> Anyway, we are focusing on this task recently, any update will let you
> know.
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 10, 2020 10:46 PM
> *To:* David Marchand <david.marchand@redhat.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> How is the current status of the Intel 82599ES?
>
> Were there any configuration changes made to fix performance issues?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi David,
>
>
>
> This was just a weird issue with the packet generator not cleaning itself
> after a test fast enough before another test.
>
> I'll rerun the tests that were affected and keep an eye out to see if it's
> stable enough to be put back online.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
> wrote:
>
> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
> >
> > Hi David and Zhaoyan,
> >
> >
> > Yes, those results are related to the Intel machine; I have disabled
> testing for the Intel testbed.
> >
> > The 82599ES machine is now available for ssh and modifications.
>
> Any news about this?
>
> I received a failure on a patch of mine (changing macros in a ARM header).
> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>
> But this time, it is with the 40G Intel nic test.
>
> --
> David Marchand
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #1.2: Type: text/html, Size: 77004 bytes --]
[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 41815 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-16 15:04 ` Brandon Lo
@ 2020-04-17 1:54 ` Ma, LihongX
2020-04-17 7:00 ` Ma, LihongX
1 sibling, 0 replies; 42+ messages in thread
From: Ma, LihongX @ 2020-04-17 1:54 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 19927 bytes --]
Hi, Brandon
Very thanks for your help, now the NNT server have changed as we wanted, and can you help me also change the cmdline on FVL server.
Thanks
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Thursday, April 16, 2020 11:04 PM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I saw your changes and applied them manually.
This involves me taking the machine down for maintenance on our backend, and then doing a normal `reboot`.
Now when I do `cat /proc/cmdline` it shows:
`BOOT_IMAGE=/boot/vmlinuz-4.15.0-55-generic root=UUID=3801030b-237d-428e-9e67-d81e12f16308 ro quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable vt.handoff=1`
Which is what I think we wanted to achieve.
I have now done the `reboot_to_ro` command and it should be working fine for testing.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 11:40 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I use the cmd ‘reboot_to_rw’ to restart server, and edit the file ‘/etc/default/grub’ to reset the isolation cpus,
Changes the field ‘isolcpus=1-48 nohz_full=1-48 rcu_nocbs=1-48’ to
‘isolcpus=1-21,49-69,24-37,72-85 nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85’
Detail as below:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable"
Then use cmd: update-grub to make the configuration effective.
Then reboot server use ‘reboot_to_ro’, but I can’t find the change in cmdline ( cat /proc/cmdline).
Can you help me to check it?
Thanks
Lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Thursday, April 16, 2020 3:31 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I think any time is fine to make these changes.
Please use the 'reboot_to_rw' command once you have logged on, this will make sure your changes are saved.
It will also disable testing momentarily while you make your changes.
Once you're finished, you can reboot it with 'reboot_to_ro'.
Please let me know when you're done so I can make sure everything is working as intended.
We could also do this change for you if needed.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the grub configuration of isolation cpu miss the logic cores which at some thread.
For example:
If the server cpus layout as below, and want to config the isolation cpu from 1-20
The config of isolation should is ‘isolcpus=1-20,49-68’
[cid:1717f4fa8006917eb1]
So, I want to change the grub configuration of the isolation cpus.
From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
Can you help me check which time is suite to do this change ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 102094 bytes --]
[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 41815 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-16 15:04 ` Brandon Lo
2020-04-17 1:54 ` Ma, LihongX
@ 2020-04-17 7:00 ` Ma, LihongX
2020-04-17 15:52 ` Brandon Lo
1 sibling, 1 reply; 42+ messages in thread
From: Ma, LihongX @ 2020-04-17 7:00 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 19864 bytes --]
Hi, Brandon
I find the NNT server did not be called by the Jenkins, can you help to check?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Thursday, April 16, 2020 11:04 PM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I saw your changes and applied them manually.
This involves me taking the machine down for maintenance on our backend, and then doing a normal `reboot`.
Now when I do `cat /proc/cmdline` it shows:
`BOOT_IMAGE=/boot/vmlinuz-4.15.0-55-generic root=UUID=3801030b-237d-428e-9e67-d81e12f16308 ro quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable vt.handoff=1`
Which is what I think we wanted to achieve.
I have now done the `reboot_to_ro` command and it should be working fine for testing.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 11:40 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I use the cmd ‘reboot_to_rw’ to restart server, and edit the file ‘/etc/default/grub’ to reset the isolation cpus,
Changes the field ‘isolcpus=1-48 nohz_full=1-48 rcu_nocbs=1-48’ to
‘isolcpus=1-21,49-69,24-37,72-85 nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85’
Detail as below:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable"
Then use cmd: update-grub to make the configuration effective.
Then reboot server use ‘reboot_to_ro’, but I can’t find the change in cmdline ( cat /proc/cmdline).
Can you help me to check it?
Thanks
Lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Thursday, April 16, 2020 3:31 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I think any time is fine to make these changes.
Please use the 'reboot_to_rw' command once you have logged on, this will make sure your changes are saved.
It will also disable testing momentarily while you make your changes.
Once you're finished, you can reboot it with 'reboot_to_ro'.
Please let me know when you're done so I can make sure everything is working as intended.
We could also do this change for you if needed.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the grub configuration of isolation cpu miss the logic cores which at some thread.
For example:
If the server cpus layout as below, and want to config the isolation cpu from 1-20
The config of isolation should is ‘isolcpus=1-20,49-68’
[cid:1717f4fa8006917eb1]
So, I want to change the grub configuration of the isolation cpus.
From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
Can you help me check which time is suite to do this change ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 101763 bytes --]
[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 41815 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-17 7:00 ` Ma, LihongX
@ 2020-04-17 15:52 ` Brandon Lo
2020-04-18 1:13 ` Ma, LihongX
0 siblings, 1 reply; 42+ messages in thread
From: Brandon Lo @ 2020-04-17 15:52 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 19017 bytes --]
Hi Lihong,
I have fixed the NNT server not being called by Jenkins. It should work
fine now.
We did run into an issue where it could not detect huge pages information,
but rebooting it again fixed the issue.
I have also done the same thing to the FVL server. Both of them should be
called by Jenkins for testing.
Thanks,
Brandon
On Fri, Apr 17, 2020 at 3:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
> Hi, Brandon
>
>
>
> I find the NNT server did not be called by the Jenkins, can you help to
> check?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Thursday, April 16, 2020 11:04 PM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I saw your changes and applied them manually.
>
> This involves me taking the machine down for maintenance on our backend,
> and then doing a normal `reboot`.
>
>
>
> Now when I do `cat /proc/cmdline` it shows:
>
> `BOOT_IMAGE=/boot/vmlinuz-4.15.0-55-generic
> root=UUID=3801030b-237d-428e-9e67-d81e12f16308 ro quiet splash
> hugepagesz=1G hugepages=40 default_hugepagesz=1G
> isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt
> nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85
> intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup
> processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off
> tsc=reliable numa_balancing=disable vt.handoff=1`
>
>
>
> Which is what I think we wanted to achieve.
>
> I have now done the `reboot_to_ro` command and it should be working fine
> for testing.
>
>
>
> Thanks,
>
> Brandon
>
>
>
>
>
> On Wed, Apr 15, 2020 at 11:40 PM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
>
>
> I use the cmd ‘reboot_to_rw’ to restart server, and edit the file
> ‘/etc/default/grub’ to reset the isolation cpus,
>
>
>
> Changes the field ‘isolcpus=1-48 nohz_full=1-48 rcu_nocbs=1-48’ to
>
> ‘isolcpus=1-21,49-69,24-37,72-85 nohz_full=1-21,49-69,24-37,72-85
> rcu_nocbs=1-21,49-69,24-37,72-85’
>
>
>
> Detail as below:
>
> GRUB_CMDLINE_LINUX_DEFAULT="quiet splash hugepagesz=1G hugepages=40
> default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on
> iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85
> intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup
> processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off
> tsc=reliable numa_balancing=disable"
>
>
>
> Then use cmd: update-grub to make the configuration effective.
>
> Then reboot server use ‘reboot_to_ro’, but I can’t find the change in
> cmdline ( cat /proc/cmdline).
>
>
>
> Can you help me to check it?
>
>
>
> Thanks
>
> Lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Thursday, April 16, 2020 3:31 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I think any time is fine to make these changes.
>
>
>
> Please use the 'reboot_to_rw' command once you have logged on, this will
> make sure your changes are saved.
>
> It will also disable testing momentarily while you make your changes.
>
>
>
> Once you're finished, you can reboot it with 'reboot_to_ro'.
>
> Please let me know when you're done so I can make sure everything is
> working as intended.
>
> We could also do this change for you if needed.
>
>
>
> Thanks,
>
> Brandon
>
>
>
> On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
>
>
> I find the grub configuration of isolation cpu miss the logic cores which
> at some thread.
>
>
>
> For example:
>
> If the server cpus layout as below, and want to config the isolation cpu
> from 1-20
>
> The config of isolation should is ‘isolcpus=1-20,49-68’
>
>
>
> [image: cid:1717f4fa8006917eb1]
>
>
>
> So, I want to change the grub configuration of the isolation cpus.
>
> From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
>
>
>
> Can you help me check which time is suite to do this change ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Friday, April 3, 2020 2:39 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I have changed the baselines to reflect the new expected values.
>
> The performance tests should work as expected and pass.
>
>
>
> We will email again in the future if we come across any problems.
>
> Feel free to email us as well if you would like to make any other changes.
>
>
>
> Thank you for all your help
>
>
>
> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> Thanks for you recommends, I have done the changes.
>
> As the throughput value of nic_single_core is proportional to the cpu
> frequency.
>
> I recommend you can change the baseline according to our report system.
>
>
>
> On the our 2.50GHz system, the baseline value as below:
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562
>
> 64
>
> 2048
>
> 41.439
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608
>
> 64
>
> 2048
>
> 47.73
>
>
>
> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
> should be
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562 / 2.5 * 2.1=44.152
>
> 64
>
> 2048
>
> 41.439 / 2.5 * 2.1=34.809
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608 / 2.5 * 2.1=50.071
>
> 64
>
> 2048
>
> 47.73 / 2.5 * 2.1=40.093
>
>
>
>
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Tuesday, March 31, 2020 9:42 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> To make changes to either Intel machine, please reboot using the command
> "reboot_to_rw" as root to reboot the machine into read/write mode.
>
> This command will also disable any testing on the machine.
>
>
>
> To re-enable the machine, please run "reboot_to_ro" as root, and it will
> save all of the changes that you've made and re-enable testing on the
> machine.
>
> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
> instead of the normal "reboot" while you're making changes.
>
>
>
> After you're done, please let me know. I'll have to manually run a test
> and update the baseline using our internal CI.
>
>
>
> Thank you
>
>
>
> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Please let me know how to make change to this reset machine.
> (ip/access...) and disable it.
>
>
>
> After that please help to change the baseline.
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Thursday, March 26, 2020 11:39 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, we have a system in place that resets any changes made while
> testing is enabled for a machine.
>
> If you would like, I can disable testing and allow you to make permanent
> changes.
>
>
>
> I can also reset the baseline of Intel 10G test performance once you make
> these changes.
>
> Please let me know if you would like to make permanent changes on the
> Intel 10G so I can disable it for you.
>
>
>
> Thanks
>
>
>
> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Thanks. Brandon.
>
>
>
> That’s good. We have made changed on 10G testbed.
>
>
>
> I monitored the several execution results; I found the results of 10G
> always has -0.9%~-1.x% gap against expected number. So it could lead to see
> sometime failures..+-1% I suggest adjusting the expected number. I don’t
> know where the expected number is from? as I know it a dynamic number?
> depends on baseline.. Please help to clarify, thanks.
>
>
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 24, 2020 9:31 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> I have enabled the 10G Intel machine for testing.
>
> If you would like to make any more changes, please let me know so I can
> perform the necessary steps to prepare the machine for changes.
>
> Please feel free to let me know if you need anything.
>
>
>
> Thank you
>
>
>
> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> For 10G, please enable it. our code is at original path
> */opt/test-harness/dts.*
>
>
>
> For 40G, please keep running. and see if any issue. But, anyway, we have
> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
> problem, then use this new DTS instead.
>
>
>
> Thanks a lot
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Saturday, March 21, 2020 1:49 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, the 40G machine is stable enough to be put on production
> dashboard to run tests which may cause Trex to be killed.
>
> Should I disable the 40G Intel machine for you to make changes?
>
>
>
> Also, just for confirmation: on the 10G machine, is the folder that you
> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
> are you still using the one in the standard */opt/test-harness/dts*
> folder?
>
>
>
> If everything is ok, I will enable the 10G machine for production testing.
>
>
>
> Thank you very much
>
>
>
> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Brandon,
>
>
>
> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could
> you please help to recover them?
>
>
>
> But, for FVL(40G) testbed, we met some problems, could you please help to
> check before recover it
>
> - Sometime 1G hugepage will be changed to 2Mhugepage
> automatically...we have to restart the system
> - When we debugging on the testbed, found that Trex was killed by some
> one(app)..
>
> Please help to check if any other program running on the testbed.
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Sent:* Wednesday, March 18, 2020 9:04 PM
> *To:* Brandon Lo <blo@iol.unh.edu>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Brandon, we almost made a workaround.
>
>
>
> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
> soon.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Wednesday, March 18, 2020 3:34 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Have you finished making changes on the Intel machine?
>
> I will turn on the machine on March 3rd for testing if you do not have any
> issues with it.
>
> Please let me know if you need anything else.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
> upgrading.
>
> So we are reviewing our DTS script, different Trex version, and CI calling
> procedure.
>
>
>
> Anyway, we are focusing on this task recently, any update will let you
> know.
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 10, 2020 10:46 PM
> *To:* David Marchand <david.marchand@redhat.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> How is the current status of the Intel 82599ES?
>
> Were there any configuration changes made to fix performance issues?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi David,
>
>
>
> This was just a weird issue with the packet generator not cleaning itself
> after a test fast enough before another test.
>
> I'll rerun the tests that were affected and keep an eye out to see if it's
> stable enough to be put back online.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
> wrote:
>
> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
> >
> > Hi David and Zhaoyan,
> >
> >
> > Yes, those results are related to the Intel machine; I have disabled
> testing for the Intel testbed.
> >
> > The 82599ES machine is now available for ssh and modifications.
>
> Any news about this?
>
> I received a failure on a patch of mine (changing macros in a ARM header).
> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>
> But this time, it is with the 40G Intel nic test.
>
> --
> David Marchand
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #1.2: Type: text/html, Size: 82351 bytes --]
[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 41815 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-17 15:52 ` Brandon Lo
@ 2020-04-18 1:13 ` Ma, LihongX
0 siblings, 0 replies; 42+ messages in thread
From: Ma, LihongX @ 2020-04-18 1:13 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim
[-- Attachment #1.1: Type: text/plain, Size: 21479 bytes --]
Yes, they are work fine now, very thanks Brandon’s help.
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Friday, April 17, 2020 11:53 PM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have fixed the NNT server not being called by Jenkins. It should work fine now.
We did run into an issue where it could not detect huge pages information, but rebooting it again fixed the issue.
I have also done the same thing to the FVL server. Both of them should be called by Jenkins for testing.
Thanks,
Brandon
On Fri, Apr 17, 2020 at 3:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the NNT server did not be called by the Jenkins, can you help to check?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Thursday, April 16, 2020 11:04 PM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I saw your changes and applied them manually.
This involves me taking the machine down for maintenance on our backend, and then doing a normal `reboot`.
Now when I do `cat /proc/cmdline` it shows:
`BOOT_IMAGE=/boot/vmlinuz-4.15.0-55-generic root=UUID=3801030b-237d-428e-9e67-d81e12f16308 ro quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable vt.handoff=1`
Which is what I think we wanted to achieve.
I have now done the `reboot_to_ro` command and it should be working fine for testing.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 11:40 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I use the cmd ‘reboot_to_rw’ to restart server, and edit the file ‘/etc/default/grub’ to reset the isolation cpus,
Changes the field ‘isolcpus=1-48 nohz_full=1-48 rcu_nocbs=1-48’ to
‘isolcpus=1-21,49-69,24-37,72-85 nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85’
Detail as below:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash hugepagesz=1G hugepages=40 default_hugepagesz=1G isolcpus=1-21,49-69,24-37,72-85 intel_iommu=on iommu=pt nohz_full=1-21,49-69,24-37,72-85 rcu_nocbs=1-21,49-69,24-37,72-85 intel_pstate=disable nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable mce=off tsc=reliable numa_balancing=disable"
Then use cmd: update-grub to make the configuration effective.
Then reboot server use ‘reboot_to_ro’, but I can’t find the change in cmdline ( cat /proc/cmdline).
Can you help me to check it?
Thanks
Lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Thursday, April 16, 2020 3:31 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I think any time is fine to make these changes.
Please use the 'reboot_to_rw' command once you have logged on, this will make sure your changes are saved.
It will also disable testing momentarily while you make your changes.
Once you're finished, you can reboot it with 'reboot_to_ro'.
Please let me know when you're done so I can make sure everything is working as intended.
We could also do this change for you if needed.
Thanks,
Brandon
On Wed, Apr 15, 2020 at 5:39 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the grub configuration of isolation cpu miss the logic cores which at some thread.
For example:
If the server cpus layout as below, and want to config the isolation cpu from 1-20
The config of isolation should is ‘isolcpus=1-20,49-68’
[cid:1717f4fa8006917eb1]
So, I want to change the grub configuration of the isolation cpus.
From ‘isolcpus=1-48’ change to ‘isolcpus=1-21,49-68,24-37,72-85’
Can you help me check which time is suite to do this change ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 108194 bytes --]
[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 41815 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-02 18:39 ` Brandon Lo
2020-04-03 1:14 ` Ma, LihongX
2020-04-15 9:39 ` Ma, LihongX
@ 2020-04-28 3:39 ` Ma, LihongX
2020-04-28 16:51 ` Brandon Lo
2 siblings, 1 reply; 42+ messages in thread
From: Ma, LihongX @ 2020-04-28 3:39 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1: Type: text/plain, Size: 14780 bytes --]
Hi, Brandon
I find the baseline of NNT have changed as expected, but FVL still same as before.
Can you help to check it and change the baseline as expected ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #2: Type: text/html, Size: 81936 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-28 3:39 ` Ma, LihongX
@ 2020-04-28 16:51 ` Brandon Lo
2020-04-29 5:30 ` Ma, LihongX
0 siblings, 1 reply; 42+ messages in thread
From: Brandon Lo @ 2020-04-28 16:51 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1: Type: text/plain, Size: 14137 bytes --]
Hi Lihong,
Sorry about that, I have reset the baseline to the values you sent in the
previous email.
I'll look to rerun tests that have failed due to the incorrect baseline.
Thanks for letting me know,
Brandon
On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com> wrote:
> Hi, Brandon
>
> I find the baseline of NNT have changed as expected, but FVL still same as
> before.
>
> Can you help to check it and change the baseline as expected ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Friday, April 3, 2020 2:39 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I have changed the baselines to reflect the new expected values.
>
> The performance tests should work as expected and pass.
>
>
>
> We will email again in the future if we come across any problems.
>
> Feel free to email us as well if you would like to make any other changes.
>
>
>
> Thank you for all your help
>
>
>
> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> Thanks for you recommends, I have done the changes.
>
> As the throughput value of nic_single_core is proportional to the cpu
> frequency.
>
> I recommend you can change the baseline according to our report system.
>
>
>
> On the our 2.50GHz system, the baseline value as below:
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562
>
> 64
>
> 2048
>
> 41.439
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608
>
> 64
>
> 2048
>
> 47.73
>
>
>
> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
> should be
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562 / 2.5 * 2.1=44.152
>
> 64
>
> 2048
>
> 41.439 / 2.5 * 2.1=34.809
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608 / 2.5 * 2.1=50.071
>
> 64
>
> 2048
>
> 47.73 / 2.5 * 2.1=40.093
>
>
>
>
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Tuesday, March 31, 2020 9:42 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> To make changes to either Intel machine, please reboot using the command
> "reboot_to_rw" as root to reboot the machine into read/write mode.
>
> This command will also disable any testing on the machine.
>
>
>
> To re-enable the machine, please run "reboot_to_ro" as root, and it will
> save all of the changes that you've made and re-enable testing on the
> machine.
>
> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
> instead of the normal "reboot" while you're making changes.
>
>
>
> After you're done, please let me know. I'll have to manually run a test
> and update the baseline using our internal CI.
>
>
>
> Thank you
>
>
>
> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Please let me know how to make change to this reset machine.
> (ip/access...) and disable it.
>
>
>
> After that please help to change the baseline.
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Thursday, March 26, 2020 11:39 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, we have a system in place that resets any changes made while
> testing is enabled for a machine.
>
> If you would like, I can disable testing and allow you to make permanent
> changes.
>
>
>
> I can also reset the baseline of Intel 10G test performance once you make
> these changes.
>
> Please let me know if you would like to make permanent changes on the
> Intel 10G so I can disable it for you.
>
>
>
> Thanks
>
>
>
> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Thanks. Brandon.
>
>
>
> That’s good. We have made changed on 10G testbed.
>
>
>
> I monitored the several execution results; I found the results of 10G
> always has -0.9%~-1.x% gap against expected number. So it could lead to see
> sometime failures..+-1% I suggest adjusting the expected number. I don’t
> know where the expected number is from? as I know it a dynamic number?
> depends on baseline.. Please help to clarify, thanks.
>
>
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 24, 2020 9:31 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> I have enabled the 10G Intel machine for testing.
>
> If you would like to make any more changes, please let me know so I can
> perform the necessary steps to prepare the machine for changes.
>
> Please feel free to let me know if you need anything.
>
>
>
> Thank you
>
>
>
> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> For 10G, please enable it. our code is at original path
> */opt/test-harness/dts.*
>
>
>
> For 40G, please keep running. and see if any issue. But, anyway, we have
> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
> problem, then use this new DTS instead.
>
>
>
> Thanks a lot
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Saturday, March 21, 2020 1:49 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, the 40G machine is stable enough to be put on production
> dashboard to run tests which may cause Trex to be killed.
>
> Should I disable the 40G Intel machine for you to make changes?
>
>
>
> Also, just for confirmation: on the 10G machine, is the folder that you
> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
> are you still using the one in the standard */opt/test-harness/dts*
> folder?
>
>
>
> If everything is ok, I will enable the 10G machine for production testing.
>
>
>
> Thank you very much
>
>
>
> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Brandon,
>
>
>
> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could
> you please help to recover them?
>
>
>
> But, for FVL(40G) testbed, we met some problems, could you please help to
> check before recover it
>
> - Sometime 1G hugepage will be changed to 2Mhugepage
> automatically...we have to restart the system
> - When we debugging on the testbed, found that Trex was killed by some
> one(app)..
>
> Please help to check if any other program running on the testbed.
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Sent:* Wednesday, March 18, 2020 9:04 PM
> *To:* Brandon Lo <blo@iol.unh.edu>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Brandon, we almost made a workaround.
>
>
>
> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
> soon.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Wednesday, March 18, 2020 3:34 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Have you finished making changes on the Intel machine?
>
> I will turn on the machine on March 3rd for testing if you do not have any
> issues with it.
>
> Please let me know if you need anything else.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
> upgrading.
>
> So we are reviewing our DTS script, different Trex version, and CI calling
> procedure.
>
>
>
> Anyway, we are focusing on this task recently, any update will let you
> know.
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 10, 2020 10:46 PM
> *To:* David Marchand <david.marchand@redhat.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> How is the current status of the Intel 82599ES?
>
> Were there any configuration changes made to fix performance issues?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi David,
>
>
>
> This was just a weird issue with the packet generator not cleaning itself
> after a test fast enough before another test.
>
> I'll rerun the tests that were affected and keep an eye out to see if it's
> stable enough to be put back online.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
> wrote:
>
> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
> >
> > Hi David and Zhaoyan,
> >
> >
> > Yes, those results are related to the Intel machine; I have disabled
> testing for the Intel testbed.
> >
> > The 82599ES machine is now available for ssh and modifications.
>
> Any news about this?
>
> I received a failure on a patch of mine (changing macros in a ARM header).
> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>
> But this time, it is with the 40G Intel nic test.
>
> --
> David Marchand
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #2: Type: text/html, Size: 65457 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-28 16:51 ` Brandon Lo
@ 2020-04-29 5:30 ` Ma, LihongX
2020-04-30 14:19 ` Brandon Lo
0 siblings, 1 reply; 42+ messages in thread
From: Ma, LihongX @ 2020-04-29 5:30 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 16513 bytes --]
Hi, Brandon
I checked the new result of FVL, find the expected value also not changed.
From the log find the expected value also is:
[cid:image001.png@01D61E29.70A1C270]
Can you help to double check it ? Is there any different between FVL and NNT ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Wednesday, April 29, 2020 12:52 AM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Sorry about that, I have reset the baseline to the values you sent in the previous email.
I'll look to rerun tests that have failed due to the incorrect baseline.
Thanks for letting me know,
Brandon
On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the baseline of NNT have changed as expected, but FVL still same as before.
Can you help to check it and change the baseline as expected ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 87012 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-29 5:30 ` Ma, LihongX
@ 2020-04-30 14:19 ` Brandon Lo
2020-05-06 13:05 ` Brandon Lo
0 siblings, 1 reply; 42+ messages in thread
From: Brandon Lo @ 2020-04-30 14:19 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 15690 bytes --]
Hi Lihong,
The expected value was reset by one of our internal scripts.
I believe that I have resolved this issue for the future by ensuring that
the baseline that you sent me will not be overwritten automatically.
I will continue to monitor this expected throughput in case of any issues.
Thanks for your patience,
Brandon
On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
> Hi, Brandon
>
>
>
> I checked the new result of FVL, find the expected value also not changed.
>
> From the log find the expected value also is:
>
>
>
> Can you help to double check it ? Is there any different between FVL and
> NNT ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Wednesday, April 29, 2020 12:52 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> Sorry about that, I have reset the baseline to the values you sent in the
> previous email.
>
> I'll look to rerun tests that have failed due to the incorrect baseline.
>
>
>
> Thanks for letting me know,
>
> Brandon
>
>
>
> On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> I find the baseline of NNT have changed as expected, but FVL still same as
> before.
>
> Can you help to check it and change the baseline as expected ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Friday, April 3, 2020 2:39 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I have changed the baselines to reflect the new expected values.
>
> The performance tests should work as expected and pass.
>
>
>
> We will email again in the future if we come across any problems.
>
> Feel free to email us as well if you would like to make any other changes.
>
>
>
> Thank you for all your help
>
>
>
> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> Thanks for you recommends, I have done the changes.
>
> As the throughput value of nic_single_core is proportional to the cpu
> frequency.
>
> I recommend you can change the baseline according to our report system.
>
>
>
> On the our 2.50GHz system, the baseline value as below:
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562
>
> 64
>
> 2048
>
> 41.439
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608
>
> 64
>
> 2048
>
> 47.73
>
>
>
> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
> should be
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562 / 2.5 * 2.1=44.152
>
> 64
>
> 2048
>
> 41.439 / 2.5 * 2.1=34.809
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608 / 2.5 * 2.1=50.071
>
> 64
>
> 2048
>
> 47.73 / 2.5 * 2.1=40.093
>
>
>
>
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Tuesday, March 31, 2020 9:42 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> To make changes to either Intel machine, please reboot using the command
> "reboot_to_rw" as root to reboot the machine into read/write mode.
>
> This command will also disable any testing on the machine.
>
>
>
> To re-enable the machine, please run "reboot_to_ro" as root, and it will
> save all of the changes that you've made and re-enable testing on the
> machine.
>
> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
> instead of the normal "reboot" while you're making changes.
>
>
>
> After you're done, please let me know. I'll have to manually run a test
> and update the baseline using our internal CI.
>
>
>
> Thank you
>
>
>
> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Please let me know how to make change to this reset machine.
> (ip/access...) and disable it.
>
>
>
> After that please help to change the baseline.
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Thursday, March 26, 2020 11:39 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, we have a system in place that resets any changes made while
> testing is enabled for a machine.
>
> If you would like, I can disable testing and allow you to make permanent
> changes.
>
>
>
> I can also reset the baseline of Intel 10G test performance once you make
> these changes.
>
> Please let me know if you would like to make permanent changes on the
> Intel 10G so I can disable it for you.
>
>
>
> Thanks
>
>
>
> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Thanks. Brandon.
>
>
>
> That’s good. We have made changed on 10G testbed.
>
>
>
> I monitored the several execution results; I found the results of 10G
> always has -0.9%~-1.x% gap against expected number. So it could lead to see
> sometime failures..+-1% I suggest adjusting the expected number. I don’t
> know where the expected number is from? as I know it a dynamic number?
> depends on baseline.. Please help to clarify, thanks.
>
>
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 24, 2020 9:31 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> I have enabled the 10G Intel machine for testing.
>
> If you would like to make any more changes, please let me know so I can
> perform the necessary steps to prepare the machine for changes.
>
> Please feel free to let me know if you need anything.
>
>
>
> Thank you
>
>
>
> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> For 10G, please enable it. our code is at original path
> */opt/test-harness/dts.*
>
>
>
> For 40G, please keep running. and see if any issue. But, anyway, we have
> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
> problem, then use this new DTS instead.
>
>
>
> Thanks a lot
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Saturday, March 21, 2020 1:49 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, the 40G machine is stable enough to be put on production
> dashboard to run tests which may cause Trex to be killed.
>
> Should I disable the 40G Intel machine for you to make changes?
>
>
>
> Also, just for confirmation: on the 10G machine, is the folder that you
> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
> are you still using the one in the standard */opt/test-harness/dts*
> folder?
>
>
>
> If everything is ok, I will enable the 10G machine for production testing.
>
>
>
> Thank you very much
>
>
>
> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Brandon,
>
>
>
> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could
> you please help to recover them?
>
>
>
> But, for FVL(40G) testbed, we met some problems, could you please help to
> check before recover it
>
> - Sometime 1G hugepage will be changed to 2Mhugepage
> automatically...we have to restart the system
> - When we debugging on the testbed, found that Trex was killed by some
> one(app)..
>
> Please help to check if any other program running on the testbed.
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Sent:* Wednesday, March 18, 2020 9:04 PM
> *To:* Brandon Lo <blo@iol.unh.edu>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Brandon, we almost made a workaround.
>
>
>
> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
> soon.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Wednesday, March 18, 2020 3:34 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Have you finished making changes on the Intel machine?
>
> I will turn on the machine on March 3rd for testing if you do not have any
> issues with it.
>
> Please let me know if you need anything else.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
> upgrading.
>
> So we are reviewing our DTS script, different Trex version, and CI calling
> procedure.
>
>
>
> Anyway, we are focusing on this task recently, any update will let you
> know.
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 10, 2020 10:46 PM
> *To:* David Marchand <david.marchand@redhat.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> How is the current status of the Intel 82599ES?
>
> Were there any configuration changes made to fix performance issues?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi David,
>
>
>
> This was just a weird issue with the packet generator not cleaning itself
> after a test fast enough before another test.
>
> I'll rerun the tests that were affected and keep an eye out to see if it's
> stable enough to be put back online.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
> wrote:
>
> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
> >
> > Hi David and Zhaoyan,
> >
> >
> > Yes, those results are related to the Intel machine; I have disabled
> testing for the Intel testbed.
> >
> > The 82599ES machine is now available for ssh and modifications.
>
> Any news about this?
>
> I received a failure on a patch of mine (changing macros in a ARM header).
> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>
> But this time, it is with the 40G Intel nic test.
>
> --
> David Marchand
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #1.2: Type: text/html, Size: 71056 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-04-30 14:19 ` Brandon Lo
@ 2020-05-06 13:05 ` Brandon Lo
2020-05-08 8:02 ` Ma, LihongX
2020-05-12 5:41 ` Ma, LihongX
0 siblings, 2 replies; 42+ messages in thread
From: Brandon Lo @ 2020-05-06 13:05 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 17038 bytes --]
Hi Lihong,
Just a further update: we have noticed that there is another internal
script that is used to calculate baselines that will pull a newer baseline
if it is found.
We are looking to solve the issues that we are having with baselines and
will get back to you.
Thanks for your patience,
Brandon
On Thu, Apr 30, 2020 at 10:19 AM Brandon Lo <blo@iol.unh.edu> wrote:
> Hi Lihong,
>
> The expected value was reset by one of our internal scripts.
> I believe that I have resolved this issue for the future by ensuring that
> the baseline that you sent me will not be overwritten automatically.
>
> I will continue to monitor this expected throughput in case of any issues.
>
> Thanks for your patience,
> Brandon
>
> On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
>> Hi, Brandon
>>
>>
>>
>> I checked the new result of FVL, find the expected value also not changed.
>>
>> From the log find the expected value also is:
>>
>>
>>
>> Can you help to double check it ? Is there any different between FVL and
>> NNT ?
>>
>>
>>
>> Regards,
>>
>> Ma,lihong
>>
>>
>>
>> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
>> *Sent:* Wednesday, April 29, 2020 12:52 AM
>> *To:* Ma, LihongX <lihongx.ma@intel.com>
>> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
>> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
>> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
>> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
>> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
>> tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Lihong,
>>
>>
>>
>> Sorry about that, I have reset the baseline to the values you sent in the
>> previous email.
>>
>> I'll look to rerun tests that have failed due to the incorrect baseline.
>>
>>
>>
>> Thanks for letting me know,
>>
>> Brandon
>>
>>
>>
>> On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com>
>> wrote:
>>
>> Hi, Brandon
>>
>> I find the baseline of NNT have changed as expected, but FVL still same
>> as before.
>>
>> Can you help to check it and change the baseline as expected ?
>>
>>
>>
>> Regards,
>>
>> Ma,lihong
>>
>>
>>
>> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
>> *Sent:* Friday, April 3, 2020 2:39 AM
>> *To:* Ma, LihongX <lihongx.ma@intel.com>
>> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
>> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
>> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
>> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
>> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
>> tim.odriscoll@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Lihong,
>>
>>
>>
>> I have changed the baselines to reflect the new expected values.
>>
>> The performance tests should work as expected and pass.
>>
>>
>>
>> We will email again in the future if we come across any problems.
>>
>> Feel free to email us as well if you would like to make any other changes.
>>
>>
>>
>> Thank you for all your help
>>
>>
>>
>> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>>
>> Hi, Brandon
>>
>> Thanks for you recommends, I have done the changes.
>>
>> As the throughput value of nic_single_core is proportional to the cpu
>> frequency.
>>
>> I recommend you can change the baseline according to our report system.
>>
>>
>>
>> On the our 2.50GHz system, the baseline value as below:
>>
>> NNT:
>>
>> *pkt_size*
>>
>> *trd/rxd*
>>
>> *expected_value*
>>
>> 64
>>
>> 512
>>
>> 52.562
>>
>> 64
>>
>> 2048
>>
>> 41.439
>>
>>
>>
>> FVL:
>>
>> *pkt_size*
>>
>> *trd/rxd*
>>
>> *expected_value*
>>
>> 64
>>
>> 512
>>
>> 59.608
>>
>> 64
>>
>> 2048
>>
>> 47.73
>>
>>
>>
>> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
>> should be
>>
>> NNT:
>>
>> *pkt_size*
>>
>> *trd/rxd*
>>
>> *expected_value*
>>
>> 64
>>
>> 512
>>
>> 52.562 / 2.5 * 2.1=44.152
>>
>> 64
>>
>> 2048
>>
>> 41.439 / 2.5 * 2.1=34.809
>>
>>
>>
>> FVL:
>>
>> *pkt_size*
>>
>> *trd/rxd*
>>
>> *expected_value*
>>
>> 64
>>
>> 512
>>
>> 59.608 / 2.5 * 2.1=50.071
>>
>> 64
>>
>> 2048
>>
>> 47.73 / 2.5 * 2.1=40.093
>>
>>
>>
>>
>>
>>
>>
>> Regards,
>>
>> Ma,lihong
>>
>>
>>
>> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
>> *Sent:* Tuesday, March 31, 2020 9:42 PM
>> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
>> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
>> tim.odriscoll@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Zhaoyan,
>>
>>
>>
>> To make changes to either Intel machine, please reboot using the command
>> "reboot_to_rw" as root to reboot the machine into read/write mode.
>>
>> This command will also disable any testing on the machine.
>>
>>
>>
>> To re-enable the machine, please run "reboot_to_ro" as root, and it will
>> save all of the changes that you've made and re-enable testing on the
>> machine.
>>
>> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
>> instead of the normal "reboot" while you're making changes.
>>
>>
>>
>> After you're done, please let me know. I'll have to manually run a test
>> and update the baseline using our internal CI.
>>
>>
>>
>> Thank you
>>
>>
>>
>> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> wrote:
>>
>> Hi, Brandon,
>>
>>
>>
>> Please let me know how to make change to this reset machine.
>> (ip/access...) and disable it.
>>
>>
>>
>> After that please help to change the baseline.
>>
>>
>>
>>
>>
>> *Regards,*
>>
>> *Zhaoyan Chen*
>>
>>
>>
>> *From:* Brandon Lo <blo@iol.unh.edu>
>> *Sent:* Thursday, March 26, 2020 11:39 PM
>> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
>> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
>> tim.odriscoll@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Zhaoyan,
>>
>>
>>
>> Currently, we have a system in place that resets any changes made while
>> testing is enabled for a machine.
>>
>> If you would like, I can disable testing and allow you to make permanent
>> changes.
>>
>>
>>
>> I can also reset the baseline of Intel 10G test performance once you make
>> these changes.
>>
>> Please let me know if you would like to make permanent changes on the
>> Intel 10G so I can disable it for you.
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> wrote:
>>
>> Thanks. Brandon.
>>
>>
>>
>> That’s good. We have made changed on 10G testbed.
>>
>>
>>
>> I monitored the several execution results; I found the results of 10G
>> always has -0.9%~-1.x% gap against expected number. So it could lead to see
>> sometime failures..+-1% I suggest adjusting the expected number. I don’t
>> know where the expected number is from? as I know it a dynamic number?
>> depends on baseline.. Please help to clarify, thanks.
>>
>>
>>
>>
>>
>> Thanks.
>>
>>
>>
>> *Regards,*
>>
>> *Zhaoyan Chen*
>>
>>
>>
>> *From:* Brandon Lo <blo@iol.unh.edu>
>> *Sent:* Tuesday, March 24, 2020 9:31 PM
>> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
>> XuemingX <xuemingx.zhang@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Zhaoyan,
>>
>>
>>
>> I have enabled the 10G Intel machine for testing.
>>
>> If you would like to make any more changes, please let me know so I can
>> perform the necessary steps to prepare the machine for changes.
>>
>> Please feel free to let me know if you need anything.
>>
>>
>>
>> Thank you
>>
>>
>>
>> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> wrote:
>>
>> Hi, Brandon,
>>
>>
>>
>> For 10G, please enable it. our code is at original path
>> */opt/test-harness/dts.*
>>
>>
>>
>> For 40G, please keep running. and see if any issue. But, anyway, we have
>> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
>> problem, then use this new DTS instead.
>>
>>
>>
>> Thanks a lot
>>
>>
>>
>> *Regards,*
>>
>> *Zhaoyan Chen*
>>
>>
>>
>> *From:* Brandon Lo <blo@iol.unh.edu>
>> *Sent:* Saturday, March 21, 2020 1:49 AM
>> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
>> XuemingX <xuemingx.zhang@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Zhaoyan,
>>
>>
>>
>> Currently, the 40G machine is stable enough to be put on production
>> dashboard to run tests which may cause Trex to be killed.
>>
>> Should I disable the 40G Intel machine for you to make changes?
>>
>>
>>
>> Also, just for confirmation: on the 10G machine, is the folder that you
>> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
>> are you still using the one in the standard */opt/test-harness/dts*
>> folder?
>>
>>
>>
>> If everything is ok, I will enable the 10G machine for production testing.
>>
>>
>>
>> Thank you very much
>>
>>
>>
>> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> wrote:
>>
>> Brandon,
>>
>>
>>
>> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G).
>> Could you please help to recover them?
>>
>>
>>
>> But, for FVL(40G) testbed, we met some problems, could you please help
>> to check before recover it
>>
>> - Sometime 1G hugepage will be changed to 2Mhugepage
>> automatically...we have to restart the system
>> - When we debugging on the testbed, found that Trex was killed by
>> some one(app)..
>>
>> Please help to check if any other program running on the testbed.
>>
>>
>>
>> Thanks a lot.
>>
>>
>>
>>
>>
>>
>>
>> *Regards,*
>>
>> *Zhaoyan Chen*
>>
>>
>>
>> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Sent:* Wednesday, March 18, 2020 9:04 PM
>> *To:* Brandon Lo <blo@iol.unh.edu>
>> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Brandon, we almost made a workaround.
>>
>>
>>
>> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
>> soon.
>>
>>
>>
>>
>>
>>
>>
>> *Regards,*
>>
>> *Zhaoyan Chen*
>>
>>
>>
>> *From:* Brandon Lo <blo@iol.unh.edu>
>> *Sent:* Wednesday, March 18, 2020 3:34 AM
>> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Zhaoyan,
>>
>>
>>
>> Have you finished making changes on the Intel machine?
>>
>> I will turn on the machine on March 3rd for testing if you do not have
>> any issues with it.
>>
>> Please let me know if you need anything else.
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
>> wrote:
>>
>> Hi, Brandon,
>>
>>
>>
>> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
>> upgrading.
>>
>> So we are reviewing our DTS script, different Trex version, and CI
>> calling procedure.
>>
>>
>>
>> Anyway, we are focusing on this task recently, any update will let you
>> know.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> *Regards,*
>>
>> *Zhaoyan Chen*
>>
>>
>>
>> *From:* Brandon Lo <blo@iol.unh.edu>
>> *Sent:* Tuesday, March 10, 2020 10:46 PM
>> *To:* David Marchand <david.marchand@redhat.com>
>> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
>> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
>> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
>> Qian Q <qian.q.xu@intel.com>
>> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>>
>>
>>
>> Hi Zhaoyan,
>>
>>
>>
>> How is the current status of the Intel 82599ES?
>>
>> Were there any configuration changes made to fix performance issues?
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>>
>> Hi David,
>>
>>
>>
>> This was just a weird issue with the packet generator not cleaning itself
>> after a test fast enough before another test.
>>
>> I'll rerun the tests that were affected and keep an eye out to see if
>> it's stable enough to be put back online.
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
>> wrote:
>>
>> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
>> >
>> > Hi David and Zhaoyan,
>> >
>> >
>> > Yes, those results are related to the Intel machine; I have disabled
>> testing for the Intel testbed.
>> >
>> > The 82599ES machine is now available for ssh and modifications.
>>
>> Any news about this?
>>
>> I received a failure on a patch of mine (changing macros in a ARM header).
>> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>>
>> But this time, it is with the 40G Intel nic test.
>>
>> --
>> David Marchand
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>>
>>
>>
>> --
>>
>> Brandon Lo
>>
>> UNH InterOperability Laboratory
>>
>> 21 Madbury Rd, Suite 100, Durham, NH 03824
>>
>> blo@iol.unh.edu
>>
>> www.iol.unh.edu
>>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #1.2: Type: text/html, Size: 73215 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-05-06 13:05 ` Brandon Lo
@ 2020-05-08 8:02 ` Ma, LihongX
2020-05-12 5:41 ` Ma, LihongX
1 sibling, 0 replies; 42+ messages in thread
From: Ma, LihongX @ 2020-05-08 8:02 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 18759 bytes --]
Thanks Brandon, wait your reply.
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Wednesday, May 6, 2020 9:05 PM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Just a further update: we have noticed that there is another internal script that is used to calculate baselines that will pull a newer baseline if it is found.
We are looking to solve the issues that we are having with baselines and will get back to you.
Thanks for your patience,
Brandon
On Thu, Apr 30, 2020 at 10:19 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi Lihong,
The expected value was reset by one of our internal scripts.
I believe that I have resolved this issue for the future by ensuring that the baseline that you sent me will not be overwritten automatically.
I will continue to monitor this expected throughput in case of any issues.
Thanks for your patience,
Brandon
On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I checked the new result of FVL, find the expected value also not changed.
From the log find the expected value also is:
[cid:image001.png@01D62552.0A1802A0]
Can you help to double check it ? Is there any different between FVL and NNT ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Wednesday, April 29, 2020 12:52 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Sorry about that, I have reset the baseline to the values you sent in the previous email.
I'll look to rerun tests that have failed due to the incorrect baseline.
Thanks for letting me know,
Brandon
On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the baseline of NNT have changed as expected, but FVL still same as before.
Can you help to check it and change the baseline as expected ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 95596 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-05-06 13:05 ` Brandon Lo
2020-05-08 8:02 ` Ma, LihongX
@ 2020-05-12 5:41 ` Ma, LihongX
2020-05-14 16:12 ` Brandon Lo
1 sibling, 1 reply; 42+ messages in thread
From: Ma, LihongX @ 2020-05-12 5:41 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 19921 bytes --]
Hi, Brandon
Thanks for your help to check the issues of baseline, and I find the FVL env is ok, the expected value has been changed to the value we wanted.
But the baseline of NNT has changed, can you help to check it ?
Thanks
Regards,
Ma,lihong
From: Ma, LihongX
Sent: Friday, May 8, 2020 4:02 PM
To: 'Brandon Lo' <blo@iol.unh.edu>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Thanks Brandon, wait your reply.
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Wednesday, May 6, 2020 9:05 PM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Just a further update: we have noticed that there is another internal script that is used to calculate baselines that will pull a newer baseline if it is found.
We are looking to solve the issues that we are having with baselines and will get back to you.
Thanks for your patience,
Brandon
On Thu, Apr 30, 2020 at 10:19 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi Lihong,
The expected value was reset by one of our internal scripts.
I believe that I have resolved this issue for the future by ensuring that the baseline that you sent me will not be overwritten automatically.
I will continue to monitor this expected throughput in case of any issues.
Thanks for your patience,
Brandon
On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I checked the new result of FVL, find the expected value also not changed.
From the log find the expected value also is:
[cid:image001.png@01D62862.A8915090]
Can you help to double check it ? Is there any different between FVL and NNT ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Wednesday, April 29, 2020 12:52 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Sorry about that, I have reset the baseline to the values you sent in the previous email.
I'll look to rerun tests that have failed due to the incorrect baseline.
Thanks for letting me know,
Brandon
On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the baseline of NNT have changed as expected, but FVL still same as before.
Can you help to check it and change the baseline as expected ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 100596 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-05-12 5:41 ` Ma, LihongX
@ 2020-05-14 16:12 ` Brandon Lo
2020-05-15 1:30 ` Ma, LihongX
0 siblings, 1 reply; 42+ messages in thread
From: Brandon Lo @ 2020-05-14 16:12 UTC (permalink / raw)
To: Ma, LihongX
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 18369 bytes --]
Hi Lihong,
I've applied the same fix to the Intel 10G machine.
It should be pulling the correct baseline values from now on.
Thanks,
Brandon
On Tue, May 12, 2020 at 1:41 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
> Hi, Brandon
>
>
>
> Thanks for your help to check the issues of baseline, and I find the FVL
> env is ok, the expected value has been changed to the value we wanted.
>
> But the baseline of NNT has changed, can you help to check it ?
>
>
>
> Thanks
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Ma, LihongX
> *Sent:* Friday, May 8, 2020 4:02 PM
> *To:* 'Brandon Lo' <blo@iol.unh.edu>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Thanks Brandon, wait your reply.
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu <blo@iol.unh.edu>]
> *Sent:* Wednesday, May 6, 2020 9:05 PM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> Just a further update: we have noticed that there is another internal
> script that is used to calculate baselines that will pull a newer baseline
> if it is found.
>
> We are looking to solve the issues that we are having with baselines and
> will get back to you.
>
>
>
> Thanks for your patience,
>
> Brandon
>
>
>
> On Thu, Apr 30, 2020 at 10:19 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi Lihong,
>
>
>
> The expected value was reset by one of our internal scripts.
>
> I believe that I have resolved this issue for the future by ensuring that
> the baseline that you sent me will not be overwritten automatically.
>
>
>
> I will continue to monitor this expected throughput in case of any issues.
>
>
>
> Thanks for your patience,
>
> Brandon
>
>
>
> On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
>
>
> I checked the new result of FVL, find the expected value also not changed.
>
> From the log find the expected value also is:
>
>
>
> Can you help to double check it ? Is there any different between FVL and
> NNT ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Wednesday, April 29, 2020 12:52 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> Sorry about that, I have reset the baseline to the values you sent in the
> previous email.
>
> I'll look to rerun tests that have failed due to the incorrect baseline.
>
>
>
> Thanks for letting me know,
>
> Brandon
>
>
>
> On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> I find the baseline of NNT have changed as expected, but FVL still same as
> before.
>
> Can you help to check it and change the baseline as expected ?
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Friday, April 3, 2020 2:39 AM
> *To:* Ma, LihongX <lihongx.ma@intel.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <
> david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <
> lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org;
> Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>;
> Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Lihong,
>
>
>
> I have changed the baselines to reflect the new expected values.
>
> The performance tests should work as expected and pass.
>
>
>
> We will email again in the future if we come across any problems.
>
> Feel free to email us as well if you would like to make any other changes.
>
>
>
> Thank you for all your help
>
>
>
> On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com> wrote:
>
> Hi, Brandon
>
> Thanks for you recommends, I have done the changes.
>
> As the throughput value of nic_single_core is proportional to the cpu
> frequency.
>
> I recommend you can change the baseline according to our report system.
>
>
>
> On the our 2.50GHz system, the baseline value as below:
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562
>
> 64
>
> 2048
>
> 41.439
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608
>
> 64
>
> 2048
>
> 47.73
>
>
>
> For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number
> should be
>
> NNT:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 52.562 / 2.5 * 2.1=44.152
>
> 64
>
> 2048
>
> 41.439 / 2.5 * 2.1=34.809
>
>
>
> FVL:
>
> *pkt_size*
>
> *trd/rxd*
>
> *expected_value*
>
> 64
>
> 512
>
> 59.608 / 2.5 * 2.1=50.071
>
> 64
>
> 2048
>
> 47.73 / 2.5 * 2.1=40.093
>
>
>
>
>
>
>
> Regards,
>
> Ma,lihong
>
>
>
> *From:* Brandon Lo [mailto:blo@iol.unh.edu]
> *Sent:* Tuesday, March 31, 2020 9:42 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> To make changes to either Intel machine, please reboot using the command
> "reboot_to_rw" as root to reboot the machine into read/write mode.
>
> This command will also disable any testing on the machine.
>
>
>
> To re-enable the machine, please run "reboot_to_ro" as root, and it will
> save all of the changes that you've made and re-enable testing on the
> machine.
>
> I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro"
> instead of the normal "reboot" while you're making changes.
>
>
>
> After you're done, please let me know. I'll have to manually run a test
> and update the baseline using our internal CI.
>
>
>
> Thank you
>
>
>
> On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Please let me know how to make change to this reset machine.
> (ip/access...) and disable it.
>
>
>
> After that please help to change the baseline.
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Thursday, March 26, 2020 11:39 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <
> tim.odriscoll@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, we have a system in place that resets any changes made while
> testing is enabled for a machine.
>
> If you would like, I can disable testing and allow you to make permanent
> changes.
>
>
>
> I can also reset the baseline of Intel 10G test performance once you make
> these changes.
>
> Please let me know if you would like to make permanent changes on the
> Intel 10G so I can disable it for you.
>
>
>
> Thanks
>
>
>
> On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Thanks. Brandon.
>
>
>
> That’s good. We have made changed on 10G testbed.
>
>
>
> I monitored the several execution results; I found the results of 10G
> always has -0.9%~-1.x% gap against expected number. So it could lead to see
> sometime failures..+-1% I suggest adjusting the expected number. I don’t
> know where the expected number is from? as I know it a dynamic number?
> depends on baseline.. Please help to clarify, thanks.
>
>
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 24, 2020 9:31 PM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> I have enabled the 10G Intel machine for testing.
>
> If you would like to make any more changes, please let me know so I can
> perform the necessary steps to prepare the machine for changes.
>
> Please feel free to let me know if you need anything.
>
>
>
> Thank you
>
>
>
> On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> For 10G, please enable it. our code is at original path
> */opt/test-harness/dts.*
>
>
>
> For 40G, please keep running. and see if any issue. But, anyway, we have
> modified the DTS code at /opt/test-harness/dts-new-suite. If we met same
> problem, then use this new DTS instead.
>
>
>
> Thanks a lot
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Saturday, March 21, 2020 1:49 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Ma, LihongX <lihongx.ma@intel.com>; Zhang,
> XuemingX <xuemingx.zhang@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Currently, the 40G machine is stable enough to be put on production
> dashboard to run tests which may cause Trex to be killed.
>
> Should I disable the 40G Intel machine for you to make changes?
>
>
>
> Also, just for confirmation: on the 10G machine, is the folder that you
> are using for the testing located in */opt/test-harness/dts-2020-3-4, o*r
> are you still using the one in the standard */opt/test-harness/dts*
> folder?
>
>
>
> If everything is ok, I will enable the 10G machine for production testing.
>
>
>
> Thank you very much
>
>
>
> On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Brandon,
>
>
>
> We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could
> you please help to recover them?
>
>
>
> But, for FVL(40G) testbed, we met some problems, could you please help to
> check before recover it
>
> - Sometime 1G hugepage will be changed to 2Mhugepage
> automatically...we have to restart the system
> - When we debugging on the testbed, found that Trex was killed by some
> one(app)..
>
> Please help to check if any other program running on the testbed.
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Sent:* Wednesday, March 18, 2020 9:04 PM
> *To:* Brandon Lo <blo@iol.unh.edu>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Subject:* RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Brandon, we almost made a workaround.
>
>
>
> Maybe tomorrow, you could recover Intel’s testbed. I will let you know
> soon.
>
>
>
>
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Wednesday, March 18, 2020 3:34 AM
> *To:* Chen, Zhaoyan <zhaoyan.chen@intel.com>
> *Cc:* David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> Have you finished making changes on the Intel machine?
>
> I will turn on the machine on March 3rd for testing if you do not have any
> issues with it.
>
> Please let me know if you need anything else.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com>
> wrote:
>
> Hi, Brandon,
>
>
>
> Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex
> upgrading.
>
> So we are reviewing our DTS script, different Trex version, and CI calling
> procedure.
>
>
>
> Anyway, we are focusing on this task recently, any update will let you
> know.
>
>
>
> Thanks.
>
>
>
> *Regards,*
>
> *Zhaoyan Chen*
>
>
>
> *From:* Brandon Lo <blo@iol.unh.edu>
> *Sent:* Tuesday, March 10, 2020 10:46 PM
> *To:* David Marchand <david.marchand@redhat.com>
> *Cc:* Chen, Zhaoyan <zhaoyan.chen@intel.com>; dpdklab@iol.unh.edu;
> Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <
> thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu,
> Qian Q <qian.q.xu@intel.com>
> *Subject:* Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
>
>
>
> Hi Zhaoyan,
>
>
>
> How is the current status of the Intel 82599ES?
>
> Were there any configuration changes made to fix performance issues?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu> wrote:
>
> Hi David,
>
>
>
> This was just a weird issue with the packet generator not cleaning itself
> after a test fast enough before another test.
>
> I'll rerun the tests that were affected and keep an eye out to see if it's
> stable enough to be put back online.
>
>
>
> Thanks
>
>
>
> On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com>
> wrote:
>
> On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu> wrote:
> >
> > Hi David and Zhaoyan,
> >
> >
> > Yes, those results are related to the Intel machine; I have disabled
> testing for the Intel testbed.
> >
> > The 82599ES machine is now available for ssh and modifications.
>
> Any news about this?
>
> I received a failure on a patch of mine (changing macros in a ARM header).
> https://lab.dpdk.org/results/dashboard/patchsets/9900/
>
> But this time, it is with the 40G Intel nic test.
>
> --
> David Marchand
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>
> --
>
> Brandon Lo
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> blo@iol.unh.edu
>
> www.iol.unh.edu
>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu
www.iol.unh.edu
[-- Attachment #1.2: Type: text/html, Size: 80881 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [dpdk-ci] [dpdklab] Re: Intel performance test is failing
2020-05-14 16:12 ` Brandon Lo
@ 2020-05-15 1:30 ` Ma, LihongX
0 siblings, 0 replies; 42+ messages in thread
From: Ma, LihongX @ 2020-05-15 1:30 UTC (permalink / raw)
To: Brandon Lo
Cc: Chen, Zhaoyan, David Marchand, dpdklab, Lincoln Lavoie,
Thomas Monjalon, ci, Tu, Lijuan, Xu, Qian Q, Zhang, XuemingX,
O'Driscoll, Tim, Lin, Xueqin
[-- Attachment #1.1: Type: text/plain, Size: 21397 bytes --]
Thanks Brandon
Now the baseline values are corrected both on NNT and FVL.
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Friday, May 15, 2020 12:13 AM
To: Ma, LihongX <lihongx.ma@intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com>; David Marchand <david.marchand@redhat.com>; dpdklab@iol.unh.edu; Lincoln Lavoie <lylavoie@iol.unh.edu>; Thomas Monjalon <thomas@monjalon.net>; ci@dpdk.org; Tu, Lijuan <lijuan.tu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Zhang, XuemingX <xuemingx.zhang@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I've applied the same fix to the Intel 10G machine.
It should be pulling the correct baseline values from now on.
Thanks,
Brandon
On Tue, May 12, 2020 at 1:41 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for your help to check the issues of baseline, and I find the FVL env is ok, the expected value has been changed to the value we wanted.
But the baseline of NNT has changed, can you help to check it ?
Thanks
Regards,
Ma,lihong
From: Ma, LihongX
Sent: Friday, May 8, 2020 4:02 PM
To: 'Brandon Lo' <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Thanks Brandon, wait your reply.
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu]
Sent: Wednesday, May 6, 2020 9:05 PM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Just a further update: we have noticed that there is another internal script that is used to calculate baselines that will pull a newer baseline if it is found.
We are looking to solve the issues that we are having with baselines and will get back to you.
Thanks for your patience,
Brandon
On Thu, Apr 30, 2020 at 10:19 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi Lihong,
The expected value was reset by one of our internal scripts.
I believe that I have resolved this issue for the future by ensuring that the baseline that you sent me will not be overwritten automatically.
I will continue to monitor this expected throughput in case of any issues.
Thanks for your patience,
Brandon
On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I checked the new result of FVL, find the expected value also not changed.
From the log find the expected value also is:
[cid:image001.png@01D62A9B.7B8F1380]
Can you help to double check it ? Is there any different between FVL and NNT ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Wednesday, April 29, 2020 12:52 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
Sorry about that, I have reset the baseline to the values you sent in the previous email.
I'll look to rerun tests that have failed due to the incorrect baseline.
Thanks for letting me know,
Brandon
On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
I find the baseline of NNT have changed as expected, but FVL still same as before.
Can you help to check it and change the baseline as expected ?
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Lihong,
I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.
We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.
Thank you for all your help
On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.
On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562
64
2048
41.439
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608
64
2048
47.73
For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size
trd/rxd
expected_value
64
512
52.562 / 2.5 * 2.1=44.152
64
2048
41.439 / 2.5 * 2.1=34.809
FVL:
pkt_size
trd/rxd
expected_value
64
512
59.608 / 2.5 * 2.1=50.071
64
2048
47.73 / 2.5 * 2.1=40.093
Regards,
Ma,lihong
From: Brandon Lo [mailto:blo@iol.unh.edu<mailto:blo@iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.
To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.
After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.
Thank you
On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Please let me know how to make change to this reset machine. (ip/access...) and disable it.
After that please help to change the baseline.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>; O'Driscoll, Tim <tim.odriscoll@intel.com<mailto:tim.odriscoll@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.
I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.
Thanks
On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Thanks. Brandon.
That’s good. We have made changed on 10G testbed.
I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.
Thank you
On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
For 10G, please enable it. our code is at original path /opt/test-harness/dts.
For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.
Thanks a lot
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Ma, LihongX <lihongx.ma@intel.com<mailto:lihongx.ma@intel.com>>; Zhang, XuemingX <xuemingx.zhang@intel.com<mailto:xuemingx.zhang@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?
Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?
If everything is ok, I will enable the 10G machine for production testing.
Thank you very much
On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Brandon,
We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?
But, for FVL(40G) testbed, we met some problems, could you please help to check before recover it
* Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
* When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.
Thanks a lot.
Regards,
Zhaoyan Chen
From: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Brandon, we almost made a workaround.
Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>
Cc: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.
Thanks
On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>> wrote:
Hi, Brandon,
Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.
Anyway, we are focusing on this task recently, any update will let you know.
Thanks.
Regards,
Zhaoyan Chen
From: Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen@intel.com<mailto:zhaoyan.chen@intel.com>>; dpdklab@iol.unh.edu<mailto:dpdklab@iol.unh.edu>; Lincoln Lavoie <lylavoie@iol.unh.edu<mailto:lylavoie@iol.unh.edu>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; ci@dpdk.org<mailto:ci@dpdk.org>; Tu, Lijuan <lijuan.tu@intel.com<mailto:lijuan.tu@intel.com>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing
Hi Zhaoyan,
How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?
Thanks
On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
Hi David,
This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.
Thanks
On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand@redhat.com<mailto:david.marchand@redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo@iol.unh.edu<mailto:blo@iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.
Any news about this?
I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/
But this time, it is with the 40G Intel nic test.
--
David Marchand
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
--
Brandon Lo
UNH InterOperability Laboratory
21 Madbury Rd, Suite 100, Durham, NH 03824
blo@iol.unh.edu<mailto:blo@iol.unh.edu>
www.iol.unh.edu<http://www.iol.unh.edu/>
[-- Attachment #1.2: Type: text/html, Size: 105970 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 4767 bytes --]
^ permalink raw reply [flat|nested] 42+ messages in thread