From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ex1.cas-well.com (unknown [122.147.166.54]) by dpdk.org (Postfix) with ESMTP id D97C22E89 for ; Mon, 2 Sep 2013 05:21:44 +0200 (CEST) Received: from [172.16.1.178] (122.147.166.57) by ex1.cas-well.com (192.168.200.10) with Microsoft SMTP Server id 14.2.247.3; Mon, 2 Sep 2013 11:24:28 +0800 Message-ID: <52240466.7050907@cas-well.com> Date: Mon, 2 Sep 2013 11:22:14 +0800 From: Zachary Organization: Cas-Well User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: Content-Type: multipart/alternative; boundary="------------030907080403080602000004" X-Originating-IP: [122.147.166.57] Cc: =?UTF-8?B?Illhbm5pYy5DaG91ICjlkajlk7LmraMpIDogNjg=?= =?UTF-8?B?MDgi?= , =?UTF-8?B?IkFsYW4gWXUgKOS/nuS6puWBiSkgOiA2NjMyIg==?= Subject: [dpdk-dev] DPDK & QPI performance issue in Romley platform. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 03:21:46 -0000 --------------030907080403080602000004 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: quoted-printable Hi~ I have a question about DPDK & QPI performance issue in Romley platform. Recently, I use DPDK example, l2fwd, to test DPDK's performance in my Romle= y platform. When I try to do the test, crossing used CPU, I find the performance dramat= ically decrease. Is it true? Or any method can prove the phenomenon? In my opinion, there should be no this kind of issue here due to QPI have e= nough bandwidth to deal the kinds of case. Thus, I am so amaze in our results and can not explain it. Could someone can help me to solve this problem. Thank a lot! My testing environment describe as below: Platform: Romley CPU: E5-2643 * 2 RAM: Transcend 8GB PC3-1600 DDR3 * 8 OS: Fedora core 14 DPDK: v1.3.1r2, example/l2fwd Slot setting: SlotA is controled by CPU1 directly. SlotB is controled by CPU0 directly. DPDK pre-setting: a. BIOS setting: HT=3Ddisable b. Kernel paramaters isolcpus=3D2,3,6,7 default_hugepagesz=3D1024M hugepagesz=3D1024M hugepages=3D16 c. OS setting: service avahi-daemon stop service NetworkManager stop service iptables stop service acpid stop selinux disable Example program Command: a. SlotB(CPU0) -> CPU1 #>./l2fwd -c 0xc -n 4 -- -q 1 -p 0xc b. SlotA(CPU1) -> CPU0 #>./l2fwd -c 0xc0 -n 4 -- -q 1 -p 0xc0 Results: use frame size 128 bytes CPU Affinity Slot A (CPU1) Slot B (CPU0) CPU0 15.9% 96.49% CPU1 90.88% 24.78% =E6=9C=AC=E4=BF=A1=E4=BB=B6=E5=8F=AF=E8=83=BD=E5=8C=85=E5=90=AB=E7=91=9E=E7= =A5=BA=E9=9B=BB=E9=80=9A=E6=A9=9F=E5=AF=86=E8=B3=87=E8=A8=8A=EF=BC=8C=E9=9D= =9E=E6=8C=87=E5=AE=9A=E4=B9=8B=E6=94=B6=E4=BB=B6=E8=80=85=EF=BC=8C=E8=AB=8B= =E5=8B=BF=E4=BD=BF=E7=94=A8=E6=88=96=E6=8F=AD=E9=9C=B2=E6=9C=AC=E4=BF=A1=E4= =BB=B6=E5=85=A7=E5=AE=B9=EF=BC=8C=E4=B8=A6=E8=AB=8B=E9=8A=B7=E6=AF=80=E6=AD= =A4=E4=BF=A1=E4=BB=B6=E3=80=82 This email may contain confidential informat= ion. Please do not use or disclose it in any way and delete it if you are n= ot the intended recipient. --------------030907080403080602000004 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi~

I have a question about DPDK & QPI performance issue in Romley  pl= atform.
Recently, I use DPDK example, l2fwd, to test DPDK's performance in my Romle= y platform.
When I try to do the test, crossing used CPU, I find the performance dramat= ically decrease.
Is it true? Or any method can prove the phenomenon?

In my opinion, there should be no this kind of issue here due to QPI have e= nough bandwidth to deal the kinds of case.
Thus, I am so amaze in our results and can not explain it.
Could someone can help me to solve this problem.

Thank a lot!


My testing environment describe as below:

Platform:         Romley
CPU:            = ;    E5-2643 * 2
RAM:            = ;   Transcend 8GB PC3-1600 DDR3 * 8
OS:           &= nbsp;     Fedora core 14
DPDK:           = ; v1.3.1r2, example/l2fwd
Slot setting:
            &nb= sp;         SlotA is controled by C= PU1 directly.

          &nbs= p;           SlotB is con= troled by CPU0 directly.

= DPDK pre-setting:
a. BIOS setting:
    HT=3Ddisable
b. Kernel paramaters
    isolcpus=3D2,3,6,7
    default_hugepagesz=3D1024M
    hugepagesz=3D1024M
    hugepages=3D16
c. OS setting:
    service avahi-daemon stop
    service NetworkManager stop
    service iptables stop
    service acpid stop
    selinux disable


Example program Command:
a. SlotB(CPU0) -> CPU1
    #>./l2fwd -c 0xc -n 4 -- -q 1 -p 0xc

b. SlotA(CPU1) -> CPU0
    #>./l2fwd -c 0xc0 -n 4 -- -q 1 -p 0xc0
=

Results:
     use frame size 128 bytes

CPU Affinity

Slot = A (CPU1)

Slot = B (CPU0)

CPU0<= /span>

15.9%=

96.49= %

CPU1<= /span>

90.88= %

24.78= %



=E6=9C=AC=E4=BF=A1=E4=BB=B6=E5=8F=AF=E8=83=BD=E5=8C=85=E5=90=AB=E7=91=9E=E7= =A5=BA=E9=9B=BB=E9=80=9A=E6=A9=9F=E5=AF=86=E8=B3=87=E8=A8=8A=EF=BC=8C=E9=9D= =9E=E6=8C=87=E5=AE=9A=E4=B9=8B=E6=94=B6=E4=BB=B6=E8=80=85=EF=BC=8C=E8=AB=8B= =E5=8B=BF=E4=BD=BF=E7=94=A8=E6=88=96=E6=8F=AD=E9=9C=B2=E6=9C=AC=E4=BF=A1=E4= =BB=B6=E5=85=A7=E5=AE=B9=EF=BC=8C=E4=B8=A6=E8=AB=8B=E9=8A=B7=E6=AF=80=E6=AD= =A4=E4=BF=A1=E4=BB=B6=E3=80=82 This email may contain confidential informat= ion. Please do not use or disclose it in any way and delete it if you are n= ot the intended recipient. --------------030907080403080602000004--