From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-x22d.google.com (mail-ie0-x22d.google.com [IPv6:2607:f8b0:4001:c03::22d]) by dpdk.org (Postfix) with ESMTP id 6940930E for ; Tue, 28 May 2013 05:15:19 +0200 (CEST) Received: by mail-ie0-f173.google.com with SMTP id k13so369911iea.4 for ; Mon, 27 May 2013 20:15:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:x-gm-message-state; bh=YKh54kyMbXJVkV6RbOPAAh/l/ycU5QP79FwM5W9qXyM=; b=JeJgRu7aQPDu9qU6nVOqSjo6LN6DadP3mkYD4wb9ZF+E5k918PL3M61ybes1xnE1ai C3t64lsPhimrsGQ3MgZ2PWcWiok9H8IrAHR7AUhvtyKI1OXyhvVIJLOLPNwAY1U1WgRg PCe2hbNJUhuY1DbW/eMQxowwA9k1Kt2G4ykADGUl9PWjY4C1Succ2LeODG5Q49UfrxHu X8ziren/DMuUQJFQyhtzOjnF5G9TSa5Z8UwfY0Q1hfP36eEpZzWIdPz9sBrlPY2906Bi yO4ejbXx86kRfyasCtO/X0+9T8r9HLjJxkG2l8JEDSm61MC+EZNdk5tiiyRQVodnVS+s urMA== X-Received: by 10.50.112.4 with SMTP id im4mr6229995igb.1.1369710925003; Mon, 27 May 2013 20:15:25 -0700 (PDT) Received: from [172.20.0.144] (50-203-21-194-static.hfc.comcastbusiness.net. [50.203.21.194]) by mx.google.com with ESMTPSA id ij6sm16389822igb.1.2013.05.27.20.15.22 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 27 May 2013 20:15:24 -0700 (PDT) Message-ID: <51A4214B.8040703@6wind.com> Date: Mon, 27 May 2013 20:15:23 -0700 From: Emre Eraltan User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: Shinae Woo References: <20130528113005.5E6D.C42C3789@sakura.ad.jp> In-Reply-To: Content-Type: multipart/alternative; boundary="------------040303000004080805070607" X-Gm-Message-State: ALoCoQmrpaF69prv0SMcF0+BceXjVsXrb9BiLhz6OwMKhY/d/hOksVqq2MUB9Wk5W6TmF56K35b/ Cc: dev@dpdk.org Subject: Re: [dpdk-dev] Performances are not scale with multiple ports X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 03:15:20 -0000 This is a multi-part message in MIME format. --------------040303000004080805070607 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello Shinae, Did you try to use the testpmd tool with multiple queues per port? It gives you more flexibility compared to l2fwd app. You need to trigger the RSS feature of the NIC by sending different streams (just by changing the destination port for instance or any information in the 5-tuple). This will load balance your packets among several cores so that you can probe multiple queues with different cores. Otherwise, you will use only one core (or thread if HT is enabled) per port for the RX side. Best Regards, Emre -- Emre ERALTAN 6WIND Field Application Engineer On 27/05/2013 20:05, Shinae Woo wrote: > Thanks for sharing Naoto. > > So in your experiments, the forwarding performance still does not > reach the line rate. > > Your perf record shows that the cpu spend most of the time in polling > for receiving packets, > and no other heavy operation. > Even though the application polling packets in its best, > the forwarder miss some packets from elsewhere from the application-side. > > The dpdk document shows that 160Mpps forwarding performance in 2 sockets, > but I can only reach the 13 Mpps in 2 ports. > Even doubling the number of ports to 4 ports, the performance is still > less than 17Mpps. > > I want to know where is the bottleneck lies in my environments, or > how I can reprocuce the same performance as the dpdk published. > > Thank you, > Shinae > > > > On Tue, May 28, 2013 at 11:30 AM, Naoto MATSUMOTO > > wrote: > > > FYI: Disruptive IP Networking with Intel DPDK on Linux > http://slidesha.re/SeVFZo > > > On Tue, 28 May 2013 11:26:30 +0900 > Shinae Woo > wrote: > > > Hello, all. > > > > I play the dpdk-1.2.3r1 with examples. > > > > But I can not achieve the line-rate packet receive performance, > > and the performance is not scale with multiple ports. > > > > For example, in example l2fwd, I have tested two cases with 2 > ports, and 4 > > ports, > > using belowed command line each > > > > ./build/l2fwd -cf -n3 -- -p3 > > ./build/l2fwd -cf -n3 -- -pf > > > > But both cases, the aggregated performance are not scale. > > > > == experiments environments == > > - Two Intel 82599 NICs (total 4 ports) > > - Intel Xeon X5690 @ 3.47GHz * 2 (total 12 cores) > > - 1024 * 2MB hugepages > > - Linux 2.6.38-15-server > > - Each ports receiving 10Gbps of traffic of 64 bytes packets, > 14.88Mpps. > > > > *1. Packet forwarding performance* > > > > In 2 ports case, receive performance is 13Mpps, > > In 4 ports case, not 26Mbps, only 16.8Mpps. > > > > Port statistics ==================================== > > Statistics for port 0 ------------------------------ > > Packets sent: 4292256 > > Packets received: 6517396 > > Packets dropped: 2224776 > > Statistics for port 1 ------------------------------ > > Packets sent: 4291840 > > Packets received: 6517044 > > Packets dropped: 2225556 > > Aggregate statistics =============================== > > Total packets sent: 8584128 > > Total packets received: 13034472 > > Total packets dropped: 4450332 > > ==================================================== > > > > Port statistics ==================================== > > Statistics for port 0 ------------------------------ > > Packets sent: 1784064 > > Packets received: 2632700 > > Packets dropped: 848128 > > Statistics for port 1 ------------------------------ > > Packets sent: 1784104 > > Packets received: 2632196 > > Packets dropped: 848596 > > Statistics for port 2 ------------------------------ > > Packets sent: 3587616 > > Packets received: 5816344 > > Packets dropped: 2200176 > > Statistics for port 3 ------------------------------ > > Packets sent: 3587712 > > Packets received: 5787848 > > Packets dropped: 2228684 > > Aggregate statistics =============================== > > Total packets sent: 10743560 > > Total packets received: 16869152 > > Total packets dropped: 6125608 > > ==================================================== > > > > *2. Packet receiving performance* > > I fix the codes for only receiving packets (not forwarding), > > the performance is still not scalable as each 13.3Mpps, 18Mpps. > > > > Port statistics ==================================== > > Statistics for port 0 ------------------------------ > > Packets sent: 0 > > Packets received: 6678860 > > Packets dropped: 0 > > Statistics for port 1 ------------------------------ > > Packets sent: 0 > > Packets received: 6646120 > > Packets dropped: 0 > > Aggregate statistics =============================== > > Total packets sent: 0 > > Total packets received: 13325012 > > Total packets dropped: 0 > > ==================================================== > > > > Port statistics ==================================== > > Statistics for port 0 ------------------------------ > > Packets sent: 0 > > Packets received: 3129624 > > Packets dropped: 0 > > Statistics for port 1 ------------------------------ > > Packets sent: 0 > > Packets received: 3131292 > > Packets dropped: 0 > > Statistics for port 2 ------------------------------ > > Packets sent: 0 > > Packets received: 6260908 > > Packets dropped: 0 > > Statistics for port 3 ------------------------------ > > Packets sent: 0 > > Packets received: 6238764 > > Packets dropped: 0 > > Aggregate statistics =============================== > > Total packets sent: 0 > > Total packets received: 18760640 > > Total packets dropped: 0 > > ==================================================== > > > > The question is that > > 1. How I can achieve each port receiving full 14.88Mpps ? > > What might be the bottleneck in current environment? > > 2. Why the performance using multiple ports is not scale? > > I guess doubling ports shows the doubling the receiving > performance, > > but it shows not. I am curious about what is limiting the packet > > receivng performance. > > > > Thanks, > > Shinae > > -- > SAKURA Internet Inc. / Senior Researcher > Naoto MATSUMOTO > > SAKURA Internet Research Center > > --------------040303000004080805070607 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hello Shinae,

Did you try to use the testpmd tool with multiple queues per port? It gives you more flexibility compared to l2fwd app.

You need to trigger the RSS feature of the NIC by sending different streams (just by changing the destination port for instance or any information in the 5-tuple). This will load balance your packets among several cores so that you can probe multiple queues with different cores. Otherwise, you will use only one core (or thread if HT is enabled) per port for the RX side.

Best Regards,
Emre
-- 
Emre ERALTAN
6WIND Field Application Engineer

On 27/05/2013 20:05, Shinae Woo wrote:
Thanks for sharing Naoto.

So in your experiments, the forwarding performance still does not reach the line rate.

Your perf record shows that the cpu spend most of the time in polling for receiving packets,
and no other heavy operation.
Even though the application polling packets in its best,
the forwarder miss some packets from elsewhere from the application-side.

The dpdk document shows that 160Mpps forwarding performance in 2 sockets,
but I can only reach the 13 Mpps in 2 ports.
Even doubling the number of ports to 4 ports, the performance is still less than 17Mpps.

I want to know where is the bottleneck lies in my environments, or
how I can reprocuce the same performance as the dpdk published.

Thank you,
Shinae



On Tue, May 28, 2013 at 11:30 AM, Naoto MATSUMOTO <n-matsumoto@sakura.ad.jp> wrote:

FYI: Disruptive IP Networking with Intel DPDK on Linux
http://slidesha.re/SeVFZo


On Tue, 28 May 2013 11:26:30 +0900
Shinae Woo <shinae2012@gmail.com> wrote:

> Hello, all.
>
> I play the dpdk-1.2.3r1 with examples.
>
> But I can not achieve the line-rate packet receive performance,
> and the performance is not scale with multiple ports.
>
> For example, in example l2fwd, I have tested two cases with 2 ports, and 4
> ports,
> using belowed command line each
>
> ./build/l2fwd -cf -n3 -- -p3
> ./build/l2fwd -cf -n3 -- -pf
>
> But both cases, the aggregated performance are not scale.
>
> == experiments environments ==
> - Two Intel 82599 NICs (total 4 ports)
> - Intel Xeon X5690  @ 3.47GHz * 2 (total 12 cores)
> - 1024 * 2MB hugepages
> - Linux 2.6.38-15-server
> - Each ports receiving 10Gbps of traffic of 64 bytes packets, 14.88Mpps.
>
> *1. Packet forwarding performance*
>
> In 2 ports case,  receive performance is 13Mpps,
> In 4 ports case,  not 26Mbps, only 16.8Mpps.
>
> Port statistics ====================================
> Statistics for port 0 ------------------------------
> Packets sent:                  4292256
> Packets received:              6517396
> Packets dropped:               2224776
> Statistics for port 1 ------------------------------
> Packets sent:                  4291840
> Packets received:              6517044
> Packets dropped:               2225556
> Aggregate statistics ===============================
> Total packets sent:            8584128
> Total packets received:       13034472
> Total packets dropped:         4450332
> ====================================================
>
> Port statistics ====================================
> Statistics for port 0 ------------------------------
> Packets sent:                  1784064
> Packets received:              2632700
> Packets dropped:                848128
> Statistics for port 1 ------------------------------
> Packets sent:                  1784104
> Packets received:              2632196
> Packets dropped:                848596
> Statistics for port 2 ------------------------------
> Packets sent:                  3587616
> Packets received:              5816344
> Packets dropped:               2200176
> Statistics for port 3 ------------------------------
> Packets sent:                  3587712
> Packets received:              5787848
> Packets dropped:               2228684
> Aggregate statistics ===============================
> Total packets sent:           10743560
> Total packets received:       16869152
> Total packets dropped:         6125608
> ====================================================
>
> *2. Packet receiving performance*
> I fix the codes for only receiving packets (not forwarding),
> the performance is still not scalable as each 13.3Mpps, 18Mpps.
>
> Port statistics ====================================
> Statistics for port 0 ------------------------------
> Packets sent:                        0
> Packets received:              6678860
> Packets dropped:                     0
> Statistics for port 1 ------------------------------
> Packets sent:                        0
> Packets received:              6646120
> Packets dropped:                     0
> Aggregate statistics ===============================
> Total packets sent:                  0
> Total packets received:       13325012
> Total packets dropped:               0
> ====================================================
>
> Port statistics ====================================
> Statistics for port 0 ------------------------------
> Packets sent:                        0
> Packets received:              3129624
> Packets dropped:                     0
> Statistics for port 1 ------------------------------
> Packets sent:                        0
> Packets received:              3131292
> Packets dropped:                     0
> Statistics for port 2 ------------------------------
> Packets sent:                        0
> Packets received:              6260908
> Packets dropped:                     0
> Statistics for port 3 ------------------------------
> Packets sent:                        0
> Packets received:              6238764
> Packets dropped:                     0
> Aggregate statistics ===============================
> Total packets sent:                  0
> Total packets received:       18760640
> Total packets dropped:               0
> ====================================================
>
> The question is that
> 1. How I can achieve each port receiving full 14.88Mpps ?
>     What might be the bottleneck in current environment?
> 2. Why the performance using multiple ports is not scale?
>     I guess doubling ports shows the doubling the receiving performance,
>     but it shows not. I am curious about what is limiting the packet
> receivng performance.
>
> Thanks,
> Shinae

--
SAKURA Internet Inc. / Senior Researcher
Naoto MATSUMOTO <n-matsumoto@sakura.ad.jp>
SAKURA Internet Research Center <http://research.sakura.ad.jp/>


--------------040303000004080805070607--