[dpdk-dev] lpm performance

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] lpm performance
@ 2016-09-19 22:18 张伟
  2016-09-20  9:41 ` Andriy Berestovskyy
  0 siblings, 1 reply; 5+ messages in thread
From: 张伟 @ 2016-09-19 22:18 UTC (permalink / raw)
  To: dev, mhall, nikita

Hi all, 

Does anyone test IPv4 performance? If so, what's the throughput? I can get almost 10Gb with 64 byte packets.  But before the test, I would expect it will be less than 10G.  I thought the performance will not be affected by the  number of rule entires. But the throughput will be related to whether the flow needs to check the second layer table : TBL8.  Is my understanding correct? I added this flow entries following this link:
http://www.slideshare.net/garyachy/understanding-ddpd-algorithmics 
slide 10, 

struct ipv4_lpm_route ipv4_lpm_route_array[] = {

        {IPv4(192, 168, 0, 0), 16, 0},

        {IPv4(192, 168, 1, 0), 24, 1},

        {IPv4(192, 168, 1, 1), 32, 2}

};

send the flow with dst IP: 

192.168.1.2

It should check the second layer table. But the performance is still 10G.  Does any part go wrong with my setup? Or it really can achieve 10G with 64 byte packet size.  

Thanks,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] lpm performance
  2016-09-19 22:18 [dpdk-dev] lpm performance 张伟
@ 2016-09-20  9:41 ` Andriy Berestovskyy
  2016-09-20 10:47   ` 张伟
  0 siblings, 1 reply; 5+ messages in thread
From: Andriy Berestovskyy @ 2016-09-20  9:41 UTC (permalink / raw)
  To: 张伟; +Cc: dev, Matthew Hall, nikita

Hey,
You are correct. The LPM might need just one (TBL24) or two memory
reads (TBL24 + TBL8). The performance also drops once you have a
variety of destination addresses instead of just one (cache misses).

In your case for the dst IP 192.168.1.2 you will have two memory reads
(TBL24 + TBL8), because 192.168.1/24 block has the more specific route
192.168.1.1/32.

Regards,
Andriy

On Tue, Sep 20, 2016 at 12:18 AM, 张伟 <zhangwqh@126.com> wrote:
> Hi all,
>
>
> Does anyone test IPv4 performance? If so, what's the throughput? I can get almost 10Gb with 64 byte packets.  But before the test, I would expect it will be less than 10G.  I thought the performance will not be affected by the  number of rule entires. But the throughput will be related to whether the flow needs to check the second layer table : TBL8.  Is my understanding correct? I added this flow entries following this link:
> http://www.slideshare.net/garyachy/understanding-ddpd-algorithmics
> slide 10,
>
>
>
> struct ipv4_lpm_route ipv4_lpm_route_array[] = {
>
>         {IPv4(192, 168, 0, 0), 16, 0},
>
>         {IPv4(192, 168, 1, 0), 24, 1},
>
>         {IPv4(192, 168, 1, 1), 32, 2}
>
> };
>
> send the flow with dst IP:
>
> 192.168.1.2
>
> It should check the second layer table. But the performance is still 10G.  Does any part go wrong with my setup? Or it really can achieve 10G with 64 byte packet size.
>
> Thanks,
>
>



-- 
Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] lpm performance
  2016-09-20  9:41 ` Andriy Berestovskyy
@ 2016-09-20 10:47   ` 张伟
  2016-09-20 14:41     ` Andriy Berestovskyy
  0 siblings, 1 reply; 5+ messages in thread
From: 张伟 @ 2016-09-20 10:47 UTC (permalink / raw)
  To: Andriy Berestovskyy; +Cc: dev, Matthew Hall, nikita

Thanks so much for your reply!  Usually how did you test lpm performance with variety of destination addresses? use which tool send the traffic? how many flows rules will you add? what's the performance you get?








At 2016-09-20 17:41:13, "Andriy Berestovskyy" <aber@semihalf.com> wrote:
>Hey,
>You are correct. The LPM might need just one (TBL24) or two memory
>reads (TBL24 + TBL8). The performance also drops once you have a
>variety of destination addresses instead of just one (cache misses).
>
>In your case for the dst IP 192.168.1.2 you will have two memory reads
>(TBL24 + TBL8), because 192.168.1/24 block has the more specific route
>192.168.1.1/32.
>
>Regards,
>Andriy
>
>On Tue, Sep 20, 2016 at 12:18 AM, 张伟 <zhangwqh@126.com> wrote:
>> Hi all,
>>
>>
>> Does anyone test IPv4 performance? If so, what's the throughput? I can get almost 10Gb with 64 byte packets.  But before the test, I would expect it will be less than 10G.  I thought the performance will not be affected by the  number of rule entires. But the throughput will be related to whether the flow needs to check the second layer table : TBL8.  Is my understanding correct? I added this flow entries following this link:
>> http://www.slideshare.net/garyachy/understanding-ddpd-algorithmics
>> slide 10,
>>
>>
>>
>> struct ipv4_lpm_route ipv4_lpm_route_array[] = {
>>
>>         {IPv4(192, 168, 0, 0), 16, 0},
>>
>>         {IPv4(192, 168, 1, 0), 24, 1},
>>
>>         {IPv4(192, 168, 1, 1), 32, 2}
>>
>> };
>>
>> send the flow with dst IP:
>>
>> 192.168.1.2
>>
>> It should check the second layer table. But the performance is still 10G.  Does any part go wrong with my setup? Or it really can achieve 10G with 64 byte packet size.
>>
>> Thanks,
>>
>>
>
>
>
>-- 
>Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] lpm performance
  2016-09-20 10:47   ` 张伟
@ 2016-09-20 14:41     ` Andriy Berestovskyy
  2016-09-21  2:42       ` 张伟
  0 siblings, 1 reply; 5+ messages in thread
From: Andriy Berestovskyy @ 2016-09-20 14:41 UTC (permalink / raw)
  To: 张伟; +Cc: dev, Matthew Hall, nikita

AFAIR Intel hardware should do the 10Gbit/s line rate (i.e. ~14,8
MPPS) with one flow and LPM quite easily. Sorry, I don't have numbers
to share at hand.

Regarding the tool please see the pktgen-dpdk or TRex. Regarding the
number of flows and overall benchmarking methodology - please see
RFC2544.

Andriy


On Tue, Sep 20, 2016 at 12:47 PM, 张伟 <zhangwqh@126.com> wrote:
> Thanks so much for your reply!  Usually how did you test lpm performance
> with variety of destination addresses? use which tool send the traffic? how
> many flows rules will you add? what's the performance you get?
>
>
>
>
>
>
> At 2016-09-20 17:41:13, "Andriy Berestovskyy" <aber@semihalf.com> wrote:
>>Hey,
>>You are correct. The LPM might need just one (TBL24) or two memory
>>reads (TBL24 + TBL8). The performance also drops once you have a
>>variety of destination addresses instead of just one (cache misses).
>>
>>In your case for the dst IP 192.168.1.2 you will have two memory reads
>>(TBL24 + TBL8), because 192.168.1/24 block has the more specific route
>>192.168.1.1/32.
>>
>>Regards,
>>Andriy
>>
>>On Tue, Sep 20, 2016 at 12:18 AM, 张伟 <zhangwqh@126.com> wrote:
>>> Hi all,
>>>
>>>
>>> Does anyone test IPv4 performance? If so, what's the throughput? I can
>>> get almost 10Gb with 64 byte packets.  But before the test, I would expect
>>> it will be less than 10G.  I thought the performance will not be affected by
>>> the  number of rule entires. But the throughput will be related to whether
>>> the flow needs to check the second layer table : TBL8.  Is my understanding
>>> correct? I added this flow entries following this link:
>>> http://www.slideshare.net/garyachy/understanding-ddpd-algorithmics
>>> slide 10,
>>>
>>>
>>>
>>> struct ipv4_lpm_route ipv4_lpm_route_array[] = {
>>>
>>>         {IPv4(192, 168, 0, 0), 16, 0},
>>>
>>>         {IPv4(192, 168, 1, 0), 24, 1},
>>>
>>>         {IPv4(192, 168, 1, 1), 32, 2}
>>>
>>> };
>>>
>>> send the flow with dst IP:
>>>
>>> 192.168.1.2
>>>
>>> It should check the second layer table. But the performance is still 10G.
>>> Does any part go wrong with my setup? Or it really can achieve 10G with 64
>>> byte packet size.
>>>
>>> Thanks,
>>>
>>>
>>
>>
>>
>>--
>>Andriy Berestovskyy
>
>
>
>



-- 
Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] lpm performance
  2016-09-20 14:41     ` Andriy Berestovskyy
@ 2016-09-21  2:42       ` 张伟
  0 siblings, 0 replies; 5+ messages in thread
From: 张伟 @ 2016-09-21  2:42 UTC (permalink / raw)
  To: Andriy Berestovskyy; +Cc: dev, Matthew Hall, nikita

Got it.  Thanks for your guidance! 








在 2016-09-20 22:41:36，"Andriy Berestovskyy" <aber@semihalf.com> 写道：
>AFAIR Intel hardware should do the 10Gbit/s line rate (i.e. ~14,8
>MPPS) with one flow and LPM quite easily. Sorry, I don't have numbers
>to share at hand.
>
>Regarding the tool please see the pktgen-dpdk or TRex. Regarding the
>number of flows and overall benchmarking methodology - please see
>RFC2544.
>
>Andriy
>
>
>On Tue, Sep 20, 2016 at 12:47 PM, 张伟 <zhangwqh@126.com> wrote:
>> Thanks so much for your reply!  Usually how did you test lpm performance
>> with variety of destination addresses? use which tool send the traffic? how
>> many flows rules will you add? what's the performance you get?
>>
>>
>>
>>
>>
>>
>> At 2016-09-20 17:41:13, "Andriy Berestovskyy" <aber@semihalf.com> wrote:
>>>Hey,
>>>You are correct. The LPM might need just one (TBL24) or two memory
>>>reads (TBL24 + TBL8). The performance also drops once you have a
>>>variety of destination addresses instead of just one (cache misses).
>>>
>>>In your case for the dst IP 192.168.1.2 you will have two memory reads
>>>(TBL24 + TBL8), because 192.168.1/24 block has the more specific route
>>>192.168.1.1/32.
>>>
>>>Regards,
>>>Andriy
>>>
>>>On Tue, Sep 20, 2016 at 12:18 AM, 张伟 <zhangwqh@126.com> wrote:
>>>> Hi all,
>>>>
>>>>
>>>> Does anyone test IPv4 performance? If so, what's the throughput? I can
>>>> get almost 10Gb with 64 byte packets.  But before the test, I would expect
>>>> it will be less than 10G.  I thought the performance will not be affected by
>>>> the  number of rule entires. But the throughput will be related to whether
>>>> the flow needs to check the second layer table : TBL8.  Is my understanding
>>>> correct? I added this flow entries following this link:
>>>> http://www.slideshare.net/garyachy/understanding-ddpd-algorithmics
>>>> slide 10,
>>>>
>>>>
>>>>
>>>> struct ipv4_lpm_route ipv4_lpm_route_array[] = {
>>>>
>>>>         {IPv4(192, 168, 0, 0), 16, 0},
>>>>
>>>>         {IPv4(192, 168, 1, 0), 24, 1},
>>>>
>>>>         {IPv4(192, 168, 1, 1), 32, 2}
>>>>
>>>> };
>>>>
>>>> send the flow with dst IP:
>>>>
>>>> 192.168.1.2
>>>>
>>>> It should check the second layer table. But the performance is still 10G.
>>>> Does any part go wrong with my setup? Or it really can achieve 10G with 64
>>>> byte packet size.
>>>>
>>>> Thanks,
>>>>
>>>>
>>>
>>>
>>>
>>>--
>>>Andriy Berestovskyy
>>
>>
>>
>>
>
>
>
>-- 
>Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-21  2:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-19 22:18 [dpdk-dev] lpm performance 张伟
2016-09-20  9:41 ` Andriy Berestovskyy
2016-09-20 10:47   ` 张伟
2016-09-20 14:41     ` Andriy Berestovskyy
2016-09-21  2:42       ` 张伟

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).