DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] VMXNET3 on vmware, ping delay
@ 2015-06-25  9:14 Vass, Sandor (Nokia - HU/Budapest)
  2015-06-25 15:18 ` Matthew Hall
  0 siblings, 1 reply; 10+ messages in thread
From: Vass, Sandor (Nokia - HU/Budapest) @ 2015-06-25  9:14 UTC (permalink / raw)
  To: dev

Hello,
I would like to create an IP packet processor program and I choose to use DPDK because it is promising wrt its speed aspect.

I am trying to build a test environment to make the development a cheaper (not to buy HW for each developer), so I created a test setup in
- VMWare Workstation 11
- using DPDK 2.0.0
- with linux kernel 3.10.0, CentOS7
- gcc 4.8.3
- and standard, centos7 provided VMXNET3 driver, with uio_pci_generic kernel module
(shall I use vmxnet3-usermap.ko with dpdk 2.0.0? Where is it, how could I compile it?)

I set up 3 machines:
- set all machines' network interface type to VMXNET3
- set up one machine (C1) for issuing ping, its interface has an IP: 192.168.3.21
- set up one machine (C2) for being the ping target, its interface has an IP: 192.168.3.23
- set up one machine (BR) to act a L2 bridge using some of the examples provided. DPDK is compiled properly, 256x  2MB hugetables created, example application is executed and running without (major) error.
- three machines are connected linearly:  C1 - BR - C2 using two private networks on each side of BR (VMnet2 and VMnet3), so the VMs are connected by vSwitches

Ping reply arrives, definitely goes through BR (extra console logs), but there are unexpected delays with example/skeleton/basicfwd...
[root@localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1018 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=18.7 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=8.87 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=10.2 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1012 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=12.7 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=1049 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=49.8 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=9.02 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=8.74 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1007 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=8.03 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=8.96 ms
64 bytes from 192.168.3.23: icmp_seq=19 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=20 ttl=64 time=9.27 ms
64 bytes from 192.168.3.23: icmp_seq=21 ttl=64 time=1008 ms
...

When I switched on BR to multi_process/client_server_mp, with 2 client processes the result was almost the same:
[root@localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=3.50 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1002 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=3.94 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=2003 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=2.29 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=3002 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=2.66 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=2.87 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=2.88 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=2.70 ms
...

And when I switched on BR to test-pdm, the ping result was kind of normal (every commandline switch left as default)
[root@localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=3.52 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=33.2 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=3.97 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=25.5 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=61.1 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=36.3 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=35.5 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=33.0 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=5.32 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=14.6 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=34.5 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=4.67 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=55.0 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=4.93 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=5.98 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=5.41 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=21.0 ms
...

Though I think these values are still quite high I can accept that as this is a virtualized environment.

Could someone please explain to me what is going on with the basicfwd and client-server exapmles? According to my understanding each packet should go through BR as fast as possible, but it seems that the rte_eth_rx_burst retrieves packets only when there are at least 2 packets on the RX queue of the NIC. At least most of the times as there are cases (rarely - according to my console log) when it can retrieve 1 packet also and sometimes only 3 packets can be retrieved...

What is the difference that makes test-pdm working without major delay and the others don't?



Thanks,
Sandor

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25  9:14 [dpdk-dev] VMXNET3 on vmware, ping delay Vass, Sandor (Nokia - HU/Budapest)
@ 2015-06-25 15:18 ` Matthew Hall
  2015-06-25 15:46   ` Avi Kivity
  2015-06-25 20:56   ` Patel, Rashmin N
  0 siblings, 2 replies; 10+ messages in thread
From: Matthew Hall @ 2015-06-25 15:18 UTC (permalink / raw)
  To: Vass, Sandor (Nokia - HU/Budapest); +Cc: dev

On Thu, Jun 25, 2015 at 09:14:53AM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
> According to my understanding each packet should go 
> through BR as fast as possible, but it seems that the rte_eth_rx_burst 
> retrieves packets only when there are at least 2 packets on the RX queue of 
> the NIC. At least most of the times as there are cases (rarely - according 
> to my console log) when it can retrieve 1 packet also and sometimes only 3 
> packets can be retrieved...

By default DPDK is optimized for throughput not latency. Try a test with 
heavier traffic.

There is also some work going on now for DPDK interrupt-driven mode, which 
will work more like traditional Ethernet drivers instead of polling mode 
Ethernet drivers.

Though I'm not an expert on it, there is also a series of ways to optimize for 
latency, which hopefully some others could discuss... or maybe search the 
archives / web site / Intel tuning documentation.

Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 15:18 ` Matthew Hall
@ 2015-06-25 15:46   ` Avi Kivity
  2015-06-25 16:37     ` Matthew Hall
  2015-06-25 18:44     ` Thomas Monjalon
  2015-06-25 20:56   ` Patel, Rashmin N
  1 sibling, 2 replies; 10+ messages in thread
From: Avi Kivity @ 2015-06-25 15:46 UTC (permalink / raw)
  To: Matthew Hall, Vass, Sandor (Nokia - HU/Budapest); +Cc: dev



On 06/25/2015 06:18 PM, Matthew Hall wrote:
> On Thu, Jun 25, 2015 at 09:14:53AM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
>> According to my understanding each packet should go
>> through BR as fast as possible, but it seems that the rte_eth_rx_burst
>> retrieves packets only when there are at least 2 packets on the RX queue of
>> the NIC. At least most of the times as there are cases (rarely - according
>> to my console log) when it can retrieve 1 packet also and sometimes only 3
>> packets can be retrieved...
> By default DPDK is optimized for throughput not latency. Try a test with
> heavier traffic.
>
> There is also some work going on now for DPDK interrupt-driven mode, which
> will work more like traditional Ethernet drivers instead of polling mode
> Ethernet drivers.
>
> Though I'm not an expert on it, there is also a series of ways to optimize for
> latency, which hopefully some others could discuss... or maybe search the
> archives / web site / Intel tuning documentation.
>

What would be useful is a runtime switch between polling and interrupt 
modes.  This was if the load is load you use interrupts, and as 
mitigation, you switch to poll mode, until the load drops again.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 15:46   ` Avi Kivity
@ 2015-06-25 16:37     ` Matthew Hall
  2015-06-25 18:44     ` Thomas Monjalon
  1 sibling, 0 replies; 10+ messages in thread
From: Matthew Hall @ 2015-06-25 16:37 UTC (permalink / raw)
  To: Avi Kivity; +Cc: dev

On Thu, Jun 25, 2015 at 06:46:30PM +0300, Avi Kivity wrote:
> What would be useful is a runtime switch between polling and interrupt
> modes.  This was if the load is load you use interrupts, and as mitigation,
> you switch to poll mode, until the load drops again.

Yes... I believe this is part of the plan. Though obviously I didn't work on 
it personally, I am still using the classic simple modes until I get my app to 
feature-complete level first.

In addition the *power* examples use adaptive polling to reduce CPU load to 
fit the current traffic profile.

Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 15:46   ` Avi Kivity
  2015-06-25 16:37     ` Matthew Hall
@ 2015-06-25 18:44     ` Thomas Monjalon
  2015-06-25 18:54       ` Matthew Hall
  2015-06-25 19:20       ` Avi Kivity
  1 sibling, 2 replies; 10+ messages in thread
From: Thomas Monjalon @ 2015-06-25 18:44 UTC (permalink / raw)
  To: Avi Kivity; +Cc: dev

2015-06-25 18:46, Avi Kivity:
> On 06/25/2015 06:18 PM, Matthew Hall wrote:
> > On Thu, Jun 25, 2015 at 09:14:53AM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
> >> According to my understanding each packet should go
> >> through BR as fast as possible, but it seems that the rte_eth_rx_burst
> >> retrieves packets only when there are at least 2 packets on the RX queue of
> >> the NIC. At least most of the times as there are cases (rarely - according
> >> to my console log) when it can retrieve 1 packet also and sometimes only 3
> >> packets can be retrieved...
> > By default DPDK is optimized for throughput not latency. Try a test with
> > heavier traffic.
> >
> > There is also some work going on now for DPDK interrupt-driven mode, which
> > will work more like traditional Ethernet drivers instead of polling mode
> > Ethernet drivers.
> >
> > Though I'm not an expert on it, there is also a series of ways to optimize for
> > latency, which hopefully some others could discuss... or maybe search the
> > archives / web site / Intel tuning documentation.
> >
> 
> What would be useful is a runtime switch between polling and interrupt 
> modes.  This was if the load is load you use interrupts, and as 
> mitigation, you switch to poll mode, until the load drops again.

DPDK is not a stack. It's up to the DPDK application to poll or use interrupts
when needed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 18:44     ` Thomas Monjalon
@ 2015-06-25 18:54       ` Matthew Hall
  2015-06-25 19:20       ` Avi Kivity
  1 sibling, 0 replies; 10+ messages in thread
From: Matthew Hall @ 2015-06-25 18:54 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, Jun 25, 2015 at 08:44:51PM +0200, Thomas Monjalon wrote:
> DPDK is not a stack.

Hi Thomas,

Don't worry too much about that challenge.

When I get my app feature complete, I think we can change that.

Same for Avi and they server frameworks they are making at Cloudius. ;)

Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 18:44     ` Thomas Monjalon
  2015-06-25 18:54       ` Matthew Hall
@ 2015-06-25 19:20       ` Avi Kivity
  1 sibling, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2015-06-25 19:20 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On 06/25/2015 09:44 PM, Thomas Monjalon wrote:
> 2015-06-25 18:46, Avi Kivity:
>> On 06/25/2015 06:18 PM, Matthew Hall wrote:
>>> On Thu, Jun 25, 2015 at 09:14:53AM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
>>>> According to my understanding each packet should go
>>>> through BR as fast as possible, but it seems that the rte_eth_rx_burst
>>>> retrieves packets only when there are at least 2 packets on the RX queue of
>>>> the NIC. At least most of the times as there are cases (rarely - according
>>>> to my console log) when it can retrieve 1 packet also and sometimes only 3
>>>> packets can be retrieved...
>>> By default DPDK is optimized for throughput not latency. Try a test with
>>> heavier traffic.
>>>
>>> There is also some work going on now for DPDK interrupt-driven mode, which
>>> will work more like traditional Ethernet drivers instead of polling mode
>>> Ethernet drivers.
>>>
>>> Though I'm not an expert on it, there is also a series of ways to optimize for
>>> latency, which hopefully some others could discuss... or maybe search the
>>> archives / web site / Intel tuning documentation.
>>>
>> What would be useful is a runtime switch between polling and interrupt
>> modes.  This was if the load is load you use interrupts, and as
>> mitigation, you switch to poll mode, until the load drops again.
> DPDK is not a stack. It's up to the DPDK application to poll or use interrupts
> when needed.

As long as DPDK provides a mechanism for a runtime switch, the 
application can do that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 15:18 ` Matthew Hall
  2015-06-25 15:46   ` Avi Kivity
@ 2015-06-25 20:56   ` Patel, Rashmin N
  2015-06-25 21:13     ` Vass, Sandor (Nokia - HU/Budapest)
  1 sibling, 1 reply; 10+ messages in thread
From: Patel, Rashmin N @ 2015-06-25 20:56 UTC (permalink / raw)
  To: Matthew Hall, Vass, Sandor (Nokia - HU/Budapest); +Cc: dev

For tuning ESXi and vSwitch for latency sensitive workloads, I remember the following paper published by VMware: https://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf that you can try out.

The overall latency in setup (vmware and dpdk-vm using vmxnet3) remains in vmware-native-driver/vmkernel/vmxnet3-backend/vmx-emulation threads in ESXi. So you can better tune ESXi (as explained in the above white paper) and/or make sure that these important threads are not starving to improve latency and throughput in some cases of this setup.

Thanks,
Rashmin

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matthew Hall
Sent: Thursday, June 25, 2015 8:19 AM
To: Vass, Sandor (Nokia - HU/Budapest)
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] VMXNET3 on vmware, ping delay

On Thu, Jun 25, 2015 at 09:14:53AM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
> According to my understanding each packet should go through BR as fast 
> as possible, but it seems that the rte_eth_rx_burst retrieves packets 
> only when there are at least 2 packets on the RX queue of the NIC. At 
> least most of the times as there are cases (rarely - according to my 
> console log) when it can retrieve 1 packet also and sometimes only 3 
> packets can be retrieved...

By default DPDK is optimized for throughput not latency. Try a test with heavier traffic.

There is also some work going on now for DPDK interrupt-driven mode, which will work more like traditional Ethernet drivers instead of polling mode Ethernet drivers.

Though I'm not an expert on it, there is also a series of ways to optimize for latency, which hopefully some others could discuss... or maybe search the archives / web site / Intel tuning documentation.

Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 20:56   ` Patel, Rashmin N
@ 2015-06-25 21:13     ` Vass, Sandor (Nokia - HU/Budapest)
  2015-06-25 22:36       ` Matthew Hall
  0 siblings, 1 reply; 10+ messages in thread
From: Vass, Sandor (Nokia - HU/Budapest) @ 2015-06-25 21:13 UTC (permalink / raw)
  To: ext Patel, Rashmin N, Matthew Hall; +Cc: dev

It seems I have found the cause, but I still don't understand the reason.
So, let me describe my setup a bit further. I installed the VMWare Workstation onto my laptop. It has a mobile i5 CPU: 2 cores with hyperthreading, so basically 4 cores.
In VMWare I assigned to C1 and C2 nodes 1 CPU and one core, BR has one CPU and 4 cores allocated (the possible maximum value).

If I execute the 'basicfwd' or the multi-process master (and two clients) on any of the cores out of [2,3,4] then the ping is received immediately (less than 0.5ms) and the transfer speed is immediately high (starting from ~30MB and finishing at around 80-90MB/s with basicfwd and test-pdm also *). 

If I allocate them on core 1 (the clients on any other cores), then the ping behaves as I originally described: 1sec delays. When I tried to transfer a bigger file (I used scp) it started really slow (some 16-32KB/s), sometimes even it was stalled. Then later on it get faster as Matthew wrote but it didn't went upper than 20-30MB/s

test-pmd worked originally. 
This is because when executing test-pmd there had to be defined 2 cores and I always passed '-c 3'. Checking with top it could be seen that it always used the CPU#2 (top showed that the second CPU was utilized by 100%).

Can anyone tell me the reason of this behavior? Using CPU 1 there are huge latencies, using other CPUs everything work as expected...
Checking on the laptop (windows task manager) it could be seen that none of the VMs were utilizing one CPU's to 100% on my laptop. The dpdk processes 100% utilization were somehow distributed amongst the physical CPU cores. So no single core were allocated exclusively by a VM. Why is it a different situation when I use the first CPU on BR rather than the others? It doesn't seem that C1 and C2 are blocking that CPU. Anyway, the HOST opsys already uses all the cores (not heavily).


Rashmin, thanks for the docs. I think I already saw that one but I didn't take that as serious. I thought perf tuning about latency in VMWare ESXi makes point when one would like to go from 5ms to 0.5ms. But I had 1000ms latency at low load... I will check those params if they apply to Workstation at all.


*) Top speed of Multi process master-client example was around 20-30 MB/s, immediately. I think this is a normal limitation because the processes have to talk with each other through shared mem, so it is anyway slower. I didn't test its speed when the Master process was bound to core 1

Sandor

-----Original Message-----
From: ext Patel, Rashmin N [mailto:rashmin.n.patel@intel.com] 
Sent: Thursday, June 25, 2015 10:56 PM
To: Matthew Hall; Vass, Sandor (Nokia - HU/Budapest)
Cc: dev@dpdk.org
Subject: RE: [dpdk-dev] VMXNET3 on vmware, ping delay

For tuning ESXi and vSwitch for latency sensitive workloads, I remember the following paper published by VMware: https://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf that you can try out.

The overall latency in setup (vmware and dpdk-vm using vmxnet3) remains in vmware-native-driver/vmkernel/vmxnet3-backend/vmx-emulation threads in ESXi. So you can better tune ESXi (as explained in the above white paper) and/or make sure that these important threads are not starving to improve latency and throughput in some cases of this setup.

Thanks,
Rashmin

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matthew Hall
Sent: Thursday, June 25, 2015 8:19 AM
To: Vass, Sandor (Nokia - HU/Budapest)
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] VMXNET3 on vmware, ping delay

On Thu, Jun 25, 2015 at 09:14:53AM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
> According to my understanding each packet should go through BR as fast 
> as possible, but it seems that the rte_eth_rx_burst retrieves packets 
> only when there are at least 2 packets on the RX queue of the NIC. At 
> least most of the times as there are cases (rarely - according to my 
> console log) when it can retrieve 1 packet also and sometimes only 3 
> packets can be retrieved...

By default DPDK is optimized for throughput not latency. Try a test with heavier traffic.

There is also some work going on now for DPDK interrupt-driven mode, which will work more like traditional Ethernet drivers instead of polling mode Ethernet drivers.

Though I'm not an expert on it, there is also a series of ways to optimize for latency, which hopefully some others could discuss... or maybe search the archives / web site / Intel tuning documentation.

Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] VMXNET3 on vmware, ping delay
  2015-06-25 21:13     ` Vass, Sandor (Nokia - HU/Budapest)
@ 2015-06-25 22:36       ` Matthew Hall
  0 siblings, 0 replies; 10+ messages in thread
From: Matthew Hall @ 2015-06-25 22:36 UTC (permalink / raw)
  To: Vass, Sandor (Nokia - HU/Budapest); +Cc: dev

On Thu, Jun 25, 2015 at 09:13:59PM +0000, Vass, Sandor (Nokia - HU/Budapest) wrote:
> Can anyone tell me the reason of this behavior? Using CPU 1 there are huge 
> latencies, using other CPUs everything work as expected...

One possible guess what could be related. Normally DPDK uses "Core #0" as the 
"master lcore".

That core's behavior is ever so slightly different from other cores.

Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-06-25 22:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-25  9:14 [dpdk-dev] VMXNET3 on vmware, ping delay Vass, Sandor (Nokia - HU/Budapest)
2015-06-25 15:18 ` Matthew Hall
2015-06-25 15:46   ` Avi Kivity
2015-06-25 16:37     ` Matthew Hall
2015-06-25 18:44     ` Thomas Monjalon
2015-06-25 18:54       ` Matthew Hall
2015-06-25 19:20       ` Avi Kivity
2015-06-25 20:56   ` Patel, Rashmin N
2015-06-25 21:13     ` Vass, Sandor (Nokia - HU/Budapest)
2015-06-25 22:36       ` Matthew Hall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).