DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] DCA
@ 2015-04-20 10:07 Vlad Zolotarov
  2015-04-20 10:50 ` Bruce Richardson
  0 siblings, 1 reply; 7+ messages in thread
From: Vlad Zolotarov @ 2015-04-20 10:07 UTC (permalink / raw)
  To: dev

Hi,
I would like to ask if there is any reason why DPDK doesn't have support 
for DCA feature?

thanks,
vlad

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DCA
  2015-04-20 10:07 [dpdk-dev] DCA Vlad Zolotarov
@ 2015-04-20 10:50 ` Bruce Richardson
  2015-04-21  8:51   ` Vlad Zolotarov
  0 siblings, 1 reply; 7+ messages in thread
From: Bruce Richardson @ 2015-04-20 10:50 UTC (permalink / raw)
  To: Vlad Zolotarov; +Cc: dev

On Mon, Apr 20, 2015 at 01:07:59PM +0300, Vlad Zolotarov wrote:
> Hi,
> I would like to ask if there is any reason why DPDK doesn't have support for
> DCA feature?
> 
> thanks,
> vlad

With modern platforms with DDIO the data written by the NIC automatically goes
into the cache of the CPU without us needing to use DCA.

/Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DCA
  2015-04-20 10:50 ` Bruce Richardson
@ 2015-04-21  8:51   ` Vlad Zolotarov
  2015-04-21  9:27     ` Bruce Richardson
  0 siblings, 1 reply; 7+ messages in thread
From: Vlad Zolotarov @ 2015-04-21  8:51 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev



On 04/20/15 13:50, Bruce Richardson wrote:
> On Mon, Apr 20, 2015 at 01:07:59PM +0300, Vlad Zolotarov wrote:
>> Hi,
>> I would like to ask if there is any reason why DPDK doesn't have support for
>> DCA feature?
>>
>> thanks,
>> vlad
> With modern platforms with DDIO the data written by the NIC automatically goes
> into the cache of the CPU without us needing to use DCA.
Thanks for a reply, Bruce.
One question though. According to DDIO documentation it only affects the 
CPUs "local" relatively to the NIC. DCA, on the other hand may be 
configured to work with any CPU. Modern platforms usually have a few 
NUMA nodes and requirement of binding network handling threads only to 
CPUs "local" to the NIC is very limiting.

Could u, pls., comment on this?

thanks in advance,
vlad

>
> /Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DCA
  2015-04-21  8:51   ` Vlad Zolotarov
@ 2015-04-21  9:27     ` Bruce Richardson
  2015-04-21  9:47       ` Vlad Zolotarov
  2015-04-21 17:44       ` Matthew Hall
  0 siblings, 2 replies; 7+ messages in thread
From: Bruce Richardson @ 2015-04-21  9:27 UTC (permalink / raw)
  To: Vlad Zolotarov; +Cc: dev

On Tue, Apr 21, 2015 at 11:51:40AM +0300, Vlad Zolotarov wrote:
> 
> 
> On 04/20/15 13:50, Bruce Richardson wrote:
> >On Mon, Apr 20, 2015 at 01:07:59PM +0300, Vlad Zolotarov wrote:
> >>Hi,
> >>I would like to ask if there is any reason why DPDK doesn't have support for
> >>DCA feature?
> >>
> >>thanks,
> >>vlad
> >With modern platforms with DDIO the data written by the NIC automatically goes
> >into the cache of the CPU without us needing to use DCA.
> Thanks for a reply, Bruce.
> One question though. According to DDIO documentation it only affects the
> CPUs "local" relatively to the NIC. DCA, on the other hand may be configured
> to work with any CPU. Modern platforms usually have a few NUMA nodes and
> requirement of binding network handling threads only to CPUs "local" to the
> NIC is very limiting.
> 
> Could u, pls., comment on this?
> 
> thanks in advance,
> vlad
>
My main comment is that yes, you are correct. DDIO only works with the local
socket, while DCA can be made to work with remote sockets. If you need to do
polling on a device from a remote socket you may need to look at DCA.

Can you perhaps comment on the use-case where you find this binding limiting? Modern
platforms have multiple NUMA nodes, but they also generally have PCI slots
connected to those multiple NUMA nodes also, so that you can have your NIC ports
similarly NUMA partitionned?

/Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DCA
  2015-04-21  9:27     ` Bruce Richardson
@ 2015-04-21  9:47       ` Vlad Zolotarov
  2015-04-21 17:44       ` Matthew Hall
  1 sibling, 0 replies; 7+ messages in thread
From: Vlad Zolotarov @ 2015-04-21  9:47 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev



On 04/21/15 12:27, Bruce Richardson wrote:
> On Tue, Apr 21, 2015 at 11:51:40AM +0300, Vlad Zolotarov wrote:
>>
>> On 04/20/15 13:50, Bruce Richardson wrote:
>>> On Mon, Apr 20, 2015 at 01:07:59PM +0300, Vlad Zolotarov wrote:
>>>> Hi,
>>>> I would like to ask if there is any reason why DPDK doesn't have support for
>>>> DCA feature?
>>>>
>>>> thanks,
>>>> vlad
>>> With modern platforms with DDIO the data written by the NIC automatically goes
>>> into the cache of the CPU without us needing to use DCA.
>> Thanks for a reply, Bruce.
>> One question though. According to DDIO documentation it only affects the
>> CPUs "local" relatively to the NIC. DCA, on the other hand may be configured
>> to work with any CPU. Modern platforms usually have a few NUMA nodes and
>> requirement of binding network handling threads only to CPUs "local" to the
>> NIC is very limiting.
>>
>> Could u, pls., comment on this?
>>
>> thanks in advance,
>> vlad
>>
> My main comment is that yes, you are correct. DDIO only works with the local
> socket, while DCA can be made to work with remote sockets. If you need to do
> polling on a device from a remote socket you may need to look at DCA.
>
> Can you perhaps comment on the use-case where you find this binding limiting? Modern
> platforms have multiple NUMA nodes, but they also generally have PCI slots
> connected to those multiple NUMA nodes also, so that you can have your NIC ports
> similarly NUMA partitionned?

The immediate example where this could be problematic is an AWS Guest 
with Enhanced Netowrking case: in c3.8xlarge instance u get a 2 NUMA 
nodes, 32 CPU cores and u can bind as many 82599 Intel VFs as u need, 
each providing 4 Rx and 4 Tx queues. AFAIR  nothing is promised about 
the locality of PFs VFs belong to. To utilize all CPUs we'll need 4 or 8 
VFs depending on the queues layout we decide (a separate CPU for each 
queue or a separate CPU for each Rx + Tx queue pair). In this case u may 
get absolutely different NUMA layouts:
     - all VFs reside on the same PF: half of the queues will be remote 
to one of the NUMA node or all of them are remote to all CPUs.
     - VFs come from two PFs which may reside in the same NUMA nodes as 
CPUs or not...
     - VFs come from more than two different PFs...

So, in the above example DCA would cover all our needs while DDIO won't 
be able to cover them in most of the cases.

>
> /Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DCA
  2015-04-21  9:27     ` Bruce Richardson
  2015-04-21  9:47       ` Vlad Zolotarov
@ 2015-04-21 17:44       ` Matthew Hall
  2015-04-22  9:10         ` Bruce Richardson
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Hall @ 2015-04-21 17:44 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Tue, Apr 21, 2015 at 10:27:48AM +0100, Bruce Richardson wrote:
> Can you perhaps comment on the use-case where you find this binding 
> limiting? Modern platforms have multiple NUMA nodes, but they also generally 
> have PCI slots connected to those multiple NUMA nodes also, so that you can 
> have your NIC ports similarly NUMA partitionned?

Hi Bruce,

I was wondering if you have tried to do this on COTS (commerical 
off-the-shelf) hardware before. What I found each time I tried it was that 
PCIe slots are not very evenly distributed across the NUMA nodes unlike what 
you'd expect.

Sometimes the PCIe lanes on CPU 0 get partly used up by Super IO or other 
integrated peripherals. Other times the motherboards give you 2 x8 when you 
needed 1 x16 or they give you a bundh of x4 when you needed x8, etc.

It's actually pretty difficult to find the mapping, for one, and even when you 
do, even harder to get the right slots for your cards and so on. In the ixgbe 
kernel driver you'll sometimes get some cryptic debug prints when it's been 
munged and performance will suffer. But in the ixgbe PMD driver you're on your 
own mostly.

Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DCA
  2015-04-21 17:44       ` Matthew Hall
@ 2015-04-22  9:10         ` Bruce Richardson
  0 siblings, 0 replies; 7+ messages in thread
From: Bruce Richardson @ 2015-04-22  9:10 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

On Tue, Apr 21, 2015 at 10:44:54AM -0700, Matthew Hall wrote:
> On Tue, Apr 21, 2015 at 10:27:48AM +0100, Bruce Richardson wrote:
> > Can you perhaps comment on the use-case where you find this binding 
> > limiting? Modern platforms have multiple NUMA nodes, but they also generally 
> > have PCI slots connected to those multiple NUMA nodes also, so that you can 
> > have your NIC ports similarly NUMA partitionned?
> 
> Hi Bruce,
> 
> I was wondering if you have tried to do this on COTS (commerical 
> off-the-shelf) hardware before. What I found each time I tried it was that 
> PCIe slots are not very evenly distributed across the NUMA nodes unlike what 
> you'd expect.
> 

I doubt I've tried it on regular commercial boards as much as you guys have,
though it does happen!

> Sometimes the PCIe lanes on CPU 0 get partly used up by Super IO or other 
> integrated peripherals. Other times the motherboards give you 2 x8 when you 
> needed 1 x16 or they give you a bundh of x4 when you needed x8, etc.

Point taken!

> 
> It's actually pretty difficult to find the mapping, for one, and even when you 
> do, even harder to get the right slots for your cards and so on. In the ixgbe 
> kernel driver you'll sometimes get some cryptic debug prints when it's been 
> munged and performance will suffer. But in the ixgbe PMD driver you're on your 
> own mostly.

It was to try and make the NUMA mapping of PCI clearer that we added in the
printing of the NUMA node on PCI scan:

EAL: PCI device 0000:86:00.0 on NUMA socket 1
EAL:   probe driver: 8086:154a rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7fb452f04000
EAL:   PCI memory mapped at 0x7fb453004000

Is there something more than this you feel we could do in the PMD to help with
slot identification?

/Bruce

> 
> Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-04-22  9:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-20 10:07 [dpdk-dev] DCA Vlad Zolotarov
2015-04-20 10:50 ` Bruce Richardson
2015-04-21  8:51   ` Vlad Zolotarov
2015-04-21  9:27     ` Bruce Richardson
2015-04-21  9:47       ` Vlad Zolotarov
2015-04-21 17:44       ` Matthew Hall
2015-04-22  9:10         ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).