DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] KNI Threads/Cores
@ 2016-06-08 16:30 Cliff Burdick
  2016-06-08 16:45 ` Matt Laswell
  2016-06-08 19:31 ` Ferruh Yigit
  0 siblings, 2 replies; 7+ messages in thread
From: Cliff Burdick @ 2016-06-08 16:30 UTC (permalink / raw)
  To: users

Hi, I have an application with two sockets where each core I'm planning to
transmit and receive a fairly large amount of traffic per core. Each core
right now handles a single queue of either TX or RX of a given port. Across
all the cores, I may be processing up to 12 ports. I also need to handle
things like ARP and ping, so I'm going to add in the KNI driver to handle
that. Since the amount of traffic I'm expecting that I'll need to forward
to Linux is very small, it seems like I should be able to dedicate one
lcore per socket to handle this functionality and have the dataplane cores
pass the traffic off to this core using rte_kni_tx_burst().

My question is, first of all, is this possible? It seems like I can
configure the KNI driver to start in "single thread" mode. From that point,
I want to initialize one KNI device for each port, and have each kernel
lcore on each processor handle that traffic. I believe if I call
rte_kni_alloc with core_id set to the kernel lcore for each device, then in
the end I'll have something like 6 KNI devices on socket one being handled
by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an
example. Then my threads that are handling the dataplane tx/rx can simply
be passed a pointer to their respective rte_kni device. Does this sound
correct?

Also, the sample says the core affinity needs to be set using taskset. Is
that already taken care of with conf.core_id in rte_kni_alloc or do I still
need to set it?

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-users] KNI Threads/Cores
  2016-06-08 16:30 [dpdk-users] KNI Threads/Cores Cliff Burdick
@ 2016-06-08 16:45 ` Matt Laswell
  2016-06-08 19:31   ` Cliff Burdick
  2016-06-08 19:31 ` Ferruh Yigit
  1 sibling, 1 reply; 7+ messages in thread
From: Matt Laswell @ 2016-06-08 16:45 UTC (permalink / raw)
  To: Cliff Burdick; +Cc: users

Hey Cliff,

I have a similar use case in my application.  If you're willing to dedicate
an lcore per socket, another way to approach what you're describing is to
create a KNI interface thread that talks to the other cores via message
rings.  That is, the cores that are interacting with the NIC read a bunch
of packets, determine if any of them need to go to KNI and, if so, enqueue
them using rte_ring_enqueue().  They also do a periodic rte_ring_dequeue()
on another queue to accept back any packets that come back from KNI.

The KNI interface process, meanwhile, just loops along, taking packets in
from the NIC interface threads via rte_ring_dequeue() and sending them to
KNI, and taking packets from KNI and returning them to the NIC interface
threads via rte_ring_enqueue().

I've found that this sort of scheme works well, and is reasonably clean
architecturally.  Also, I found that calls into KNI can at times be very
slow.  In my application, I would periodically see KNI calls take 50-100K
cycles, which can cause congestion if you're handling large volumes of
traffic.  Letting a non-critical thread handle this interface was a big win
for me.

This leaves the kernel side processing out, of course.  But if the traffic
going to the kernel is lightweight, you likely don't need a dedicated core
for the kernel-side RX and TX work.

--
Matt Laswell
Principal Software Engineer
infinite io

On Wed, Jun 8, 2016 at 11:30 AM, Cliff Burdick <shaklee3@gmail.com> wrote:

> Hi, I have an application with two sockets where each core I'm planning to
> transmit and receive a fairly large amount of traffic per core. Each core
> right now handles a single queue of either TX or RX of a given port. Across
> all the cores, I may be processing up to 12 ports. I also need to handle
> things like ARP and ping, so I'm going to add in the KNI driver to handle
> that. Since the amount of traffic I'm expecting that I'll need to forward
> to Linux is very small, it seems like I should be able to dedicate one
> lcore per socket to handle this functionality and have the dataplane cores
> pass the traffic off to this core using rte_kni_tx_burst().
>
> My question is, first of all, is this possible? It seems like I can
> configure the KNI driver to start in "single thread" mode. From that point,
> I want to initialize one KNI device for each port, and have each kernel
> lcore on each processor handle that traffic. I believe if I call
> rte_kni_alloc with core_id set to the kernel lcore for each device, then in
> the end I'll have something like 6 KNI devices on socket one being handled
> by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an
> example. Then my threads that are handling the dataplane tx/rx can simply
> be passed a pointer to their respective rte_kni device. Does this sound
> correct?
>
> Also, the sample says the core affinity needs to be set using taskset. Is
> that already taken care of with conf.core_id in rte_kni_alloc or do I still
> need to set it?
>
> Thanks
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-users] KNI Threads/Cores
  2016-06-08 16:45 ` Matt Laswell
@ 2016-06-08 19:31   ` Cliff Burdick
  0 siblings, 0 replies; 7+ messages in thread
From: Cliff Burdick @ 2016-06-08 19:31 UTC (permalink / raw)
  To: Matt Laswell; +Cc: users

Thanks Matt! I will try that. It seems very clean.

On Wed, Jun 8, 2016 at 9:45 AM, Matt Laswell <laswell@infinite.io> wrote:

> Hey Cliff,
>
> I have a similar use case in my application.  If you're willing to
> dedicate an lcore per socket, another way to approach what you're
> describing is to create a KNI interface thread that talks to the other
> cores via message rings.  That is, the cores that are interacting with the
> NIC read a bunch of packets, determine if any of them need to go to KNI
> and, if so, enqueue them using rte_ring_enqueue().  They also do a periodic
> rte_ring_dequeue() on another queue to accept back any packets that come
> back from KNI.
>
> The KNI interface process, meanwhile, just loops along, taking packets in
> from the NIC interface threads via rte_ring_dequeue() and sending them to
> KNI, and taking packets from KNI and returning them to the NIC interface
> threads via rte_ring_enqueue().
>
> I've found that this sort of scheme works well, and is reasonably clean
> architecturally.  Also, I found that calls into KNI can at times be very
> slow.  In my application, I would periodically see KNI calls take 50-100K
> cycles, which can cause congestion if you're handling large volumes of
> traffic.  Letting a non-critical thread handle this interface was a big win
> for me.
>
> This leaves the kernel side processing out, of course.  But if the traffic
> going to the kernel is lightweight, you likely don't need a dedicated core
> for the kernel-side RX and TX work.
>
> --
> Matt Laswell
> Principal Software Engineer
> infinite io
>
> On Wed, Jun 8, 2016 at 11:30 AM, Cliff Burdick <shaklee3@gmail.com> wrote:
>
>> Hi, I have an application with two sockets where each core I'm planning to
>> transmit and receive a fairly large amount of traffic per core. Each core
>> right now handles a single queue of either TX or RX of a given port.
>> Across
>> all the cores, I may be processing up to 12 ports. I also need to handle
>> things like ARP and ping, so I'm going to add in the KNI driver to handle
>> that. Since the amount of traffic I'm expecting that I'll need to forward
>> to Linux is very small, it seems like I should be able to dedicate one
>> lcore per socket to handle this functionality and have the dataplane cores
>> pass the traffic off to this core using rte_kni_tx_burst().
>>
>> My question is, first of all, is this possible? It seems like I can
>> configure the KNI driver to start in "single thread" mode. From that
>> point,
>> I want to initialize one KNI device for each port, and have each kernel
>> lcore on each processor handle that traffic. I believe if I call
>> rte_kni_alloc with core_id set to the kernel lcore for each device, then
>> in
>> the end I'll have something like 6 KNI devices on socket one being handled
>> by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an
>> example. Then my threads that are handling the dataplane tx/rx can simply
>> be passed a pointer to their respective rte_kni device. Does this sound
>> correct?
>>
>> Also, the sample says the core affinity needs to be set using taskset. Is
>> that already taken care of with conf.core_id in rte_kni_alloc or do I
>> still
>> need to set it?
>>
>> Thanks
>>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-users] KNI Threads/Cores
  2016-06-08 16:30 [dpdk-users] KNI Threads/Cores Cliff Burdick
  2016-06-08 16:45 ` Matt Laswell
@ 2016-06-08 19:31 ` Ferruh Yigit
  2016-06-08 19:48   ` Cliff Burdick
  1 sibling, 1 reply; 7+ messages in thread
From: Ferruh Yigit @ 2016-06-08 19:31 UTC (permalink / raw)
  To: Cliff Burdick, users

On 6/8/2016 5:30 PM, Cliff Burdick wrote:
> Hi, I have an application with two sockets where each core I'm planning to
> transmit and receive a fairly large amount of traffic per core. Each core
> right now handles a single queue of either TX or RX of a given port. Across
> all the cores, I may be processing up to 12 ports. I also need to handle
> things like ARP and ping, so I'm going to add in the KNI driver to handle
> that. Since the amount of traffic I'm expecting that I'll need to forward
> to Linux is very small, it seems like I should be able to dedicate one
> lcore per socket to handle this functionality and have the dataplane cores
> pass the traffic off to this core using rte_kni_tx_burst().
> 
> My question is, first of all, is this possible? It seems like I can
> configure the KNI driver to start in "single thread" mode. From that point,
> I want to initialize one KNI device for each port, and have each kernel
> lcore on each processor handle that traffic. I believe if I call
> rte_kni_alloc with core_id set to the kernel lcore for each device, then in
> the end I'll have something like 6 KNI devices on socket one being handled
> by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an
> example. Then my threads that are handling the dataplane tx/rx can simply
> be passed a pointer to their respective rte_kni device. Does this sound
> correct?

If rte_kni module used "single thread" mode, kernel core_id is not used
at all. For single thread mode, a single thread created, this is used to
for all kni devices and not able to pin to any specific lcore.

For what you have described, first need to insert module with
kthread_mode=multiple param. This will create a kernel thread per kni
interface. But I guess it is possible to provide same
rte_kni_conf->core_id for some of them, and yes rte_kni_conf->force_pin
is required, otherwise core_id is not useful. According your sample,
first 6 kni devices will have core_id value 0, and other 6 kni devices
will have core_id value 31, with all have force_bind set. This will
create 12 kernel threads, will bind 6 of them to core 0 and other 6 to
core 31.

> 
> Also, the sample says the core affinity needs to be set using taskset. Is
> that already taken care of with conf.core_id in rte_kni_alloc or do I still
> need to set it?
> 
> Thanks
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-users] KNI Threads/Cores
  2016-06-08 19:31 ` Ferruh Yigit
@ 2016-06-08 19:48   ` Cliff Burdick
  2016-06-08 19:49     ` Cliff Burdick
  2016-06-09 16:29     ` Ferruh Yigit
  0 siblings, 2 replies; 7+ messages in thread
From: Cliff Burdick @ 2016-06-08 19:48 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: users

Hi Yigit, when you say "I guess it's possible" is that not common? I would
think that the amount of traffic people would want to forward to Linux for
normal DPDK applications would be quite small. If you had to have a core
dedicated for KNI on each interface that would get very wasteful.

On Wed, Jun 8, 2016 at 12:31 PM, Ferruh Yigit <ferruh.yigit@intel.com>
wrote:

> On 6/8/2016 5:30 PM, Cliff Burdick wrote:
> > Hi, I have an application with two sockets where each core I'm planning
> to
> > transmit and receive a fairly large amount of traffic per core. Each core
> > right now handles a single queue of either TX or RX of a given port.
> Across
> > all the cores, I may be processing up to 12 ports. I also need to handle
> > things like ARP and ping, so I'm going to add in the KNI driver to handle
> > that. Since the amount of traffic I'm expecting that I'll need to forward
> > to Linux is very small, it seems like I should be able to dedicate one
> > lcore per socket to handle this functionality and have the dataplane
> cores
> > pass the traffic off to this core using rte_kni_tx_burst().
> >
> > My question is, first of all, is this possible? It seems like I can
> > configure the KNI driver to start in "single thread" mode. From that
> point,
> > I want to initialize one KNI device for each port, and have each kernel
> > lcore on each processor handle that traffic. I believe if I call
> > rte_kni_alloc with core_id set to the kernel lcore for each device, then
> in
> > the end I'll have something like 6 KNI devices on socket one being
> handled
> > by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an
> > example. Then my threads that are handling the dataplane tx/rx can simply
> > be passed a pointer to their respective rte_kni device. Does this sound
> > correct?
>
> If rte_kni module used "single thread" mode, kernel core_id is not used
> at all. For single thread mode, a single thread created, this is used to
> for all kni devices and not able to pin to any specific lcore.
>
> For what you have described, first need to insert module with
> kthread_mode=multiple param. This will create a kernel thread per kni
> interface. But I guess it is possible to provide same
> rte_kni_conf->core_id for some of them, and yes rte_kni_conf->force_pin
> is required, otherwise core_id is not useful. According your sample,
> first 6 kni devices will have core_id value 0, and other 6 kni devices
> will have core_id value 31, with all have force_bind set. This will
> create 12 kernel threads, will bind 6 of them to core 0 and other 6 to
> core 31.
>
> >
> > Also, the sample says the core affinity needs to be set using taskset. Is
> > that already taken care of with conf.core_id in rte_kni_alloc or do I
> still
> > need to set it?
> >
> > Thanks
> >
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-users] KNI Threads/Cores
  2016-06-08 19:48   ` Cliff Burdick
@ 2016-06-08 19:49     ` Cliff Burdick
  2016-06-09 16:29     ` Ferruh Yigit
  1 sibling, 0 replies; 7+ messages in thread
From: Cliff Burdick @ 2016-06-08 19:49 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: users

Ferruh, sorry.

On Wed, Jun 8, 2016 at 12:48 PM, Cliff Burdick <shaklee3@gmail.com> wrote:

> Hi Yigit, when you say "I guess it's possible" is that not common? I would
> think that the amount of traffic people would want to forward to Linux for
> normal DPDK applications would be quite small. If you had to have a core
> dedicated for KNI on each interface that would get very wasteful.
>
> On Wed, Jun 8, 2016 at 12:31 PM, Ferruh Yigit <ferruh.yigit@intel.com>
> wrote:
>
>> On 6/8/2016 5:30 PM, Cliff Burdick wrote:
>> > Hi, I have an application with two sockets where each core I'm planning
>> to
>> > transmit and receive a fairly large amount of traffic per core. Each
>> core
>> > right now handles a single queue of either TX or RX of a given port.
>> Across
>> > all the cores, I may be processing up to 12 ports. I also need to handle
>> > things like ARP and ping, so I'm going to add in the KNI driver to
>> handle
>> > that. Since the amount of traffic I'm expecting that I'll need to
>> forward
>> > to Linux is very small, it seems like I should be able to dedicate one
>> > lcore per socket to handle this functionality and have the dataplane
>> cores
>> > pass the traffic off to this core using rte_kni_tx_burst().
>> >
>> > My question is, first of all, is this possible? It seems like I can
>> > configure the KNI driver to start in "single thread" mode. From that
>> point,
>> > I want to initialize one KNI device for each port, and have each kernel
>> > lcore on each processor handle that traffic. I believe if I call
>> > rte_kni_alloc with core_id set to the kernel lcore for each device,
>> then in
>> > the end I'll have something like 6 KNI devices on socket one being
>> handled
>> > by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as
>> an
>> > example. Then my threads that are handling the dataplane tx/rx can
>> simply
>> > be passed a pointer to their respective rte_kni device. Does this sound
>> > correct?
>>
>> If rte_kni module used "single thread" mode, kernel core_id is not used
>> at all. For single thread mode, a single thread created, this is used to
>> for all kni devices and not able to pin to any specific lcore.
>>
>> For what you have described, first need to insert module with
>> kthread_mode=multiple param. This will create a kernel thread per kni
>> interface. But I guess it is possible to provide same
>> rte_kni_conf->core_id for some of them, and yes rte_kni_conf->force_pin
>> is required, otherwise core_id is not useful. According your sample,
>> first 6 kni devices will have core_id value 0, and other 6 kni devices
>> will have core_id value 31, with all have force_bind set. This will
>> create 12 kernel threads, will bind 6 of them to core 0 and other 6 to
>> core 31.
>>
>> >
>> > Also, the sample says the core affinity needs to be set using taskset.
>> Is
>> > that already taken care of with conf.core_id in rte_kni_alloc or do I
>> still
>> > need to set it?
>> >
>> > Thanks
>> >
>>
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-users] KNI Threads/Cores
  2016-06-08 19:48   ` Cliff Burdick
  2016-06-08 19:49     ` Cliff Burdick
@ 2016-06-09 16:29     ` Ferruh Yigit
  1 sibling, 0 replies; 7+ messages in thread
From: Ferruh Yigit @ 2016-06-09 16:29 UTC (permalink / raw)
  To: Cliff Burdick; +Cc: users

On 6/8/2016 8:48 PM, Cliff Burdick wrote:
> Hi Yigit, when you say "I guess it's possible" is that not common? I
> would think that the amount of traffic people would want to forward to
> Linux for normal DPDK applications would be quite small. If you had to
> have a core dedicated for KNI on each interface that would get very
> wasteful.
> 

Hi Cliff,

I am not aware of any similar usage or I didn't test myself, but
according source code, it should work as described for what you request.

Regards,
ferruh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-06-09 16:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-08 16:30 [dpdk-users] KNI Threads/Cores Cliff Burdick
2016-06-08 16:45 ` Matt Laswell
2016-06-08 19:31   ` Cliff Burdick
2016-06-08 19:31 ` Ferruh Yigit
2016-06-08 19:48   ` Cliff Burdick
2016-06-08 19:49     ` Cliff Burdick
2016-06-09 16:29     ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).