* [dpdk-dev] Recent changes related to interrupt thread
@ 2015-11-16 12:32 Rahul Lakkireddy
  2015-11-16 13:48 ` Thomas Monjalon
  0 siblings, 1 reply; 6+ messages in thread
From: Rahul Lakkireddy @ 2015-11-16 12:32 UTC (permalink / raw)
  To: David Marchand, dev; +Cc: Felix Marti, Kumar Sanghvi, Nirranjan Kirubaharan
Hi,
I notice that the following changeset:
Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
handler")
has moved the initialization of the interrupt thread to after the master
lcore has been initialized.  However, this causes the interrupt thread
to _inherit_ the affinity of the master lcore. Hence, this seems to
make all interrupts to be handled by _only_ the master lcore. Because
of this change, it seems that now alarm interrupts would also be handled
by master lcore only, IIUC.
We are seeing a performance regression for cxgbe PMD after this commit
since, cxgbe PMD relies on alarm to periodically transmit pending
coalesced packets.
Also, this perf degradation is only seen if there's a queue allocated
on the master lcore, such as in l3fwd app.  If the master lcore has
been skipped, then no degradation in perf is seen since only the alarm
will run on the master lcore.
So, is the change done to make all interrupts, including alarm
interrupts, be handled by _only_ the master lcore intended?
BTW, I have tried setting the affinity to all cpus instead in
eal_intr_init() and this seems to restore the perf back. Perhaps it's
better to move the master lcore initialization to after the interrupt
thread has been initialized as well? Thoughts?
Thanks,
Rahul
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] Recent changes related to interrupt thread
  2015-11-16 12:32 [dpdk-dev] Recent changes related to interrupt thread Rahul Lakkireddy
@ 2015-11-16 13:48 ` Thomas Monjalon
  2015-11-16 17:06   ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Monjalon @ 2015-11-16 13:48 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: dev, Nirranjan Kirubaharan, Felix Marti, Kumar Sanghvi
Hi,
2015-11-16 18:02, Rahul Lakkireddy:
> Hi,
> 
> I notice that the following changeset:
> 
> Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> handler")
> 
> has moved the initialization of the interrupt thread to after the master
> lcore has been initialized.  However, this causes the interrupt thread
> to _inherit_ the affinity of the master lcore. Hence, this seems to
> make all interrupts to be handled by _only_ the master lcore. Because
> of this change, it seems that now alarm interrupts would also be handled
> by master lcore only, IIUC.
> 
> We are seeing a performance regression for cxgbe PMD after this commit
> since, cxgbe PMD relies on alarm to periodically transmit pending
> coalesced packets.
> 
> Also, this perf degradation is only seen if there's a queue allocated
> on the master lcore, such as in l3fwd app.  If the master lcore has
> been skipped, then no degradation in perf is seen since only the alarm
> will run on the master lcore.
> 
> So, is the change done to make all interrupts, including alarm
> interrupts, be handled by _only_ the master lcore intended?
No it was not intended. The idea was to inherit settings (iopl) from
the device initialization into the interrupt thread.
Though a DPDK driver is not really supposed to rely on interrupt performance.
So having interrupts managed on any core was more or less a side effect.
> BTW, I have tried setting the affinity to all cpus instead in
> eal_intr_init() and this seems to restore the perf back. Perhaps it's
> better to move the master lcore initialization to after the interrupt
> thread has been initialized as well? Thoughts?
Yes, i think it's possible.
We can also imagine a command line option to set the interrupt affinity
with a default which mimics the old behaviour.
In order to make this conversation clearer, and for later references,
below is the DPDK init call tree:
start
	driver constructor (if .a)
		rte_eal_driver_register
main
	rte_eal_init
		eal_parse_args
		rte_eal_pci_init
		rte_eal_memory_init
		eal_plugins_init
			dlopen
				driver constructor (if .so)
					rte_eal_driver_register
		eal_thread_init_master
			eal_thread_set_affinity
		rte_eal_dev_init
			driver->init
				PMD init
					rte_eth_driver_register
		rte_eal_intr_init
			pthread_create
				eal_intr_thread_main
					eal_intr_handle_interrupts
		pthread_create
		rte_eal_pci_probe
			driver->devinit
				rte_eth_dev_init
					rte_eth_dev_allocate
					eth_drv->eth_dev_init
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] Recent changes related to interrupt thread
  2015-11-16 13:48 ` Thomas Monjalon
@ 2015-11-16 17:06   ` Stephen Hemminger
  2015-11-16 17:19     ` Ananyev, Konstantin
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2015-11-16 17:06 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Nirranjan Kirubaharan, Felix Marti, Kumar Sanghvi
On Mon, 16 Nov 2015 14:48:42 +0100
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> Hi,
> 
> 2015-11-16 18:02, Rahul Lakkireddy:
> > Hi,
> > 
> > I notice that the following changeset:
> > 
> > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > handler")
> > 
> > has moved the initialization of the interrupt thread to after the master
> > lcore has been initialized.  However, this causes the interrupt thread
> > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > make all interrupts to be handled by _only_ the master lcore. Because
> > of this change, it seems that now alarm interrupts would also be handled
> > by master lcore only, IIUC.
> > 
> > We are seeing a performance regression for cxgbe PMD after this commit
> > since, cxgbe PMD relies on alarm to periodically transmit pending
> > coalesced packets.
> > 
> > Also, this perf degradation is only seen if there's a queue allocated
> > on the master lcore, such as in l3fwd app.  If the master lcore has
> > been skipped, then no degradation in perf is seen since only the alarm
> > will run on the master lcore.
> > 
> > So, is the change done to make all interrupts, including alarm
> > interrupts, be handled by _only_ the master lcore intended?
> 
> No it was not intended. The idea was to inherit settings (iopl) from
> the device initialization into the interrupt thread.
> Though a DPDK driver is not really supposed to rely on interrupt performance.
> So having interrupts managed on any core was more or less a side effect.
> 
> > BTW, I have tried setting the affinity to all cpus instead in
> > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > better to move the master lcore initialization to after the interrupt
> > thread has been initialized as well? Thoughts?
> 
> Yes, i think it's possible.
> We can also imagine a command line option to set the interrupt affinity
> with a default which mimics the old behaviour.
> 
> In order to make this conversation clearer, and for later references,
> below is the DPDK init call tree:
> 
With the new interrupt mode, the interrupt thread needs some rework anyway.
Ideally, there would be multiple interrupt threads, one per core;
then use SMP affinity to align the MSI-x interrupt for the device queue
to run on the core that is processing that queue.
This would require new API's to do SMP affinity, wrapper around /proc/irq
and an API to tell DPDK which lcore is being to process a RX (and TX)
queue.
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] Recent changes related to interrupt thread
  2015-11-16 17:06   ` Stephen Hemminger
@ 2015-11-16 17:19     ` Ananyev, Konstantin
  2015-11-16 17:40       ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Ananyev, Konstantin @ 2015-11-16 17:19 UTC (permalink / raw)
  To: Stephen Hemminger, Thomas Monjalon
  Cc: dev, Felix Marti, Nirranjan Kirubaharan, Kumar Sanghvi
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Monday, November 16, 2015 5:07 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org; Nirranjan Kirubaharan; Felix Marti; Kumar Sanghvi
> Subject: Re: [dpdk-dev] Recent changes related to interrupt thread
> 
> On Mon, 16 Nov 2015 14:48:42 +0100
> Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> 
> > Hi,
> >
> > 2015-11-16 18:02, Rahul Lakkireddy:
> > > Hi,
> > >
> > > I notice that the following changeset:
> > >
> > > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > > handler")
> > >
> > > has moved the initialization of the interrupt thread to after the master
> > > lcore has been initialized.  However, this causes the interrupt thread
> > > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > > make all interrupts to be handled by _only_ the master lcore. Because
> > > of this change, it seems that now alarm interrupts would also be handled
> > > by master lcore only, IIUC.
> > >
> > > We are seeing a performance regression for cxgbe PMD after this commit
> > > since, cxgbe PMD relies on alarm to periodically transmit pending
> > > coalesced packets.
> > >
> > > Also, this perf degradation is only seen if there's a queue allocated
> > > on the master lcore, such as in l3fwd app.  If the master lcore has
> > > been skipped, then no degradation in perf is seen since only the alarm
> > > will run on the master lcore.
> > >
> > > So, is the change done to make all interrupts, including alarm
> > > interrupts, be handled by _only_ the master lcore intended?
> >
> > No it was not intended. The idea was to inherit settings (iopl) from
> > the device initialization into the interrupt thread.
> > Though a DPDK driver is not really supposed to rely on interrupt performance.
> > So having interrupts managed on any core was more or less a side effect.
> >
> > > BTW, I have tried setting the affinity to all cpus instead in
> > > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > > better to move the master lcore initialization to after the interrupt
> > > thread has been initialized as well? Thoughts?
> >
> > Yes, i think it's possible.
> > We can also imagine a command line option to set the interrupt affinity
> > with a default which mimics the old behaviour.
> >
> > In order to make this conversation clearer, and for later references,
> > below is the DPDK init call tree:
> >
> 
> With the new interrupt mode, the interrupt thread needs some rework anyway.
> Ideally, there would be multiple interrupt threads, one per core;
> then use SMP affinity to align the MSI-x interrupt for the device queue
> to run on the core that is processing that queue.
> 
> This would require new API's to do SMP affinity, wrapper around /proc/irq
> and an API to tell DPDK which lcore is being to process a RX (and TX)
> queue.
There is no one to one mapping between lcore and device queue.
Any lcore can do RX/TX on the device queue.
Of course it is preferable to do it from the core on the same socket, but not required.
You can even have multiple threads  RX/TX from/to the same queue -
as long as you provide some sync mechanism between them.
Konstantin 
> 
> 
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] Recent changes related to interrupt thread
  2015-11-16 17:19     ` Ananyev, Konstantin
@ 2015-11-16 17:40       ` Stephen Hemminger
       [not found]         ` <2601191342CEEE43887BDE71AB97725836AC98E9@irsmsx105.ger.corp.intel.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2015-11-16 17:40 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: dev, Felix Marti, Nirranjan Kirubaharan, Kumar Sanghvi
I was thinking of something like:
rte_intr_affinity(portid, queueid, lcoreid)
And per-lcore interrupt threads.
On Mon, Nov 16, 2015 at 9:19 AM, Ananyev, Konstantin <
konstantin.ananyev@intel.com> wrote:
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> > Sent: Monday, November 16, 2015 5:07 PM
> > To: Thomas Monjalon
> > Cc: dev@dpdk.org; Nirranjan Kirubaharan; Felix Marti; Kumar Sanghvi
> > Subject: Re: [dpdk-dev] Recent changes related to interrupt thread
> >
> > On Mon, 16 Nov 2015 14:48:42 +0100
> > Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> >
> > > Hi,
> > >
> > > 2015-11-16 18:02, Rahul Lakkireddy:
> > > > Hi,
> > > >
> > > > I notice that the following changeset:
> > > >
> > > > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > > > handler")
> > > >
> > > > has moved the initialization of the interrupt thread to after the
> master
> > > > lcore has been initialized.  However, this causes the interrupt
> thread
> > > > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > > > make all interrupts to be handled by _only_ the master lcore. Because
> > > > of this change, it seems that now alarm interrupts would also be
> handled
> > > > by master lcore only, IIUC.
> > > >
> > > > We are seeing a performance regression for cxgbe PMD after this
> commit
> > > > since, cxgbe PMD relies on alarm to periodically transmit pending
> > > > coalesced packets.
> > > >
> > > > Also, this perf degradation is only seen if there's a queue allocated
> > > > on the master lcore, such as in l3fwd app.  If the master lcore has
> > > > been skipped, then no degradation in perf is seen since only the
> alarm
> > > > will run on the master lcore.
> > > >
> > > > So, is the change done to make all interrupts, including alarm
> > > > interrupts, be handled by _only_ the master lcore intended?
> > >
> > > No it was not intended. The idea was to inherit settings (iopl) from
> > > the device initialization into the interrupt thread.
> > > Though a DPDK driver is not really supposed to rely on interrupt
> performance.
> > > So having interrupts managed on any core was more or less a side
> effect.
> > >
> > > > BTW, I have tried setting the affinity to all cpus instead in
> > > > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > > > better to move the master lcore initialization to after the interrupt
> > > > thread has been initialized as well? Thoughts?
> > >
> > > Yes, i think it's possible.
> > > We can also imagine a command line option to set the interrupt affinity
> > > with a default which mimics the old behaviour.
> > >
> > > In order to make this conversation clearer, and for later references,
> > > below is the DPDK init call tree:
> > >
> >
> > With the new interrupt mode, the interrupt thread needs some rework
> anyway.
> > Ideally, there would be multiple interrupt threads, one per core;
> > then use SMP affinity to align the MSI-x interrupt for the device queue
> > to run on the core that is processing that queue.
> >
> > This would require new API's to do SMP affinity, wrapper around /proc/irq
> > and an API to tell DPDK which lcore is being to process a RX (and TX)
> > queue.
>
> There is no one to one mapping between lcore and device queue.
> Any lcore can do RX/TX on the device queue.
> Of course it is preferable to do it from the core on the same socket, but
> not required.
> You can even have multiple threads  RX/TX from/to the same queue -
> as long as you provide some sync mechanism between them.
> Konstantin
>
> >
> >
>
>
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] Recent changes related to interrupt thread
       [not found]         ` <2601191342CEEE43887BDE71AB97725836AC98E9@irsmsx105.ger.corp.intel.com>
@ 2015-11-17 11:48           ` Ananyev, Konstantin
  0 siblings, 0 replies; 6+ messages in thread
From: Ananyev, Konstantin @ 2015-11-17 11:48 UTC (permalink / raw)
  To: stephen; +Cc: dev
 
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Monday, November 16, 2015 5:40 PM
> To: Ananyev, Konstantin
> Cc: Thomas Monjalon; dev@dpdk.org; Nirranjan Kirubaharan; Felix Marti; Kumar Sanghvi
> Subject: Re: [dpdk-dev] Recent changes related to interrupt thread
> 
> I was thinking of something like:
> 
> rte_intr_affinity(portid, queueid, lcoreid)
> 
> And per-lcore interrupt threads.
But that's probably too expensive to have interrupt thread per each lcore.
Again, now we can have an ability to run several lcores over one physical core.
Probably 2 new API functions:
one to create a new intr thread (so user can create as as many as he needs),
second to bind <portid>,<queueid> interrupt to particular interrupt thread.  
?
Again in that case, if user doesn't want to create extra interrupt threads at all
and just call  rte_epoll_wait() manually - he can do it that way too.
Konstantin
> 
> On Mon, Nov 16, 2015 at 9:19 AM, Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote:
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> > Sent: Monday, November 16, 2015 5:07 PM
> > To: Thomas Monjalon
> > Cc: dev@dpdk.org; Nirranjan Kirubaharan; Felix Marti; Kumar Sanghvi
> > Subject: Re: [dpdk-dev] Recent changes related to interrupt thread
> >
> > On Mon, 16 Nov 2015 14:48:42 +0100
> > Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> >
> > > Hi,
> > >
> > > 2015-11-16 18:02, Rahul Lakkireddy:
> > > > Hi,
> > > >
> > > > I notice that the following changeset:
> > > >
> > > > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > > > handler")
> > > >
> > > > has moved the initialization of the interrupt thread to after the master
> > > > lcore has been initialized.  However, this causes the interrupt thread
> > > > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > > > make all interrupts to be handled by _only_ the master lcore. Because
> > > > of this change, it seems that now alarm interrupts would also be handled
> > > > by master lcore only, IIUC.
> > > >
> > > > We are seeing a performance regression for cxgbe PMD after this commit
> > > > since, cxgbe PMD relies on alarm to periodically transmit pending
> > > > coalesced packets.
> > > >
> > > > Also, this perf degradation is only seen if there's a queue allocated
> > > > on the master lcore, such as in l3fwd app.  If the master lcore has
> > > > been skipped, then no degradation in perf is seen since only the alarm
> > > > will run on the master lcore.
> > > >
> > > > So, is the change done to make all interrupts, including alarm
> > > > interrupts, be handled by _only_ the master lcore intended?
> > >
> > > No it was not intended. The idea was to inherit settings (iopl) from
> > > the device initialization into the interrupt thread.
> > > Though a DPDK driver is not really supposed to rely on interrupt performance.
> > > So having interrupts managed on any core was more or less a side effect.
> > >
> > > > BTW, I have tried setting the affinity to all cpus instead in
> > > > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > > > better to move the master lcore initialization to after the interrupt
> > > > thread has been initialized as well? Thoughts?
> > >
> > > Yes, i think it's possible.
> > > We can also imagine a command line option to set the interrupt affinity
> > > with a default which mimics the old behaviour.
> > >
> > > In order to make this conversation clearer, and for later references,
> > > below is the DPDK init call tree:
> > >
> >
> > With the new interrupt mode, the interrupt thread needs some rework anyway.
> > Ideally, there would be multiple interrupt threads, one per core;
> > then use SMP affinity to align the MSI-x interrupt for the device queue
> > to run on the core that is processing that queue.
> >
> > This would require new API's to do SMP affinity, wrapper around /proc/irq
> > and an API to tell DPDK which lcore is being to process a RX (and TX)
> > queue.
> There is no one to one mapping between lcore and device queue.
> Any lcore can do RX/TX on the device queue.
> Of course it is preferable to do it from the core on the same socket, but not required.
> You can even have multiple threads  RX/TX from/to the same queue -
> as long as you provide some sync mechanism between them.
> Konstantin
> 
> >
> >
^ permalink raw reply	[flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-11-17 11:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-16 12:32 [dpdk-dev] Recent changes related to interrupt thread Rahul Lakkireddy
2015-11-16 13:48 ` Thomas Monjalon
2015-11-16 17:06   ` Stephen Hemminger
2015-11-16 17:19     ` Ananyev, Konstantin
2015-11-16 17:40       ` Stephen Hemminger
     [not found]         ` <2601191342CEEE43887BDE71AB97725836AC98E9@irsmsx105.ger.corp.intel.com>
2015-11-17 11:48           ` Ananyev, Konstantin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).