DPDK patches and discussions
 help / color / mirror / Atom feed
From: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
To: "Van Haaren, Harry" <harry.van.haaren@intel.com>,
	"Eads, Gage" <gage.eads@intel.com>,
	"jerin.jacob@caviumnetworks.com" <jerin.jacob@caviumnetworks.com>,
	"santosh.shukla@caviumnetworks.com"
	<santosh.shukla@caviumnetworks.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 2/2] event/sw: use dynamically-sized IQs
Date: Mon, 8 Jan 2018 21:35:30 +0530	[thread overview]
Message-ID: <20180108160529.gven7vlrbmrrlw2p@Pavan-LT> (raw)
In-Reply-To: <E923DB57A917B54B9182A2E928D00FA650FEB40B@IRSMSX102.ger.corp.intel.com>

On Mon, Jan 08, 2018 at 03:50:24PM +0000, Van Haaren, Harry wrote:
> > From: Pavan Nikhilesh [mailto:pbhagavatula@caviumnetworks.com]
> > Sent: Monday, January 8, 2018 3:32 PM
> > To: Eads, Gage <gage.eads@intel.com>; Van Haaren, Harry
> > <harry.van.haaren@intel.com>; jerin.jacob@caviumnetworks.com;
> > santosh.shukla@caviumnetworks.com
> > Cc: dev@dpdk.org
> > Subject: Re: [PATCH 2/2] event/sw: use dynamically-sized IQs
> >
> > On Wed, Nov 29, 2017 at 09:08:34PM -0600, Gage Eads wrote:
> > > This commit introduces dynamically-sized IQs, by switching the underlying
> > > data structure from a fixed-size ring to a linked list of queue 'chunks.'
>
> <snip>
>
> > Sw eventdev crashes when used alongside Rx adapter. The crash happens when
> > pumping traffic at > 1.4mpps. This commit seems responsible for this.
> >
> >
> > Apply the following Rx adapter patch
> > http://dpdk.org/dev/patchwork/patch/31977/
> > Command used:
> > ./build/eventdev_pipeline_sw_pmd -c 0xfffff8 --vdev="event_sw" -- -r0x800
> > -t0x100 -w F000 -e 0x10
>
> Applied the patch to current master, recompiled; cannot reproduce here..
>
master in the sense dpdk-next-eventdev right?
> Is it 100% reproducible and "instant" or can it take some time to occur there?
>
It is instant
>
> > Backtrace:
> >
> > Thread 4 "lcore-slave-4" received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0xffffb6c8f040 (LWP 25291)]
> > 0x0000aaaaaadcc0d4 in iq_dequeue_burst (count=48, ev=0xffffb6c8dd38,
> > iq=0xffff9f764720, sw=0xffff9f332600) at
> > /root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:142
> > 142 ev[total++] = current->events[index++];
>
> Could we get the output of (gdb) info locals?
>

Thread 4 "lcore-slave-4" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xffffb6c8f040 (LWP 19751)]
0x0000aaaaaadcc0d4 in iq_dequeue_burst (count=48, ev=0xffffb6c8dd38,
iq=0xffff9f764620, sw=0xffff9f332500) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:142
142 ev[total++] = current->events[index++];

(gdb) info locals
next = 0x7000041400be73b
current = 0x7000041400be73b
total = 36
index = 1
(gdb)


Noticed an other crash:

Thread 4 "lcore-slave-4" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xffffb6c8f040 (LWP 19690)]
0x0000aaaaaadcfb78 in iq_alloc_chunk (sw=0xffff9f332500) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:63
63		sw->chunk_list_head = chunk->next;

(gdb) info locals
chunk = 0x14340000119

(gdb) bt
#0  0x0000aaaaaadcfb78 in iq_alloc_chunk (sw=0xffff9f332500) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:63
#1  iq_enqueue (ev=0xffff9f3967c0, iq=0xffff9f764620, sw=0xffff9f332500) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:95
#2  __pull_port_lb (allow_reorder=0, port_id=5, sw=0xffff9f332500) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/sw_evdev_scheduler.c:463
#3  sw_schedule_pull_port_no_reorder (sw=0xffff9f332500, port_id=5) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/sw_evdev_scheduler.c:486
#4  0x0000aaaaaadd0608 in sw_event_schedule (dev=0xaaaaaafbd200
<rte_event_devices>) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/sw_evdev_scheduler.c:554
#5  0x0000aaaaaadca008 in sw_sched_service_func (args=0xaaaaaafbd200
<rte_event_devices>) at
/root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/sw_evdev.c:767
#6  0x0000aaaaaab54740 in rte_service_runner_do_callback (s=0xffff9fffdf80,
cs=0xffff9ffef900, service_idx=0) at
/root/clean/rebase/dpdk-next-eventdev/lib/librte_eal/common/rte_service.c:349
#7  0x0000aaaaaab54868 in service_run (i=0, cs=0xffff9ffef900,
service_mask=18446744073709551615) at
/root/clean/rebase/dpdk-next-eventdev/lib/librte_eal/common/rte_service.c:376
#8  0x0000aaaaaab54954 in rte_service_run_iter_on_app_lcore (id=0,
serialize_mt_unsafe=1) at
/root/clean/rebase/dpdk-next-eventdev/lib/librte_eal/common/rte_service.c:405
#9  0x0000aaaaaaaef04c in schedule_devices (lcore_id=4) at
/root/clean/rebase/dpdk-next-eventdev/examples/eventdev_pipeline_sw_pmd/main.c:223
#10 0x0000aaaaaaaef234 in worker (arg=0xffff9f331c80) at
/root/clean/rebase/dpdk-next-eventdev/examples/eventdev_pipeline_sw_pmd/main.c:274
#11 0x0000aaaaaab4382c in eal_thread_loop (arg=0x0) at
/root/clean/rebase/dpdk-next-eventdev/lib/librte_eal/linuxapp/eal/eal_thread.c:182
#12 0x0000ffffb7e46d64 in start_thread () from /usr/lib/libpthread.so.0
#13 0x0000ffffb7da8bbc in thread_start () from /usr/lib/libc.so.6


>
>
> > (gdb) bt
> > #0  0x0000aaaaaadcc0d4 in iq_dequeue_burst (count=48, ev=0xffffb6c8dd38,
> > iq=0xffff9f764720, sw=0xffff9f332600) at
> > /root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:142
> > #1  sw_schedule_atomic_to_cq (sw=0xffff9f332600, qid=0xffff9f764700,
> > iq_num=0,
> > count=48) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/drivers/event/sw/sw_evdev_scheduler.c:74
> > #2  0x0000aaaaaadcdc44 in sw_schedule_qid_to_cq (sw=0xffff9f332600) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/drivers/event/sw/sw_evdev_scheduler.c:262
> > #3  0x0000aaaaaadd069c in sw_event_schedule (dev=0xaaaaaafbd200
> > <rte_event_devices>) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/drivers/event/sw/sw_evdev_scheduler.c:564
> > #4  0x0000aaaaaadca008 in sw_sched_service_func (args=0xaaaaaafbd200
> > <rte_event_devices>) at
> > /root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/sw_evdev.c:767
> > #5  0x0000aaaaaab54740 in rte_service_runner_do_callback (s=0xffff9fffdf80,
> > cs=0xffff9ffef900, service_idx=0) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/lib/librte_eal/common/rte_service.c:349
> > #6  0x0000aaaaaab54868 in service_run (i=0, cs=0xffff9ffef900,
> > service_mask=18446744073709551615) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/lib/librte_eal/common/rte_service.c:376
> > #7  0x0000aaaaaab54954 in rte_service_run_iter_on_app_lcore (id=0,
> > serialize_mt_unsafe=1) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/lib/librte_eal/common/rte_service.c:405
> > #8  0x0000aaaaaaaef04c in schedule_devices (lcore_id=4) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/examples/eventdev_pipeline_sw_pmd/main.c:223
> > #9  0x0000aaaaaaaef234 in worker (arg=0xffff9f331d80) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/examples/eventdev_pipeline_sw_pmd/main.c:274
> > #10 0x0000aaaaaab4382c in eal_thread_loop (arg=0x0) at
> > /root/clean/rebase/dpdk-next-
> > eventdev/lib/librte_eal/linuxapp/eal/eal_thread.c:182
> > #11 0x0000ffffb7e46d64 in start_thread () from /usr/lib/libpthread.so.0
> > #12 0x0000ffffb7da8bbc in thread_start () from /usr/lib/libc.so.6
> >
> > Segfault seems to happen in sw_event_schedule and only happens under high
> > traffic load.
>
> I've added -n 0 to the command line allowing it to run forever,
> and after ~2 mins its still happily forwarding pkts at ~10G line rate here.
>

On arm64 the crash is instant even without -n0.

>
> > Thanks,
> > Pavan
>
> Thanks for reporting - I'm afraid I'll have to ask a few questions to identify why I can't reproduce here before I can dig in and identify a fix.
>
> Anything special about the system that it is on?

Running on arm64 octeontx with 8x10G connected.

> What traffic pattern is being sent to the app?

Using something similar to trafficgen, IPv4/UDP pkts.

   0:00:51     958245 |0xB00   2816|0xB10   2832|0xB20   2848|0xB30   2864|0xC00 * 3072|0xC10 * 3088|0xC20 * 3104|0xC30 * 3120|    Totals
Port Status           |XFI30     Up|XFI31     Up|XFI32     Up|XFI33     Up|XFI40     Up|XFI41     Up|XFI42     Up|XFI43     Up|
 1:Total TX packets   |  7197041566|  5194976604|  5120240981|  4424870160|  5860892739|  5191225514|  5126500427|  4429259828|42545007819
 3:Total RX packets   |   358886055|   323055411|   321000948|   277179800|   387486466|   350278086|   348080242|   295460613|2661427621
 6:TX packet rate     |           0|           0|           0|           0|           0|           0|           0|           0|         0
 7:TX octet rate      |           0|           0|           0|           0|           0|           0|           0|           0|         0
 8:TX bit rate, Mbps  |           0|           0|           0|           0|           0|           0|           0|           0|         0
10:RX packet rate     |           0|           0|           0|           0|           0|           0|           0|           0|         0
11:RX octet rate      |           0|           0|           0|           0|           0|           0|           0|           0|         0
12:RX bit rate, Mbps  |           0|           0|           0|           0|           0|           0|           0|           0|         0
36:tx.size            |          60|          60|          60|          60|          60|          60|          60|          60|
37:tx.type            |    IPv4+UDP|    IPv4+UDP|    IPv4+UDP|    IPv4+UDP|    IPv4+UDP|    IPv4+UDP|    IPv4+UDP|    IPv4+UDP|
38:tx.payload         |         abc|         abc|         abc|         abc|         abc|         abc|         abc|         abc|
47:dest.mac           |   fb71189c0|   fb71189d0|   fb71189e0|   fb71189bf|   fb7118ac0|   fb7118ad0|   fb7118ae0|   fb7118abf|
51:src.mac            |   fb71189bf|   fb71189cf|   fb71189df|   fb71189ef|   fb7118abf|   fb7118acf|   fb7118adf|   fb7118aef|
55:dest.ip            |   11.1.0.99|  11.17.0.99|  11.33.0.99|   11.0.0.99|   14.1.0.99|  14.17.0.99|  14.33.0.99|   14.0.0.99|
59:src.ip             |   11.0.0.99|  11.16.0.99|  11.32.0.99|  11.48.0.99|   14.0.0.99|  14.16.0.99|  14.32.0.99|  14.48.0.99|
73:bridge             |         off|         off|         off|         off|         off|         off|         off|         off|
77:validate packets   |         off|         off|         off|         off|         off|         off|         off|         off|

Thanks,
Pavan.

>
> Thanks
>
>
> <snip>
>

  reply	other threads:[~2018-01-08 16:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30  3:08 [dpdk-dev] [PATCH 1/2] event/sw: fix queue memory leak and multi-link bug Gage Eads
2017-11-30  3:08 ` [dpdk-dev] [PATCH 2/2] event/sw: use dynamically-sized IQs Gage Eads
2017-12-07 17:15   ` Van Haaren, Harry
2017-12-09  9:26     ` Jerin Jacob
2018-01-08 15:32   ` Pavan Nikhilesh
2018-01-08 15:50     ` Van Haaren, Harry
2018-01-08 16:05       ` Pavan Nikhilesh [this message]
2018-01-08 18:36         ` Eads, Gage
2018-01-09  7:12           ` Pavan Nikhilesh
2017-12-07 17:15 ` [dpdk-dev] [PATCH 1/2] event/sw: fix queue memory leak and multi-link bug Van Haaren, Harry
2017-12-09  9:26   ` Jerin Jacob

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180108160529.gven7vlrbmrrlw2p@Pavan-LT \
    --to=pbhagavatula@caviumnetworks.com \
    --cc=dev@dpdk.org \
    --cc=gage.eads@intel.com \
    --cc=harry.van.haaren@intel.com \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=santosh.shukla@caviumnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).