From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 3046E1B1B2 for ; Mon, 8 Jan 2018 16:51:01 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jan 2018 07:51:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,330,1511856000"; d="scan'208";a="18409420" Received: from irsmsx153.ger.corp.intel.com ([163.33.192.75]) by orsmga003.jf.intel.com with ESMTP; 08 Jan 2018 07:50:59 -0800 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.180]) by IRSMSX153.ger.corp.intel.com ([169.254.9.34]) with mapi id 14.03.0319.002; Mon, 8 Jan 2018 15:50:24 +0000 From: "Van Haaren, Harry" To: Pavan Nikhilesh , "Eads, Gage" , "jerin.jacob@caviumnetworks.com" , "santosh.shukla@caviumnetworks.com" CC: "dev@dpdk.org" Thread-Topic: [PATCH 2/2] event/sw: use dynamically-sized IQs Thread-Index: AQHTaYiqxxfVOWUinESckeS2JLwVc6NqWK0AgAAA6JA= Date: Mon, 8 Jan 2018 15:50:24 +0000 Message-ID: References: <1512011314-19682-1-git-send-email-gage.eads@intel.com> <1512011314-19682-2-git-send-email-gage.eads@intel.com> <20180108153219.jszoepdgfiggn3bm@Pavan-LT> In-Reply-To: <20180108153219.jszoepdgfiggn3bm@Pavan-LT> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMWE1ODQ0NmItZTQ1Yi00NDYwLWE1MjYtNzI3MDExZDEyYzljIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6InFYakRDU1BHTWo1RmJZc3RqK3JWS3c1a2VcL3lsUnIwM1l3bmtIK3VCRzRFPSJ9 x-ctpclassification: CTP_IC dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 2/2] event/sw: use dynamically-sized IQs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2018 15:51:02 -0000 > From: Pavan Nikhilesh [mailto:pbhagavatula@caviumnetworks.com] > Sent: Monday, January 8, 2018 3:32 PM > To: Eads, Gage ; Van Haaren, Harry > ; jerin.jacob@caviumnetworks.com; > santosh.shukla@caviumnetworks.com > Cc: dev@dpdk.org > Subject: Re: [PATCH 2/2] event/sw: use dynamically-sized IQs >=20 > On Wed, Nov 29, 2017 at 09:08:34PM -0600, Gage Eads wrote: > > This commit introduces dynamically-sized IQs, by switching the underlyi= ng > > data structure from a fixed-size ring to a linked list of queue 'chunks= .' > Sw eventdev crashes when used alongside Rx adapter. The crash happens whe= n > pumping traffic at > 1.4mpps. This commit seems responsible for this. >=20 >=20 > Apply the following Rx adapter patch > http://dpdk.org/dev/patchwork/patch/31977/ > Command used: > ./build/eventdev_pipeline_sw_pmd -c 0xfffff8 --vdev=3D"event_sw" -- -r0x8= 00 > -t0x100 -w F000 -e 0x10 Applied the patch to current master, recompiled; cannot reproduce here.. Is it 100% reproducible and "instant" or can it take some time to occur the= re? > Backtrace: >=20 > Thread 4 "lcore-slave-4" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0xffffb6c8f040 (LWP 25291)] > 0x0000aaaaaadcc0d4 in iq_dequeue_burst (count=3D48, ev=3D0xffffb6c8dd38, > iq=3D0xffff9f764720, sw=3D0xffff9f332600) at > /root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:142 > 142 ev[total++] =3D current->events[index++]; Could we get the output of (gdb) info locals? > (gdb) bt > #0 0x0000aaaaaadcc0d4 in iq_dequeue_burst (count=3D48, ev=3D0xffffb6c8dd= 38, > iq=3D0xffff9f764720, sw=3D0xffff9f332600) at > /root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/iq_chunk.h:142 > #1 sw_schedule_atomic_to_cq (sw=3D0xffff9f332600, qid=3D0xffff9f764700, > iq_num=3D0, > count=3D48) at > /root/clean/rebase/dpdk-next- > eventdev/drivers/event/sw/sw_evdev_scheduler.c:74 > #2 0x0000aaaaaadcdc44 in sw_schedule_qid_to_cq (sw=3D0xffff9f332600) at > /root/clean/rebase/dpdk-next- > eventdev/drivers/event/sw/sw_evdev_scheduler.c:262 > #3 0x0000aaaaaadd069c in sw_event_schedule (dev=3D0xaaaaaafbd200 > ) at > /root/clean/rebase/dpdk-next- > eventdev/drivers/event/sw/sw_evdev_scheduler.c:564 > #4 0x0000aaaaaadca008 in sw_sched_service_func (args=3D0xaaaaaafbd200 > ) at > /root/clean/rebase/dpdk-next-eventdev/drivers/event/sw/sw_evdev.c:767 > #5 0x0000aaaaaab54740 in rte_service_runner_do_callback (s=3D0xffff9fffd= f80, > cs=3D0xffff9ffef900, service_idx=3D0) at > /root/clean/rebase/dpdk-next- > eventdev/lib/librte_eal/common/rte_service.c:349 > #6 0x0000aaaaaab54868 in service_run (i=3D0, cs=3D0xffff9ffef900, > service_mask=3D18446744073709551615) at > /root/clean/rebase/dpdk-next- > eventdev/lib/librte_eal/common/rte_service.c:376 > #7 0x0000aaaaaab54954 in rte_service_run_iter_on_app_lcore (id=3D0, > serialize_mt_unsafe=3D1) at > /root/clean/rebase/dpdk-next- > eventdev/lib/librte_eal/common/rte_service.c:405 > #8 0x0000aaaaaaaef04c in schedule_devices (lcore_id=3D4) at > /root/clean/rebase/dpdk-next- > eventdev/examples/eventdev_pipeline_sw_pmd/main.c:223 > #9 0x0000aaaaaaaef234 in worker (arg=3D0xffff9f331d80) at > /root/clean/rebase/dpdk-next- > eventdev/examples/eventdev_pipeline_sw_pmd/main.c:274 > #10 0x0000aaaaaab4382c in eal_thread_loop (arg=3D0x0) at > /root/clean/rebase/dpdk-next- > eventdev/lib/librte_eal/linuxapp/eal/eal_thread.c:182 > #11 0x0000ffffb7e46d64 in start_thread () from /usr/lib/libpthread.so.0 > #12 0x0000ffffb7da8bbc in thread_start () from /usr/lib/libc.so.6 >=20 > Segfault seems to happen in sw_event_schedule and only happens under high > traffic load. I've added -n 0 to the command line allowing it to run forever, and after ~2 mins its still happily forwarding pkts at ~10G line rate here. > Thanks, > Pavan Thanks for reporting - I'm afraid I'll have to ask a few questions to ident= ify why I can't reproduce here before I can dig in and identify a fix. Anything special about the system that it is on? What traffic pattern is being sent to the app? Thanks