From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <martin_curran-gray@keysight.com>
Received: from cos-us-iron02k.cos.keysight.com
 (cos-us-iron02k.cos.keysight.com [192.25.5.36])
 by dpdk.org (Postfix) with ESMTP id 063F83239
 for <users@dpdk.org>; Mon, 22 Aug 2016 16:06:28 +0200 (CEST)
X-IPAS-Result: A2AGAQC3BbtXfRYYjJxdGgEBAQGEeAeNJqYShDmBfYYdAoISFAEBAQEBAQEBARMBARYrL4ReAQEBAQIBJxNLBAIBCBEEAQELFAkHMhQJCAEBBAESCIghCL1fAQEBAQEBAQEBAQEBAQEBAQEBAQEBHIp4hBIQAgEdgyqCLwWIJIV7gTeJcpEMhFyJB0iGIYVWg3gegkUDHIFMcIV8AX4BAQE
X-IronPort-AV: E=Sophos;i="5.28,560,1464674400"; d="scan'208";a="36532807"
Received: from wcosexch03k.cos.is.keysight.com (HELO 2k10hubs.keysight.com)
 ([156.140.24.22])
 by cos-us-iron02k.cos.keysight.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 22 Aug 2016 08:06:27 -0600
Received: from wcosexch02k.cos.is.keysight.com ([169.254.2.97]) by
 wcosexch03k.cos.is.keysight.com ([156.140.24.22]) with mapi id
 14.03.0279.002; Mon, 22 Aug 2016 08:06:27 -0600
From: <martin_curran-gray@keysight.com>
To: <shreyansh.jain@nxp.com>, <users@dpdk.org>
Thread-Topic: segfault with dpdk 16.07 in rte_mempool_populate_phys
Thread-Index: AdH4kORR/4kmue64QRuvda8Df4kUXwBTgvEAAASWNaAAoiZ4wAAA7B8g
Date: Mon, 22 Aug 2016 14:06:26 +0000
Message-ID: <22C95CA62CBADB498D32A348F0F073BC20AE2B33@wcosexch02k.cos.is.keysight.com>
References: <22C95CA62CBADB498D32A348F0F073BC20AE2583@wcosexch02k.cos.is.keysight.com>
 <DB5PR0401MB2054C29AE2C86DCA191A48C490160@DB5PR0401MB2054.eurprd04.prod.outlook.com>
 <22C95CA62CBADB498D32A348F0F073BC20AE28CB@wcosexch02k.cos.is.keysight.com>
 <DB5PR0401MB2054344A6B0FD51589E0C79C90E80@DB5PR0401MB2054.eurprd04.prod.outlook.com>
In-Reply-To: <DB5PR0401MB2054344A6B0FD51589E0C79C90E80@DB5PR0401MB2054.eurprd04.prod.outlook.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [156.140.13.70]
x-tm-as-product-ver: SMEX-11.0.0.4283-8.000.1202-22528.007
x-tm-as-result: No--49.641700-8.000000-31
x-tm-as-matchedid: 147014-150567-701625-704425-700685-700107-702143-708797-7
 02098-700758-706891-702084-121270-702609-701646-105700-703788-709908-139006
 -711993-106660-106230-106420-703829-700606-139010-702020-701618-700752-7077
 88-708310-708218-188019-709823-700264-704465-703399-703179-701177-708712-85
 1788-708060-863916-703523-701445-700104-701588-701223-701604-703283-700693-
 701594-701236-708325-701305-701837-709584-700057-110262-105250-705167-71027
 2-704410-707426-700618-863828-148004-148133-20043-42000-42003
x-tm-as-user-approved-sender: No
x-tm-as-user-blocked-sender: No
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-users] segfault with dpdk 16.07 in
	rte_mempool_populate_phys
X-BeenThere: users@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: usage discussions <users.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/users>,
 <mailto:users-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/users/>
List-Post: <mailto:users@dpdk.org>
List-Help: <mailto:users-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/users>,
 <mailto:users-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Aug 2016 14:06:29 -0000

Hi Shreyansh,

Thanks for the update

I got a bit further, the ops->alloc function pointer is not being setup, it=
 is left at 0

I'm trying to figure out what happens in the ip_reassembly example that I'm=
 not doing, since ip_reassembly program works fine

Debug from my app

	 at start of rte_mempool_create
	 at start of rte_mempool_populate_default
	 at start of rte_mempool_populate_phys

	rte_mempool_ops_alloc pos 1
	 mp pointer here is 2781207808
	 ops is 3069532288
	rte_mempool_ops_alloc pos 2
	   does ops->alloc exist?
	   ops->alloc is 0

when it tries to then access ops->alloc, it then segfaults

Debug from ip_reassembly

at start of rte_mempool_create
	 at start of rte_mempool_populate_default
	 at start of rte_mempool_populate_phys

	rte_mempool_ops_alloc pos 1
	 mp pointer here is 2410384960
	 ops is 8063296

	rte_mempool_ops_alloc pos 2
	   does ops->alloc exist?
	   ops->alloc is 4526304
	 at end of rte_mempool_populate_phys
	 at end of rte_mempool_populate_default
	 at end of rte_mempool_create


-----Original Message-----
From: Shreyansh Jain [mailto:shreyansh.jain@nxp.com]=20
Sent: 22 August 2016 14:59
To: CURRAN-GRAY,MARTIN (K-Scotland,ex1) <martin_curran-gray@keysight.com>; =
users@dpdk.org
Subject: RE: segfault with dpdk 16.07 in rte_mempool_populate_phys

Hi Martin,

See inline.
(Also, please don't remove mail thread text in replied as it loses context)=
.

> -----Original Message-----
> From: martin_curran-gray@keysight.com [mailto:martin_curran-=20
> gray@keysight.com]
> Sent: Friday, August 19, 2016 1:58 PM
> To: Shreyansh Jain <shreyansh.jain@nxp.com>; users@dpdk.org
> Subject: RE: segfault with dpdk 16.07 in rte_mempool_populate_phys
>=20
> Hi Shreyansh,
>=20
> Thanks for your reply,
>=20
> Hmmm, I had wondered if the debug output from 16.7 was reduced=20
> compared to 2.2.0, but perhaps this is what I should have been=20
> concentrating on, rather than the core later
>=20
>=20
> On a vm running our app using 2.2.0 at startup, I see:
>=20
> dpdk: In dpdk_init_eal core_mask is  79, master_core_id  is 0
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 0 on socket 0
> EAL: Detected lcore 2 as core 0 on socket 0
> EAL: Detected lcore 3 as core 0 on socket 0
> EAL: Detected lcore 4 as core 0 on socket 0
> EAL: Detected lcore 5 as core 0 on socket 0
> EAL: Detected lcore 6 as core 0 on socket 0
> EAL: Support maximum 32 logical core(s) by configuration.
> EAL: Detected 7 lcore(s)
> EAL: Setting up physically contiguous memory...
> EAL: Ask a virtual area of 0x40000000 bytes
> EAL: Virtual area found at 0x7f2735600000 (size =3D 0x40000000)
> EAL: Requesting 512 pages of size 2MB from socket 0
> EAL: TSC frequency is ~2094950 KHz
> EAL: WARNING: cpu flags constant_tsc=3Dyes nonstop_tsc=3Dno -> using=20
> unreliable clock cycles !
> EAL: Master lcore 0 is ready (tid=3D9a11c720;cpuset=3D[0])
> EAL: Failed to set thread name for interrupt handling
> EAL: Cannot set name for lcore thread
> EAL: Cannot set name for lcore thread
> EAL: Cannot set name for lcore thread
> EAL: Cannot set name for lcore thread
> EAL: lcore 4 is ready (tid=3D33ff7700;cpuset=3D[4])
> EAL: lcore 3 is ready (tid=3D349f8700;cpuset=3D[3])
> EAL: lcore 6 is ready (tid=3D32bf5700;cpuset=3D[6])
> EAL: lcore 5 is ready (tid=3D335f6700;cpuset=3D[5])
> EAL: PCI device 0000:00:07.0 on NUMA socket -1
> EAL:   probe driver: 8086:1521 rte_igb_pmd
> EAL:   Not managed by a supported kernel driver, skipped
> EAL: PCI device 0000:00:08.0 on NUMA socket -1
> EAL:   probe driver: 8086:1572 rte_i40e_pmd
> EAL:   PCI memory mapped at 0x7f27319f5000
> EAL:   PCI memory mapped at 0x7f279a33c000
> PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000224e
>=20
> However on my vm running our app but with 16.7 I see much less EAL=20
> output, the other stuff is printf output I put in the dpdk code to try=20
> and figure out where it was going wrong
>=20
> dpdk: In dpdk_init_eal core_mask is  79, master_core_id  is 0
> EAL: Detected 7 lcore(s)
> EAL: WARNING: cpu flags constant_tsc=3Dyes nonstop_tsc=3Dno -> using=20
> unreliable clock cycles !
>=20
> dpdk_init_memory_pools  position 1
> dpdk_init_memory_pools  position 2
> dpdk_init_memory_pools  position 3
>=20
>   about to call ret_mempool_create
>=20
>   name               Error Ind Mempool
>   number             8
>   element size       256
>   cache size         4
>   private data size  4
>   mp_init            1158173360
>   mp_init_arg        0
>   obj_init           1158173120
>   obj_init_arg       0
>   socket_id          4294967295
>   flags              0
>=20
>=20
>  at start of rte_mempool_create
>  at start of rte_mempool_populate_default  at start of=20
> rte_mempool_populate_phys
>=20
>=20
> Is this just down to a change of the debug output from within the EAL=20
> , or is something going fundamentally wrong.

The number of messages (specially the lcore detection, etc) have definitely=
 been reduced across 16.07.
>>From what I remember, Lcore detection, VFIO support and eventually applicat=
ion specific log was what was getting printed. As soon as I have access to =
a vanilla 16.07 app, I will post the output (on Host only). But, it seems f=
ine to me as of now.

>=20
> There is output about the individual detected lcores, there is no=20
> output about the setting up physically contiguous memory.. etc

Which is OK I think. Most of the INFO have been moved to DEBUG which is why=
 you won't see the 2.2.0 messages.=20

>=20
> However if my call to rte_eal_init  hadn't worked, I shouldn't have to=20
> as far as trying to call rte_mempool_create
>=20
> We check for a return of rte_eal_init of < 0 and if so, we rte_exit.
>=20
> I'll have a look over the newer documentation for the debug output

For the stack trace that you dumped in previous email, would it be possible=
 to recompile without the optimization flags and dump it again?
It is possible that the core is hitting some path because of which clean ex=
it is not happening.

>=20
> Thanks
>=20
> Martin
>=20

-
Shreyansh