From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <keith.wiles@intel.com>
Received: from mga04.intel.com (mga04.intel.com [192.55.52.120])
 by dpdk.org (Postfix) with ESMTP id C41DD98
 for <dev@dpdk.org>; Tue, 21 Aug 2018 14:18:17 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga005.jf.intel.com ([10.7.209.41])
 by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 21 Aug 2018 05:18:16 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.53,269,1531810800"; d="scan'208";a="250678418"
Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206])
 by orsmga005.jf.intel.com with ESMTP; 21 Aug 2018 05:17:51 -0700
Received: from fmsmsx154.amr.corp.intel.com (10.18.116.70) by
 FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS)
 id 14.3.319.2; Tue, 21 Aug 2018 05:17:51 -0700
Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.210]) by
 FMSMSX154.amr.corp.intel.com ([169.254.6.197]) with mapi id 14.03.0319.002;
 Tue, 21 Aug 2018 05:17:51 -0700
From: "Wiles, Keith" <keith.wiles@intel.com>
To: Matteo Lanzuisi <m.lanzuisi@resi.it>
CC: Olivier Matz <olivier.matz@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [dpdk-dev] Multi-thread mempool usage
Thread-Index: AQHULv41lPaH6s1QH0eKSa+Fpr+0laS+KD0AgACPhACACoktAIAAFPkAgAFO0ACAAASXAA==
Date: Tue, 21 Aug 2018 12:17:49 +0000
Message-ID: <CC8BC250-CF62-4B21-9A14-85AF0541A9EF@intel.com>
References: <a7c3c3ac-470a-b4d7-c3ca-1825ff28989d@resi.it>
 <a4f1fa01-c838-671d-9f52-5e790a5007a4@resi.it>
 <20180813215424.cesdejskukrrt4zf@neon>
 <18bbb971-40f1-bba3-3cea-83e7eff94e43@resi.it>
 <DBE08A5E-ED3F-417B-BD08-5BC28CED379A@intel.com>
 <d58cfe5b-8b97-86e0-2f38-3fa5bcc82049@resi.it>
In-Reply-To: <d58cfe5b-8b97-86e0-2f38-3fa5bcc82049@resi.it>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.255.228.109]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <05C8FE65080B394B9ECFEADE47B3ABD5@intel.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] Multi-thread mempool usage
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Aug 2018 12:18:18 -0000


> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>=20
> Hi
>=20
> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>  wrote:
>>>=20
>>> Hello Olivier,
>>>=20
>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>>=20
>>>> Hello Matteo,
>>>>=20
>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>>=20
>>>>> Any suggestion? any idea about this behaviour?
>>>>>=20
>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>>=20
>>>>>> Hi all,
>>>>>>=20
>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) o=
n
>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>>=20
>>>>>> This application is made up by one or more threads (each one on a
>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>>=20
>>>>>> Each thread can call the following code lines when receiving a speci=
fic
>>>>>> packet:
>>>>>>=20
>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>> {
>>>>>>         result =3D
>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOI=
D_P
>>>>>> *) &new_work);        // mempools are created one for each logical c=
ore
>>>>>>         if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>             printf("Result %d, lcore di partenza %u, lcore di ricezi=
one
>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    /=
/
>>>>>> debug print, on my server it should never happen but with multi-thre=
ad
>>>>>> happens always on the last logical core!!!!
>>>>>>=20
>>>> Here, checking the value of new_work looks wrong to me, before
>>>> ensuring that result =3D=3D 0. At least, new_work should be set to
>>>> NULL before calling rte_mempool_get().
>>>>=20
>>> I put the check after result =3D=3D 0, and just before the rte_mempool_=
get() I set new_work to NULL, but nothing changed.
>>> The first time something goes wrong the print is
>>>=20
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, point=
er 0x880002
>>>=20
>>> Sorry for the italian language print :) it means that application is se=
nding a message from the logical core 1 to the logical core 2, it's the 635=
th time, the result is 0 and the pointer is 0x880002 while all pointers bef=
ore were 0x7ffxxxxxx.
>>> One strange thing is that this behaviour happens always from the logica=
l core 1 to the logical core 2 when the counter is 635!!! (Sending messages=
 from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>> Another strange thing is that pointers from counter 636 to 640 are NULL=
, and from 641 begin again to be good... as you can see here following (I a=
ttached the result of a test without the "if" of the check on the value of =
new_work, and only for messages from the lcore 1 to lcore 2)
>>>=20
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, point=
er 0x7ffe8a261880
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, point=
er 0x7ffe8a261900
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, point=
er 0x7ffe8a261980
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, point=
er 0x7ffe8a261a00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, point=
er 0x7ffe8a261a80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, point=
er 0x7ffe8a261b00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, point=
er 0x7ffe8a261b80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, point=
er 0x7ffe8a261c00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, point=
er 0x880002
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, point=
er (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, point=
er (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, point=
er (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, point=
er (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, point=
er (nil)
>>>=20
>> This sure does seem like a memory over write problem, with maybe a memse=
t(0) in the mix as well. Have you tried using hardware break points with th=
e 0x880002 or 0x00 being written into this range?
> I put some breakpoints and found this:
>=20
> 1 - using pointer 0x880002, the output is (the pointer comes in the middl=
e of two rwlock):
>=20
> (gdb) awatch *0x880002
> Hardware access (read/write) watchpoint 1: *0x880002
> (gdb) c
> Continuing.
> [New Thread 0x7fffeded5700 (LWP 19969)]
> [New Thread 0x7fffed6d4700 (LWP 19970)]
> [New Thread 0x7fffeced3700 (LWP 19971)]
> [New Thread 0x7fffec6d2700 (LWP 19972)]
> [New Thread 0x7fffebed1700 (LWP 19973)]
> [New Thread 0x7fffeb6d0700 (LWP 19974)]
> Hardware access (read/write) watchpoint 1: *0x880002
>=20
> Value =3D 0
> rte_rwlock_init (rwl=3D0x880000 <ikco_sdkif_actlist_lock+677024>)
>     at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rw=
lock.h:81
> 81    }
> (gdb) c
> Continuing.
> Hardware access (read/write) watchpoint 1: *0x880002

These are most likely false positive hits and not the issue.
>=20
> Value =3D 0
> rte_rwlock_init (rwl=3D0x880004 <ikco_sdkif_actlist_lock+677028>)
>     at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rw=
lock.h:81
> 81    }
> (gdb) c
> Continuing.
>=20
> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range =
of the mempool), gdb tells nothing about them (I don't use them, I just get=
 them from the pool and the put them in the pool again);
>=20
> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of g=
db:
>=20
> (gdb) awatch *(int *)0x7ffe8a261d65
> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
> (gdb) c
> Continuing.
> [New Thread 0x7fffeded5700 (LWP 17689)]
> [New Thread 0x7fffed6d4700 (LWP 17690)]
> [New Thread 0x7fffeced3700 (LWP 17691)]
> [New Thread 0x7fffec6d2700 (LWP 17692)]
> [New Thread 0x7fffebed1700 (LWP 17693)]
> [New Thread 0x7fffeb6d0700 (LWP 17694)]
> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>=20
> Value =3D 0
> 0x00007ffff3798c21 in mempool_add_elem (mp=3Dmp@entry=3D0x7ffebfd8d6c0, o=
bj=3Dobj@entry=3D0x7ffe8a261d80,=20
>     iova=3Diova@entry=3D4465237376) at /usr/src/debug/dpdk-17.11/lib/libr=
te_mempool/rte_mempool.c:140
> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
> (gdb) where
> #0  0x00007ffff3798c21 in mempool_add_elem (mp=3Dmp@entry=3D0x7ffebfd8d6c=
0, obj=3Dobj@entry=3D0x7ffe8a261d80,=20
>     iova=3Diova@entry=3D4465237376) at /usr/src/debug/dpdk-17.11/lib/libr=
te_mempool/rte_mempool.c:140
> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=3D0x7ffebfd8d6c0,=
 vaddr=3D0x7ffe8a23d540 "",=20
>     iova=3D4465087808, len=3D8388480, free_cb=3D<optimized out>, opaque=
=3D<optimized out>)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=3Dmp@entry=3D0=
x7ffebfd8d6c0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
> #3  0x00007ffff3799e89 in rte_mempool_create (name=3D<optimized out>, n=
=3D<optimized out>,=20
>     elt_size=3D<optimized out>, cache_size=3D<optimized out>, private_dat=
a_size=3D<optimized out>,=20
>     mp_init=3D0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=3D0x0,=
=20
>     obj_init=3D0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=3D0x0, soc=
ket_id=3D0, flags=3D0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
> #4  0x0000000000548a52 in main (argc=3D16, argv=3D0x7fffffffe3c8)
>     at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../..=
/sys/com/linux-dpdk/cea-app/main.c:2360
> (gdb) c
> Continuing.
> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65

This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is j=
ust putting the mempool on the main tailq list for mempools in DPDK.

>=20
> Old value =3D 0
> New value =3D -402653184
> 0x00007ffff3798c24 in mempool_add_elem (mp=3Dmp@entry=3D0x7ffebfd8d6c0, o=
bj=3Dobj@entry=3D0x7ffe8a261e00,=20
>     iova=3Diova@entry=3D4465237504) at /usr/src/debug/dpdk-17.11/lib/libr=
te_mempool/rte_mempool.c:140
> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
> (gdb) where
> #0  0x00007ffff3798c24 in mempool_add_elem (mp=3Dmp@entry=3D0x7ffebfd8d6c=
0, obj=3Dobj@entry=3D0x7ffe8a261e00,=20
>     iova=3Diova@entry=3D4465237504) at /usr/src/debug/dpdk-17.11/lib/libr=
te_mempool/rte_mempool.c:140
> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=3D0x7ffebfd8d6c0,=
 vaddr=3D0x7ffe8a23d540 "",=20
>     iova=3D4465087808, len=3D8388480, free_cb=3D<optimized out>, opaque=
=3D<optimized out>)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=3Dmp@entry=3D0=
x7ffebfd8d6c0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
> #3  0x00007ffff3799e89 in rte_mempool_create (name=3D<optimized out>, n=
=3D<optimized out>,=20
>     elt_size=3D<optimized out>, cache_size=3D<optimized out>, private_dat=
a_size=3D<optimized out>,=20
>     mp_init=3D0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=3D0x0,=
=20
>     obj_init=3D0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=3D0x0, soc=
ket_id=3D0, flags=3D0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
> #4  0x0000000000548a52 in main (argc=3D16, argv=3D0x7fffffffe3c8)
>     at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../..=
/sys/com/linux-dpdk/cea-app/main.c:2360
> (gdb) c
> Continuing.
>=20
> What do you think? It is normal that the mempool_add_elem is called only =
on certain pointers of the mempool?=20
> I attached the initialization of the mempool. Can this be wrong?=20

All mempools with a cache size will have two queue to put memory on, one is=
 the per lcore list and that one is used as a fast access queue. When the c=
ache becomes empty or has more entries then the cache was created with then=
 it pushed the extra entries to the main list of mempool data.

The only time that rwlock is touched is to get/put items on the main mempoo=
l.

Just as a data point have you tried this app on 18.08 yet? I do not see the=
 problem yet, sorry.

>=20
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, point=
er 0x7ffe8a262b00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, point=
er 0x7ffe8a262b80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, point=
er 0x7ffe8a262d00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, point=
er 0x7ffe8a262d80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, point=
er 0x7ffe8a262e00
>>>=20
>>>=20
>>>>>>         if (result =3D=3D 0)
>>>>>>         {
>>>>>>             new_work->command =3D command; // usage of the memory go=
tten
>>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>>>>>=20
>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>>=20
>>> The pointer is not NULL but is not sequential to the others (0x880002 a=
s written before in this email). It seems to be in a memory zone not in DPD=
K hugepages or something similar.
>>> If I use this pointer the application crashes.
>>>=20
>>>> Can you check how the mempool is initialized? It should be in multi
>>>> consumer and depending on your use case, single or multi producer.
>>>>=20
>>> Here is the initialization of this mempool
>>>=20
>>> cea_main_cmd_pool[i] =3D rte_mempool_create(pool_name,
>>>             (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this=
 case
>>>             sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>             0, 0,
>>>             rte_pktmbuf_pool_init, NULL,
>>>             rte_pktmbuf_init, NULL,
>>>             rte_socket_id(), 0);
>>>=20
>>>> Another thing that could be checked: at all the places where you
>>>> return your work object to the mempool, you should add a check
>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>> could do the trick: it adds some additional checks when doing
>>>> mempool operations.
>>>>=20
>>> I think I have already answered this point with the prints up in the em=
ail.
>>>=20
>>> What do you think about this behaviour?
>>>=20
>>> Regards,
>>> Matteo
>>>=20
>>>>>>             result =3D
>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of=
 all
>>>>>> lcores
>>>>>>             // check on result value ...
>>>>>>         }
>>>>>>         else
>>>>>>         {
>>>>>>             // do something if result !=3D 0 ...
>>>>>>         }
>>>>>> }
>>>>>>=20
>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while=
 if
>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it hap=
pens
>>>>>> that after some times the "new_work" pointer is not a good one, and =
the
>>>>>> application crashes when using that pointer.
>>>>>>=20
>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>>> success.
>>>>>>=20
>>>>>> Is this code possible on the new dpdk versions? Or have I to change =
my
>>>>>> application so that this code is called just by one lcore at a time?
>>>>>>=20
>>>> Assuming the mempool is properly initialized, I don't see any reason
>>>> why it would not work. There has been a lot of changes in mempool betw=
een
>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>>=20
>>>> If the comments above do not help to solve the issue, it could be help=
ful
>>>> to try to reproduce the issue in a minimal program, so we can help to
>>>> review it.
>>>>=20
>>>> Regards,
>>>> Olivier
>>>>=20
>> Regards,
>> Keith
>>=20
>>=20
>>=20
>>=20
> Regards,
>=20
> Matteo
>=20

Regards,
Keith