DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Multi-thread mempool usage
@ 2018-08-08  9:56 Matteo Lanzuisi
  2018-08-13 13:20 ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-08  9:56 UTC (permalink / raw)
  To: dev

Hi all,

recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on 
RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an 
application from RH6 to RH7. On RH6 I used dpdk-2.2.0.

This application is made up by one or more threads (each one on a 
different logical core) reading packets from i40e interfaces.

Each thread can call the following code lines when receiving a specific 
packet:

RTE_LCORE_FOREACH(lcore_id)
{
         result = 
rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P 
*) &new_work);        // mempools are created one for each logical core
         if (((uint64_t)(new_work)) < 0x7f0000000000)
             printf("Result %d, lcore di partenza %u, lcore di ricezione 
%u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    // 
debug print, on my server it should never happen but with multi-thread 
happens always on the last logical core!!!!
         if (result == 0)
         {
             new_work->command = command; // usage of the memory gotten 
from the mempool... <<<<<- here is where the application crashes!!!!
             result = 
rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, 
(VOID_P) new_work);    // enqueues the gotten buffer on the rings of all 
lcores
             // check on result value ...
         }
         else
         {
             // do something if result != 0 ...
         }
}

This code worked perfectly (never had an issue) on dpdk-2.2.0, while if 
I use more than 1 thread doing these operations on dpdk-17.11 it happens 
that after some times the "new_work" pointer is not a good one, and the 
application crashes when using that pointer.

It seems that these lines cannot be used by more than one thread 
simultaneously. I also used many 2017 and 2018 dpdk versions without 
success.

Is this code possible on the new dpdk versions? Or have I to change my 
application so that this code is called just by one lcore at a time?

Matteo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-08  9:56 [dpdk-dev] Multi-thread mempool usage Matteo Lanzuisi
@ 2018-08-13 13:20 ` Matteo Lanzuisi
  2018-08-13 21:54   ` Olivier Matz
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-13 13:20 UTC (permalink / raw)
  To: dev

Any suggestion? any idea about this behaviour?

Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
> Hi all,
>
> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on 
> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an 
> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>
> This application is made up by one or more threads (each one on a 
> different logical core) reading packets from i40e interfaces.
>
> Each thread can call the following code lines when receiving a 
> specific packet:
>
> RTE_LCORE_FOREACH(lcore_id)
> {
>         result = 
> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, 
> (VOID_P *) &new_work);        // mempools are created one for each 
> logical core
>         if (((uint64_t)(new_work)) < 0x7f0000000000)
>             printf("Result %d, lcore di partenza %u, lcore di 
> ricezione %u, pointer %p\n", result, rte_lcore_id(), lcore_id, 
> new_work);    // debug print, on my server it should never happen but 
> with multi-thread happens always on the last logical core!!!!
>         if (result == 0)
>         {
>             new_work->command = command; // usage of the memory gotten 
> from the mempool... <<<<<- here is where the application crashes!!!!
>             result = 
> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, 
> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of 
> all lcores
>             // check on result value ...
>         }
>         else
>         {
>             // do something if result != 0 ...
>         }
> }
>
> This code worked perfectly (never had an issue) on dpdk-2.2.0, while 
> if I use more than 1 thread doing these operations on dpdk-17.11 it 
> happens that after some times the "new_work" pointer is not a good 
> one, and the application crashes when using that pointer.
>
> It seems that these lines cannot be used by more than one thread 
> simultaneously. I also used many 2017 and 2018 dpdk versions without 
> success.
>
> Is this code possible on the new dpdk versions? Or have I to change my 
> application so that this code is called just by one lcore at a time?
>
> Matteo
>
>
>

-- 


Descrizione: resi.gif

*Matteo Lanzuisi*

/Business Unit ICT/

/
/RESI Informatica S.p.A.
Via Pontina Km 44,044
04011 Aprilia (LT) - Italy
*Tel:*+39 06 92710339
*Mobile:*+39 3355686712
*Fax:*+39 06 92710208

*Email:*m.lanzuisi@resi.it
*Web:*www.resi.it <blocked::http://www.resi.it/>**

Descrizione: resi-group.gif width= <http://www.resi-group.com/>

_ _ _ _ _ _ _ _ _ _ _ _

*Nota di riservatezza:*Ai sensi del Decreto Legislativo n. 196/2003, 
"Codice in materia di Protezione dei dati personali", si precisa che le 
informazioni contenute in questo messaggio e negli eventuali allegati 
sono riservate e per uso esclusivo del destinatario. Persone diverse 
dallo stesso non possono copiare o distribuire il messaggio a terzi. 
Chiunque riceva questo messaggio per errore è pregato di distruggerlo e 
di informare immediatamente il mittente. Grazie.


*Confidentiality Notice:* Accordingly to Italian legislative decree n. 
196/2003 concerning privacy, the information contained in this e-mail is 
intended for the named recipients only. It may contain privileged and 
confidential information and if you are not an intended recipient, you 
must not copy, distribute or take any action in reliance on it. If you 
have received this e-mail in error, please notify the sender by e-mail 
and delete the e-mail and any copies of it. Thank you.

*P*****Please consider the environment before printing this mail. 
Rispettate l’ambiente e stampate questa email solo in caso di reale 
necessità

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-13 13:20 ` Matteo Lanzuisi
@ 2018-08-13 21:54   ` Olivier Matz
  2018-08-20 14:47     ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Olivier Matz @ 2018-08-13 21:54 UTC (permalink / raw)
  To: Matteo Lanzuisi; +Cc: dev

Hello Matteo,

On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
> Any suggestion? any idea about this behaviour?
> 
> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
> > Hi all,
> > 
> > recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
> > RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
> > application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
> > 
> > This application is made up by one or more threads (each one on a
> > different logical core) reading packets from i40e interfaces.
> > 
> > Each thread can call the following code lines when receiving a specific
> > packet:
> > 
> > RTE_LCORE_FOREACH(lcore_id)
> > {
> >         result =
> > rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
> > *) &new_work);        // mempools are created one for each logical core
> >         if (((uint64_t)(new_work)) < 0x7f0000000000)
> >             printf("Result %d, lcore di partenza %u, lcore di ricezione
> > %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
> > debug print, on my server it should never happen but with multi-thread
> > happens always on the last logical core!!!!

Here, checking the value of new_work looks wrong to me, before
ensuring that result == 0. At least, new_work should be set to
NULL before calling rte_mempool_get().

> >         if (result == 0)
> >         {
> >             new_work->command = command; // usage of the memory gotten
> > from the mempool... <<<<<- here is where the application crashes!!!!

Do you know why it crashes? Is it that new_work is NULL?

Can you check how the mempool is initialized? It should be in multi
consumer and depending on your use case, single or multi producer.

Another thing that could be checked: at all the places where you
return your work object to the mempool, you should add a check
that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
could do the trick: it adds some additional checks when doing
mempool operations.

> >             result =
> > rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
> > (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
> > lcores
> >             // check on result value ...
> >         }
> >         else
> >         {
> >             // do something if result != 0 ...
> >         }
> > }
> > 
> > This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
> > I use more than 1 thread doing these operations on dpdk-17.11 it happens
> > that after some times the "new_work" pointer is not a good one, and the
> > application crashes when using that pointer.
> > 
> > It seems that these lines cannot be used by more than one thread
> > simultaneously. I also used many 2017 and 2018 dpdk versions without
> > success.
> > 
> > Is this code possible on the new dpdk versions? Or have I to change my
> > application so that this code is called just by one lcore at a time?

Assuming the mempool is properly initialized, I don't see any reason
why it would not work. There has been a lot of changes in mempool between
dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.

If the comments above do not help to solve the issue, it could be helpful
to try to reproduce the issue in a minimal program, so we can help to
review it.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-13 21:54   ` Olivier Matz
@ 2018-08-20 14:47     ` Matteo Lanzuisi
  2018-08-20 16:03       ` Wiles, Keith
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-20 14:47 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

Hello Olivier,

Il 13/08/2018 23:54, Olivier Matz ha scritto:
> Hello Matteo,
>
> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>> Any suggestion? any idea about this behaviour?
>>
>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>> Hi all,
>>>
>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>
>>> This application is made up by one or more threads (each one on a
>>> different logical core) reading packets from i40e interfaces.
>>>
>>> Each thread can call the following code lines when receiving a specific
>>> packet:
>>>
>>> RTE_LCORE_FOREACH(lcore_id)
>>> {
>>>          result =
>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>> *) &new_work);        // mempools are created one for each logical core
>>>          if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>              printf("Result %d, lcore di partenza %u, lcore di ricezione
>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>> debug print, on my server it should never happen but with multi-thread
>>> happens always on the last logical core!!!!
> Here, checking the value of new_work looks wrong to me, before
> ensuring that result == 0. At least, new_work should be set to
> NULL before calling rte_mempool_get().
I put the check after result == 0, and just before the rte_mempool_get() 
I set new_work to NULL, but nothing changed.
The first time something goes wrong the print is

Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, 
pointer 0x880002

Sorry for the italian language print :) it means that application is 
sending a message from the logical core 1 to the logical core 2, it's 
the 635th time, the result is 0 and the pointer is 0x880002 while all 
pointers before were 0x7ffxxxxxx.
One strange thing is that this behaviour happens always from the logical 
core 1 to the logical core 2 when the counter is 635!!! (Sending 
messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
Another strange thing is that pointers from counter 636 to 640 are NULL, 
and from 641 begin again to be good... as you can see here following (I 
attached the result of a test without the "if" of the check on the value 
of new_work, and only for messages from the lcore 1 to lcore 2)

Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, 
pointer 0x7ffe8a261880
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, 
pointer 0x7ffe8a261900
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, 
pointer 0x7ffe8a261980
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, 
pointer 0x7ffe8a261a00
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, 
pointer 0x7ffe8a261a80
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, 
pointer 0x7ffe8a261b00
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, 
pointer 0x7ffe8a261b80
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, 
pointer 0x7ffe8a261c00
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, 
pointer 0x880002
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, 
pointer (nil)
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, 
pointer (nil)
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, 
pointer (nil)
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, 
pointer (nil)
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, 
pointer (nil)
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, 
pointer 0x7ffe8a262b00
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, 
pointer 0x7ffe8a262b80
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, 
pointer 0x7ffe8a262d00
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, 
pointer 0x7ffe8a262d80
Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, 
pointer 0x7ffe8a262e00

>
>>>          if (result == 0)
>>>          {
>>>              new_work->command = command; // usage of the memory gotten
>>> from the mempool... <<<<<- here is where the application crashes!!!!
> Do you know why it crashes? Is it that new_work is NULL?
The pointer is not NULL but is not sequential to the others (0x880002 as 
written before in this email). It seems to be in a memory zone not in 
DPDK hugepages or something similar.
If I use this pointer the application crashes.
>
> Can you check how the mempool is initialized? It should be in multi
> consumer and depending on your use case, single or multi producer.
Here is the initialization of this mempool

cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
             (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this 
case
             sizeof (CEA_DECODE_CMD_T), // 24 bytes
             0, 0,
             rte_pktmbuf_pool_init, NULL,
             rte_pktmbuf_init, NULL,
             rte_socket_id(), 0);
>
> Another thing that could be checked: at all the places where you
> return your work object to the mempool, you should add a check
> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
> could do the trick: it adds some additional checks when doing
> mempool operations.
I think I have already answered this point with the prints up in the email.

What do you think about this behaviour?

Regards,
Matteo
>
>>>              result =
>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>> lcores
>>>              // check on result value ...
>>>          }
>>>          else
>>>          {
>>>              // do something if result != 0 ...
>>>          }
>>> }
>>>
>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>> that after some times the "new_work" pointer is not a good one, and the
>>> application crashes when using that pointer.
>>>
>>> It seems that these lines cannot be used by more than one thread
>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>> success.
>>>
>>> Is this code possible on the new dpdk versions? Or have I to change my
>>> application so that this code is called just by one lcore at a time?
> Assuming the mempool is properly initialized, I don't see any reason
> why it would not work. There has been a lot of changes in mempool between
> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>
> If the comments above do not help to solve the issue, it could be helpful
> to try to reproduce the issue in a minimal program, so we can help to
> review it.
>
> Regards,
> Olivier
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-20 14:47     ` Matteo Lanzuisi
@ 2018-08-20 16:03       ` Wiles, Keith
  2018-08-21 12:01         ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Wiles, Keith @ 2018-08-20 16:03 UTC (permalink / raw)
  To: Matteo Lanzuisi; +Cc: Olivier Matz, dev



> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
> 
> Hello Olivier,
> 
> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>> Hello Matteo,
>> 
>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>> Any suggestion? any idea about this behaviour?
>>> 
>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>> Hi all,
>>>> 
>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>> 
>>>> This application is made up by one or more threads (each one on a
>>>> different logical core) reading packets from i40e interfaces.
>>>> 
>>>> Each thread can call the following code lines when receiving a specific
>>>> packet:
>>>> 
>>>> RTE_LCORE_FOREACH(lcore_id)
>>>> {
>>>>         result =
>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>> *) &new_work);        // mempools are created one for each logical core
>>>>         if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>             printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>> debug print, on my server it should never happen but with multi-thread
>>>> happens always on the last logical core!!!!
>> Here, checking the value of new_work looks wrong to me, before
>> ensuring that result == 0. At least, new_work should be set to
>> NULL before calling rte_mempool_get().
> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
> The first time something goes wrong the print is
> 
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
> 
> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
> 
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)

This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?

> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
> 
>> 
>>>>         if (result == 0)
>>>>         {
>>>>             new_work->command = command; // usage of the memory gotten
>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>> Do you know why it crashes? Is it that new_work is NULL?
> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
> If I use this pointer the application crashes.
>> 
>> Can you check how the mempool is initialized? It should be in multi
>> consumer and depending on your use case, single or multi producer.
> Here is the initialization of this mempool
> 
> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>             (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>             sizeof (CEA_DECODE_CMD_T), // 24 bytes
>             0, 0,
>             rte_pktmbuf_pool_init, NULL,
>             rte_pktmbuf_init, NULL,
>             rte_socket_id(), 0);
>> 
>> Another thing that could be checked: at all the places where you
>> return your work object to the mempool, you should add a check
>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>> could do the trick: it adds some additional checks when doing
>> mempool operations.
> I think I have already answered this point with the prints up in the email.
> 
> What do you think about this behaviour?
> 
> Regards,
> Matteo
>> 
>>>>             result =
>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>> lcores
>>>>             // check on result value ...
>>>>         }
>>>>         else
>>>>         {
>>>>             // do something if result != 0 ...
>>>>         }
>>>> }
>>>> 
>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>> that after some times the "new_work" pointer is not a good one, and the
>>>> application crashes when using that pointer.
>>>> 
>>>> It seems that these lines cannot be used by more than one thread
>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>> success.
>>>> 
>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>> application so that this code is called just by one lcore at a time?
>> Assuming the mempool is properly initialized, I don't see any reason
>> why it would not work. There has been a lot of changes in mempool between
>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>> 
>> If the comments above do not help to solve the issue, it could be helpful
>> to try to reproduce the issue in a minimal program, so we can help to
>> review it.
>> 
>> Regards,
>> Olivier

Regards,
Keith

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-20 16:03       ` Wiles, Keith
@ 2018-08-21 12:01         ` Matteo Lanzuisi
  2018-08-21 12:17           ` Wiles, Keith
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-21 12:01 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: Olivier Matz, dev

Hi

Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>
>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>
>> Hello Olivier,
>>
>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>> Hello Matteo,
>>>
>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>> Any suggestion? any idea about this behaviour?
>>>>
>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>> Hi all,
>>>>>
>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>
>>>>> This application is made up by one or more threads (each one on a
>>>>> different logical core) reading packets from i40e interfaces.
>>>>>
>>>>> Each thread can call the following code lines when receiving a specific
>>>>> packet:
>>>>>
>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>> {
>>>>>          result =
>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>>> *) &new_work);        // mempools are created one for each logical core
>>>>>          if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>              printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>>> debug print, on my server it should never happen but with multi-thread
>>>>> happens always on the last logical core!!!!
>>> Here, checking the value of new_work looks wrong to me, before
>>> ensuring that result == 0. At least, new_work should be set to
>>> NULL before calling rte_mempool_get().
>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
>> The first time something goes wrong the print is
>>
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>
>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
>>
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)
> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?
I put some breakpoints and found this:

1 - using pointer 0x880002, the output is (the pointer comes in the 
middle of two rwlock):

(gdb) awatch *0x880002
Hardware access (read/write) watchpoint 1: *0x880002
(gdb) c
Continuing.
[New Thread 0x7fffeded5700 (LWP 19969)]
[New Thread 0x7fffed6d4700 (LWP 19970)]
[New Thread 0x7fffeced3700 (LWP 19971)]
[New Thread 0x7fffec6d2700 (LWP 19972)]
[New Thread 0x7fffebed1700 (LWP 19973)]
[New Thread 0x7fffeb6d0700 (LWP 19974)]
Hardware access (read/write) watchpoint 1: *0x880002

Value = 0
rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
     at 
/usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
81    }
(gdb) c
Continuing.
Hardware access (read/write) watchpoint 1: *0x880002

Value = 0
rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
     at 
/usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
81    }
(gdb) c
Continuing.

2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range 
of the mempool), gdb tells nothing about them (I don't use them, I just 
get them from the pool and the put them in the pool again);

3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb:

(gdb) awatch *(int *)0x7ffe8a261d65
Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
(gdb) c
Continuing.
[New Thread 0x7fffeded5700 (LWP 17689)]
[New Thread 0x7fffed6d4700 (LWP 17690)]
[New Thread 0x7fffeced3700 (LWP 17691)]
[New Thread 0x7fffec6d2700 (LWP 17692)]
[New Thread 0x7fffebed1700 (LWP 17693)]
[New Thread 0x7fffeb6d0700 (LWP 17694)]
Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65

Value = 0
0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, 
obj=obj@entry=0x7ffe8a261d80,
     iova=iova@entry=4465237376) at 
/usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
(gdb) where
#0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, 
obj=obj@entry=0x7ffe8a261d80,
     iova=iova@entry=4465237376) at 
/usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
#1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, 
vaddr=0x7ffe8a23d540 "",
     iova=4465087808, len=8388480, free_cb=<optimized out>, 
opaque=<optimized out>)
     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
#2  0x00007ffff379967d in rte_mempool_populate_default 
(mp=mp@entry=0x7ffebfd8d6c0)
     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
#3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, 
n=<optimized out>,
     elt_size=<optimized out>, cache_size=<optimized out>, 
private_data_size=<optimized out>,
     mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
     obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, 
socket_id=0, flags=0)
     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
#4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
     at 
/root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
(gdb) c
Continuing.
Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65

Old value = 0
New value = -402653184
0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, 
obj=obj@entry=0x7ffe8a261e00,
     iova=iova@entry=4465237504) at 
/usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
(gdb) where
#0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, 
obj=obj@entry=0x7ffe8a261e00,
     iova=iova@entry=4465237504) at 
/usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
#1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, 
vaddr=0x7ffe8a23d540 "",
     iova=4465087808, len=8388480, free_cb=<optimized out>, 
opaque=<optimized out>)
     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
#2  0x00007ffff379967d in rte_mempool_populate_default 
(mp=mp@entry=0x7ffebfd8d6c0)
     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
#3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, 
n=<optimized out>,
     elt_size=<optimized out>, cache_size=<optimized out>, 
private_data_size=<optimized out>,
     mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
     obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, 
socket_id=0, flags=0)
     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
#4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
     at 
/root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
(gdb) c
Continuing.

What do you think? It is normal that the mempool_add_elem is called only 
on certain pointers of the mempool?
I attached the initialization of the mempool. Can this be wrong?

>
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
>>
>>>>>          if (result == 0)
>>>>>          {
>>>>>              new_work->command = command; // usage of the memory gotten
>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>> Do you know why it crashes? Is it that new_work is NULL?
>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
>> If I use this pointer the application crashes.
>>> Can you check how the mempool is initialized? It should be in multi
>>> consumer and depending on your use case, single or multi producer.
>> Here is the initialization of this mempool
>>
>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>              (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>>              sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>              0, 0,
>>              rte_pktmbuf_pool_init, NULL,
>>              rte_pktmbuf_init, NULL,
>>              rte_socket_id(), 0);
>>> Another thing that could be checked: at all the places where you
>>> return your work object to the mempool, you should add a check
>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>> could do the trick: it adds some additional checks when doing
>>> mempool operations.
>> I think I have already answered this point with the prints up in the email.
>>
>> What do you think about this behaviour?
>>
>> Regards,
>> Matteo
>>>>>              result =
>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>>> lcores
>>>>>              // check on result value ...
>>>>>          }
>>>>>          else
>>>>>          {
>>>>>              // do something if result != 0 ...
>>>>>          }
>>>>> }
>>>>>
>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>>> that after some times the "new_work" pointer is not a good one, and the
>>>>> application crashes when using that pointer.
>>>>>
>>>>> It seems that these lines cannot be used by more than one thread
>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>> success.
>>>>>
>>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>>> application so that this code is called just by one lcore at a time?
>>> Assuming the mempool is properly initialized, I don't see any reason
>>> why it would not work. There has been a lot of changes in mempool between
>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>
>>> If the comments above do not help to solve the issue, it could be helpful
>>> to try to reproduce the issue in a minimal program, so we can help to
>>> review it.
>>>
>>> Regards,
>>> Olivier
> Regards,
> Keith
>
>
>
Regards,

Matteo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-21 12:01         ` Matteo Lanzuisi
@ 2018-08-21 12:17           ` Wiles, Keith
  2018-08-21 12:44             ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Wiles, Keith @ 2018-08-21 12:17 UTC (permalink / raw)
  To: Matteo Lanzuisi; +Cc: Olivier Matz, dev



> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
> 
> Hi
> 
> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>  wrote:
>>> 
>>> Hello Olivier,
>>> 
>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>> 
>>>> Hello Matteo,
>>>> 
>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>> 
>>>>> Any suggestion? any idea about this behaviour?
>>>>> 
>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>> 
>>>>>> This application is made up by one or more threads (each one on a
>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>> 
>>>>>> Each thread can call the following code lines when receiving a specific
>>>>>> packet:
>>>>>> 
>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>> {
>>>>>>         result =
>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>>>> *) &new_work);        // mempools are created one for each logical core
>>>>>>         if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>             printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>>>> debug print, on my server it should never happen but with multi-thread
>>>>>> happens always on the last logical core!!!!
>>>>>> 
>>>> Here, checking the value of new_work looks wrong to me, before
>>>> ensuring that result == 0. At least, new_work should be set to
>>>> NULL before calling rte_mempool_get().
>>>> 
>>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
>>> The first time something goes wrong the print is
>>> 
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>> 
>>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
>>> 
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)
>>> 
>> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?
> I put some breakpoints and found this:
> 
> 1 - using pointer 0x880002, the output is (the pointer comes in the middle of two rwlock):
> 
> (gdb) awatch *0x880002
> Hardware access (read/write) watchpoint 1: *0x880002
> (gdb) c
> Continuing.
> [New Thread 0x7fffeded5700 (LWP 19969)]
> [New Thread 0x7fffed6d4700 (LWP 19970)]
> [New Thread 0x7fffeced3700 (LWP 19971)]
> [New Thread 0x7fffec6d2700 (LWP 19972)]
> [New Thread 0x7fffebed1700 (LWP 19973)]
> [New Thread 0x7fffeb6d0700 (LWP 19974)]
> Hardware access (read/write) watchpoint 1: *0x880002
> 
> Value = 0
> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
>     at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
> 81    }
> (gdb) c
> Continuing.
> Hardware access (read/write) watchpoint 1: *0x880002

These are most likely false positive hits and not the issue.
> 
> Value = 0
> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
>     at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
> 81    }
> (gdb) c
> Continuing.
> 
> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range of the mempool), gdb tells nothing about them (I don't use them, I just get them from the pool and the put them in the pool again);
> 
> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb:
> 
> (gdb) awatch *(int *)0x7ffe8a261d65
> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
> (gdb) c
> Continuing.
> [New Thread 0x7fffeded5700 (LWP 17689)]
> [New Thread 0x7fffed6d4700 (LWP 17690)]
> [New Thread 0x7fffeced3700 (LWP 17691)]
> [New Thread 0x7fffec6d2700 (LWP 17692)]
> [New Thread 0x7fffebed1700 (LWP 17693)]
> [New Thread 0x7fffeb6d0700 (LWP 17694)]
> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
> 
> Value = 0
> 0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80, 
>     iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
> (gdb) where
> #0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80, 
>     iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "", 
>     iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>, 
>     elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>, 
>     mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0, 
>     obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>     at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
> (gdb) c
> Continuing.
> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65

This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq list for mempools in DPDK.

> 
> Old value = 0
> New value = -402653184
> 0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00, 
>     iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
> (gdb) where
> #0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00, 
>     iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "", 
>     iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>, 
>     elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>, 
>     mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0, 
>     obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>     at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
> (gdb) c
> Continuing.
> 
> What do you think? It is normal that the mempool_add_elem is called only on certain pointers of the mempool? 
> I attached the initialization of the mempool. Can this be wrong? 

All mempools with a cache size will have two queue to put memory on, one is the per lcore list and that one is used as a fast access queue. When the cache becomes empty or has more entries then the cache was created with then it pushed the extra entries to the main list of mempool data.

The only time that rwlock is touched is to get/put items on the main mempool.

Just as a data point have you tried this app on 18.08 yet? I do not see the problem yet, sorry.

> 
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
>>> 
>>> 
>>>>>>         if (result == 0)
>>>>>>         {
>>>>>>             new_work->command = command; // usage of the memory gotten
>>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>>>>> 
>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>> 
>>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
>>> If I use this pointer the application crashes.
>>> 
>>>> Can you check how the mempool is initialized? It should be in multi
>>>> consumer and depending on your use case, single or multi producer.
>>>> 
>>> Here is the initialization of this mempool
>>> 
>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>>             (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>>>             sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>             0, 0,
>>>             rte_pktmbuf_pool_init, NULL,
>>>             rte_pktmbuf_init, NULL,
>>>             rte_socket_id(), 0);
>>> 
>>>> Another thing that could be checked: at all the places where you
>>>> return your work object to the mempool, you should add a check
>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>> could do the trick: it adds some additional checks when doing
>>>> mempool operations.
>>>> 
>>> I think I have already answered this point with the prints up in the email.
>>> 
>>> What do you think about this behaviour?
>>> 
>>> Regards,
>>> Matteo
>>> 
>>>>>>             result =
>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>>>> lcores
>>>>>>             // check on result value ...
>>>>>>         }
>>>>>>         else
>>>>>>         {
>>>>>>             // do something if result != 0 ...
>>>>>>         }
>>>>>> }
>>>>>> 
>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>>>> that after some times the "new_work" pointer is not a good one, and the
>>>>>> application crashes when using that pointer.
>>>>>> 
>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>>> success.
>>>>>> 
>>>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>>>> application so that this code is called just by one lcore at a time?
>>>>>> 
>>>> Assuming the mempool is properly initialized, I don't see any reason
>>>> why it would not work. There has been a lot of changes in mempool between
>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>> 
>>>> If the comments above do not help to solve the issue, it could be helpful
>>>> to try to reproduce the issue in a minimal program, so we can help to
>>>> review it.
>>>> 
>>>> Regards,
>>>> Olivier
>>>> 
>> Regards,
>> Keith
>> 
>> 
>> 
>> 
> Regards,
> 
> Matteo
> 

Regards,
Keith

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-21 12:17           ` Wiles, Keith
@ 2018-08-21 12:44             ` Matteo Lanzuisi
  2018-08-21 12:51               ` Wiles, Keith
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-21 12:44 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: Olivier Matz, dev

Il 21/08/2018 14:17, Wiles, Keith ha scritto:
>
>> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>
>> Hi
>>
>> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>>   wrote:
>>>>
>>>> Hello Olivier,
>>>>
>>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>>>
>>>>> Hello Matteo,
>>>>>
>>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>>>
>>>>>> Any suggestion? any idea about this behaviour?
>>>>>>
>>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>>>
>>>>>>> This application is made up by one or more threads (each one on a
>>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>>>
>>>>>>> Each thread can call the following code lines when receiving a specific
>>>>>>> packet:
>>>>>>>
>>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>>> {
>>>>>>>          result =
>>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>>>>> *) &new_work);        // mempools are created one for each logical core
>>>>>>>          if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>>              printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>>>>> debug print, on my server it should never happen but with multi-thread
>>>>>>> happens always on the last logical core!!!!
>>>>>>>
>>>>> Here, checking the value of new_work looks wrong to me, before
>>>>> ensuring that result == 0. At least, new_work should be set to
>>>>> NULL before calling rte_mempool_get().
>>>>>
>>>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
>>>> The first time something goes wrong the print is
>>>>
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>
>>>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>>>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
>>>>
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)
>>>>
>>> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?
>> I put some breakpoints and found this:
>>
>> 1 - using pointer 0x880002, the output is (the pointer comes in the middle of two rwlock):
>>
>> (gdb) awatch *0x880002
>> Hardware access (read/write) watchpoint 1: *0x880002
>> (gdb) c
>> Continuing.
>> [New Thread 0x7fffeded5700 (LWP 19969)]
>> [New Thread 0x7fffed6d4700 (LWP 19970)]
>> [New Thread 0x7fffeced3700 (LWP 19971)]
>> [New Thread 0x7fffec6d2700 (LWP 19972)]
>> [New Thread 0x7fffebed1700 (LWP 19973)]
>> [New Thread 0x7fffeb6d0700 (LWP 19974)]
>> Hardware access (read/write) watchpoint 1: *0x880002
>>
>> Value = 0
>> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
>>      at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>> 81    }
>> (gdb) c
>> Continuing.
>> Hardware access (read/write) watchpoint 1: *0x880002
> These are most likely false positive hits and not the issue.
>> Value = 0
>> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
>>      at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>> 81    }
>> (gdb) c
>> Continuing.
>>
>> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range of the mempool), gdb tells nothing about them (I don't use them, I just get them from the pool and the put them in the pool again);
>>
>> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb:
>>
>> (gdb) awatch *(int *)0x7ffe8a261d65
>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>> (gdb) c
>> Continuing.
>> [New Thread 0x7fffeded5700 (LWP 17689)]
>> [New Thread 0x7fffed6d4700 (LWP 17690)]
>> [New Thread 0x7fffeced3700 (LWP 17691)]
>> [New Thread 0x7fffec6d2700 (LWP 17692)]
>> [New Thread 0x7fffebed1700 (LWP 17693)]
>> [New Thread 0x7fffeb6d0700 (LWP 17694)]
>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>
>> Value = 0
>> 0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>      iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>> (gdb) where
>> #0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>      iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>      iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>      elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>      at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>> (gdb) c
>> Continuing.
>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
> This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq list for mempools in DPDK.
>
>> Old value = 0
>> New value = -402653184
>> 0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>      iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>> (gdb) where
>> #0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>      iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>      iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>      elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>      at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>> (gdb) c
>> Continuing.
>>
>> What do you think? It is normal that the mempool_add_elem is called only on certain pointers of the mempool?
>> I attached the initialization of the mempool. Can this be wrong?
> All mempools with a cache size will have two queue to put memory on, one is the per lcore list and that one is used as a fast access queue. When the cache becomes empty or has more entries then the cache was created with then it pushed the extra entries to the main list of mempool data.
Why do you say "mempools with a cache size" ? In my initialization this 
mempool has cache_size = 0
>
> The only time that rwlock is touched is to get/put items on the main mempool.
>
> Just as a data point have you tried this app on 18.08 yet? I do not see the problem yet, sorry.
I'll try 18.08 and let you know
>
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
>>>>
>>>>
>>>>>>>          if (result == 0)
>>>>>>>          {
>>>>>>>              new_work->command = command; // usage of the memory gotten
>>>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>>>>>>
>>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>>>
>>>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
>>>> If I use this pointer the application crashes.
>>>>
>>>>> Can you check how the mempool is initialized? It should be in multi
>>>>> consumer and depending on your use case, single or multi producer.
>>>>>
>>>> Here is the initialization of this mempool
>>>>
>>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>>>              (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>>>>              sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>>              0, 0,
>>>>              rte_pktmbuf_pool_init, NULL,
>>>>              rte_pktmbuf_init, NULL,
>>>>              rte_socket_id(), 0);
>>>>
>>>>> Another thing that could be checked: at all the places where you
>>>>> return your work object to the mempool, you should add a check
>>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>>> could do the trick: it adds some additional checks when doing
>>>>> mempool operations.
>>>>>
>>>> I think I have already answered this point with the prints up in the email.
>>>>
>>>> What do you think about this behaviour?
>>>>
>>>> Regards,
>>>> Matteo
>>>>
>>>>>>>              result =
>>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>>>>> lcores
>>>>>>>              // check on result value ...
>>>>>>>          }
>>>>>>>          else
>>>>>>>          {
>>>>>>>              // do something if result != 0 ...
>>>>>>>          }
>>>>>>> }
>>>>>>>
>>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>>>>> that after some times the "new_work" pointer is not a good one, and the
>>>>>>> application crashes when using that pointer.
>>>>>>>
>>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>>>> success.
>>>>>>>
>>>>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>>>>> application so that this code is called just by one lcore at a time?
>>>>>>>
>>>>> Assuming the mempool is properly initialized, I don't see any reason
>>>>> why it would not work. There has been a lot of changes in mempool between
>>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>>>
>>>>> If the comments above do not help to solve the issue, it could be helpful
>>>>> to try to reproduce the issue in a minimal program, so we can help to
>>>>> review it.
>>>>>
>>>>> Regards,
>>>>> Olivier
>>>>>
>>> Regards,
>>> Keith
>>>
>>>
>>>
>>>
>> Regards,
>>
>> Matteo
>>
> Regards,
> Keith
>
>
>
Regards,
Matteo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-21 12:44             ` Matteo Lanzuisi
@ 2018-08-21 12:51               ` Wiles, Keith
  2018-08-21 14:46                 ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Wiles, Keith @ 2018-08-21 12:51 UTC (permalink / raw)
  To: Matteo Lanzuisi; +Cc: Olivier Matz, dev



> On Aug 21, 2018, at 7:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
> 
> Il 21/08/2018 14:17, Wiles, Keith ha scritto:
>> 
>>> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>> 
>>> Hi
>>> 
>>> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>>>  wrote:
>>>>> 
>>>>> Hello Olivier,
>>>>> 
>>>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>>>> 
>>>>>> Hello Matteo,
>>>>>> 
>>>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>>>> 
>>>>>>> Any suggestion? any idea about this behaviour?
>>>>>>> 
>>>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>>>> 
>>>>>>>> This application is made up by one or more threads (each one on a
>>>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>>>> 
>>>>>>>> Each thread can call the following code lines when receiving a specific
>>>>>>>> packet:
>>>>>>>> 
>>>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>>>> {
>>>>>>>>         result =
>>>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>>>>>> *) &new_work);        // mempools are created one for each logical core
>>>>>>>>         if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>>>             printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>>>>>> debug print, on my server it should never happen but with multi-thread
>>>>>>>> happens always on the last logical core!!!!
>>>>>>>> 
>>>>>> Here, checking the value of new_work looks wrong to me, before
>>>>>> ensuring that result == 0. At least, new_work should be set to
>>>>>> NULL before calling rte_mempool_get().
>>>>>> 
>>>>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
>>>>> The first time something goes wrong the print is
>>>>> 
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>> 
>>>>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>>>>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>>>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
>>>>> 
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)
>>>>> 
>>>> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?
>>> I put some breakpoints and found this:
>>> 
>>> 1 - using pointer 0x880002, the output is (the pointer comes in the middle of two rwlock):
>>> 
>>> (gdb) awatch *0x880002
>>> Hardware access (read/write) watchpoint 1: *0x880002
>>> (gdb) c
>>> Continuing.
>>> [New Thread 0x7fffeded5700 (LWP 19969)]
>>> [New Thread 0x7fffed6d4700 (LWP 19970)]
>>> [New Thread 0x7fffeced3700 (LWP 19971)]
>>> [New Thread 0x7fffec6d2700 (LWP 19972)]
>>> [New Thread 0x7fffebed1700 (LWP 19973)]
>>> [New Thread 0x7fffeb6d0700 (LWP 19974)]
>>> Hardware access (read/write) watchpoint 1: *0x880002
>>> 
>>> Value = 0
>>> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
>>>     at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>> 81    }
>>> (gdb) c
>>> Continuing.
>>> Hardware access (read/write) watchpoint 1: *0x880002
>> These are most likely false positive hits and not the issue.
>>> Value = 0
>>> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
>>>     at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>> 81    }
>>> (gdb) c
>>> Continuing.
>>> 
>>> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range of the mempool), gdb tells nothing about them (I don't use them, I just get them from the pool and the put them in the pool again);
>>> 
>>> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb:
>>> 
>>> (gdb) awatch *(int *)0x7ffe8a261d65
>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>> (gdb) c
>>> Continuing.
>>> [New Thread 0x7fffeded5700 (LWP 17689)]
>>> [New Thread 0x7fffed6d4700 (LWP 17690)]
>>> [New Thread 0x7fffeced3700 (LWP 17691)]
>>> [New Thread 0x7fffec6d2700 (LWP 17692)]
>>> [New Thread 0x7fffebed1700 (LWP 17693)]
>>> [New Thread 0x7fffeb6d0700 (LWP 17694)]
>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>> 
>>> Value = 0
>>> 0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>     iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>> (gdb) where
>>> #0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>     iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>     iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>>     elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>>     mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>     obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>     at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>> (gdb) c
>>> Continuing.
>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>> This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq list for mempools in DPDK.
>> 
>>> Old value = 0
>>> New value = -402653184
>>> 0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>     iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>> (gdb) where
>>> #0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>     iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>     iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>>     elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>>     mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>     obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>>     at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>     at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>> (gdb) c
>>> Continuing.
>>> 
>>> What do you think? It is normal that the mempool_add_elem is called only on certain pointers of the mempool?
>>> I attached the initialization of the mempool. Can this be wrong?
>> All mempools with a cache size will have two queue to put memory on, one is the per lcore list and that one is used as a fast access queue. When the cache becomes empty or has more entries then the cache was created with then it pushed the extra entries to the main list of mempool data.
> Why do you say "mempools with a cache size" ? In my initialization this mempool has cache_size = 0

If you give a cache size then you will have a cache list per lcore, in your case you do not have a cache. BTW not having a cache will effect performance a great deal.

>> 
>> The only time that rwlock is touched is to get/put items on the main mempool.
>> 
>> Just as a data point have you tried this app on 18.08 yet? I do not see the problem yet, sorry.
> I'll try 18.08 and let you know
>> 
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
>>>>> 
>>>>> 
>>>>>>>>         if (result == 0)
>>>>>>>>         {
>>>>>>>>             new_work->command = command; // usage of the memory gotten
>>>>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>>>>>>> 
>>>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>>>> 
>>>>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
>>>>> If I use this pointer the application crashes.
>>>>> 
>>>>>> Can you check how the mempool is initialized? It should be in multi
>>>>>> consumer and depending on your use case, single or multi producer.
>>>>>> 
>>>>> Here is the initialization of this mempool
>>>>> 
>>>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>>>>             (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>>>>>             sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>>>             0, 0,
>>>>>             rte_pktmbuf_pool_init, NULL,
>>>>>             rte_pktmbuf_init, NULL,
>>>>>             rte_socket_id(), 0);
>>>>> 
>>>>>> Another thing that could be checked: at all the places where you
>>>>>> return your work object to the mempool, you should add a check
>>>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>>>> could do the trick: it adds some additional checks when doing
>>>>>> mempool operations.
>>>>>> 
>>>>> I think I have already answered this point with the prints up in the email.
>>>>> 
>>>>> What do you think about this behaviour?
>>>>> 
>>>>> Regards,
>>>>> Matteo
>>>>> 
>>>>>>>>             result =
>>>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>>>>>> lcores
>>>>>>>>             // check on result value ...
>>>>>>>>         }
>>>>>>>>         else
>>>>>>>>         {
>>>>>>>>             // do something if result != 0 ...
>>>>>>>>         }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>>>>>> that after some times the "new_work" pointer is not a good one, and the
>>>>>>>> application crashes when using that pointer.
>>>>>>>> 
>>>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>>>>> success.
>>>>>>>> 
>>>>>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>>>>>> application so that this code is called just by one lcore at a time?
>>>>>>>> 
>>>>>> Assuming the mempool is properly initialized, I don't see any reason
>>>>>> why it would not work. There has been a lot of changes in mempool between
>>>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>>>> 
>>>>>> If the comments above do not help to solve the issue, it could be helpful
>>>>>> to try to reproduce the issue in a minimal program, so we can help to
>>>>>> review it.
>>>>>> 
>>>>>> Regards,
>>>>>> Olivier
>>>>>> 
>>>> Regards,
>>>> Keith
>>>> 
>>>> 
>>>> 
>>>> 
>>> Regards,
>>> 
>>> Matteo
>>> 
>> Regards,
>> Keith
>> 
>> 
>> 
> Regards,
> Matteo

Regards,
Keith

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-21 12:51               ` Wiles, Keith
@ 2018-08-21 14:46                 ` Matteo Lanzuisi
  2018-08-24 14:44                   ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-21 14:46 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: Olivier Matz, dev

Il 21/08/2018 14:51, Wiles, Keith ha scritto:
>
>> On Aug 21, 2018, at 7:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>
>> Il 21/08/2018 14:17, Wiles, Keith ha scritto:
>>>> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>>>
>>>> Hi
>>>>
>>>> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>>>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>>>>   wrote:
>>>>>>
>>>>>> Hello Olivier,
>>>>>>
>>>>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>>>>>
>>>>>>> Hello Matteo,
>>>>>>>
>>>>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>>>>>
>>>>>>>> Any suggestion? any idea about this behaviour?
>>>>>>>>
>>>>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>>>>>
>>>>>>>>> This application is made up by one or more threads (each one on a
>>>>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>>>>>
>>>>>>>>> Each thread can call the following code lines when receiving a specific
>>>>>>>>> packet:
>>>>>>>>>
>>>>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>>>>> {
>>>>>>>>>          result =
>>>>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>>>>>>> *) &new_work);        // mempools are created one for each logical core
>>>>>>>>>          if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>>>>              printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>>>>>>> debug print, on my server it should never happen but with multi-thread
>>>>>>>>> happens always on the last logical core!!!!
>>>>>>>>>
>>>>>>> Here, checking the value of new_work looks wrong to me, before
>>>>>>> ensuring that result == 0. At least, new_work should be set to
>>>>>>> NULL before calling rte_mempool_get().
>>>>>>>
>>>>>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
>>>>>> The first time something goes wrong the print is
>>>>>>
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>>>
>>>>>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>>>>>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>>>>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
>>>>>>
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)
>>>>>>
>>>>> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?
>>>> I put some breakpoints and found this:
>>>>
>>>> 1 - using pointer 0x880002, the output is (the pointer comes in the middle of two rwlock):
>>>>
>>>> (gdb) awatch *0x880002
>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>> (gdb) c
>>>> Continuing.
>>>> [New Thread 0x7fffeded5700 (LWP 19969)]
>>>> [New Thread 0x7fffed6d4700 (LWP 19970)]
>>>> [New Thread 0x7fffeced3700 (LWP 19971)]
>>>> [New Thread 0x7fffec6d2700 (LWP 19972)]
>>>> [New Thread 0x7fffebed1700 (LWP 19973)]
>>>> [New Thread 0x7fffeb6d0700 (LWP 19974)]
>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>>
>>>> Value = 0
>>>> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
>>>>      at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>> 81    }
>>>> (gdb) c
>>>> Continuing.
>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>> These are most likely false positive hits and not the issue.
>>>> Value = 0
>>>> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
>>>>      at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>> 81    }
>>>> (gdb) c
>>>> Continuing.
>>>>
>>>> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range of the mempool), gdb tells nothing about them (I don't use them, I just get them from the pool and the put them in the pool again);
>>>>
>>>> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb:
>>>>
>>>> (gdb) awatch *(int *)0x7ffe8a261d65
>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>> (gdb) c
>>>> Continuing.
>>>> [New Thread 0x7fffeded5700 (LWP 17689)]
>>>> [New Thread 0x7fffed6d4700 (LWP 17690)]
>>>> [New Thread 0x7fffeced3700 (LWP 17691)]
>>>> [New Thread 0x7fffec6d2700 (LWP 17692)]
>>>> [New Thread 0x7fffebed1700 (LWP 17693)]
>>>> [New Thread 0x7fffeb6d0700 (LWP 17694)]
>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>>
>>>> Value = 0
>>>> 0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>>      iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>>> (gdb) where
>>>> #0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>>      iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>>      iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>>>      elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>>      at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>> (gdb) c
>>>> Continuing.
>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>> This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq list for mempools in DPDK.
>>>
>>>> Old value = 0
>>>> New value = -402653184
>>>> 0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>>      iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>>> (gdb) where
>>>> #0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>>      iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>>      iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>>>      elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>>      at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>> (gdb) c
>>>> Continuing.
>>>>
>>>> What do you think? It is normal that the mempool_add_elem is called only on certain pointers of the mempool?
>>>> I attached the initialization of the mempool. Can this be wrong?
>>> All mempools with a cache size will have two queue to put memory on, one is the per lcore list and that one is used as a fast access queue. When the cache becomes empty or has more entries then the cache was created with then it pushed the extra entries to the main list of mempool data.
>> Why do you say "mempools with a cache size" ? In my initialization this mempool has cache_size = 0
> If you give a cache size then you will have a cache list per lcore, in your case you do not have a cache. BTW not having a cache will effect performance a great deal.
>
>>> The only time that rwlock is touched is to get/put items on the main mempool.
>>>
>>> Just as a data point have you tried this app on 18.08 yet? I do not see the problem yet, sorry.
>> I'll try 18.08 and let you know

Hi ,

I tried 18.08 but nothing changed about the described behaviour. I'm 
thinking about some overflow in my code lines but using valgrind on my 
application tells me nothing more and it seems strange to me.
Is there any particular way to debug memory issues on dpdk application 
apart from valgrind?

Regards,
Matteo

>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
>>>>>>
>>>>>>
>>>>>>>>>          if (result == 0)
>>>>>>>>>          {
>>>>>>>>>              new_work->command = command; // usage of the memory gotten
>>>>>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>>>>>>>>
>>>>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>>>>>
>>>>>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
>>>>>> If I use this pointer the application crashes.
>>>>>>
>>>>>>> Can you check how the mempool is initialized? It should be in multi
>>>>>>> consumer and depending on your use case, single or multi producer.
>>>>>>>
>>>>>> Here is the initialization of this mempool
>>>>>>
>>>>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>>>>>              (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>>>>>>              sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>>>>              0, 0,
>>>>>>              rte_pktmbuf_pool_init, NULL,
>>>>>>              rte_pktmbuf_init, NULL,
>>>>>>              rte_socket_id(), 0);
>>>>>>
>>>>>>> Another thing that could be checked: at all the places where you
>>>>>>> return your work object to the mempool, you should add a check
>>>>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>>>>> could do the trick: it adds some additional checks when doing
>>>>>>> mempool operations.
>>>>>>>
>>>>>> I think I have already answered this point with the prints up in the email.
>>>>>>
>>>>>> What do you think about this behaviour?
>>>>>>
>>>>>> Regards,
>>>>>> Matteo
>>>>>>
>>>>>>>>>              result =
>>>>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>>>>>>> lcores
>>>>>>>>>              // check on result value ...
>>>>>>>>>          }
>>>>>>>>>          else
>>>>>>>>>          {
>>>>>>>>>              // do something if result != 0 ...
>>>>>>>>>          }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>>>>>>> that after some times the "new_work" pointer is not a good one, and the
>>>>>>>>> application crashes when using that pointer.
>>>>>>>>>
>>>>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>>>>>> success.
>>>>>>>>>
>>>>>>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>>>>>>> application so that this code is called just by one lcore at a time?
>>>>>>>>>
>>>>>>> Assuming the mempool is properly initialized, I don't see any reason
>>>>>>> why it would not work. There has been a lot of changes in mempool between
>>>>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>>>>>
>>>>>>> If the comments above do not help to solve the issue, it could be helpful
>>>>>>> to try to reproduce the issue in a minimal program, so we can help to
>>>>>>> review it.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Olivier
>>>>>>>
>>>>> Regards,
>>>>> Keith
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Regards,
>>>>
>>>> Matteo
>>>>
>>> Regards,
>>> Keith
>>>
>>>
>>>
>> Regards,
>> Matteo
> Regards,
> Keith
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-21 14:46                 ` Matteo Lanzuisi
@ 2018-08-24 14:44                   ` Matteo Lanzuisi
  2018-08-24 16:47                     ` Wiles, Keith
  0 siblings, 1 reply; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-24 14:44 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: Olivier Matz, dev

Hi,

I used valgrind again for a very long time, and it told me nothing 
strange is happening on my code.
After it, I changed my code this way

  unsigned            lcore_id_start = rte_lcore_id();
RTE_LCORE_FOREACH(lcore_id)
{
         if (lcore_id_start != lcore_id) // <--------- before this 
change, every lcore could use it own mempool and enqueue to its own ring
         {
                 new_work = NULL;
                 result = 
rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P 
*) &new_work);        // mempools are created one for each logical core
                 if (result == 0)
                {
                     if (((uint64_t)(new_work)) < 0x7f0000000000)
                         printf("Result %d, lcore di partenza %u, lcore 
di ricezione %u, pointer %p\n", result, rte_lcore_id(), lcore_id, 
new_work);    // debug print, on my server it should never happen but 
with multi-thread happens always on the last logical core!!!!
                     new_work->command = command; // usage of the memory 
gotten from the mempool... <<<<<- here is where the application crashes!!!!
                     result = 
rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, 
(VOID_P) new_work);    // enqueues the gotten buffer on the rings of all 
lcores
                     // check on result value ...
                 }
                 else
                 {
                     // do something if result != 0 ...
                 }
         }
         else
         {
               // don't use mempool but call a function instead ....
         }
}

and now it all goes well.
It is possibile that sending to itself could generate this issue?

Regards,
Matteo

Il 21/08/2018 16:46, Matteo Lanzuisi ha scritto:
> Il 21/08/2018 14:51, Wiles, Keith ha scritto:
>>
>>> On Aug 21, 2018, at 7:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> 
>>> wrote:
>>>
>>> Il 21/08/2018 14:17, Wiles, Keith ha scritto:
>>>>> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> 
>>>>> wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>>>>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>>>>>   wrote:
>>>>>>>
>>>>>>> Hello Olivier,
>>>>>>>
>>>>>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>>>>>>
>>>>>>>> Hello Matteo,
>>>>>>>>
>>>>>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>>>>>>
>>>>>>>>> Any suggestion? any idea about this behaviour?
>>>>>>>>>
>>>>>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat 
>>>>>>>>>> rpm) on
>>>>>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>>>>>>
>>>>>>>>>> This application is made up by one or more threads (each one 
>>>>>>>>>> on a
>>>>>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>>>>>>
>>>>>>>>>> Each thread can call the following code lines when receiving 
>>>>>>>>>> a specific
>>>>>>>>>> packet:
>>>>>>>>>>
>>>>>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>>>>>> {
>>>>>>>>>>          result =
>>>>>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, 
>>>>>>>>>> (VOID_P
>>>>>>>>>> *) &new_work);        // mempools are created one for each 
>>>>>>>>>> logical core
>>>>>>>>>>          if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>>>>>              printf("Result %d, lcore di partenza %u, lcore 
>>>>>>>>>> di ricezione
>>>>>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, 
>>>>>>>>>> new_work);    //
>>>>>>>>>> debug print, on my server it should never happen but with 
>>>>>>>>>> multi-thread
>>>>>>>>>> happens always on the last logical core!!!!
>>>>>>>>>>
>>>>>>>> Here, checking the value of new_work looks wrong to me, before
>>>>>>>> ensuring that result == 0. At least, new_work should be set to
>>>>>>>> NULL before calling rte_mempool_get().
>>>>>>>>
>>>>>>> I put the check after result == 0, and just before the 
>>>>>>> rte_mempool_get() I set new_work to NULL, but nothing changed.
>>>>>>> The first time something goes wrong the print is
>>>>>>>
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 635, pointer 0x880002
>>>>>>>
>>>>>>> Sorry for the italian language print :) it means that 
>>>>>>> application is sending a message from the logical core 1 to the 
>>>>>>> logical core 2, it's the 635th time, the result is 0 and the 
>>>>>>> pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>>>>>>> One strange thing is that this behaviour happens always from the 
>>>>>>> logical core 1 to the logical core 2 when the counter is 635!!! 
>>>>>>> (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>>>>>> Another strange thing is that pointers from counter 636 to 640 
>>>>>>> are NULL, and from 641 begin again to be good... as you can see 
>>>>>>> here following (I attached the result of a test without the "if" 
>>>>>>> of the check on the value of new_work, and only for messages 
>>>>>>> from the lcore 1 to lcore 2)
>>>>>>>
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 627, pointer 0x7ffe8a261880
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 628, pointer 0x7ffe8a261900
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 629, pointer 0x7ffe8a261980
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 630, pointer 0x7ffe8a261a00
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 631, pointer 0x7ffe8a261a80
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 632, pointer 0x7ffe8a261b00
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 633, pointer 0x7ffe8a261b80
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 634, pointer 0x7ffe8a261c00
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 635, pointer 0x880002
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 636, pointer (nil)
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 637, pointer (nil)
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 638, pointer (nil)
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 639, pointer (nil)
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 640, pointer (nil)
>>>>>>>
>>>>>> This sure does seem like a memory over write problem, with maybe 
>>>>>> a memset(0) in the mix as well. Have you tried using hardware 
>>>>>> break points with the 0x880002 or 0x00 being written into this 
>>>>>> range?
>>>>> I put some breakpoints and found this:
>>>>>
>>>>> 1 - using pointer 0x880002, the output is (the pointer comes in 
>>>>> the middle of two rwlock):
>>>>>
>>>>> (gdb) awatch *0x880002
>>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>>> (gdb) c
>>>>> Continuing.
>>>>> [New Thread 0x7fffeded5700 (LWP 19969)]
>>>>> [New Thread 0x7fffed6d4700 (LWP 19970)]
>>>>> [New Thread 0x7fffeced3700 (LWP 19971)]
>>>>> [New Thread 0x7fffec6d2700 (LWP 19972)]
>>>>> [New Thread 0x7fffebed1700 (LWP 19973)]
>>>>> [New Thread 0x7fffeb6d0700 (LWP 19974)]
>>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>>>
>>>>> Value = 0
>>>>> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
>>>>>      at 
>>>>> /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>>> 81    }
>>>>> (gdb) c
>>>>> Continuing.
>>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>> These are most likely false positive hits and not the issue.
>>>>> Value = 0
>>>>> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
>>>>>      at 
>>>>> /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>>> 81    }
>>>>> (gdb) c
>>>>> Continuing.
>>>>>
>>>>> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the 
>>>>> range of the mempool), gdb tells nothing about them (I don't use 
>>>>> them, I just get them from the pool and the put them in the pool 
>>>>> again);
>>>>>
>>>>> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the 
>>>>> output of gdb:
>>>>>
>>>>> (gdb) awatch *(int *)0x7ffe8a261d65
>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>>> (gdb) c
>>>>> Continuing.
>>>>> [New Thread 0x7fffeded5700 (LWP 17689)]
>>>>> [New Thread 0x7fffed6d4700 (LWP 17690)]
>>>>> [New Thread 0x7fffeced3700 (LWP 17691)]
>>>>> [New Thread 0x7fffec6d2700 (LWP 17692)]
>>>>> [New Thread 0x7fffebed1700 (LWP 17693)]
>>>>> [New Thread 0x7fffeb6d0700 (LWP 17694)]
>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>>>
>>>>> Value = 0
>>>>> 0x00007ffff3798c21 in mempool_add_elem 
>>>>> (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>>>      iova=iova@entry=4465237376) at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>>>> (gdb) where
>>>>> #0  0x00007ffff3798c21 in mempool_add_elem 
>>>>> (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>>>      iova=iova@entry=4465237376) at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova 
>>>>> (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>>>      iova=4465087808, len=8388480, free_cb=<optimized out>, 
>>>>> opaque=<optimized out>)
>>>>>      at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>>>> #2  0x00007ffff379967d in rte_mempool_populate_default 
>>>>> (mp=mp@entry=0x7ffebfd8d6c0)
>>>>>      at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized 
>>>>> out>, n=<optimized out>,
>>>>>      elt_size=<optimized out>, cache_size=<optimized out>, 
>>>>> private_data_size=<optimized out>,
>>>>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, 
>>>>> socket_id=0, flags=0)
>>>>>      at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>>>      at 
>>>>> /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>>> (gdb) c
>>>>> Continuing.
>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>> This seems to be just creating a pktmbuf pool. The 
>>>> STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq 
>>>> list for mempools in DPDK.
>>>>
>>>>> Old value = 0
>>>>> New value = -402653184
>>>>> 0x00007ffff3798c24 in mempool_add_elem 
>>>>> (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>>>      iova=iova@entry=4465237504) at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>>>> (gdb) where
>>>>> #0  0x00007ffff3798c24 in mempool_add_elem 
>>>>> (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>>>      iova=iova@entry=4465237504) at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova 
>>>>> (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>>>      iova=4465087808, len=8388480, free_cb=<optimized out>, 
>>>>> opaque=<optimized out>)
>>>>>      at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>>>> #2  0x00007ffff379967d in rte_mempool_populate_default 
>>>>> (mp=mp@entry=0x7ffebfd8d6c0)
>>>>>      at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized 
>>>>> out>, n=<optimized out>,
>>>>>      elt_size=<optimized out>, cache_size=<optimized out>, 
>>>>> private_data_size=<optimized out>,
>>>>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, 
>>>>> socket_id=0, flags=0)
>>>>>      at 
>>>>> /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>>>      at 
>>>>> /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>>> (gdb) c
>>>>> Continuing.
>>>>>
>>>>> What do you think? It is normal that the mempool_add_elem is 
>>>>> called only on certain pointers of the mempool?
>>>>> I attached the initialization of the mempool. Can this be wrong?
>>>> All mempools with a cache size will have two queue to put memory 
>>>> on, one is the per lcore list and that one is used as a fast access 
>>>> queue. When the cache becomes empty or has more entries then the 
>>>> cache was created with then it pushed the extra entries to the main 
>>>> list of mempool data.
>>> Why do you say "mempools with a cache size" ? In my initialization 
>>> this mempool has cache_size = 0
>> If you give a cache size then you will have a cache list per lcore, 
>> in your case you do not have a cache. BTW not having a cache will 
>> effect performance a great deal.
>>
>>>> The only time that rwlock is touched is to get/put items on the 
>>>> main mempool.
>>>>
>>>> Just as a data point have you tried this app on 18.08 yet? I do not 
>>>> see the problem yet, sorry.
>>> I'll try 18.08 and let you know
>
> Hi ,
>
> I tried 18.08 but nothing changed about the described behaviour. I'm 
> thinking about some overflow in my code lines but using valgrind on my 
> application tells me nothing more and it seems strange to me.
> Is there any particular way to debug memory issues on dpdk application 
> apart from valgrind?
>
> Regards,
> Matteo
>
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 641, pointer 0x7ffe8a262b00
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 642, pointer 0x7ffe8a262b80
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 643, pointer 0x7ffe8a262d00
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 644, pointer 0x7ffe8a262d80
>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 
>>>>>>> 645, pointer 0x7ffe8a262e00
>>>>>>>
>>>>>>>
>>>>>>>>>>          if (result == 0)
>>>>>>>>>>          {
>>>>>>>>>>              new_work->command = command; // usage of the 
>>>>>>>>>> memory gotten
>>>>>>>>>> from the mempool... <<<<<- here is where the application 
>>>>>>>>>> crashes!!!!
>>>>>>>>>>
>>>>>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>>>>>>
>>>>>>> The pointer is not NULL but is not sequential to the others 
>>>>>>> (0x880002 as written before in this email). It seems to be in a 
>>>>>>> memory zone not in DPDK hugepages or something similar.
>>>>>>> If I use this pointer the application crashes.
>>>>>>>
>>>>>>>> Can you check how the mempool is initialized? It should be in 
>>>>>>>> multi
>>>>>>>> consumer and depending on your use case, single or multi producer.
>>>>>>>>
>>>>>>> Here is the initialization of this mempool
>>>>>>>
>>>>>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>>>>>>              (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 
>>>>>>> in this case
>>>>>>>              sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>>>>>              0, 0,
>>>>>>>              rte_pktmbuf_pool_init, NULL,
>>>>>>>              rte_pktmbuf_init, NULL,
>>>>>>>              rte_socket_id(), 0);
>>>>>>>
>>>>>>>> Another thing that could be checked: at all the places where you
>>>>>>>> return your work object to the mempool, you should add a check
>>>>>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>>>>>> could do the trick: it adds some additional checks when doing
>>>>>>>> mempool operations.
>>>>>>>>
>>>>>>> I think I have already answered this point with the prints up in 
>>>>>>> the email.
>>>>>>>
>>>>>>> What do you think about this behaviour?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Matteo
>>>>>>>
>>>>>>>>>>              result =
>>>>>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the 
>>>>>>>>>> rings of all
>>>>>>>>>> lcores
>>>>>>>>>>              // check on result value ...
>>>>>>>>>>          }
>>>>>>>>>>          else
>>>>>>>>>>          {
>>>>>>>>>>              // do something if result != 0 ...
>>>>>>>>>>          }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> This code worked perfectly (never had an issue) on 
>>>>>>>>>> dpdk-2.2.0, while if
>>>>>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 
>>>>>>>>>> it happens
>>>>>>>>>> that after some times the "new_work" pointer is not a good 
>>>>>>>>>> one, and the
>>>>>>>>>> application crashes when using that pointer.
>>>>>>>>>>
>>>>>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions 
>>>>>>>>>> without
>>>>>>>>>> success.
>>>>>>>>>>
>>>>>>>>>> Is this code possible on the new dpdk versions? Or have I to 
>>>>>>>>>> change my
>>>>>>>>>> application so that this code is called just by one lcore at 
>>>>>>>>>> a time?
>>>>>>>>>>
>>>>>>>> Assuming the mempool is properly initialized, I don't see any 
>>>>>>>> reason
>>>>>>>> why it would not work. There has been a lot of changes in 
>>>>>>>> mempool between
>>>>>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the 
>>>>>>>> same.
>>>>>>>>
>>>>>>>> If the comments above do not help to solve the issue, it could 
>>>>>>>> be helpful
>>>>>>>> to try to reproduce the issue in a minimal program, so we can 
>>>>>>>> help to
>>>>>>>> review it.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Olivier
>>>>>>>>
>>>>>> Regards,
>>>>>> Keith
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Regards,
>>>>>
>>>>> Matteo
>>>>>
>>>> Regards,
>>>> Keith
>>>>
>>>>
>>>>
>>> Regards,
>>> Matteo
>> Regards,
>> Keith
>>
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-24 14:44                   ` Matteo Lanzuisi
@ 2018-08-24 16:47                     ` Wiles, Keith
  2018-08-27 15:34                       ` Matteo Lanzuisi
  0 siblings, 1 reply; 13+ messages in thread
From: Wiles, Keith @ 2018-08-24 16:47 UTC (permalink / raw)
  To: Matteo Lanzuisi; +Cc: Olivier Matz, dev



> On Aug 24, 2018, at 9:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
> 
> Hi,
> 
> I used valgrind again for a very long time, and it told me nothing strange is happening on my code.
> After it, I changed my code this way
> 
>  unsigned            lcore_id_start = rte_lcore_id();
> RTE_LCORE_FOREACH(lcore_id)
> {
>         if (lcore_id_start != lcore_id) // <--------- before this change, every lcore could use it own mempool and enqueue to its own ring

Something in the back of my head tells me this is correct, but I have no real reason :-(

If this works then I guess it is OK, but it would be nice to understand why it works with this fix. Unless you have another thread running on this lcore doing a get/put I do not see the problem.
>         {
>                 new_work = NULL;
>                 result = rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P *) &new_work);        // mempools are created one for each logical core
>                 if (result == 0)
>                { 
>                     if (((uint64_t)(new_work)) < 0x7f0000000000)
>                         printf("Result %d, lcore di partenza %u, lcore di ricezione %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    // debug print, on my server it should never happen but with multi-thread happens always on the last logical core!!!!
>                     new_work->command = command; // usage of the memory gotten from the mempool... <<<<<- here is where the application crashes!!!!
>                     result = rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, (VOID_P) new_work);    // enqueues the gotten buffer on the rings       of all lcores
>                     // check on result value ...
>                 }
>                 else
>                 {
>                     // do something if result != 0 ...
>                 }
>         }
>         else
>         {
>               // don't use mempool but call a function instead ....
>         }
> }
> 
> and now it all goes well. 
> It is possibile that sending to itself could generate this issue?
> 
> Regards,
> Matteo
> 
> Il 21/08/2018 16:46, Matteo Lanzuisi ha scritto:
>> Il 21/08/2018 14:51, Wiles, Keith ha scritto: 
>>> 
>>>> On Aug 21, 2018, at 7:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote: 
>>>> 
>>>> Il 21/08/2018 14:17, Wiles, Keith ha scritto: 
>>>>>> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote: 
>>>>>> 
>>>>>> Hi 
>>>>>> 
>>>>>> Il 20/08/2018 18:03, Wiles, Keith ha scritto: 
>>>>>>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> 
>>>>>>>>   wrote: 
>>>>>>>> 
>>>>>>>> Hello Olivier, 
>>>>>>>> 
>>>>>>>> Il 13/08/2018 23:54, Olivier Matz ha scritto: 
>>>>>>>> 
>>>>>>>>> Hello Matteo, 
>>>>>>>>> 
>>>>>>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote: 
>>>>>>>>> 
>>>>>>>>>> Any suggestion? any idea about this behaviour? 
>>>>>>>>>> 
>>>>>>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto: 
>>>>>>>>>> 
>>>>>>>>>>> Hi all, 
>>>>>>>>>>> 
>>>>>>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on 
>>>>>>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an 
>>>>>>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0. 
>>>>>>>>>>> 
>>>>>>>>>>> This application is made up by one or more threads (each one on a 
>>>>>>>>>>> different logical core) reading packets from i40e interfaces. 
>>>>>>>>>>> 
>>>>>>>>>>> Each thread can call the following code lines when receiving a specific 
>>>>>>>>>>> packet: 
>>>>>>>>>>> 
>>>>>>>>>>> RTE_LCORE_FOREACH(lcore_id) 
>>>>>>>>>>> { 
>>>>>>>>>>>          result = 
>>>>>>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P 
>>>>>>>>>>> *) &new_work);        // mempools are created one for each logical core 
>>>>>>>>>>>          if (((uint64_t)(new_work)) < 0x7f0000000000) 
>>>>>>>>>>>              printf("Result %d, lcore di partenza %u, lcore di ricezione 
>>>>>>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    // 
>>>>>>>>>>> debug print, on my server it should never happen but with multi-thread 
>>>>>>>>>>> happens always on the last logical core!!!! 
>>>>>>>>>>> 
>>>>>>>>> Here, checking the value of new_work looks wrong to me, before 
>>>>>>>>> ensuring that result == 0. At least, new_work should be set to 
>>>>>>>>> NULL before calling rte_mempool_get(). 
>>>>>>>>> 
>>>>>>>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed. 
>>>>>>>> The first time something goes wrong the print is 
>>>>>>>> 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002 
>>>>>>>> 
>>>>>>>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx. 
>>>>>>>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok) 
>>>>>>>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2) 
>>>>>>>> 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil) 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil) 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil) 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil) 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil) 
>>>>>>>> 
>>>>>>> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range? 
>>>>>> I put some breakpoints and found this: 
>>>>>> 
>>>>>> 1 - using pointer 0x880002, the output is (the pointer comes in the middle of two rwlock): 
>>>>>> 
>>>>>> (gdb) awatch *0x880002 
>>>>>> Hardware access (read/write) watchpoint 1: *0x880002 
>>>>>> (gdb) c 
>>>>>> Continuing. 
>>>>>> [New Thread 0x7fffeded5700 (LWP 19969)] 
>>>>>> [New Thread 0x7fffed6d4700 (LWP 19970)] 
>>>>>> [New Thread 0x7fffeced3700 (LWP 19971)] 
>>>>>> [New Thread 0x7fffec6d2700 (LWP 19972)] 
>>>>>> [New Thread 0x7fffebed1700 (LWP 19973)] 
>>>>>> [New Thread 0x7fffeb6d0700 (LWP 19974)] 
>>>>>> Hardware access (read/write) watchpoint 1: *0x880002 
>>>>>> 
>>>>>> Value = 0 
>>>>>> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>) 
>>>>>>      at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>>>> 81    } 
>>>>>> (gdb) c 
>>>>>> Continuing. 
>>>>>> Hardware access (read/write) watchpoint 1: *0x880002 
>>>>> These are most likely false positive hits and not the issue. 
>>>>>> Value = 0 
>>>>>> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>) 
>>>>>>      at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>>>> 81    } 
>>>>>> (gdb) c 
>>>>>> Continuing. 
>>>>>> 
>>>>>> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range of the mempool), gdb tells nothing about them (I don't use them, I just get them from the pool and the put them in the pool again); 
>>>>>> 
>>>>>> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb: 
>>>>>> 
>>>>>> (gdb) awatch *(int *)0x7ffe8a261d65 
>>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65 
>>>>>> (gdb) c 
>>>>>> Continuing. 
>>>>>> [New Thread 0x7fffeded5700 (LWP 17689)] 
>>>>>> [New Thread 0x7fffed6d4700 (LWP 17690)] 
>>>>>> [New Thread 0x7fffeced3700 (LWP 17691)] 
>>>>>> [New Thread 0x7fffec6d2700 (LWP 17692)] 
>>>>>> [New Thread 0x7fffebed1700 (LWP 17693)] 
>>>>>> [New Thread 0x7fffeb6d0700 (LWP 17694)] 
>>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65 
>>>>>> 
>>>>>> Value = 0 
>>>>>> 0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80, 
>>>>>>      iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140 
>>>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next); 
>>>>>> (gdb) where 
>>>>>> #0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80, 
>>>>>>      iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140 
>>>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "", 
>>>>>>      iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>) 
>>>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424 
>>>>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0) 
>>>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624 
>>>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>, 
>>>>>>      elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>, 
>>>>>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0, 
>>>>>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0) 
>>>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952 
>>>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8) 
>>>>>>      at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>>>> (gdb) c 
>>>>>> Continuing. 
>>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65 
>>>>> This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq list for mempools in DPDK. 
>>>>> 
>>>>>> Old value = 0 
>>>>>> New value = -402653184 
>>>>>> 0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00, 
>>>>>>      iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140 
>>>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next); 
>>>>>> (gdb) where 
>>>>>> #0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00, 
>>>>>>      iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140 
>>>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "", 
>>>>>>      iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>) 
>>>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424 
>>>>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0) 
>>>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624 
>>>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>, 
>>>>>>      elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>, 
>>>>>>      mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0, 
>>>>>>      obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0) 
>>>>>>      at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952 
>>>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8) 
>>>>>>      at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>>>> (gdb) c 
>>>>>> Continuing. 
>>>>>> 
>>>>>> What do you think? It is normal that the mempool_add_elem is called only on certain pointers of the mempool? 
>>>>>> I attached the initialization of the mempool. Can this be wrong? 
>>>>> All mempools with a cache size will have two queue to put memory on, one is the per lcore list and that one is used as a fast access queue. When the cache becomes empty or has more entries then the cache was created with then it pushed the extra entries to the main list of mempool data. 
>>>> Why do you say "mempools with a cache size" ? In my initialization this mempool has cache_size = 0 
>>> If you give a cache size then you will have a cache list per lcore, in your case you do not have a cache. BTW not having a cache will effect performance a great deal. 
>>> 
>>>>> The only time that rwlock is touched is to get/put items on the main mempool. 
>>>>> 
>>>>> Just as a data point have you tried this app on 18.08 yet? I do not see the problem yet, sorry. 
>>>> I'll try 18.08 and let you know 
>> 
>> Hi , 
>> 
>> I tried 18.08 but nothing changed about the described behaviour. I'm thinking about some overflow in my code lines but using valgrind on my application tells me nothing more and it seems strange to me. 
>> Is there any particular way to debug memory issues on dpdk application apart from valgrind? 
>> 
>> Regards, 
>> Matteo 
>> 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80 
>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>          if (result == 0) 
>>>>>>>>>>>          { 
>>>>>>>>>>>              new_work->command = command; // usage of the memory gotten 
>>>>>>>>>>> from the mempool... <<<<<- here is where the application crashes!!!! 
>>>>>>>>>>> 
>>>>>>>>> Do you know why it crashes? Is it that new_work is NULL? 
>>>>>>>>> 
>>>>>>>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar. 
>>>>>>>> If I use this pointer the application crashes. 
>>>>>>>> 
>>>>>>>>> Can you check how the mempool is initialized? It should be in multi 
>>>>>>>>> consumer and depending on your use case, single or multi producer. 
>>>>>>>>> 
>>>>>>>> Here is the initialization of this mempool 
>>>>>>>> 
>>>>>>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name, 
>>>>>>>>              (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case 
>>>>>>>>              sizeof (CEA_DECODE_CMD_T), // 24 bytes 
>>>>>>>>              0, 0, 
>>>>>>>>              rte_pktmbuf_pool_init, NULL, 
>>>>>>>>              rte_pktmbuf_init, NULL, 
>>>>>>>>              rte_socket_id(), 0); 
>>>>>>>> 
>>>>>>>>> Another thing that could be checked: at all the places where you 
>>>>>>>>> return your work object to the mempool, you should add a check 
>>>>>>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG 
>>>>>>>>> could do the trick: it adds some additional checks when doing 
>>>>>>>>> mempool operations. 
>>>>>>>>> 
>>>>>>>> I think I have already answered this point with the prints up in the email. 
>>>>>>>> 
>>>>>>>> What do you think about this behaviour? 
>>>>>>>> 
>>>>>>>> Regards, 
>>>>>>>> Matteo 
>>>>>>>> 
>>>>>>>>>>>              result = 
>>>>>>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, 
>>>>>>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all 
>>>>>>>>>>> lcores 
>>>>>>>>>>>              // check on result value ... 
>>>>>>>>>>>          } 
>>>>>>>>>>>          else 
>>>>>>>>>>>          { 
>>>>>>>>>>>              // do something if result != 0 ... 
>>>>>>>>>>>          } 
>>>>>>>>>>> } 
>>>>>>>>>>> 
>>>>>>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if 
>>>>>>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens 
>>>>>>>>>>> that after some times the "new_work" pointer is not a good one, and the 
>>>>>>>>>>> application crashes when using that pointer. 
>>>>>>>>>>> 
>>>>>>>>>>> It seems that these lines cannot be used by more than one thread 
>>>>>>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without 
>>>>>>>>>>> success. 
>>>>>>>>>>> 
>>>>>>>>>>> Is this code possible on the new dpdk versions? Or have I to change my 
>>>>>>>>>>> application so that this code is called just by one lcore at a time? 
>>>>>>>>>>> 
>>>>>>>>> Assuming the mempool is properly initialized, I don't see any reason 
>>>>>>>>> why it would not work. There has been a lot of changes in mempool between 
>>>>>>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same. 
>>>>>>>>> 
>>>>>>>>> If the comments above do not help to solve the issue, it could be helpful 
>>>>>>>>> to try to reproduce the issue in a minimal program, so we can help to 
>>>>>>>>> review it. 
>>>>>>>>> 
>>>>>>>>> Regards, 
>>>>>>>>> Olivier 
>>>>>>>>> 
>>>>>>> Regards, 
>>>>>>> Keith 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> Regards, 
>>>>>> 
>>>>>> Matteo 
>>>>>> 
>>>>> Regards, 
>>>>> Keith 
>>>>> 
>>>>> 
>>>>> 
>>>> Regards, 
>>>> Matteo 
>>> Regards, 
>>> Keith 
>>> 
>>> 
>> 
>> 

Regards,
Keith

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] Multi-thread mempool usage
  2018-08-24 16:47                     ` Wiles, Keith
@ 2018-08-27 15:34                       ` Matteo Lanzuisi
  0 siblings, 0 replies; 13+ messages in thread
From: Matteo Lanzuisi @ 2018-08-27 15:34 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: Olivier Matz, dev

Hi,

I apologize for the last email, it was a false positive, sometimes it 
went good and sometimes not.
The real problem was a memory overflow in my code, where part of a 
memzone was overwritten by a memcpy. This was never found in RedHat 6 
and dpdk-2.2.0. I think this is because of some hugepage management 
changes between 2.2.0 and 17.07 dpdk version.

Thank you for you time and patience,
Matteo

Il 24/08/2018 18:47, Wiles, Keith ha scritto:
>
>> On Aug 24, 2018, at 9:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>
>> Hi,
>>
>> I used valgrind again for a very long time, and it told me nothing strange is happening on my code.
>> After it, I changed my code this way
>>
>>   unsigned            lcore_id_start = rte_lcore_id();
>> RTE_LCORE_FOREACH(lcore_id)
>> {
>>          if (lcore_id_start != lcore_id) // <--------- before this change, every lcore could use it own mempool and enqueue to its own ring
> Something in the back of my head tells me this is correct, but I have no real reason :-(
>
> If this works then I guess it is OK, but it would be nice to understand why it works with this fix. Unless you have another thread running on this lcore doing a get/put I do not see the problem.
>>          {
>>                  new_work = NULL;
>>                  result = rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P *) &new_work);        // mempools are created one for each logical core
>>                  if (result == 0)
>>                 {
>>                      if (((uint64_t)(new_work)) < 0x7f0000000000)
>>                          printf("Result %d, lcore di partenza %u, lcore di ricezione %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    // debug print, on my server it should never happen but with multi-thread happens always on the last logical core!!!!
>>                      new_work->command = command; // usage of the memory gotten from the mempool... <<<<<- here is where the application crashes!!!!
>>                      result = rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, (VOID_P) new_work);    // enqueues the gotten buffer on the rings       of all lcores
>>                      // check on result value ...
>>                  }
>>                  else
>>                  {
>>                      // do something if result != 0 ...
>>                  }
>>          }
>>          else
>>          {
>>                // don't use mempool but call a function instead ....
>>          }
>> }
>>
>> and now it all goes well.
>> It is possibile that sending to itself could generate this issue?
>>
>> Regards,
>> Matteo
>>
>> Il 21/08/2018 16:46, Matteo Lanzuisi ha scritto:
>>> Il 21/08/2018 14:51, Wiles, Keith ha scritto:
>>>>> On Aug 21, 2018, at 7:44 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>>>>
>>>>> Il 21/08/2018 14:17, Wiles, Keith ha scritto:
>>>>>>> On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi <m.lanzuisi@resi.it> wrote:
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> Il 20/08/2018 18:03, Wiles, Keith ha scritto:
>>>>>>>>> On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi <m.lanzuisi@resi.it>
>>>>>>>>>    wrote:
>>>>>>>>>
>>>>>>>>> Hello Olivier,
>>>>>>>>>
>>>>>>>>> Il 13/08/2018 23:54, Olivier Matz ha scritto:
>>>>>>>>>
>>>>>>>>>> Hello Matteo,
>>>>>>>>>>
>>>>>>>>>> On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:
>>>>>>>>>>
>>>>>>>>>>> Any suggestion? any idea about this behaviour?
>>>>>>>>>>>
>>>>>>>>>>> Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
>>>>>>>>>>>> RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
>>>>>>>>>>>> application from RH6 to RH7. On RH6 I used dpdk-2.2.0.
>>>>>>>>>>>>
>>>>>>>>>>>> This application is made up by one or more threads (each one on a
>>>>>>>>>>>> different logical core) reading packets from i40e interfaces.
>>>>>>>>>>>>
>>>>>>>>>>>> Each thread can call the following code lines when receiving a specific
>>>>>>>>>>>> packet:
>>>>>>>>>>>>
>>>>>>>>>>>> RTE_LCORE_FOREACH(lcore_id)
>>>>>>>>>>>> {
>>>>>>>>>>>>           result =
>>>>>>>>>>>> rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
>>>>>>>>>>>> *) &new_work);        // mempools are created one for each logical core
>>>>>>>>>>>>           if (((uint64_t)(new_work)) < 0x7f0000000000)
>>>>>>>>>>>>               printf("Result %d, lcore di partenza %u, lcore di ricezione
>>>>>>>>>>>> %u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);    //
>>>>>>>>>>>> debug print, on my server it should never happen but with multi-thread
>>>>>>>>>>>> happens always on the last logical core!!!!
>>>>>>>>>>>>
>>>>>>>>>> Here, checking the value of new_work looks wrong to me, before
>>>>>>>>>> ensuring that result == 0. At least, new_work should be set to
>>>>>>>>>> NULL before calling rte_mempool_get().
>>>>>>>>>>
>>>>>>>>> I put the check after result == 0, and just before the rte_mempool_get() I set new_work to NULL, but nothing changed.
>>>>>>>>> The first time something goes wrong the print is
>>>>>>>>>
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>>>>>>
>>>>>>>>> Sorry for the italian language print :) it means that application is sending a message from the logical core 1 to the logical core 2, it's the 635th time, the result is 0 and the pointer is 0x880002 while all pointers before were 0x7ffxxxxxx.
>>>>>>>>> One strange thing is that this behaviour happens always from the logical core 1 to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 or 1 to 1 or 2 to 2 is all ok)
>>>>>>>>> Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 begin again to be good... as you can see here following (I attached the result of a test without the "if" of the check on the value of new_work, and only for messages from the lcore 1 to lcore 2)
>>>>>>>>>
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 627, pointer 0x7ffe8a261880
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 628, pointer 0x7ffe8a261900
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 629, pointer 0x7ffe8a261980
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 630, pointer 0x7ffe8a261a00
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 631, pointer 0x7ffe8a261a80
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 632, pointer 0x7ffe8a261b00
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 633, pointer 0x7ffe8a261b80
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 634, pointer 0x7ffe8a261c00
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 0x880002
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 636, pointer (nil)
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 637, pointer (nil)
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 638, pointer (nil)
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 639, pointer (nil)
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 640, pointer (nil)
>>>>>>>>>
>>>>>>>> This sure does seem like a memory over write problem, with maybe a memset(0) in the mix as well. Have you tried using hardware break points with the 0x880002 or 0x00 being written into this range?
>>>>>>> I put some breakpoints and found this:
>>>>>>>
>>>>>>> 1 - using pointer 0x880002, the output is (the pointer comes in the middle of two rwlock):
>>>>>>>
>>>>>>> (gdb) awatch *0x880002
>>>>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>>>>> (gdb) c
>>>>>>> Continuing.
>>>>>>> [New Thread 0x7fffeded5700 (LWP 19969)]
>>>>>>> [New Thread 0x7fffed6d4700 (LWP 19970)]
>>>>>>> [New Thread 0x7fffeced3700 (LWP 19971)]
>>>>>>> [New Thread 0x7fffec6d2700 (LWP 19972)]
>>>>>>> [New Thread 0x7fffebed1700 (LWP 19973)]
>>>>>>> [New Thread 0x7fffeb6d0700 (LWP 19974)]
>>>>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>>>>>
>>>>>>> Value = 0
>>>>>>> rte_rwlock_init (rwl=0x880000 <ikco_sdkif_actlist_lock+677024>)
>>>>>>>       at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>>>>> 81    }
>>>>>>> (gdb) c
>>>>>>> Continuing.
>>>>>>> Hardware access (read/write) watchpoint 1: *0x880002
>>>>>> These are most likely false positive hits and not the issue.
>>>>>>> Value = 0
>>>>>>> rte_rwlock_init (rwl=0x880004 <ikco_sdkif_actlist_lock+677028>)
>>>>>>>       at /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/generic/rte_rwlock.h:81
>>>>>>> 81    }
>>>>>>> (gdb) c
>>>>>>> Continuing.
>>>>>>>
>>>>>>> 2 - when using pointers minor or equal than 0x7ffe8a261d64 (in the range of the mempool), gdb tells nothing about them (I don't use them, I just get them from the pool and the put them in the pool again);
>>>>>>>
>>>>>>> 3 - when using pointer 0x7ffe8a261d65 or greater, this is the output of gdb:
>>>>>>>
>>>>>>> (gdb) awatch *(int *)0x7ffe8a261d65
>>>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>>>>> (gdb) c
>>>>>>> Continuing.
>>>>>>> [New Thread 0x7fffeded5700 (LWP 17689)]
>>>>>>> [New Thread 0x7fffed6d4700 (LWP 17690)]
>>>>>>> [New Thread 0x7fffeced3700 (LWP 17691)]
>>>>>>> [New Thread 0x7fffec6d2700 (LWP 17692)]
>>>>>>> [New Thread 0x7fffebed1700 (LWP 17693)]
>>>>>>> [New Thread 0x7fffeb6d0700 (LWP 17694)]
>>>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>>>>>
>>>>>>> Value = 0
>>>>>>> 0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>>>>>       iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>>>>>> (gdb) where
>>>>>>> #0  0x00007ffff3798c21 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261d80,
>>>>>>>       iova=iova@entry=4465237376) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>>>>>       iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>>>>>>       at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>>>>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>>>>>>       at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>>>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>>>>>>       elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>>>>>>       mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>>>>>       obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>>>>>>       at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>>>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>>>>>       at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>>>>> (gdb) c
>>>>>>> Continuing.
>>>>>>> Hardware access (read/write) watchpoint 1: *(int *)0x7ffe8a261d65
>>>>>> This seems to be just creating a pktmbuf pool. The STAILQ_INSERT_TAILQ is just putting the mempool on the main tailq list for mempools in DPDK.
>>>>>>
>>>>>>> Old value = 0
>>>>>>> New value = -402653184
>>>>>>> 0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>>>>>       iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>>>> 140        STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
>>>>>>> (gdb) where
>>>>>>> #0  0x00007ffff3798c24 in mempool_add_elem (mp=mp@entry=0x7ffebfd8d6c0, obj=obj@entry=0x7ffe8a261e00,
>>>>>>>       iova=iova@entry=4465237504) at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:140
>>>>>>> #1  0x00007ffff37990f0 in rte_mempool_populate_iova (mp=0x7ffebfd8d6c0, vaddr=0x7ffe8a23d540 "",
>>>>>>>       iova=4465087808, len=8388480, free_cb=<optimized out>, opaque=<optimized out>)
>>>>>>>       at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:424
>>>>>>> #2  0x00007ffff379967d in rte_mempool_populate_default (mp=mp@entry=0x7ffebfd8d6c0)
>>>>>>>       at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:624
>>>>>>> #3  0x00007ffff3799e89 in rte_mempool_create (name=<optimized out>, n=<optimized out>,
>>>>>>>       elt_size=<optimized out>, cache_size=<optimized out>, private_data_size=<optimized out>,
>>>>>>>       mp_init=0x7ffff444e410 <rte_pktmbuf_pool_init>, mp_init_arg=0x0,
>>>>>>>       obj_init=0x7ffff444e330 <rte_pktmbuf_init>, obj_init_arg=0x0, socket_id=0, flags=0)
>>>>>>>       at /usr/src/debug/dpdk-17.11/lib/librte_mempool/rte_mempool.c:952
>>>>>>> #4  0x0000000000548a52 in main (argc=16, argv=0x7fffffffe3c8)
>>>>>>>       at /root/gemini-cea-4.6.0/msrc/sys/com/linux-dpdk/cea-app/../../../../sys/com/linux-dpdk/cea-app/main.c:2360
>>>>>>> (gdb) c
>>>>>>> Continuing.
>>>>>>>
>>>>>>> What do you think? It is normal that the mempool_add_elem is called only on certain pointers of the mempool?
>>>>>>> I attached the initialization of the mempool. Can this be wrong?
>>>>>> All mempools with a cache size will have two queue to put memory on, one is the per lcore list and that one is used as a fast access queue. When the cache becomes empty or has more entries then the cache was created with then it pushed the extra entries to the main list of mempool data.
>>>>> Why do you say "mempools with a cache size" ? In my initialization this mempool has cache_size = 0
>>>> If you give a cache size then you will have a cache list per lcore, in your case you do not have a cache. BTW not having a cache will effect performance a great deal.
>>>>
>>>>>> The only time that rwlock is touched is to get/put items on the main mempool.
>>>>>>
>>>>>> Just as a data point have you tried this app on 18.08 yet? I do not see the problem yet, sorry.
>>>>> I'll try 18.08 and let you know
>>> Hi ,
>>>
>>> I tried 18.08 but nothing changed about the described behaviour. I'm thinking about some overflow in my code lines but using valgrind on my application tells me nothing more and it seems strange to me.
>>> Is there any particular way to debug memory issues on dpdk application apart from valgrind?
>>>
>>> Regards,
>>> Matteo
>>>
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 641, pointer 0x7ffe8a262b00
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 642, pointer 0x7ffe8a262b80
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 643, pointer 0x7ffe8a262d00
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 644, pointer 0x7ffe8a262d80
>>>>>>>>> Result 0, lcore di partenza 1, lcore di ricezione 2, counter 645, pointer 0x7ffe8a262e00
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>           if (result == 0)
>>>>>>>>>>>>           {
>>>>>>>>>>>>               new_work->command = command; // usage of the memory gotten
>>>>>>>>>>>> from the mempool... <<<<<- here is where the application crashes!!!!
>>>>>>>>>>>>
>>>>>>>>>> Do you know why it crashes? Is it that new_work is NULL?
>>>>>>>>>>
>>>>>>>>> The pointer is not NULL but is not sequential to the others (0x880002 as written before in this email). It seems to be in a memory zone not in DPDK hugepages or something similar.
>>>>>>>>> If I use this pointer the application crashes.
>>>>>>>>>
>>>>>>>>>> Can you check how the mempool is initialized? It should be in multi
>>>>>>>>>> consumer and depending on your use case, single or multi producer.
>>>>>>>>>>
>>>>>>>>> Here is the initialization of this mempool
>>>>>>>>>
>>>>>>>>> cea_main_cmd_pool[i] = rte_mempool_create(pool_name,
>>>>>>>>>               (unsigned int) (ikco_cmd_buffers - 1), // 65536 - 1 in this case
>>>>>>>>>               sizeof (CEA_DECODE_CMD_T), // 24 bytes
>>>>>>>>>               0, 0,
>>>>>>>>>               rte_pktmbuf_pool_init, NULL,
>>>>>>>>>               rte_pktmbuf_init, NULL,
>>>>>>>>>               rte_socket_id(), 0);
>>>>>>>>>
>>>>>>>>>> Another thing that could be checked: at all the places where you
>>>>>>>>>> return your work object to the mempool, you should add a check
>>>>>>>>>> that it is not NULL. Or just enabling RTE_LIBRTE_MEMPOOL_DEBUG
>>>>>>>>>> could do the trick: it adds some additional checks when doing
>>>>>>>>>> mempool operations.
>>>>>>>>>>
>>>>>>>>> I think I have already answered this point with the prints up in the email.
>>>>>>>>>
>>>>>>>>> What do you think about this behaviour?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Matteo
>>>>>>>>>
>>>>>>>>>>>>               result =
>>>>>>>>>>>> rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring,
>>>>>>>>>>>> (VOID_P) new_work);    // enqueues the gotten buffer on the rings of all
>>>>>>>>>>>> lcores
>>>>>>>>>>>>               // check on result value ...
>>>>>>>>>>>>           }
>>>>>>>>>>>>           else
>>>>>>>>>>>>           {
>>>>>>>>>>>>               // do something if result != 0 ...
>>>>>>>>>>>>           }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> This code worked perfectly (never had an issue) on dpdk-2.2.0, while if
>>>>>>>>>>>> I use more than 1 thread doing these operations on dpdk-17.11 it happens
>>>>>>>>>>>> that after some times the "new_work" pointer is not a good one, and the
>>>>>>>>>>>> application crashes when using that pointer.
>>>>>>>>>>>>
>>>>>>>>>>>> It seems that these lines cannot be used by more than one thread
>>>>>>>>>>>> simultaneously. I also used many 2017 and 2018 dpdk versions without
>>>>>>>>>>>> success.
>>>>>>>>>>>>
>>>>>>>>>>>> Is this code possible on the new dpdk versions? Or have I to change my
>>>>>>>>>>>> application so that this code is called just by one lcore at a time?
>>>>>>>>>>>>
>>>>>>>>>> Assuming the mempool is properly initialized, I don't see any reason
>>>>>>>>>> why it would not work. There has been a lot of changes in mempool between
>>>>>>>>>> dpdk-2.2.0 and dpdk-17.11, but this behavior should remain the same.
>>>>>>>>>>
>>>>>>>>>> If the comments above do not help to solve the issue, it could be helpful
>>>>>>>>>> to try to reproduce the issue in a minimal program, so we can help to
>>>>>>>>>> review it.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Olivier
>>>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Keith
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Matteo
>>>>>>>
>>>>>> Regards,
>>>>>> Keith
>>>>>>
>>>>>>
>>>>>>
>>>>> Regards,
>>>>> Matteo
>>>> Regards,
>>>> Keith
>>>>
>>>>
>>>
> Regards,
> Keith
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-08-27 15:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-08  9:56 [dpdk-dev] Multi-thread mempool usage Matteo Lanzuisi
2018-08-13 13:20 ` Matteo Lanzuisi
2018-08-13 21:54   ` Olivier Matz
2018-08-20 14:47     ` Matteo Lanzuisi
2018-08-20 16:03       ` Wiles, Keith
2018-08-21 12:01         ` Matteo Lanzuisi
2018-08-21 12:17           ` Wiles, Keith
2018-08-21 12:44             ` Matteo Lanzuisi
2018-08-21 12:51               ` Wiles, Keith
2018-08-21 14:46                 ` Matteo Lanzuisi
2018-08-24 14:44                   ` Matteo Lanzuisi
2018-08-24 16:47                     ` Wiles, Keith
2018-08-27 15:34                       ` Matteo Lanzuisi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).