DPDK patches and discussions
 help / color / mirror / Atom feed
* Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
@ 2023-03-30  5:00 Prashant Upadhyaya
  2023-03-30  8:00 ` Bruce Richardson
  0 siblings, 1 reply; 12+ messages in thread
From: Prashant Upadhyaya @ 2023-03-30  5:00 UTC (permalink / raw)
  To: dev

Hi,

While trying to port some code to VPP (which uses DPDK as the backend
driver), I am running into a problem that calls to API's like
rte_timer_subsystem_init, rte_hash_create are failing while allocation
of memory.

This is presumably because VPP inits the EAL with the following arguments --

-in-memory --no-telemetry --file-prefix vpp

Is  there is something that can be done eg. passing some more parms in
the EAL initialization which hopefully wouldn't break VPP but will
also be friendly to the RTE timer and hash functions too, that would
be great, so requesting some advice here.

Regards
-Prashant

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30  5:00 Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP Prashant Upadhyaya
@ 2023-03-30  8:00 ` Bruce Richardson
  2023-03-30  8:27   ` Prashant Upadhyaya
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2023-03-30  8:00 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> Hi,
> 
> While trying to port some code to VPP (which uses DPDK as the backend
> driver), I am running into a problem that calls to API's like
> rte_timer_subsystem_init, rte_hash_create are failing while allocation
> of memory.
> 
> This is presumably because VPP inits the EAL with the following arguments --
> 
> -in-memory --no-telemetry --file-prefix vpp
> 
> Is  there is something that can be done eg. passing some more parms in
> the EAL initialization which hopefully wouldn't break VPP but will
> also be friendly to the RTE timer and hash functions too, that would
> be great, so requesting some advice here.
> 
Hi,

can you provide some more details on what the errors are that you are
receiving? Have you been able to dig a little deeper into what might be
causing the memory failures? The above flags alone are unlikely to cause
issues with hash or timer libraries, for example.

/Bruce

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30  8:00 ` Bruce Richardson
@ 2023-03-30  8:27   ` Prashant Upadhyaya
  2023-03-30  9:20     ` Bruce Richardson
  0 siblings, 1 reply; 12+ messages in thread
From: Prashant Upadhyaya @ 2023-03-30  8:27 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Hi,

The hash creation API throws the following error --
RING: Cannot reserve memory for tailq
HASH: memory allocation failed

The timer subsystem init api throws this error --
EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
memzone segments exceeds RTE_MAX_MEMZONE

I did check the code and apparently the memzone and rte zmalloc
related api's are not being able to allocate memory.

Regards
-Prashant

On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > Hi,
> >
> > While trying to port some code to VPP (which uses DPDK as the backend
> > driver), I am running into a problem that calls to API's like
> > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > of memory.
> >
> > This is presumably because VPP inits the EAL with the following arguments --
> >
> > -in-memory --no-telemetry --file-prefix vpp
> >
> > Is  there is something that can be done eg. passing some more parms in
> > the EAL initialization which hopefully wouldn't break VPP but will
> > also be friendly to the RTE timer and hash functions too, that would
> > be great, so requesting some advice here.
> >
> Hi,
>
> can you provide some more details on what the errors are that you are
> receiving? Have you been able to dig a little deeper into what might be
> causing the memory failures? The above flags alone are unlikely to cause
> issues with hash or timer libraries, for example.
>
> /Bruce

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30  8:27   ` Prashant Upadhyaya
@ 2023-03-30  9:20     ` Bruce Richardson
  2023-03-30 13:12       ` Prashant Upadhyaya
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2023-03-30  9:20 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> Hi,
> 

FYI, when replying on list, it's best not to top-post, but put your replies
below the email snippet you are replying to.

> The hash creation API throws the following error --
> RING: Cannot reserve memory for tailq
> HASH: memory allocation failed
> 
> The timer subsystem init api throws this error --
> EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> memzone segments exceeds RTE_MAX_MEMZONE
> 

Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
file, so edit that and then rebuild DPDK. [If you are using the built-in
DPDK from VPP, you may need to do a patch for this, add it into the VPP
patches direction and then do a VPP rebuild.]

Let's see if we can get rid of at least one of the error messages. :-)

/Bruce

> I did check the code and apparently the memzone and rte zmalloc
> related api's are not being able to allocate memory.
> 
> Regards
> -Prashant
> 
> On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > Hi,
> > >
> > > While trying to port some code to VPP (which uses DPDK as the backend
> > > driver), I am running into a problem that calls to API's like
> > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > of memory.
> > >
> > > This is presumably because VPP inits the EAL with the following arguments --
> > >
> > > -in-memory --no-telemetry --file-prefix vpp
> > >
> > > Is  there is something that can be done eg. passing some more parms in
> > > the EAL initialization which hopefully wouldn't break VPP but will
> > > also be friendly to the RTE timer and hash functions too, that would
> > > be great, so requesting some advice here.
> > >
> > Hi,
> >
> > can you provide some more details on what the errors are that you are
> > receiving? Have you been able to dig a little deeper into what might be
> > causing the memory failures? The above flags alone are unlikely to cause
> > issues with hash or timer libraries, for example.
> >
> > /Bruce

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30  9:20     ` Bruce Richardson
@ 2023-03-30 13:12       ` Prashant Upadhyaya
  2023-03-30 13:17         ` Bruce Richardson
  0 siblings, 1 reply; 12+ messages in thread
From: Prashant Upadhyaya @ 2023-03-30 13:12 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > Hi,
> >
>
> FYI, when replying on list, it's best not to top-post, but put your replies
> below the email snippet you are replying to.
>
> > The hash creation API throws the following error --
> > RING: Cannot reserve memory for tailq
> > HASH: memory allocation failed
> >
> > The timer subsystem init api throws this error --
> > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > memzone segments exceeds RTE_MAX_MEMZONE
> >
>
> Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> file, so edit that and then rebuild DPDK. [If you are using the built-in
> DPDK from VPP, you may need to do a patch for this, add it into the VPP
> patches direction and then do a VPP rebuild.]
>
> Let's see if we can get rid of at least one of the error messages. :-)
>
> /Bruce
>
> > I did check the code and apparently the memzone and rte zmalloc
> > related api's are not being able to allocate memory.
> >
> > Regards
> > -Prashant
> >
> > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > Hi,
> > > >
> > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > driver), I am running into a problem that calls to API's like
> > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > of memory.
> > > >
> > > > This is presumably because VPP inits the EAL with the following arguments --
> > > >
> > > > -in-memory --no-telemetry --file-prefix vpp
> > > >
> > > > Is  there is something that can be done eg. passing some more parms in
> > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > also be friendly to the RTE timer and hash functions too, that would
> > > > be great, so requesting some advice here.
> > > >
> > > Hi,
> > >
> > > can you provide some more details on what the errors are that you are
> > > receiving? Have you been able to dig a little deeper into what might be
> > > causing the memory failures? The above flags alone are unlikely to cause
> > > issues with hash or timer libraries, for example.
> > >
> > > /Bruce

Thanks Bruce, the error comes from the following function in
lib/eal/common/eal_common_memzone.c
memzone_reserve_aligned_thread_unsafe

The condition which spits out the error is the following
if (arr->count >= arr->len)
So I printed both of the above values inside this function, and the
following output came

vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
[New Thread 0x7fffa67b6700 (LWP 14732)]
count: 0 len: 2560
count: 1 len: 2560
count: 2 len: 2560
[New Thread 0x7fffa5fb5700 (LWP 14733)]
[New Thread 0x7fffa5db4700 (LWP 14734)]
count: 3 len: 2560
count: 4 len: 2560
### this is the place where I call rte_timer_subsystem_init from my
code, the above must be coming from any other code from VPP/EAL init,
the line below is surely because of my call to
rte_timer_subsystem_init
count: 0 len: 0

So as you can see that both values are coming to be zero -- is this
expected ? I thought the arr->len should have been non zero.
I must add that the thread which is calling the
rte_timer_subsystem_init is possibly different than the one which did
the eal init, do you think that might be a problem...
I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
the above first for any suggestions.

Regards
-Prashant

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30 13:12       ` Prashant Upadhyaya
@ 2023-03-30 13:17         ` Bruce Richardson
  2023-03-30 13:37           ` Prashant Upadhyaya
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2023-03-30 13:17 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > Hi,
> > >
> >
> > FYI, when replying on list, it's best not to top-post, but put your replies
> > below the email snippet you are replying to.
> >
> > > The hash creation API throws the following error --
> > > RING: Cannot reserve memory for tailq
> > > HASH: memory allocation failed
> > >
> > > The timer subsystem init api throws this error --
> > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > memzone segments exceeds RTE_MAX_MEMZONE
> > >
> >
> > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > patches direction and then do a VPP rebuild.]
> >
> > Let's see if we can get rid of at least one of the error messages. :-)
> >
> > /Bruce
> >
> > > I did check the code and apparently the memzone and rte zmalloc
> > > related api's are not being able to allocate memory.
> > >
> > > Regards
> > > -Prashant
> > >
> > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > Hi,
> > > > >
> > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > driver), I am running into a problem that calls to API's like
> > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > of memory.
> > > > >
> > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > >
> > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > >
> > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > be great, so requesting some advice here.
> > > > >
> > > > Hi,
> > > >
> > > > can you provide some more details on what the errors are that you are
> > > > receiving? Have you been able to dig a little deeper into what might be
> > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > issues with hash or timer libraries, for example.
> > > >
> > > > /Bruce
> 
> Thanks Bruce, the error comes from the following function in
> lib/eal/common/eal_common_memzone.c
> memzone_reserve_aligned_thread_unsafe
> 
> The condition which spits out the error is the following
> if (arr->count >= arr->len)
> So I printed both of the above values inside this function, and the
> following output came
> 
> vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> [New Thread 0x7fffa67b6700 (LWP 14732)]
> count: 0 len: 2560
> count: 1 len: 2560
> count: 2 len: 2560
> [New Thread 0x7fffa5fb5700 (LWP 14733)]
> [New Thread 0x7fffa5db4700 (LWP 14734)]
> count: 3 len: 2560
> count: 4 len: 2560
> ### this is the place where I call rte_timer_subsystem_init from my
> code, the above must be coming from any other code from VPP/EAL init,
> the line below is surely because of my call to
> rte_timer_subsystem_init
> count: 0 len: 0
> 
> So as you can see that both values are coming to be zero -- is this
> expected ? I thought the arr->len should have been non zero.
> I must add that the thread which is calling the
> rte_timer_subsystem_init is possibly different than the one which did
> the eal init, do you think that might be a problem...
> I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> the above first for any suggestions.
> 
Given the lengths you printed above, increasing the MAX_MEMZONE will not
help things. Is the init call which is failing coming from a non-DPDK
thread?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30 13:17         ` Bruce Richardson
@ 2023-03-30 13:37           ` Prashant Upadhyaya
  2023-03-30 14:02             ` Bruce Richardson
  0 siblings, 1 reply; 12+ messages in thread
From: Prashant Upadhyaya @ 2023-03-30 13:37 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > Hi,
> > > >
> > >
> > > FYI, when replying on list, it's best not to top-post, but put your replies
> > > below the email snippet you are replying to.
> > >
> > > > The hash creation API throws the following error --
> > > > RING: Cannot reserve memory for tailq
> > > > HASH: memory allocation failed
> > > >
> > > > The timer subsystem init api throws this error --
> > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > >
> > >
> > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > patches direction and then do a VPP rebuild.]
> > >
> > > Let's see if we can get rid of at least one of the error messages. :-)
> > >
> > > /Bruce
> > >
> > > > I did check the code and apparently the memzone and rte zmalloc
> > > > related api's are not being able to allocate memory.
> > > >
> > > > Regards
> > > > -Prashant
> > > >
> > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > <bruce.richardson@intel.com> wrote:
> > > > >
> > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > Hi,
> > > > > >
> > > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > > driver), I am running into a problem that calls to API's like
> > > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > > of memory.
> > > > > >
> > > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > > >
> > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > >
> > > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > be great, so requesting some advice here.
> > > > > >
> > > > > Hi,
> > > > >
> > > > > can you provide some more details on what the errors are that you are
> > > > > receiving? Have you been able to dig a little deeper into what might be
> > > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > > issues with hash or timer libraries, for example.
> > > > >
> > > > > /Bruce
> >
> > Thanks Bruce, the error comes from the following function in
> > lib/eal/common/eal_common_memzone.c
> > memzone_reserve_aligned_thread_unsafe
> >
> > The condition which spits out the error is the following
> > if (arr->count >= arr->len)
> > So I printed both of the above values inside this function, and the
> > following output came
> >
> > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > count: 0 len: 2560
> > count: 1 len: 2560
> > count: 2 len: 2560
> > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > count: 3 len: 2560
> > count: 4 len: 2560
> > ### this is the place where I call rte_timer_subsystem_init from my
> > code, the above must be coming from any other code from VPP/EAL init,
> > the line below is surely because of my call to
> > rte_timer_subsystem_init
> > count: 0 len: 0
> >
> > So as you can see that both values are coming to be zero -- is this
> > expected ? I thought the arr->len should have been non zero.
> > I must add that the thread which is calling the
> > rte_timer_subsystem_init is possibly different than the one which did
> > the eal init, do you think that might be a problem...
> > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > the above first for any suggestions.
> >
> Given the lengths you printed above, increasing the MAX_MEMZONE will not
> help things. Is the init call which is failing coming from a non-DPDK
> thread?

Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
Assuming this is the case, do you foresee a problem ?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30 13:37           ` Prashant Upadhyaya
@ 2023-03-30 14:02             ` Bruce Richardson
  2023-03-31  9:41               ` Prashant Upadhyaya
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2023-03-30 14:02 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

On Thu, Mar 30, 2023 at 07:07:23PM +0530, Prashant Upadhyaya wrote:
> On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > > Hi,
> > > > >
> > > >
> > > > FYI, when replying on list, it's best not to top-post, but put your replies
> > > > below the email snippet you are replying to.
> > > >
> > > > > The hash creation API throws the following error --
> > > > > RING: Cannot reserve memory for tailq
> > > > > HASH: memory allocation failed
> > > > >
> > > > > The timer subsystem init api throws this error --
> > > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > > >
> > > >
> > > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > > patches direction and then do a VPP rebuild.]
> > > >
> > > > Let's see if we can get rid of at least one of the error messages. :-)
> > > >
> > > > /Bruce
> > > >
> > > > > I did check the code and apparently the memzone and rte zmalloc
> > > > > related api's are not being able to allocate memory.
> > > > >
> > > > > Regards
> > > > > -Prashant
> > > > >
> > > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > > <bruce.richardson@intel.com> wrote:
> > > > > >
> > > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > > > driver), I am running into a problem that calls to API's like
> > > > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > > > of memory.
> > > > > > >
> > > > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > > > >
> > > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > > >
> > > > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > > be great, so requesting some advice here.
> > > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > can you provide some more details on what the errors are that you are
> > > > > > receiving? Have you been able to dig a little deeper into what might be
> > > > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > > > issues with hash or timer libraries, for example.
> > > > > >
> > > > > > /Bruce
> > >
> > > Thanks Bruce, the error comes from the following function in
> > > lib/eal/common/eal_common_memzone.c
> > > memzone_reserve_aligned_thread_unsafe
> > >
> > > The condition which spits out the error is the following
> > > if (arr->count >= arr->len)
> > > So I printed both of the above values inside this function, and the
> > > following output came
> > >
> > > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> > > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > > count: 0 len: 2560
> > > count: 1 len: 2560
> > > count: 2 len: 2560
> > > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > > count: 3 len: 2560
> > > count: 4 len: 2560
> > > ### this is the place where I call rte_timer_subsystem_init from my
> > > code, the above must be coming from any other code from VPP/EAL init,
> > > the line below is surely because of my call to
> > > rte_timer_subsystem_init
> > > count: 0 len: 0
> > >
> > > So as you can see that both values are coming to be zero -- is this
> > > expected ? I thought the arr->len should have been non zero.
> > > I must add that the thread which is calling the
> > > rte_timer_subsystem_init is possibly different than the one which did
> > > the eal init, do you think that might be a problem...
> > > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > > the above first for any suggestions.
> > >
> > Given the lengths you printed above, increasing the MAX_MEMZONE will not
> > help things. Is the init call which is failing coming from a non-DPDK
> > thread?
> 
> Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
> Assuming this is the case, do you foresee a problem ?

Could well be a possible cause, yes. With non-DPDK threads, the memory NUMA
node/socket-id entries could be invalid, and cause the DPDK memory
allocation to look for memory heaps on non-existent NUMA nodes.
Can you try using rte_thread_register API in your thread before calling the
init functions and see if that helps.

/Bruce

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-30 14:02             ` Bruce Richardson
@ 2023-03-31  9:41               ` Prashant Upadhyaya
  2023-03-31 10:49                 ` Bruce Richardson
  0 siblings, 1 reply; 12+ messages in thread
From: Prashant Upadhyaya @ 2023-03-31  9:41 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Thu, Mar 30, 2023 at 7:34 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Mar 30, 2023 at 07:07:23PM +0530, Prashant Upadhyaya wrote:
> > On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > > > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > > > <bruce.richardson@intel.com> wrote:
> > > > >
> > > > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > > > Hi,
> > > > > >
> > > > >
> > > > > FYI, when replying on list, it's best not to top-post, but put your replies
> > > > > below the email snippet you are replying to.
> > > > >
> > > > > > The hash creation API throws the following error --
> > > > > > RING: Cannot reserve memory for tailq
> > > > > > HASH: memory allocation failed
> > > > > >
> > > > > > The timer subsystem init api throws this error --
> > > > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > > > >
> > > > >
> > > > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > > > patches direction and then do a VPP rebuild.]
> > > > >
> > > > > Let's see if we can get rid of at least one of the error messages. :-)
> > > > >
> > > > > /Bruce
> > > > >
> > > > > > I did check the code and apparently the memzone and rte zmalloc
> > > > > > related api's are not being able to allocate memory.
> > > > > >
> > > > > > Regards
> > > > > > -Prashant
> > > > > >
> > > > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > >
> > > > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > > > > driver), I am running into a problem that calls to API's like
> > > > > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > > > > of memory.
> > > > > > > >
> > > > > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > > > > >
> > > > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > > > >
> > > > > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > > > be great, so requesting some advice here.
> > > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > can you provide some more details on what the errors are that you are
> > > > > > > receiving? Have you been able to dig a little deeper into what might be
> > > > > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > > > > issues with hash or timer libraries, for example.
> > > > > > >
> > > > > > > /Bruce
> > > >
> > > > Thanks Bruce, the error comes from the following function in
> > > > lib/eal/common/eal_common_memzone.c
> > > > memzone_reserve_aligned_thread_unsafe
> > > >
> > > > The condition which spits out the error is the following
> > > > if (arr->count >= arr->len)
> > > > So I printed both of the above values inside this function, and the
> > > > following output came
> > > >
> > > > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> > > > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > > > count: 0 len: 2560
> > > > count: 1 len: 2560
> > > > count: 2 len: 2560
> > > > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > > > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > > > count: 3 len: 2560
> > > > count: 4 len: 2560
> > > > ### this is the place where I call rte_timer_subsystem_init from my
> > > > code, the above must be coming from any other code from VPP/EAL init,
> > > > the line below is surely because of my call to
> > > > rte_timer_subsystem_init
> > > > count: 0 len: 0
> > > >
> > > > So as you can see that both values are coming to be zero -- is this
> > > > expected ? I thought the arr->len should have been non zero.
> > > > I must add that the thread which is calling the
> > > > rte_timer_subsystem_init is possibly different than the one which did
> > > > the eal init, do you think that might be a problem...
> > > > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > > > the above first for any suggestions.
> > > >
> > > Given the lengths you printed above, increasing the MAX_MEMZONE will not
> > > help things. Is the init call which is failing coming from a non-DPDK
> > > thread?
> >
> > Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
> > Assuming this is the case, do you foresee a problem ?
>
> Could well be a possible cause, yes. With non-DPDK threads, the memory NUMA
> node/socket-id entries could be invalid, and cause the DPDK memory
> allocation to look for memory heaps on non-existent NUMA nodes.
> Can you try using rte_thread_register API in your thread before calling the
> init functions and see if that helps.
>
> /Bruce

Still no luck !
I tried two things --
First, I tried to make the calls from the VPP's fastpath thread (which
I hoped would be a true DPDK thread internally), but the calls failed
like before.
Second, I tried to do a rte_thread_register on this fastpath thread
before making the calls -- this did not help either, same problem.
It appears that VPP's memory management has done something so that
these rte calls are not able to access the expected datastructures at
DPDK level.
It appears I am the only guy in the world trying to make these rte
calls from VPP plugins, I checked on VPP mailing list too and the only
suggestion I got was to replace DPDK memzone/memory allocator
functions with those of VPP. This becomes intricate work as I call rte
timers, hash and rcu functions in my code.
Can I patch DPDK in a generic fashion to reserve memory from a VPP
allocator function or even a malloc at some centralized place or a set
of functions in DPDK ?
But the larger question is what are we running into here which is
causing these issues.

Regards
-Prashant

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-31  9:41               ` Prashant Upadhyaya
@ 2023-03-31 10:49                 ` Bruce Richardson
  2023-04-14  4:25                   ` Prashant Upadhyaya
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2023-03-31 10:49 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

On Fri, Mar 31, 2023 at 03:11:18PM +0530, Prashant Upadhyaya wrote:
> On Thu, Mar 30, 2023 at 7:34 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Thu, Mar 30, 2023 at 07:07:23PM +0530, Prashant Upadhyaya wrote:
> > > On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > > > > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > > > > <bruce.richardson@intel.com> wrote:
> > > > > >
> > > > > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > >
> > > > > > FYI, when replying on list, it's best not to top-post, but put your replies
> > > > > > below the email snippet you are replying to.
> > > > > >
> > > > > > > The hash creation API throws the following error --
> > > > > > > RING: Cannot reserve memory for tailq
> > > > > > > HASH: memory allocation failed
> > > > > > >
> > > > > > > The timer subsystem init api throws this error --
> > > > > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > > > > >
> > > > > >
> > > > > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > > > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > > > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > > > > patches direction and then do a VPP rebuild.]
> > > > > >
> > > > > > Let's see if we can get rid of at least one of the error messages. :-)
> > > > > >
> > > > > > /Bruce
> > > > > >
> > > > > > > I did check the code and apparently the memzone and rte zmalloc
> > > > > > > related api's are not being able to allocate memory.
> > > > > > >
> > > > > > > Regards
> > > > > > > -Prashant
> > > > > > >
> > > > > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > > > > > driver), I am running into a problem that calls to API's like
> > > > > > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > > > > > of memory.
> > > > > > > > >
> > > > > > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > > > > > >
> > > > > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > > > > >
> > > > > > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > > > > be great, so requesting some advice here.
> > > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > can you provide some more details on what the errors are that you are
> > > > > > > > receiving? Have you been able to dig a little deeper into what might be
> > > > > > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > > > > > issues with hash or timer libraries, for example.
> > > > > > > >
> > > > > > > > /Bruce
> > > > >
> > > > > Thanks Bruce, the error comes from the following function in
> > > > > lib/eal/common/eal_common_memzone.c
> > > > > memzone_reserve_aligned_thread_unsafe
> > > > >
> > > > > The condition which spits out the error is the following
> > > > > if (arr->count >= arr->len)
> > > > > So I printed both of the above values inside this function, and the
> > > > > following output came
> > > > >
> > > > > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> > > > > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > > > > count: 0 len: 2560
> > > > > count: 1 len: 2560
> > > > > count: 2 len: 2560
> > > > > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > > > > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > > > > count: 3 len: 2560
> > > > > count: 4 len: 2560
> > > > > ### this is the place where I call rte_timer_subsystem_init from my
> > > > > code, the above must be coming from any other code from VPP/EAL init,
> > > > > the line below is surely because of my call to
> > > > > rte_timer_subsystem_init
> > > > > count: 0 len: 0
> > > > >
> > > > > So as you can see that both values are coming to be zero -- is this
> > > > > expected ? I thought the arr->len should have been non zero.
> > > > > I must add that the thread which is calling the
> > > > > rte_timer_subsystem_init is possibly different than the one which did
> > > > > the eal init, do you think that might be a problem...
> > > > > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > > > > the above first for any suggestions.
> > > > >
> > > > Given the lengths you printed above, increasing the MAX_MEMZONE will not
> > > > help things. Is the init call which is failing coming from a non-DPDK
> > > > thread?
> > >
> > > Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
> > > Assuming this is the case, do you foresee a problem ?
> >
> > Could well be a possible cause, yes. With non-DPDK threads, the memory NUMA
> > node/socket-id entries could be invalid, and cause the DPDK memory
> > allocation to look for memory heaps on non-existent NUMA nodes.
> > Can you try using rte_thread_register API in your thread before calling the
> > init functions and see if that helps.
> >
> > /Bruce
> 
> Still no luck !
> I tried two things --
> First, I tried to make the calls from the VPP's fastpath thread (which
> I hoped would be a true DPDK thread internally), but the calls failed
> like before.
> Second, I tried to do a rte_thread_register on this fastpath thread
> before making the calls -- this did not help either, same problem.
> It appears that VPP's memory management has done something so that
> these rte calls are not able to access the expected datastructures at
> DPDK level.
> It appears I am the only guy in the world trying to make these rte
> calls from VPP plugins, I checked on VPP mailing list too and the only
> suggestion I got was to replace DPDK memzone/memory allocator
> functions with those of VPP. This becomes intricate work as I call rte
> timers, hash and rcu functions in my code.
> Can I patch DPDK in a generic fashion to reserve memory from a VPP
> allocator function or even a malloc at some centralized place or a set
> of functions in DPDK ?

If doing any such patching, the patches should add new init functions to
DPDK to allow init with already-allocated memory, or with a custom memory
allocator to allow reserving X amount of memory. That's the only way I can
see to do a generic version here.

> But the larger question is what are we running into here which is
> causing these issues.
> 

It's a very good question. I don't know enough about VPP to answer that -
but I suggest you ask on the VPP mailing list, as its likely others in the
VPP community may be better able to help. I would suggest doing this before
looking into any patching of DPDK, there may be easier workarounds if we
know the exact root cause of the issue.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-03-31 10:49                 ` Bruce Richardson
@ 2023-04-14  4:25                   ` Prashant Upadhyaya
  2023-04-14  9:07                     ` Bruce Richardson
  0 siblings, 1 reply; 12+ messages in thread
From: Prashant Upadhyaya @ 2023-04-14  4:25 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Fri, Mar 31, 2023 at 4:19 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Fri, Mar 31, 2023 at 03:11:18PM +0530, Prashant Upadhyaya wrote:
> > On Thu, Mar 30, 2023 at 7:34 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Thu, Mar 30, 2023 at 07:07:23PM +0530, Prashant Upadhyaya wrote:
> > > > On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
> > > > <bruce.richardson@intel.com> wrote:
> > > > >
> > > > > On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > > > > > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > >
> > > > > > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > >
> > > > > > > FYI, when replying on list, it's best not to top-post, but put your replies
> > > > > > > below the email snippet you are replying to.
> > > > > > >
> > > > > > > > The hash creation API throws the following error --
> > > > > > > > RING: Cannot reserve memory for tailq
> > > > > > > > HASH: memory allocation failed
> > > > > > > >
> > > > > > > > The timer subsystem init api throws this error --
> > > > > > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > > > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > > > > > >
> > > > > > >
> > > > > > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > > > > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > > > > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > > > > > patches direction and then do a VPP rebuild.]
> > > > > > >
> > > > > > > Let's see if we can get rid of at least one of the error messages. :-)
> > > > > > >
> > > > > > > /Bruce
> > > > > > >
> > > > > > > > I did check the code and apparently the memzone and rte zmalloc
> > > > > > > > related api's are not being able to allocate memory.
> > > > > > > >
> > > > > > > > Regards
> > > > > > > > -Prashant
> > > > > > > >
> > > > > > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > > > > > > driver), I am running into a problem that calls to API's like
> > > > > > > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > > > > > > of memory.
> > > > > > > > > >
> > > > > > > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > > > > > > >
> > > > > > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > > > > > >
> > > > > > > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > > > > > be great, so requesting some advice here.
> > > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > can you provide some more details on what the errors are that you are
> > > > > > > > > receiving? Have you been able to dig a little deeper into what might be
> > > > > > > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > > > > > > issues with hash or timer libraries, for example.
> > > > > > > > >
> > > > > > > > > /Bruce
> > > > > >
> > > > > > Thanks Bruce, the error comes from the following function in
> > > > > > lib/eal/common/eal_common_memzone.c
> > > > > > memzone_reserve_aligned_thread_unsafe
> > > > > >
> > > > > > The condition which spits out the error is the following
> > > > > > if (arr->count >= arr->len)
> > > > > > So I printed both of the above values inside this function, and the
> > > > > > following output came
> > > > > >
> > > > > > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> > > > > > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > > > > > count: 0 len: 2560
> > > > > > count: 1 len: 2560
> > > > > > count: 2 len: 2560
> > > > > > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > > > > > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > > > > > count: 3 len: 2560
> > > > > > count: 4 len: 2560
> > > > > > ### this is the place where I call rte_timer_subsystem_init from my
> > > > > > code, the above must be coming from any other code from VPP/EAL init,
> > > > > > the line below is surely because of my call to
> > > > > > rte_timer_subsystem_init
> > > > > > count: 0 len: 0
> > > > > >
> > > > > > So as you can see that both values are coming to be zero -- is this
> > > > > > expected ? I thought the arr->len should have been non zero.
> > > > > > I must add that the thread which is calling the
> > > > > > rte_timer_subsystem_init is possibly different than the one which did
> > > > > > the eal init, do you think that might be a problem...
> > > > > > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > > > > > the above first for any suggestions.
> > > > > >
> > > > > Given the lengths you printed above, increasing the MAX_MEMZONE will not
> > > > > help things. Is the init call which is failing coming from a non-DPDK
> > > > > thread?
> > > >
> > > > Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
> > > > Assuming this is the case, do you foresee a problem ?
> > >
> > > Could well be a possible cause, yes. With non-DPDK threads, the memory NUMA
> > > node/socket-id entries could be invalid, and cause the DPDK memory
> > > allocation to look for memory heaps on non-existent NUMA nodes.
> > > Can you try using rte_thread_register API in your thread before calling the
> > > init functions and see if that helps.
> > >
> > > /Bruce
> >
> > Still no luck !
> > I tried two things --
> > First, I tried to make the calls from the VPP's fastpath thread (which
> > I hoped would be a true DPDK thread internally), but the calls failed
> > like before.
> > Second, I tried to do a rte_thread_register on this fastpath thread
> > before making the calls -- this did not help either, same problem.
> > It appears that VPP's memory management has done something so that
> > these rte calls are not able to access the expected datastructures at
> > DPDK level.
> > It appears I am the only guy in the world trying to make these rte
> > calls from VPP plugins, I checked on VPP mailing list too and the only
> > suggestion I got was to replace DPDK memzone/memory allocator
> > functions with those of VPP. This becomes intricate work as I call rte
> > timers, hash and rcu functions in my code.
> > Can I patch DPDK in a generic fashion to reserve memory from a VPP
> > allocator function or even a malloc at some centralized place or a set
> > of functions in DPDK ?
>
> If doing any such patching, the patches should add new init functions to
> DPDK to allow init with already-allocated memory, or with a custom memory
> allocator to allow reserving X amount of memory. That's the only way I can
> see to do a generic version here.
>
> > But the larger question is what are we running into here which is
> > causing these issues.
> >
>
> It's a very good question. I don't know enough about VPP to answer that -
> but I suggest you ask on the VPP mailing list, as its likely others in the
> VPP community may be better able to help. I would suggest doing this before
> looking into any patching of DPDK, there may be easier workarounds if we
> know the exact root cause of the issue.
>
> Regards,
> /Bruce

Hi,

I could finally get over this issue.
This would probably not be of interest to DPDK community but I am
putting up the root cause here for the record.
In VPP, there is a DPDK plugin (dpdk_plugin.so) which links with DPDK.
This does not export symbols because of the way VPP loads this shared
object.
If I link with DPDK in my own plugin, then it effectively creates a
second copy of DPDK which leads to the issues I was seeing.
As per advice from VPP community, there is a way in VPP to let the
DPDK plugin in VPP to link against the system installed DPDK libs, and
also let my own plugin link against the same system installed DPDK
libs. Then everything works as expected.

Regards
-Prashant

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP
  2023-04-14  4:25                   ` Prashant Upadhyaya
@ 2023-04-14  9:07                     ` Bruce Richardson
  0 siblings, 0 replies; 12+ messages in thread
From: Bruce Richardson @ 2023-04-14  9:07 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

On Fri, Apr 14, 2023 at 09:55:31AM +0530, Prashant Upadhyaya wrote:
> On Fri, Mar 31, 2023 at 4:19 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Fri, Mar 31, 2023 at 03:11:18PM +0530, Prashant Upadhyaya wrote:
> > > On Thu, Mar 30, 2023 at 7:34 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 07:07:23PM +0530, Prashant Upadhyaya wrote:
> > > > > On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
> > > > > <bruce.richardson@intel.com> wrote:
> > > > > >
> > > > > > On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > > > > > > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > >
> > > > > > > > FYI, when replying on list, it's best not to top-post, but put your replies
> > > > > > > > below the email snippet you are replying to.
> > > > > > > >
> > > > > > > > > The hash creation API throws the following error --
> > > > > > > > > RING: Cannot reserve memory for tailq
> > > > > > > > > HASH: memory allocation failed
> > > > > > > > >
> > > > > > > > > The timer subsystem init api throws this error --
> > > > > > > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > > > > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > > > > > > >
> > > > > > > >
> > > > > > > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > > > > > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > > > > > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > > > > > > patches direction and then do a VPP rebuild.]
> > > > > > > >
> > > > > > > > Let's see if we can get rid of at least one of the error messages. :-)
> > > > > > > >
> > > > > > > > /Bruce
> > > > > > > >
> > > > > > > > > I did check the code and apparently the memzone and rte zmalloc
> > > > > > > > > related api's are not being able to allocate memory.
> > > > > > > > >
> > > > > > > > > Regards
> > > > > > > > > -Prashant
> > > > > > > > >
> > > > > > > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > > > > > > Hi,
> > > > > > > > > > >
> > > > > > > > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > > > > > > > driver), I am running into a problem that calls to API's like
> > > > > > > > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > > > > > > > of memory.
> > > > > > > > > > >
> > > > > > > > > > > This is presumably because VPP inits the EAL with the following arguments --
> > > > > > > > > > >
> > > > > > > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > > > > > > >
> > > > > > > > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > > > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > > > > > > be great, so requesting some advice here.
> > > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > can you provide some more details on what the errors are that you are
> > > > > > > > > > receiving? Have you been able to dig a little deeper into what might be
> > > > > > > > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > > > > > > > issues with hash or timer libraries, for example.
> > > > > > > > > >
> > > > > > > > > > /Bruce
> > > > > > >
> > > > > > > Thanks Bruce, the error comes from the following function in
> > > > > > > lib/eal/common/eal_common_memzone.c
> > > > > > > memzone_reserve_aligned_thread_unsafe
> > > > > > >
> > > > > > > The condition which spits out the error is the following
> > > > > > > if (arr->count >= arr->len)
> > > > > > > So I printed both of the above values inside this function, and the
> > > > > > > following output came
> > > > > > >
> > > > > > > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> > > > > > > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > > > > > > count: 0 len: 2560
> > > > > > > count: 1 len: 2560
> > > > > > > count: 2 len: 2560
> > > > > > > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > > > > > > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > > > > > > count: 3 len: 2560
> > > > > > > count: 4 len: 2560
> > > > > > > ### this is the place where I call rte_timer_subsystem_init from my
> > > > > > > code, the above must be coming from any other code from VPP/EAL init,
> > > > > > > the line below is surely because of my call to
> > > > > > > rte_timer_subsystem_init
> > > > > > > count: 0 len: 0
> > > > > > >
> > > > > > > So as you can see that both values are coming to be zero -- is this
> > > > > > > expected ? I thought the arr->len should have been non zero.
> > > > > > > I must add that the thread which is calling the
> > > > > > > rte_timer_subsystem_init is possibly different than the one which did
> > > > > > > the eal init, do you think that might be a problem...
> > > > > > > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > > > > > > the above first for any suggestions.
> > > > > > >
> > > > > > Given the lengths you printed above, increasing the MAX_MEMZONE will not
> > > > > > help things. Is the init call which is failing coming from a non-DPDK
> > > > > > thread?
> > > > >
> > > > > Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
> > > > > Assuming this is the case, do you foresee a problem ?
> > > >
> > > > Could well be a possible cause, yes. With non-DPDK threads, the memory NUMA
> > > > node/socket-id entries could be invalid, and cause the DPDK memory
> > > > allocation to look for memory heaps on non-existent NUMA nodes.
> > > > Can you try using rte_thread_register API in your thread before calling the
> > > > init functions and see if that helps.
> > > >
> > > > /Bruce
> > >
> > > Still no luck !
> > > I tried two things --
> > > First, I tried to make the calls from the VPP's fastpath thread (which
> > > I hoped would be a true DPDK thread internally), but the calls failed
> > > like before.
> > > Second, I tried to do a rte_thread_register on this fastpath thread
> > > before making the calls -- this did not help either, same problem.
> > > It appears that VPP's memory management has done something so that
> > > these rte calls are not able to access the expected datastructures at
> > > DPDK level.
> > > It appears I am the only guy in the world trying to make these rte
> > > calls from VPP plugins, I checked on VPP mailing list too and the only
> > > suggestion I got was to replace DPDK memzone/memory allocator
> > > functions with those of VPP. This becomes intricate work as I call rte
> > > timers, hash and rcu functions in my code.
> > > Can I patch DPDK in a generic fashion to reserve memory from a VPP
> > > allocator function or even a malloc at some centralized place or a set
> > > of functions in DPDK ?
> >
> > If doing any such patching, the patches should add new init functions to
> > DPDK to allow init with already-allocated memory, or with a custom memory
> > allocator to allow reserving X amount of memory. That's the only way I can
> > see to do a generic version here.
> >
> > > But the larger question is what are we running into here which is
> > > causing these issues.
> > >
> >
> > It's a very good question. I don't know enough about VPP to answer that -
> > but I suggest you ask on the VPP mailing list, as its likely others in the
> > VPP community may be better able to help. I would suggest doing this before
> > looking into any patching of DPDK, there may be easier workarounds if we
> > know the exact root cause of the issue.
> >
> > Regards,
> > /Bruce
> 
> Hi,
> 
> I could finally get over this issue.
> This would probably not be of interest to DPDK community but I am
> putting up the root cause here for the record.
> In VPP, there is a DPDK plugin (dpdk_plugin.so) which links with DPDK.
> This does not export symbols because of the way VPP loads this shared
> object.
> If I link with DPDK in my own plugin, then it effectively creates a
> second copy of DPDK which leads to the issues I was seeing.
> As per advice from VPP community, there is a way in VPP to let the
> DPDK plugin in VPP to link against the system installed DPDK libs, and
> also let my own plugin link against the same system installed DPDK
> libs. Then everything works as expected.
> 
Thank you for sharing the solution with us. It could well be of help to
others.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-04-14  9:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-30  5:00 Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP Prashant Upadhyaya
2023-03-30  8:00 ` Bruce Richardson
2023-03-30  8:27   ` Prashant Upadhyaya
2023-03-30  9:20     ` Bruce Richardson
2023-03-30 13:12       ` Prashant Upadhyaya
2023-03-30 13:17         ` Bruce Richardson
2023-03-30 13:37           ` Prashant Upadhyaya
2023-03-30 14:02             ` Bruce Richardson
2023-03-31  9:41               ` Prashant Upadhyaya
2023-03-31 10:49                 ` Bruce Richardson
2023-04-14  4:25                   ` Prashant Upadhyaya
2023-04-14  9:07                     ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).