DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
@ 2016-02-10 22:54 Luca Boccassi
  2016-02-11  7:34 ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: Luca Boccassi @ 2016-02-10 22:54 UTC (permalink / raw)
  To: dev

Hello all,

I created a set of patches for Valgrind that add support for the
rte_*alloc family of functions. We use it for memcheck (I added support
for other all the other Valgrind tools like cachegrind as well, but it's
less tested), and find it extremely useful, since the vanilla version
cannot intercept and report leaks cause by rte_*alloc functions from
librte_malloc.

While at FOSDEM last week I mentioned this to Mark Gray and Kevin
Traynor after their presentations (sorry it took a while to remember to
push this up :-) ), and they thought it would be useful for other
DPDK-based applications developers, so I've uploaded it to Github [1].
I've also sent the patches upstream a while ago [2].

It works with both with a statically and dynamically linked DPDK
library.

To use it with a statically linked DPDK, pass the following parameter to
Valgrind: --soname-synonyms=somalloc=NONE

To use it with a dynamically linked DPDK no additional parameter is
needed, but make sure that the SONAME matches either "lib*dpdk.so*" if
building a single .so or "librte_malloc.so*" if building each library
individually. If it doesn't match either of these regexp, then you can
manually patch the file include/pub_tool_redir.h and rebuild.

Please feel free to provide comments, feedback, or patches! All bug
reports will be promptly redirected to /dev/null :-)

-- 
Kind regards,
Luca Boccassi
Brocade Communications Systems

[1] https://github.com/bluca/valgrind-dpdk
[2] https://bugs.kde.org/show_bug.cgi?id=350405

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
  2016-02-10 22:54 [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind Luca Boccassi
@ 2016-02-11  7:34 ` Thomas Monjalon
  2016-02-13  6:47   ` Matthew Hall
  2016-02-13 12:30   ` Luca Boccassi
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Monjalon @ 2016-02-11  7:34 UTC (permalink / raw)
  To: Luca Boccassi; +Cc: dev

2016-02-10 22:54, Luca Boccassi:
 I created a set of patches for Valgrind that add support for the
> rte_*alloc family of functions. We use it for memcheck (I added support
> for other all the other Valgrind tools like cachegrind as well, but it's
> less tested), and find it extremely useful, since the vanilla version
> cannot intercept and report leaks cause by rte_*alloc functions from
> librte_malloc.

Thank you Luca.
I think it deserves to be visible in the DPDK doc.
What about adding some explanations in
http://dpdk.org/doc/guides/prog_guide/profile_app.html
or
http://dpdk.org/doc/guides/prog_guide/env_abstraction_layer.html#malloc
?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
  2016-02-11  7:34 ` Thomas Monjalon
@ 2016-02-13  6:47   ` Matthew Hall
  2016-02-13 12:15     ` Luca Boccassi
  2016-02-13 12:30   ` Luca Boccassi
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Hall @ 2016-02-13  6:47 UTC (permalink / raw)
  To: Luca Boccassi; +Cc: <dev@dpdk.org>

2016-02-10 22:54, Luca Boccassi:
> I created a set of patches for Valgrind that add support for the rte_*alloc family of functions. We use it for memcheck

Hi Luca,

This is awesome stuff:

==18730== Source and destination overlap in memcpy(0x6851c00, 0x6851c00, 4144)
==18730==    at 0x4C30573: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18730==    by 0x787195: sort_by_physaddr (eal_memory.c:747)
==18730==    by 0x788314: rte_eal_hugepage_init (eal_memory.c:1198)
==18730==    by 0x8F1888: rte_eal_memory_init (eal_common_memory.c:145)
==18730==    by 0x751BB6: rte_eal_init (eal.c:793)
==18730==    by 0x4D9E8A: main (sdn_sensor.c:555)

==18730== Thread 3:
==18730== Syscall param epoll_ctl(event) points to uninitialised byte(s)
==18730==    at 0x60C249A: epoll_ctl (syscall-template.S:81)
==18730==    by 0x72F88E: eal_intr_thread_main (eal_interrupts.c:844)
==18730==    by 0x581A6A9: start_thread (pthread_create.c:333)
==18730==    by 0x60C1EEC: clone (clone.S:109)
==18730==  Address 0x8400b38 is on thread 3's stack
==18730==  in frame #1, created by eal_intr_thread_main (eal_interrupts.c:801)
==18730==

I'll be running my app with this special valgrind heavily and patching these from now on.

I was wondering, is there a way to get the patchset on its own branch versus the master valgrind in your repository? If so it will be easier for the rest of the community to assist you with rebasing it periodically to the upstream valgrind. This would be easy if you keep their upstream code in one branch and the patches applied in another branch. Then we can update the master one somehow and rebase the patched one to this new code...

Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
  2016-02-13  6:47   ` Matthew Hall
@ 2016-02-13 12:15     ` Luca Boccassi
  0 siblings, 0 replies; 7+ messages in thread
From: Luca Boccassi @ 2016-02-13 12:15 UTC (permalink / raw)
  To: mhall; +Cc: dev

On Fri, 2016-02-12 at 22:47 -0800, Matthew Hall wrote:
> 2016-02-10 22:54, Luca Boccassi:
> > I created a set of patches for Valgrind that add support for the rte_*alloc family of functions. We use it for memcheck
> 
> Hi Luca,
> 
> This is awesome stuff:
> 
> ==18730== Source and destination overlap in memcpy(0x6851c00, 0x6851c00, 4144)
> ==18730==    at 0x4C30573: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18730==    by 0x787195: sort_by_physaddr (eal_memory.c:747)
> ==18730==    by 0x788314: rte_eal_hugepage_init (eal_memory.c:1198)
> ==18730==    by 0x8F1888: rte_eal_memory_init (eal_common_memory.c:145)
> ==18730==    by 0x751BB6: rte_eal_init (eal.c:793)
> ==18730==    by 0x4D9E8A: main (sdn_sensor.c:555)
> 
> ==18730== Thread 3:
> ==18730== Syscall param epoll_ctl(event) points to uninitialised byte(s)
> ==18730==    at 0x60C249A: epoll_ctl (syscall-template.S:81)
> ==18730==    by 0x72F88E: eal_intr_thread_main (eal_interrupts.c:844)
> ==18730==    by 0x581A6A9: start_thread (pthread_create.c:333)
> ==18730==    by 0x60C1EEC: clone (clone.S:109)
> ==18730==  Address 0x8400b38 is on thread 3's stack
> ==18730==  in frame #1, created by eal_intr_thread_main (eal_interrupts.c:801)
> ==18730==
> 
> I'll be running my app with this special valgrind heavily and patching these from now on.

Hi Matthew,

Very cool, happy to help :-)

Unrelated to Valgrind but related to debugging tools: I've been playing
with our DPDK application and GCC's address sanitizer, and I would
recommend looking into it. It cannot be turned on in production as it
kills performance, but for test builds it's very useful as it will abort
when out-of-bounds reads/writes or use-after-free happen. To use it
simply add -fsanitize=address to the CFLAGS.

> I was wondering, is there a way to get the patchset on its own branch versus the master valgrind in your repository? If so it will be easier for the rest of the community to assist you with rebasing it periodically to the upstream valgrind. This would be easy if you keep their upstream code in one branch and the patches applied in another branch. Then we can update the master one somehow and rebase the patched one to this new code...

Sure thing, good idea! I've pushed an "upstream" branch on the commit
that did the initial import. This way we can follow the common workflow
of importing upstream changes into their own branch, and then merging
into the master branch, which has the patches. Does that sound
reasonable?

-- 
Kind regards,
Luca Boccassi
Brocade Communications Systems

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
  2016-02-11  7:34 ` Thomas Monjalon
  2016-02-13  6:47   ` Matthew Hall
@ 2016-02-13 12:30   ` Luca Boccassi
  2016-02-13 19:59     ` Matthew Hall
  2016-02-15  9:16     ` Thomas Monjalon
  1 sibling, 2 replies; 7+ messages in thread
From: Luca Boccassi @ 2016-02-13 12:30 UTC (permalink / raw)
  To: thomas.monjalon; +Cc: dev

On Thu, 2016-02-11 at 08:34 +0100, Thomas Monjalon wrote:
> 2016-02-10 22:54, Luca Boccassi:
>  I created a set of patches for Valgrind that add support for the
> > rte_*alloc family of functions. We use it for memcheck (I added support
> > for other all the other Valgrind tools like cachegrind as well, but it's
> > less tested), and find it extremely useful, since the vanilla version
> > cannot intercept and report leaks cause by rte_*alloc functions from
> > librte_malloc.
> 
> Thank you Luca.
> I think it deserves to be visible in the DPDK doc.
> What about adding some explanations in
> https://urldefense.proofpoint.com/v2/url?u=http-3A__dpdk.org_doc_guides_prog-5Fguide_profile-5Fapp.html&d=CwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=QTEM8ICX7t_SLgWP3qPWtKiwKMps487LPWQx-B9AqIc&m=QXy2HY_6FCRpB2dqb0AfDLoTIJ2MpHaKS_Bd5WKYgMQ&s=d4OWq_1QIlrYTxkCHIsQqn7p0887PWo4RaYa7PZeeII&e= 
> or
> https://urldefense.proofpoint.com/v2/url?u=http-3A__dpdk.org_doc_guides_prog-5Fguide_env-5Fabstraction-5Flayer.html-23malloc&d=CwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=QTEM8ICX7t_SLgWP3qPWtKiwKMps487LPWQx-B9AqIc&m=QXy2HY_6FCRpB2dqb0AfDLoTIJ2MpHaKS_Bd5WKYgMQ&s=J36uf3GxS8AuoM2eQje4VTbXuF4WLmxGKIXM3RslaOA&e= 
> ?

Hi Thomas,

Thanks, anything I could help with for that to happen?

Also, a few words about the actual implementation.

Valgrind re-implements the whole *alloc and friends internally. There is
a common framework shared between the various tools, and each builds on
top of it.

What I've done is to map the various rte_*alloc/free functions on top of
Valgrind's implementation of posix_memalign/free. This was done in order
to respect the cache alignment parameter of rte_malloc and friends. I've
tested to make sure that this works correctly, as we rely heavily upon
it.

I have not, however, implemented support for NUMA sockets. There is no
such concept inside Valgrind's framework at the moment, so it would be a
monumental task. The NUMA socket parameter will simply be ignored. I do
not believe it would be very useful to implement support for it, as it
doesn't add much. For the purpose of memory leaks detection, I don't
think it matters much on which socket a memory block is allocated.

This might have an effect on cachegrind though, so it's worth noting and
bearing it in mind when using cachegrind rather than memcheck.

I've added a note on Github.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
  2016-02-13 12:30   ` Luca Boccassi
@ 2016-02-13 19:59     ` Matthew Hall
  2016-02-15  9:16     ` Thomas Monjalon
  1 sibling, 0 replies; 7+ messages in thread
From: Matthew Hall @ 2016-02-13 19:59 UTC (permalink / raw)
  To: Luca Boccassi; +Cc: dev

On Feb 13, 2016, at 4:30 AM, Luca Boccassi <lboccass@Brocade.com> wrote:
> I have not, however, implemented support for NUMA sockets. There is no
> such concept inside Valgrind's framework at the moment, so it would be a
> monumental task.

There is a way to mark the mallocs and frees from inside a custom allocator instead of remapping to valgrind's allocator. jemalloc uses this if you enable it. I use jemalloc with my DPDK code for all the variable-sized mallocs as I prefer it to librte_malloc, and valgrind works fine on all those allocs because jemalloc calls the hinter functions. 

include/jemalloc/internal/jemalloc_internal.h
look for #ifdef JEMALLOC_VALGRIND

> This might have an effect on cachegrind though, so it's worth noting and
> bearing it in mind when using cachegrind rather than memcheck.

I am not sure that's much of a limitation really, because nobody would use cachegrind on DPDK code I wouldn't think. Instead you would use freely available VTune for open-source or you would use the perf subsystem to monitor the cache performance counters. The only thing I am aware of that Valgrind does, that the performance hardware cannot also do, is memcheck. Unless I missed anything.

Either way this is very handy to have.

Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
  2016-02-13 12:30   ` Luca Boccassi
  2016-02-13 19:59     ` Matthew Hall
@ 2016-02-15  9:16     ` Thomas Monjalon
  1 sibling, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2016-02-15  9:16 UTC (permalink / raw)
  To: Luca Boccassi; +Cc: dev

2016-02-13 12:30, Luca Boccassi:
> On Thu, 2016-02-11 at 08:34 +0100, Thomas Monjalon wrote:
> > 2016-02-10 22:54, Luca Boccassi:
> >  I created a set of patches for Valgrind that add support for the
> > > rte_*alloc family of functions. We use it for memcheck (I added support
> > > for other all the other Valgrind tools like cachegrind as well, but it's
> > > less tested), and find it extremely useful, since the vanilla version
> > > cannot intercept and report leaks cause by rte_*alloc functions from
> > > librte_malloc.
> > 
> > Thank you Luca.
> > I think it deserves to be visible in the DPDK doc.
> > What about adding some explanations in
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__dpdk.org_doc_guides_prog-5Fguide_profile-5Fapp.html&d=CwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=QTEM8ICX7t_SLgWP3qPWtKiwKMps487LPWQx-B9AqIc&m=QXy2HY_6FCRpB2dqb0AfDLoTIJ2MpHaKS_Bd5WKYgMQ&s=d4OWq_1QIlrYTxkCHIsQqn7p0887PWo4RaYa7PZeeII&e= 
> > or
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__dpdk.org_doc_guides_prog-5Fguide_env-5Fabstraction-5Flayer.html-23malloc&d=CwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=QTEM8ICX7t_SLgWP3qPWtKiwKMps487LPWQx-B9AqIc&m=QXy2HY_6FCRpB2dqb0AfDLoTIJ2MpHaKS_Bd5WKYgMQ&s=J36uf3GxS8AuoM2eQje4VTbXuF4WLmxGKIXM3RslaOA&e= 
> > ?
> 
> Hi Thomas,
> 
> Thanks, anything I could help with for that to happen?

Yes, the documentation is in the git tree.
If you have time, it would be nice to send a patch on this list to
point your patches and explain how it works (below notes can be included).
The guide for doc contribution is http://dpdk.org/doc/guides/contributing/documentation.html

> Also, a few words about the actual implementation.
> 
> Valgrind re-implements the whole *alloc and friends internally. There is
> a common framework shared between the various tools, and each builds on
> top of it.
> 
> What I've done is to map the various rte_*alloc/free functions on top of
> Valgrind's implementation of posix_memalign/free. This was done in order
> to respect the cache alignment parameter of rte_malloc and friends. I've
> tested to make sure that this works correctly, as we rely heavily upon
> it.
> 
> I have not, however, implemented support for NUMA sockets. There is no
> such concept inside Valgrind's framework at the moment, so it would be a
> monumental task. The NUMA socket parameter will simply be ignored. I do
> not believe it would be very useful to implement support for it, as it
> doesn't add much. For the purpose of memory leaks detection, I don't
> think it matters much on which socket a memory block is allocated.
> 
> This might have an effect on cachegrind though, so it's worth noting and
> bearing it in mind when using cachegrind rather than memcheck.
> 
> I've added a note on Github.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-15  9:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-10 22:54 [dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind Luca Boccassi
2016-02-11  7:34 ` Thomas Monjalon
2016-02-13  6:47   ` Matthew Hall
2016-02-13 12:15     ` Luca Boccassi
2016-02-13 12:30   ` Luca Boccassi
2016-02-13 19:59     ` Matthew Hall
2016-02-15  9:16     ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).