DPDK patches and discussions
 help / color / mirror / Atom feed
From: David Marchand <david.marchand@redhat.com>
To: lic121 <chengtcli@qq.com>
Cc: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>, dev <dev@dpdk.org>
Subject: Re: [PATCH] eal: zero out new added memory
Date: Mon, 29 Aug 2022 13:57:48 +0200	[thread overview]
Message-ID: <CAJFAV8x8c7xth4WmXPG3bautS43YBS3OKfDmKekozW1U=ri+jg@mail.gmail.com> (raw)
In-Reply-To: <tencent_93CEF4C2700DE476C4A48B3A1BD1A21F210A@qq.com>

On Mon, Aug 29, 2022 at 1:38 PM lic121 <chengtcli@qq.com> wrote:
>
> On Mon, Aug 29, 2022 at 01:18:36AM +0000, lic121 wrote:
> > On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> > > 2022-08-27 13:31 (UTC+0000), lic121:
> > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > > > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:
> > > > > > From: lic121 <lic121@chinatelecom.cn>
> > > > > >
> > > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > > > > zero oute allocaed memory. Because memory are zeroed out when free
> > > > > > in malloc_elem_free(). But seems the initial allocated memory is
> > > > > > not zeroed out as expected.
> > > > > >
> > > > > > This patch zero out initial allocated memory in
> > > > > > malloc_heap_add_memory().
> > > > > >
> > > > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > > > > this problem.
> > > > > > ```
> > > > > >   Stack trace of thread 412780:
> > > > > >   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > > > > >   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
> > > > > >   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
> > > > > >   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
> > > > > >   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
> > > > > >   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
> > > > > >   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > > > > >   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
> > > > > >   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
> > > > > >   #9  0x000000000041768f main (dpdk-testpmd)
> > > > > >   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
> > > > > >   #11 0x000000000067e34a _start (dpdk-testpmd)
> > > > > > ```
> > > > > >
> > > > > > Signed-off-by: lic121 <lic121@chinatelecom.cn>
> > > > > > ---
> > > > > >  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
> > > > > >  1 file changed, 8 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> > > > > > index f4e20ea..1607401 100644
> > > > > > --- a/lib/librte_eal/common/malloc_heap.c
> > > > > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > > > > @@ -96,11 +96,19 @@
> > > > > >               void *start, size_t len)
> > > > > >  {
> > > > > >       struct malloc_elem *elem = start;
> > > > > > +     void *ptr;
> > > > > > +     size_t data_len
> > > > > > +
> > > > > >
> > > > > >       malloc_elem_init(elem, heap, msl, len, elem, len);
> > > > > >
> > > > > >       malloc_elem_insert(elem);
> > > > > >
> > > > > > +     /* Zero out new added memory. */
> > > > > > +     *ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > > > > +     data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > > > > +     memset(ptr, 0, data_len);
> > > > > > +
> > > > > >       elem = malloc_elem_join_adjacent_free(elem);
> > > > > >
> > > > > >       malloc_elem_free_list_insert(elem);
> > > > >
> > > > > Hi,
> > > > >
> > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > > > > What makes you think that it is not zeroed?
> > > > > Were you able to catch [start; start+len) contain non-zero bytes
> > > > > at the start of this function?
> > > > > If so, is it system memory (not an external heap)?
> > > > > If so, what is the CPU, kernel, any custom settings?
> > > > >
> > > > > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> > > > >
> > > > > This patch cannot be accepted as-is anyway:
> > > > > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > > > > 2. It leads to zeroing on both alloc and free, which is suboptimal.
> > > >
> > > > Hi Dmitry, thanks for the review.
> > > >
> > > > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> > > > I printed
> > > > the content in gdb. It's not zero.
> > > >
> > > > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info
> > > >
> > > > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > > > kernel: 4.19.90-2102
> > > >
> > > > [1]
> > > > https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93
> > >
> > > Sorry, it seems that something is wrong with your debug.
> > > Your link is for 20.11.0.
> > > In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2].
> > > Even if it was, in zeroed memory it would be a NULL pointer,
> > > reading a member would give a random value at NULL + some offset.
> > > I suggest to print content of the allocated memory with rte_hexdump().
> > >
> >
> > Sorry I didn't describe my debug clear. At first I debuged with version
> > 20.11.0, I found that the rte_zmalloc_socket() memory is dirty. Then I
> > tried 20.11.5, I didn't debug on 20.11.5 but the behave is the same(nic
> > failed to be probed). So in the commit msg I said v20.11.5 has the
> > issue. But when I describe my debug I uesd 20.11.0 url.
> >
> > More debug info:
> > 1. I reproduced the issue for tens of times, every time the printed var
> > has the same value.
> > 2. After search malloc_heap_add_memory, I found that there are 3 places
> > where call this function to add memory, malloc_add_seg(),
> > alloc_pages_on_heap() and malloc_heap_add_external_memory(). Firstly, I
> > zero out memory only for malloc_add_seg(), it didn't fix the issue. Then
> > I zero out meory in malloc_heap_add_memory() to cover all 3 cases, this
> > time nic is probed successfully.
> > 3. Once nic is probed, I roll back my fix code, try to reproduce the
> > issue. But I can't reproduce anymore. So I guess: the memory allocated
> > when probe qede nic is at a fixed memory location. Because every time in
> > my debug the printed var has the same value. After I zeroed out the
> > allocated memory once, I can't reproduce the issue anymore.
> >
> > > [2]:
> > > http://git.dpdk.org/dpdk-stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223
>
> Today we probaly meet the same issue with intel E810 nic, the behave is
> that E810 nic can be probed on some host, but can't one some other. On
> the same host, one E810 may be probed while the other one can't be.
> After I applied this patch, no such issue anymore.

Are you perhaps running your DPDK application from inside a container?
I remember tracking an issue which had to do with reusing a "dirty"
hugepage file (because of SELinux forbidding to destroy those files).


-- 
David Marchand


  reply	other threads:[~2022-08-29 11:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-27  9:25 chengtcli
2022-08-27  9:57 ` Dmitry Kozlyuk
2022-08-27 13:31   ` lic121
2022-08-27 14:56     ` Dmitry Kozlyuk
2022-08-29  1:18       ` lic121
2022-08-29 11:37         ` lic121
2022-08-29 11:57           ` David Marchand [this message]
2022-08-29 12:37             ` Morten Brørup
2022-08-29 12:43               ` David Marchand
2022-08-29 12:49               ` Dmitry Kozlyuk
2022-08-30  1:11                 ` lic121
2022-08-30  9:49                   ` lic121
2022-08-30 10:59                     ` Dmitry Kozlyuk
2022-08-30 12:47                       ` lic121
2022-08-30 12:53                       ` lic121
2022-09-03 13:53                         ` lic121

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJFAV8x8c7xth4WmXPG3bautS43YBS3OKfDmKekozW1U=ri+jg@mail.gmail.com' \
    --to=david.marchand@redhat.com \
    --cc=chengtcli@qq.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).