DPDK patches and discussions
 help / color / mirror / Atom feed
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [dpdk-dev] [Bug 786] dynamic memory model may cause potential DMA silent error
Date: Tue, 10 Aug 2021 08:00:58 +0000	[thread overview]
Message-ID: <bug-786-3@http.bugs.dpdk.org/> (raw)

https://bugs.dpdk.org/show_bug.cgi?id=786

            Bug ID: 786
           Summary: dynamic memory model may cause potential DMA silent
                    error
           Product: DPDK
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: core
          Assignee: dev@dpdk.org
          Reporter: changpeng.liu@intel.com
  Target Milestone: ---

We found that in some very rare situations the vfio dynamic memory model has an
issue which may result the DMA engine doesn't put the data to the right IO
buffer, here is the tests we do to identify the issue:

1. Start the application and call rte_zmalloc to allocate IO buffers.
Hotplug one NVMe drive, then DPDK will register existing memory region to
kernel vfio driver via dma_map ioctl, we added one trace before this ioctl:
DPDK dma_map vaddr: 0x200000200000, iova: 0x200000200000, size: 0x14200000,
ret: 0

2. Then we call rte_free to free some memory buffers, and DPDK will call
dma_unmap to vfio driver and release related huge files:
DPDK dma_unmap iova: 0x20000a400000, size: 0x0, ret: 0

Here we saw that the return value is 0, which means success, but the unmap size
is 0, the kernel vfio driver didn't do the real unmap action, because the IOVA
range isn't same with the previous map one. The new DPDK version will print an
error for this case now.

3. Then we call rte_zmalloc again, DPDK will create new huge files and remap to
the previous virtual address, and then call dma_map to register to kernel vfio
driver:

DPDK dma_map vaddr: 0x20000a400000, iova: 0x20000a400000, size: 0x400000,
ret=-1, errno was set to EEXIST

but DPDK will ignore this errno, so rte_zmalloc will return success.

Then if the new malloced memory was used as NVMe IO buffer, the DMA engine may
move data to the previous pinned pages, because the kernel vfio driver didn't
update the memory map, but all the IO stack will not print any warning log.

We can use static memory model as a workaround.

-- 
You are receiving this mail because:
You are the assignee for the bug.

                 reply	other threads:[~2021-08-10  8:01 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-786-3@http.bugs.dpdk.org/ \
    --to=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).