From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Frank Du" <frank.du@intel.com>, <dev@dpdk.org>
Cc: <ciara.loftus@intel.com>, <ferruh.yigit@amd.com>
Subject: RE: [PATCH v4] net/af_xdp: fix umem map size for zero copy
Date: Thu, 23 May 2024 11:22:37 +0200 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9F492@smartserver.smartshare.dk> (raw)
In-Reply-To: <20240523080751.2347970-1-frank.du@intel.com>
> From: Frank Du [mailto:frank.du@intel.com]
> Sent: Thursday, 23 May 2024 10.08
>
> The current calculation assumes that the mbufs are contiguous. However,
> this assumption is incorrect when the mbuf memory spans across huge page.
> To ensure that each mbuf resides exclusively within a single page, there
> are deliberate spacing gaps when allocating mbufs across the boundaries.
A agree that this patch is an improvement of what existed previously.
But I still don't understand the patch description. To me, it looks like the patch adds a missing check for contiguous memory, and the patch itself has nothing to do with huge pages. Anyway, if the maintainer agrees with the description, I don't mind not grasping it. ;-)
However, while trying to understand what is happening, I think I found one more (already existing) bug.
I will show through an example inline below.
>
> Correct to directly read the size from the mempool memory chunk.
>
> Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
> Cc: stable@dpdk.org
>
> Signed-off-by: Frank Du <frank.du@intel.com>
>
> ---
> v2:
> * Add virtual contiguous detect for for multiple memhdrs
> v3:
> * Use RTE_ALIGN_FLOOR to get the aligned addr
> * Add check on the first memhdr of memory chunks
> v4:
> * Replace the iterating with simple nb_mem_chunks check
> ---
> drivers/net/af_xdp/rte_eth_af_xdp.c | 33 +++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 6ba455bb9b..d0431ec089 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -1040,16 +1040,32 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused,
> }
>
> #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
> -static inline uintptr_t get_base_addr(struct rte_mempool *mp, uint64_t
> *align)
> +static inline uintptr_t
> +get_memhdr_info(const struct rte_mempool *mp, uint64_t *align, size_t *len)
> {
> struct rte_mempool_memhdr *memhdr;
> uintptr_t memhdr_addr, aligned_addr;
>
> + if (mp->nb_mem_chunks != 1) {
> + /*
> + * The mempool with multiple chunks is not virtual contiguous but
> + * xsk umem only support single virtual region mapping.
> + */
> + AF_XDP_LOG(ERR, "The mempool contain multiple %u memory
> chunks\n",
> + mp->nb_mem_chunks);
> + return 0;
> + }
> +
> + /* Get the mempool base addr and align from the header now */
> memhdr = STAILQ_FIRST(&mp->mem_list);
> + if (!memhdr) {
> + AF_XDP_LOG(ERR, "The mempool is not populated\n");
> + return 0;
> + }
> memhdr_addr = (uintptr_t)memhdr->addr;
> - aligned_addr = memhdr_addr & ~(getpagesize() - 1);
> + aligned_addr = RTE_ALIGN_FLOOR(memhdr_addr, getpagesize());
> *align = memhdr_addr - aligned_addr;
> -
> + *len = memhdr->len;
> return aligned_addr;
On x86_64, the page size is 4 KB = 0x1000.
Let's look at an example where memhdr->addr is not aligned to the page size:
In the example,
memhdr->addr is 0x700100, and
memhdr->len is 0x20000.
Then
aligned_addr becomes 0x700000,
*align becomes 0x100, and
*len becomes 0x20000.
> }
>
> @@ -1126,6 +1142,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals
> *internals,
> void *base_addr = NULL;
> struct rte_mempool *mb_pool = rxq->mb_pool;
> uint64_t umem_size, align = 0;
> + size_t len = 0;
>
> if (internals->shared_umem) {
> if (get_shared_umem(rxq, internals->if_name, &umem) < 0)
> @@ -1157,10 +1174,12 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals
> *internals,
> }
>
> umem->mb_pool = mb_pool;
> - base_addr = (void *)get_base_addr(mb_pool, &align);
> - umem_size = (uint64_t)mb_pool->populated_size *
> - (uint64_t)usr_config.frame_size +
> - align;
> + base_addr = (void *)get_memhdr_info(mb_pool, &align, &len);
> + if (!base_addr) {
> + AF_XDP_LOG(ERR, "The memory pool can't be mapped as
> umem\n");
> + goto err;
> + }
> + umem_size = (uint64_t)len + align;
Here, umem_size becomes 0x20100.
>
> ret = xsk_umem__create(&umem->umem, base_addr, umem_size,
> &rxq->fq, &rxq->cq, &usr_config);
Here, xsk_umem__create() is called with the base_address (0x700000) preceding the address of the memory chunk (0x700100).
It looks like a bug, causing a buffer underrun. I.e. will it access memory starting at base_address?
If I'm correct, the code should probably do this for alignment instead:
aligned_addr = RTE_ALIGN_CEIL(memhdr_addr, getpagesize());
*align = aligned_addr - memhdr_addr;
umem_size = (uint64_t)len - align;
Disclaimer: I don't know much about the AF_XDP implementation, so maybe I just don't understand what is going on.
> --
> 2.34.1
next prev parent reply other threads:[~2024-05-23 9:22 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-26 0:51 [PATCH] " Frank Du
2024-04-26 10:43 ` Loftus, Ciara
2024-04-28 0:46 ` Du, Frank
2024-04-30 9:22 ` Loftus, Ciara
2024-05-11 5:26 ` [PATCH v2] " Frank Du
2024-05-17 13:19 ` Loftus, Ciara
2024-05-20 1:28 ` Du, Frank
2024-05-21 15:43 ` Ferruh Yigit
2024-05-21 17:57 ` Ferruh Yigit
2024-05-22 1:25 ` Du, Frank
2024-05-22 7:26 ` Morten Brørup
2024-05-22 10:20 ` Ferruh Yigit
2024-05-23 6:56 ` Du, Frank
2024-05-23 7:40 ` Morten Brørup
2024-05-23 7:56 ` Du, Frank
2024-05-29 12:57 ` Loftus, Ciara
2024-05-29 14:16 ` Morten Brørup
2024-05-22 10:00 ` Ferruh Yigit
2024-05-22 11:03 ` Morten Brørup
2024-05-22 14:05 ` Ferruh Yigit
2024-05-23 6:53 ` [PATCH v3] " Frank Du
2024-05-23 8:07 ` [PATCH v4] " Frank Du
2024-05-23 9:22 ` Morten Brørup [this message]
2024-05-23 13:31 ` Ferruh Yigit
2024-05-24 1:05 ` Du, Frank
2024-05-24 5:30 ` Morten Brørup
2024-06-20 3:25 ` [PATCH v5] net/af_xdp: parse umem map info from mempool range api Frank Du
2024-06-20 7:10 ` Morten Brørup
2024-07-06 3:40 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35E9F492@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=ciara.loftus@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@amd.com \
--cc=frank.du@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).