DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Du, Frank" <frank.du@intel.com>
To: "Ferruh Yigit" <ferruh.yigit@amd.com>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "Loftus, Ciara" <ciara.loftus@intel.com>
Subject: RE: [PATCH v4] net/af_xdp: fix umem map size for zero copy
Date: Fri, 24 May 2024 01:05:26 +0000	[thread overview]
Message-ID: <PH0PR11MB477535C8BC39D0D07E355AFE80F52@PH0PR11MB4775.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ddf06836-546d-4672-8b51-0fd67f4c36d9@amd.com>

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Thursday, May 23, 2024 9:32 PM
> To: Morten Brørup <mb@smartsharesystems.com>; Du, Frank
> <frank.du@intel.com>; dev@dpdk.org
> Cc: Loftus, Ciara <ciara.loftus@intel.com>
> Subject: Re: [PATCH v4] net/af_xdp: fix umem map size for zero copy
> 
> On 5/23/2024 10:22 AM, Morten Brørup wrote:
> >> From: Frank Du [mailto:frank.du@intel.com]
> >> Sent: Thursday, 23 May 2024 10.08
> >>
> >> The current calculation assumes that the mbufs are contiguous.
> >> However, this assumption is incorrect when the mbuf memory spans across
> huge page.
> >> To ensure that each mbuf resides exclusively within a single page,
> >> there are deliberate spacing gaps when allocating mbufs across the
> boundaries.
> >
> > A agree that this patch is an improvement of what existed previously.
> > But I still don't understand the patch description. To me, it looks
> > like the patch adds a missing check for contiguous memory, and the
> > patch itself has nothing to do with huge pages. Anyway, if the
> > maintainer agrees with the description, I don't mind not grasping it.
> > ;-)
> >
> > However, while trying to understand what is happening, I think I found one
> more (already existing) bug.
> > I will show through an example inline below.
> >
> >>
> >> Correct to directly read the size from the mempool memory chunk.
> >>
> >> Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Frank Du <frank.du@intel.com>
> >>
> >> ---
> >> v2:
> >> * Add virtual contiguous detect for for multiple memhdrs
> >> v3:
> >> * Use RTE_ALIGN_FLOOR to get the aligned addr
> >> * Add check on the first memhdr of memory chunks
> >> v4:
> >> * Replace the iterating with simple nb_mem_chunks check
> >> ---
> >>  drivers/net/af_xdp/rte_eth_af_xdp.c | 33
> >> +++++++++++++++++++++++------
> >>  1 file changed, 26 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> index 6ba455bb9b..d0431ec089 100644
> >> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> @@ -1040,16 +1040,32 @@ eth_link_update(struct rte_eth_dev *dev
> >> __rte_unused,  }
> >>
> >>  #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
> >> -static inline uintptr_t get_base_addr(struct rte_mempool *mp,
> >> uint64_t
> >> *align)
> >> +static inline uintptr_t
> >> +get_memhdr_info(const struct rte_mempool *mp, uint64_t *align,
> >> +size_t *len)
> >>  {
> >>  	struct rte_mempool_memhdr *memhdr;
> >>  	uintptr_t memhdr_addr, aligned_addr;
> >>
> >> +	if (mp->nb_mem_chunks != 1) {
> >> +		/*
> >> +		 * The mempool with multiple chunks is not virtual contiguous
> but
> >> +		 * xsk umem only support single virtual region mapping.
> >> +		 */
> >> +		AF_XDP_LOG(ERR, "The mempool contain multiple %u memory
> >> chunks\n",
> >> +				   mp->nb_mem_chunks);
> >> +		return 0;
> >> +	}
> >> +
> >> +	/* Get the mempool base addr and align from the header now */
> >>  	memhdr = STAILQ_FIRST(&mp->mem_list);
> >> +	if (!memhdr) {
> >> +		AF_XDP_LOG(ERR, "The mempool is not populated\n");
> >> +		return 0;
> >> +	}
> >>  	memhdr_addr = (uintptr_t)memhdr->addr;
> >> -	aligned_addr = memhdr_addr & ~(getpagesize() - 1);
> >> +	aligned_addr = RTE_ALIGN_FLOOR(memhdr_addr, getpagesize());
> >>  	*align = memhdr_addr - aligned_addr;
> >> -
> >> +	*len = memhdr->len;
> >>  	return aligned_addr;
> >
> > On x86_64, the page size is 4 KB = 0x1000.
> >
> > Let's look at an example where memhdr->addr is not aligned to the page size:
> >
> > In the example,
> > memhdr->addr is 0x700100, and
> > memhdr->len is 0x20000.
> >
> > Then
> > aligned_addr becomes 0x700000,
> > *align becomes 0x100, and
> > *len becomes 0x20000.
> >
> >>  }
> >>
> >> @@ -1126,6 +1142,7 @@ xsk_umem_info *xdp_umem_configure(struct
> >> pmd_internals *internals,
> >>  	void *base_addr = NULL;
> >>  	struct rte_mempool *mb_pool = rxq->mb_pool;
> >>  	uint64_t umem_size, align = 0;
> >> +	size_t len = 0;
> >>
> >>  	if (internals->shared_umem) {
> >>  		if (get_shared_umem(rxq, internals->if_name, &umem) < 0) @@
> >> -1157,10 +1174,12 @@ xsk_umem_info *xdp_umem_configure(struct
> >> pmd_internals *internals,
> >>  		}
> >>
> >>  		umem->mb_pool = mb_pool;
> >> -		base_addr = (void *)get_base_addr(mb_pool, &align);
> >> -		umem_size = (uint64_t)mb_pool->populated_size *
> >> -				(uint64_t)usr_config.frame_size +
> >> -				align;
> >> +		base_addr = (void *)get_memhdr_info(mb_pool, &align, &len);
> >> +		if (!base_addr) {
> >> +			AF_XDP_LOG(ERR, "The memory pool can't be mapped
> as
> >> umem\n");
> >> +			goto err;
> >> +		}
> >> +		umem_size = (uint64_t)len + align;
> >
> > Here, umem_size becomes 0x20100.
> >
> >>
> >>  		ret = xsk_umem__create(&umem->umem, base_addr,
> umem_size,
> >>  				&rxq->fq, &rxq->cq, &usr_config);
> >
> > Here, xsk_umem__create() is called with the base_address (0x700000)
> preceding the address of the memory chunk (0x700100).
> > It looks like a bug, causing a buffer underrun. I.e. will it access memory starting
> at base_address?
> >
> 
> I already asked for this on v2, Frank mentioned that area is not accessed and
> having gap is safe.

xsk_umem__create() requires a base address that is aligned to a page boundary. 
And, there is no chance to access the area between 0x700000 and 0x700100,
because the memory pointer for each XSK TX/RX descriptor is derived from the
mbuf data area.

> 
> > If I'm correct, the code should probably do this for alignment instead:
> >
> > aligned_addr = RTE_ALIGN_CEIL(memhdr_addr, getpagesize()); *align =
> > aligned_addr - memhdr_addr; umem_size = (uint64_t)len - align;
> >
> >
> > Disclaimer: I don't know much about the AF_XDP implementation, so maybe I
> just don't understand what is going on.
> >
> >> --
> >> 2.34.1
> >


  reply	other threads:[~2024-05-24  1:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-26  0:51 [PATCH] " Frank Du
2024-04-26 10:43 ` Loftus, Ciara
2024-04-28  0:46   ` Du, Frank
2024-04-30  9:22     ` Loftus, Ciara
2024-05-11  5:26 ` [PATCH v2] " Frank Du
2024-05-17 13:19   ` Loftus, Ciara
2024-05-20  1:28     ` Du, Frank
2024-05-21 15:43   ` Ferruh Yigit
2024-05-21 17:57   ` Ferruh Yigit
2024-05-22  1:25     ` Du, Frank
2024-05-22  7:26       ` Morten Brørup
2024-05-22 10:20         ` Ferruh Yigit
2024-05-23  6:56         ` Du, Frank
2024-05-23  7:40           ` Morten Brørup
2024-05-23  7:56             ` Du, Frank
2024-05-29 12:57               ` Loftus, Ciara
2024-05-29 14:16                 ` Morten Brørup
2024-05-22 10:00       ` Ferruh Yigit
2024-05-22 11:03         ` Morten Brørup
2024-05-22 14:05           ` Ferruh Yigit
2024-05-23  6:53 ` [PATCH v3] " Frank Du
2024-05-23  8:07 ` [PATCH v4] " Frank Du
2024-05-23  9:22   ` Morten Brørup
2024-05-23 13:31     ` Ferruh Yigit
2024-05-24  1:05       ` Du, Frank [this message]
2024-05-24  5:30         ` Morten Brørup
2024-06-20  3:25 ` [PATCH v5] net/af_xdp: parse umem map info from mempool range api Frank Du
2024-06-20  7:10   ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR11MB477535C8BC39D0D07E355AFE80F52@PH0PR11MB4775.namprd11.prod.outlook.com \
    --to=frank.du@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=mb@smartsharesystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).