DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@intel.com>
To: Junxiao Shi <git@mail1.yoursunny.com>, dev@dpdk.org
Cc: sunnylandh@gmail.com, Olivier MATZ <olivier.matz@6wind.com>,
	Jerin Jacob <jerin.jacob@caviumnetworks.com>,
	Igor Ryzhov <iryzhov@nfware.com>,
	Vamsi Attunuru <vattunuru@marvell.com>
Subject: Re: [dpdk-dev] [RFC PATCH v2] kni: properly translate pa2va for cloned mbuf
Date: Tue, 4 Feb 2020 16:17:41 +0000	[thread overview]
Message-ID: <bbf5daab-da55-a6a8-93ae-510764aad120@intel.com> (raw)
In-Reply-To: <202001281642.00SGgPfT032152@lectura.cs.arizona.edu>

On 1/28/2020 4:16 PM, Junxiao Shi wrote:
> Previously, KNI kernel module uses the difference between m->buf_addr
> and m->buf_iova to calculate userspace virtual address from physical
> address. This works for direct mbufs, but does not work for indirect
> (cloned) mbufs that come from another mempool. Transmitting a cloned
> mbuf may cause segmentation fault in userspace.

Yes, indirect buffers from another mempool won't work, I think first thing to do
is detect these mbufs and return error to prevent the crash.

> 
> Now, userspace KNI library writes the virtual address of each mbuf
> in m->userdata field, and KNI kernel module uses this field to restore
> virtual address before putting mbuf into free_q. This approach works
> for both direct and indirect mbufs.

Overwriting the mbuf metadata is not ideal and in some cases it may be
unacceptable so I think we can't make it default behavior.


But why not have this as option, need to find a way to pass this option to kni
library.

And kni library may add a flag to mbuf to say the 'userdata' is used to as 'va'
before sending packet to kernel, this can make kernel code unified which can use
'pa2va()' when there is no flag and 'kva->va' when there is.
Adding this flag to mbuf.h prevents the any possible conflict in the future if
@Oliver accepts this.

And in mbuf, not sure if there is flag for mbuf that marks the 'userdata' have
valid value or not. But detecting it in userspace and not overwriting the value
but returning error in that case can be better option.

What do you think?

> 
> NOTE TO REVIEWER - DO NOT MERGE
> The idea of this change is at https://bugs.dpdk.org/show_bug.cgi?id=183#c4
> Test case is at https://bugs.dpdk.org/show_bug.cgi?id=183#c5
> I only modified kni_net_rx_normal function.
> If this approach is acceptable, I will modify kni_net_rx_lo_fifo,
> kni_net_rx_lo_fifo_skb, and kni_fifo_trans_pa2va(rx_q) as well.
> 
> Bugzilla ID: 183
> 
> Signed-off-by: Junxiao Shi <git@mail1.yoursunny.com>
> ---
>  kernel/linux/kni/kni_net.c                        | 4 ++--
>  lib/librte_eal/linux/eal/include/rte_kni_common.h | 3 ++-
>  lib/librte_kni/rte_kni.c                          | 8 ++++++--
>  3 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
> index 97fe85b..d783545 100644
> --- a/kernel/linux/kni/kni_net.c
> +++ b/kernel/linux/kni/kni_net.c
> @@ -377,7 +377,7 @@ kni_net_rx_normal(struct kni_dev *kni)
>  		kva = get_kva(kni, kni->pa[i]);
>  		len = kva->pkt_len;
>  		data_kva = get_data_kva(kni, kva);
> -		kni->va[i] = pa2va(kni->pa[i], kva);
> +		kni->va[i] = kva->va;
>  
>  		skb = netdev_alloc_skb(dev, len);
>  		if (!skb) {
> @@ -403,7 +403,7 @@ kni_net_rx_normal(struct kni_dev *kni)
>  				kva = pa2kva(kva->next);
>  				data_kva = kva2data_kva(kva);
>  				/* Convert physical address to virtual address */
> -				prev_kva->next = pa2va(prev_kva->next, kva);
> +				prev_kva->next = kva->va;
>  			}
>  		}
>  
> diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h
> index 7313ef5..c694a1d 100644
> --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h
> +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h
> @@ -86,7 +86,8 @@ struct rte_kni_mbuf {
>  	uint16_t data_len;      /**< Amount of data in segment buffer. */
>  
>  	/* fields on second cache line */
> -	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE)));
> +	void *va __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE)));
> +	         /**< Virtual address of this mbuf in userspace (overwrites userdata). */
>  	void *pool;
>  	void *next;             /**< Physical address of next mbuf in kernel. */
>  };
> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> index e388751..463485f 100644
> --- a/lib/librte_kni/rte_kni.c
> +++ b/lib/librte_kni/rte_kni.c
> @@ -359,13 +359,15 @@ va2pa(struct rte_mbuf *m)
>  static void *
>  va2pa_all(struct rte_mbuf *mbuf)
>  {
> -	void *phy_mbuf = va2pa(mbuf);
> +	void *phy_mbuf = (void*)rte_mempool_virt2iova(mbuf);
>  	struct rte_mbuf *next = mbuf->next;
>  	while (next) {
> -		mbuf->next = va2pa(next);
> +		mbuf->userdata = mbuf;
> +		mbuf->next = (void*)rte_mempool_virt2iova(next);
>  		mbuf = next;
>  		next = mbuf->next;
>  	}
> +	mbuf->userdata = mbuf;
>  	return phy_mbuf;
>  }
>  
> @@ -652,6 +654,8 @@ kni_allocate_mbufs(struct rte_kni *kni)
>  			 offsetof(struct rte_kni_mbuf, buf_addr));
>  	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) !=
>  			 offsetof(struct rte_kni_mbuf, next));
> +	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, userdata) !=
> +			 offsetof(struct rte_kni_mbuf, va));
>  	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) !=
>  			 offsetof(struct rte_kni_mbuf, data_off));
>  	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) !=
> 


  reply	other threads:[~2020-02-04 16:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-10 18:58 [dpdk-dev] [RFC PATCH] " Junxiao Shi
2020-01-28 16:16 ` [dpdk-dev] [RFC PATCH v2] " Junxiao Shi
2020-02-04 16:17   ` Ferruh Yigit [this message]
2020-02-04 16:49     ` yoursunny
2020-02-04 17:58       ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bbf5daab-da55-a6a8-93ae-510764aad120@intel.com \
    --to=ferruh.yigit@intel.com \
    --cc=dev@dpdk.org \
    --cc=git@mail1.yoursunny.com \
    --cc=iryzhov@nfware.com \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=olivier.matz@6wind.com \
    --cc=sunnylandh@gmail.com \
    --cc=vattunuru@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).