From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B5DFBA0535; Tue, 4 Feb 2020 17:17:48 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 058FF1C214; Tue, 4 Feb 2020 17:17:48 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 2C6721C20E for ; Tue, 4 Feb 2020 17:17:45 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Feb 2020 08:17:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,402,1574150400"; d="scan'208";a="224343483" Received: from fyigit-mobl.ger.corp.intel.com (HELO [10.252.22.38]) ([10.252.22.38]) by orsmga008.jf.intel.com with ESMTP; 04 Feb 2020 08:17:42 -0800 To: Junxiao Shi , dev@dpdk.org Cc: sunnylandh@gmail.com, Olivier MATZ , Jerin Jacob , Igor Ryzhov , Vamsi Attunuru References: <201909101917.x8AJH281027807@lectura.cs.arizona.edu> <202001281642.00SGgPfT032152@lectura.cs.arizona.edu> From: Ferruh Yigit Autocrypt: addr=ferruh.yigit@intel.com; prefer-encrypt=mutual; keydata= mQINBFXZCFABEADCujshBOAaqPZpwShdkzkyGpJ15lmxiSr3jVMqOtQS/sB3FYLT0/d3+bvy qbL9YnlbPyRvZfnP3pXiKwkRoR1RJwEo2BOf6hxdzTmLRtGtwWzI9MwrUPj6n/ldiD58VAGQ +iR1I/z9UBUN/ZMksElA2D7Jgg7vZ78iKwNnd+vLBD6I61kVrZ45Vjo3r+pPOByUBXOUlxp9 GWEKKIrJ4eogqkVNSixN16VYK7xR+5OUkBYUO+sE6etSxCr7BahMPKxH+XPlZZjKrxciaWQb +dElz3Ab4Opl+ZT/bK2huX+W+NJBEBVzjTkhjSTjcyRdxvS1gwWRuXqAml/sh+KQjPV1PPHF YK5LcqLkle+OKTCa82OvUb7cr+ALxATIZXQkgmn+zFT8UzSS3aiBBohg3BtbTIWy51jNlYdy ezUZ4UxKSsFuUTPt+JjHQBvF7WKbmNGS3fCid5Iag4tWOfZoqiCNzxApkVugltxoc6rG2TyX CmI2rP0mQ0GOsGXA3+3c1MCdQFzdIn/5tLBZyKy4F54UFo35eOX8/g7OaE+xrgY/4bZjpxC1 1pd66AAtKb3aNXpHvIfkVV6NYloo52H+FUE5ZDPNCGD0/btFGPWmWRmkPybzColTy7fmPaGz cBcEEqHK4T0aY4UJmE7Ylvg255Kz7s6wGZe6IR3N0cKNv++O7QARAQABtCVGZXJydWggWWln aXQgPGZlcnJ1aC55aWdpdEBpbnRlbC5jb20+iQJUBBMBCgA+AhsDAh4BAheABQsJCAcDBRUK CQgLBRYCAwEAFiEE0jZTh0IuwoTjmYHH+TPrQ98TYR8FAl1meboFCQlupOoACgkQ+TPrQ98T YR9ACBAAv2tomhyxY0Tp9Up7mNGLfEdBu/7joB/vIdqMRv63ojkwr9orQq5V16V/25+JEAD0 60cKodBDM6HdUvqLHatS8fooWRueSXHKYwJ3vxyB2tWDyZrLzLI1jxEvunGodoIzUOtum0Ce gPynnfQCelXBja0BwLXJMplM6TY1wXX22ap0ZViC0m714U5U4LQpzjabtFtjT8qOUR6L7hfy YQ72PBuktGb00UR/N5UrR6GqB0x4W41aZBHXfUQnvWIMmmCrRUJX36hOTYBzh+x86ULgg7H2 1499tA4o6rvE13FiGccplBNWCAIroAe/G11rdoN5NBgYVXu++38gTa/MBmIt6zRi6ch15oLA Ln2vHOdqhrgDuxjhMpG2bpNE36DG/V9WWyWdIRlz3NYPCDM/S3anbHlhjStXHOz1uHOnerXM 1jEjcsvmj1vSyYoQMyRcRJmBZLrekvgZeh7nJzbPHxtth8M7AoqiZ/o/BpYU+0xZ+J5/szWZ aYxxmIRu5ejFf+Wn9s5eXNHmyqxBidpCWvcbKYDBnkw2+Y9E5YTpL0mS0dCCOlrO7gca27ux ybtbj84aaW1g0CfIlUnOtHgMCmz6zPXThb+A8H8j3O6qmPoVqT3qnq3Uhy6GOoH8Fdu2Vchh TWiF5yo+pvUagQP6LpslffufSnu+RKAagkj7/RSuZV25Ag0EV9ZMvgEQAKc0Db17xNqtSwEv mfp4tkddwW9XA0tWWKtY4KUdd/jijYqc3fDD54ESYpV8QWj0xK4YM0dLxnDU2IYxjEshSB1T qAatVWz9WtBYvzalsyTqMKP3w34FciuL7orXP4AibPtrHuIXWQOBECcVZTTOdZYGAzaYzxiA ONzF9eTiwIqe9/oaOjTwTLnOarHt16QApTYQSnxDUQljeNvKYt1lZE/gAUUxNLWsYyTT+22/ vU0GDUahsJxs1+f1yEr+OGrFiEAmqrzpF0lCS3f/3HVTU6rS9cK3glVUeaTF4+1SK5ZNO35p iVQCwphmxa+dwTG/DvvHYCtgOZorTJ+OHfvCnSVjsM4kcXGjJPy3JZmUtyL9UxEbYlrffGPQ I3gLXIGD5AN5XdAXFCjjaID/KR1c9RHd7Oaw0Pdcq9UtMLgM1vdX8RlDuMGPrj5sQrRVbgYH fVU/TQCk1C9KhzOwg4Ap2T3tE1umY/DqrXQgsgH71PXFucVjOyHMYXXugLT8YQ0gcBPHy9mZ qw5mgOI5lCl6d4uCcUT0l/OEtPG/rA1lxz8ctdFBVOQOxCvwRG2QCgcJ/UTn5vlivul+cThi 6ERPvjqjblLncQtRg8izj2qgmwQkvfj+h7Ex88bI8iWtu5+I3K3LmNz/UxHBSWEmUnkg4fJl Rr7oItHsZ0ia6wWQ8lQnABEBAAGJAjwEGAEKACYCGwwWIQTSNlOHQi7ChOOZgcf5M+tD3xNh HwUCXWZ5wAUJB3FgggAKCRD5M+tD3xNhH2O+D/9OEz62YuJQLuIuOfL67eFTIB5/1+0j8Tsu o2psca1PUQ61SZJZOMl6VwNxpdvEaolVdrpnSxUF31kPEvR0Igy8HysQ11pj8AcgH0a9FrvU /8k2Roccd2ZIdpNLkirGFZR7LtRw41Kt1Jg+lafI0efkiHKMT/6D/P1EUp1RxOBNtWGV2hrd 0Yg9ds+VMphHHU69fDH02SwgpvXwG8Qm14Zi5WQ66R4CtTkHuYtA63sS17vMl8fDuTCtvfPF HzvdJLIhDYN3Mm1oMjKLlq4PUdYh68Fiwm+boJoBUFGuregJFlO3hM7uHBDhSEnXQr5mqpPM 6R/7Q5BjAxrwVBisH0yQGjsWlnysRWNfExAE2sRePSl0or9q19ddkRYltl6X4FDUXy2DTXa9 a+Fw4e1EvmcF3PjmTYs9IE3Vc64CRQXkhujcN4ZZh5lvOpU8WgyDxFq7bavFnSS6kx7Tk29/ wNJBp+cf9qsQxLbqhW5kfORuZGecus0TLcmpZEFKKjTJBK9gELRBB/zoN3j41hlEl7uTUXTI JQFLhpsFlEdKLujyvT/aCwP3XWT+B2uZDKrMAElF6ltpTxI53JYi22WO7NH7MR16Fhi4R6vh FHNBOkiAhUpoXRZXaCR6+X4qwA8CwHGqHRBfYFSU/Ulq1ZLR+S3hNj2mbnSx0lBs1eEqe2vh cA== Message-ID: Date: Tue, 4 Feb 2020 16:17:41 +0000 MIME-Version: 1.0 In-Reply-To: <202001281642.00SGgPfT032152@lectura.cs.arizona.edu> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [RFC PATCH v2] kni: properly translate pa2va for cloned mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 1/28/2020 4:16 PM, Junxiao Shi wrote: > Previously, KNI kernel module uses the difference between m->buf_addr > and m->buf_iova to calculate userspace virtual address from physical > address. This works for direct mbufs, but does not work for indirect > (cloned) mbufs that come from another mempool. Transmitting a cloned > mbuf may cause segmentation fault in userspace. Yes, indirect buffers from another mempool won't work, I think first thing to do is detect these mbufs and return error to prevent the crash. > > Now, userspace KNI library writes the virtual address of each mbuf > in m->userdata field, and KNI kernel module uses this field to restore > virtual address before putting mbuf into free_q. This approach works > for both direct and indirect mbufs. Overwriting the mbuf metadata is not ideal and in some cases it may be unacceptable so I think we can't make it default behavior. But why not have this as option, need to find a way to pass this option to kni library. And kni library may add a flag to mbuf to say the 'userdata' is used to as 'va' before sending packet to kernel, this can make kernel code unified which can use 'pa2va()' when there is no flag and 'kva->va' when there is. Adding this flag to mbuf.h prevents the any possible conflict in the future if @Oliver accepts this. And in mbuf, not sure if there is flag for mbuf that marks the 'userdata' have valid value or not. But detecting it in userspace and not overwriting the value but returning error in that case can be better option. What do you think? > > NOTE TO REVIEWER - DO NOT MERGE > The idea of this change is at https://bugs.dpdk.org/show_bug.cgi?id=183#c4 > Test case is at https://bugs.dpdk.org/show_bug.cgi?id=183#c5 > I only modified kni_net_rx_normal function. > If this approach is acceptable, I will modify kni_net_rx_lo_fifo, > kni_net_rx_lo_fifo_skb, and kni_fifo_trans_pa2va(rx_q) as well. > > Bugzilla ID: 183 > > Signed-off-by: Junxiao Shi > --- > kernel/linux/kni/kni_net.c | 4 ++-- > lib/librte_eal/linux/eal/include/rte_kni_common.h | 3 ++- > lib/librte_kni/rte_kni.c | 8 ++++++-- > 3 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > index 97fe85b..d783545 100644 > --- a/kernel/linux/kni/kni_net.c > +++ b/kernel/linux/kni/kni_net.c > @@ -377,7 +377,7 @@ kni_net_rx_normal(struct kni_dev *kni) > kva = get_kva(kni, kni->pa[i]); > len = kva->pkt_len; > data_kva = get_data_kva(kni, kva); > - kni->va[i] = pa2va(kni->pa[i], kva); > + kni->va[i] = kva->va; > > skb = netdev_alloc_skb(dev, len); > if (!skb) { > @@ -403,7 +403,7 @@ kni_net_rx_normal(struct kni_dev *kni) > kva = pa2kva(kva->next); > data_kva = kva2data_kva(kva); > /* Convert physical address to virtual address */ > - prev_kva->next = pa2va(prev_kva->next, kva); > + prev_kva->next = kva->va; > } > } > > diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h > index 7313ef5..c694a1d 100644 > --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h > +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h > @@ -86,7 +86,8 @@ struct rte_kni_mbuf { > uint16_t data_len; /**< Amount of data in segment buffer. */ > > /* fields on second cache line */ > - char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE))); > + void *va __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE))); > + /**< Virtual address of this mbuf in userspace (overwrites userdata). */ > void *pool; > void *next; /**< Physical address of next mbuf in kernel. */ > }; > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c > index e388751..463485f 100644 > --- a/lib/librte_kni/rte_kni.c > +++ b/lib/librte_kni/rte_kni.c > @@ -359,13 +359,15 @@ va2pa(struct rte_mbuf *m) > static void * > va2pa_all(struct rte_mbuf *mbuf) > { > - void *phy_mbuf = va2pa(mbuf); > + void *phy_mbuf = (void*)rte_mempool_virt2iova(mbuf); > struct rte_mbuf *next = mbuf->next; > while (next) { > - mbuf->next = va2pa(next); > + mbuf->userdata = mbuf; > + mbuf->next = (void*)rte_mempool_virt2iova(next); > mbuf = next; > next = mbuf->next; > } > + mbuf->userdata = mbuf; > return phy_mbuf; > } > > @@ -652,6 +654,8 @@ kni_allocate_mbufs(struct rte_kni *kni) > offsetof(struct rte_kni_mbuf, buf_addr)); > RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) != > offsetof(struct rte_kni_mbuf, next)); > + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, userdata) != > + offsetof(struct rte_kni_mbuf, va)); > RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) != > offsetof(struct rte_kni_mbuf, data_off)); > RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) != >