From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C7215A0537; Tue, 4 Feb 2020 18:58:09 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4AF4F1C226; Tue, 4 Feb 2020 18:58:09 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id B219B1C20C for ; Tue, 4 Feb 2020 18:58:07 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Feb 2020 09:58:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,402,1574150400"; d="scan'208";a="224374984" Received: from fyigit-mobl.ger.corp.intel.com (HELO [10.252.22.38]) ([10.252.22.38]) by orsmga008.jf.intel.com with ESMTP; 04 Feb 2020 09:58:04 -0800 To: yoursunny Cc: Junxiao Shi , dev@dpdk.org, Olivier MATZ , Jerin Jacob , Igor Ryzhov , Vamsi Attunuru References: <201909101917.x8AJH281027807@lectura.cs.arizona.edu> <202001281642.00SGgPfT032152@lectura.cs.arizona.edu> From: Ferruh Yigit Autocrypt: addr=ferruh.yigit@intel.com; prefer-encrypt=mutual; keydata= mQINBFXZCFABEADCujshBOAaqPZpwShdkzkyGpJ15lmxiSr3jVMqOtQS/sB3FYLT0/d3+bvy qbL9YnlbPyRvZfnP3pXiKwkRoR1RJwEo2BOf6hxdzTmLRtGtwWzI9MwrUPj6n/ldiD58VAGQ +iR1I/z9UBUN/ZMksElA2D7Jgg7vZ78iKwNnd+vLBD6I61kVrZ45Vjo3r+pPOByUBXOUlxp9 GWEKKIrJ4eogqkVNSixN16VYK7xR+5OUkBYUO+sE6etSxCr7BahMPKxH+XPlZZjKrxciaWQb +dElz3Ab4Opl+ZT/bK2huX+W+NJBEBVzjTkhjSTjcyRdxvS1gwWRuXqAml/sh+KQjPV1PPHF YK5LcqLkle+OKTCa82OvUb7cr+ALxATIZXQkgmn+zFT8UzSS3aiBBohg3BtbTIWy51jNlYdy ezUZ4UxKSsFuUTPt+JjHQBvF7WKbmNGS3fCid5Iag4tWOfZoqiCNzxApkVugltxoc6rG2TyX CmI2rP0mQ0GOsGXA3+3c1MCdQFzdIn/5tLBZyKy4F54UFo35eOX8/g7OaE+xrgY/4bZjpxC1 1pd66AAtKb3aNXpHvIfkVV6NYloo52H+FUE5ZDPNCGD0/btFGPWmWRmkPybzColTy7fmPaGz cBcEEqHK4T0aY4UJmE7Ylvg255Kz7s6wGZe6IR3N0cKNv++O7QARAQABtCVGZXJydWggWWln aXQgPGZlcnJ1aC55aWdpdEBpbnRlbC5jb20+iQJUBBMBCgA+AhsDAh4BAheABQsJCAcDBRUK CQgLBRYCAwEAFiEE0jZTh0IuwoTjmYHH+TPrQ98TYR8FAl1meboFCQlupOoACgkQ+TPrQ98T YR9ACBAAv2tomhyxY0Tp9Up7mNGLfEdBu/7joB/vIdqMRv63ojkwr9orQq5V16V/25+JEAD0 60cKodBDM6HdUvqLHatS8fooWRueSXHKYwJ3vxyB2tWDyZrLzLI1jxEvunGodoIzUOtum0Ce gPynnfQCelXBja0BwLXJMplM6TY1wXX22ap0ZViC0m714U5U4LQpzjabtFtjT8qOUR6L7hfy YQ72PBuktGb00UR/N5UrR6GqB0x4W41aZBHXfUQnvWIMmmCrRUJX36hOTYBzh+x86ULgg7H2 1499tA4o6rvE13FiGccplBNWCAIroAe/G11rdoN5NBgYVXu++38gTa/MBmIt6zRi6ch15oLA Ln2vHOdqhrgDuxjhMpG2bpNE36DG/V9WWyWdIRlz3NYPCDM/S3anbHlhjStXHOz1uHOnerXM 1jEjcsvmj1vSyYoQMyRcRJmBZLrekvgZeh7nJzbPHxtth8M7AoqiZ/o/BpYU+0xZ+J5/szWZ aYxxmIRu5ejFf+Wn9s5eXNHmyqxBidpCWvcbKYDBnkw2+Y9E5YTpL0mS0dCCOlrO7gca27ux ybtbj84aaW1g0CfIlUnOtHgMCmz6zPXThb+A8H8j3O6qmPoVqT3qnq3Uhy6GOoH8Fdu2Vchh TWiF5yo+pvUagQP6LpslffufSnu+RKAagkj7/RSuZV25Ag0EV9ZMvgEQAKc0Db17xNqtSwEv mfp4tkddwW9XA0tWWKtY4KUdd/jijYqc3fDD54ESYpV8QWj0xK4YM0dLxnDU2IYxjEshSB1T qAatVWz9WtBYvzalsyTqMKP3w34FciuL7orXP4AibPtrHuIXWQOBECcVZTTOdZYGAzaYzxiA ONzF9eTiwIqe9/oaOjTwTLnOarHt16QApTYQSnxDUQljeNvKYt1lZE/gAUUxNLWsYyTT+22/ vU0GDUahsJxs1+f1yEr+OGrFiEAmqrzpF0lCS3f/3HVTU6rS9cK3glVUeaTF4+1SK5ZNO35p iVQCwphmxa+dwTG/DvvHYCtgOZorTJ+OHfvCnSVjsM4kcXGjJPy3JZmUtyL9UxEbYlrffGPQ I3gLXIGD5AN5XdAXFCjjaID/KR1c9RHd7Oaw0Pdcq9UtMLgM1vdX8RlDuMGPrj5sQrRVbgYH fVU/TQCk1C9KhzOwg4Ap2T3tE1umY/DqrXQgsgH71PXFucVjOyHMYXXugLT8YQ0gcBPHy9mZ qw5mgOI5lCl6d4uCcUT0l/OEtPG/rA1lxz8ctdFBVOQOxCvwRG2QCgcJ/UTn5vlivul+cThi 6ERPvjqjblLncQtRg8izj2qgmwQkvfj+h7Ex88bI8iWtu5+I3K3LmNz/UxHBSWEmUnkg4fJl Rr7oItHsZ0ia6wWQ8lQnABEBAAGJAjwEGAEKACYCGwwWIQTSNlOHQi7ChOOZgcf5M+tD3xNh HwUCXWZ5wAUJB3FgggAKCRD5M+tD3xNhH2O+D/9OEz62YuJQLuIuOfL67eFTIB5/1+0j8Tsu o2psca1PUQ61SZJZOMl6VwNxpdvEaolVdrpnSxUF31kPEvR0Igy8HysQ11pj8AcgH0a9FrvU /8k2Roccd2ZIdpNLkirGFZR7LtRw41Kt1Jg+lafI0efkiHKMT/6D/P1EUp1RxOBNtWGV2hrd 0Yg9ds+VMphHHU69fDH02SwgpvXwG8Qm14Zi5WQ66R4CtTkHuYtA63sS17vMl8fDuTCtvfPF HzvdJLIhDYN3Mm1oMjKLlq4PUdYh68Fiwm+boJoBUFGuregJFlO3hM7uHBDhSEnXQr5mqpPM 6R/7Q5BjAxrwVBisH0yQGjsWlnysRWNfExAE2sRePSl0or9q19ddkRYltl6X4FDUXy2DTXa9 a+Fw4e1EvmcF3PjmTYs9IE3Vc64CRQXkhujcN4ZZh5lvOpU8WgyDxFq7bavFnSS6kx7Tk29/ wNJBp+cf9qsQxLbqhW5kfORuZGecus0TLcmpZEFKKjTJBK9gELRBB/zoN3j41hlEl7uTUXTI JQFLhpsFlEdKLujyvT/aCwP3XWT+B2uZDKrMAElF6ltpTxI53JYi22WO7NH7MR16Fhi4R6vh FHNBOkiAhUpoXRZXaCR6+X4qwA8CwHGqHRBfYFSU/Ulq1ZLR+S3hNj2mbnSx0lBs1eEqe2vh cA== Message-ID: <40ed7e79-cd8a-bf8b-da4b-acb797e574c0@intel.com> Date: Tue, 4 Feb 2020 17:58:03 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [RFC PATCH v2] kni: properly translate pa2va for cloned mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 2/4/2020 4:49 PM, yoursunny wrote: > Hi Ferruh > > Thanks for your review. > > On Tue, Feb 4, 2020 at 11:17 AM Ferruh Yigit wrote: >> >> On 1/28/2020 4:16 PM, Junxiao Shi wrote: >>> Previously, KNI kernel module uses the difference between m->buf_addr >>> and m->buf_iova to calculate userspace virtual address from physical >>> address. This works for direct mbufs, but does not work for indirect >>> (cloned) mbufs that come from another mempool. Transmitting a cloned >>> mbuf may cause segmentation fault in userspace. >> >> Yes, indirect buffers from another mempool won't work, I think first thing to do >> is detect these mbufs and return error to prevent the crash. >> >>> >>> Now, userspace KNI library writes the virtual address of each mbuf >>> in m->userdata field, and KNI kernel module uses this field to restore >>> virtual address before putting mbuf into free_q. This approach works >>> for both direct and indirect mbufs. >> >> Overwriting the mbuf metadata is not ideal and in some cases it may be >> unacceptable so I think we can't make it default behavior. >> > > I believe that after an mbuf has been passed to ethdev for > transmission, it is owned by ethdev, and the user application should > not use it anymore. > This is because, the mbuf could be free'd by ethdev after > transmission, so that it is unsafe for the user application to > dereference its pointer. > Notice that in case the user application has created indirect mbufs > pointing to the same buffer, only the mbuf passed to ethdev would have > its m->userdata field overwritten; other mbufs still held by the > application are unaffected. > This is why I decided to overwrite the m->userdata field. > > Can you explain, in what case overwriting mbuf metadata would be unacceptable? If the mbuf sent to kernel space returns back to DPDK application and if next hope was expecting that metadata field. 'kni_net_rx_lo_fifo()' can cause that but that is mainly for debug, not sure if that counts really. I am not quite sure if any packet sent to kernel can return back to circulation, but trying to be more defensive here, and protect the mbuf metada as much as possible and not overwrite it via KNI data. That is why suggesting to enable it via an option instead of making it default. And 'userdata' seems not really used much as of now, but what if it has been changed to dynamic field, which was mentioned in the past, and that field end up being something required by kernel side? Long shot? Probably. In that case we can make that option default, but I think better than risking breaking some possible usage. > >> >> But why not have this as option, need to find a way to pass this option to kni >> library. >> >> And kni library may add a flag to mbuf to say the 'userdata' is used to as 'va' >> before sending packet to kernel, this can make kernel code unified which can use >> 'pa2va()' when there is no flag and 'kva->va' when there is. >> Adding this flag to mbuf.h prevents the any possible conflict in the future if >> @Oliver accepts this. >> >> And in mbuf, not sure if there is flag for mbuf that marks the 'userdata' have >> valid value or not. But detecting it in userspace and not overwriting the value >> but returning error in that case can be better option. >> >> What do you think? >> >>> >>> NOTE TO REVIEWER - DO NOT MERGE >>> The idea of this change is at https://bugs.dpdk.org/show_bug.cgi?id=183#c4 >>> Test case is at https://bugs.dpdk.org/show_bug.cgi?id=183#c5 >>> I only modified kni_net_rx_normal function. >>> If this approach is acceptable, I will modify kni_net_rx_lo_fifo, >>> kni_net_rx_lo_fifo_skb, and kni_fifo_trans_pa2va(rx_q) as well. >>> >>> Bugzilla ID: 183 >>> >>> Signed-off-by: Junxiao Shi >>> --- >>> kernel/linux/kni/kni_net.c | 4 ++-- >>> lib/librte_eal/linux/eal/include/rte_kni_common.h | 3 ++- >>> lib/librte_kni/rte_kni.c | 8 ++++++-- >>> 3 files changed, 10 insertions(+), 5 deletions(-) >>> >>> diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c >>> index 97fe85b..d783545 100644 >>> --- a/kernel/linux/kni/kni_net.c >>> +++ b/kernel/linux/kni/kni_net.c >>> @@ -377,7 +377,7 @@ kni_net_rx_normal(struct kni_dev *kni) >>> kva = get_kva(kni, kni->pa[i]); >>> len = kva->pkt_len; >>> data_kva = get_data_kva(kni, kva); >>> - kni->va[i] = pa2va(kni->pa[i], kva); >>> + kni->va[i] = kva->va; >>> >>> skb = netdev_alloc_skb(dev, len); >>> if (!skb) { >>> @@ -403,7 +403,7 @@ kni_net_rx_normal(struct kni_dev *kni) >>> kva = pa2kva(kva->next); >>> data_kva = kva2data_kva(kva); >>> /* Convert physical address to virtual address */ >>> - prev_kva->next = pa2va(prev_kva->next, kva); >>> + prev_kva->next = kva->va; >>> } >>> } >>> >>> diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h >>> index 7313ef5..c694a1d 100644 >>> --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h >>> +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h >>> @@ -86,7 +86,8 @@ struct rte_kni_mbuf { >>> uint16_t data_len; /**< Amount of data in segment buffer. */ >>> >>> /* fields on second cache line */ >>> - char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE))); >>> + void *va __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE))); >>> + /**< Virtual address of this mbuf in userspace (overwrites userdata). */ >>> void *pool; >>> void *next; /**< Physical address of next mbuf in kernel. */ >>> }; >>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c >>> index e388751..463485f 100644 >>> --- a/lib/librte_kni/rte_kni.c >>> +++ b/lib/librte_kni/rte_kni.c >>> @@ -359,13 +359,15 @@ va2pa(struct rte_mbuf *m) >>> static void * >>> va2pa_all(struct rte_mbuf *mbuf) >>> { >>> - void *phy_mbuf = va2pa(mbuf); >>> + void *phy_mbuf = (void*)rte_mempool_virt2iova(mbuf); >>> struct rte_mbuf *next = mbuf->next; >>> while (next) { >>> - mbuf->next = va2pa(next); >>> + mbuf->userdata = mbuf; >>> + mbuf->next = (void*)rte_mempool_virt2iova(next); >>> mbuf = next; >>> next = mbuf->next; >>> } >>> + mbuf->userdata = mbuf; >>> return phy_mbuf; >>> } >>> >>> @@ -652,6 +654,8 @@ kni_allocate_mbufs(struct rte_kni *kni) >>> offsetof(struct rte_kni_mbuf, buf_addr)); >>> RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) != >>> offsetof(struct rte_kni_mbuf, next)); >>> + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, userdata) != >>> + offsetof(struct rte_kni_mbuf, va)); >>> RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) != >>> offsetof(struct rte_kni_mbuf, data_off)); >>> RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) != >>> >>