From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2E269A0535; Tue, 4 Feb 2020 17:50:03 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 01CE11C21A; Tue, 4 Feb 2020 17:50:03 +0100 (CET) Received: from mail-oi1-f195.google.com (mail-oi1-f195.google.com [209.85.167.195]) by dpdk.org (Postfix) with ESMTP id 1AA671C1FF for ; Tue, 4 Feb 2020 17:50:01 +0100 (CET) Received: by mail-oi1-f195.google.com with SMTP id z2so19096076oih.6 for ; Tue, 04 Feb 2020 08:50:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9a4gQCm+SsQDaboo5Fv9Pmb3BOMUlcv0FfGX3AgJfME=; b=QqhW1NfkjaQsHQ2aEDunVNbwY2GdalB1hgdWGMKAO8/3fRCU070e+lWDGufbA7QyrX BD57vS3dCMuBwQK6YTzAPkpeB3o/fIZspVYDiVXbjcA2BziK/pGfW/DANz3VodBv5Jv7 YJz80JppN8rWj2TqvdG+x6iwcMtfJ9VO4ACOZoTxVGLkLlNzIJki24qgr1jqhbCBoNHv vJ+MIDsckwUA7D2SLkmx4798ZAbJgbHS1AecdepVAmKuzDHtwhT/iAWboeVQtVMYMk+c yPpxU2pKufFTUmaHRhmag4BgoHpmeuggo16LIRinK0hS+tWsQA5jhthwAL7Uov3C6ogW UEdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9a4gQCm+SsQDaboo5Fv9Pmb3BOMUlcv0FfGX3AgJfME=; b=Yq/Ux7hUEI1THC2q8UfDDR8mZGBuMKevsbCaOIBKj73S0ga6AGObXC+XuSE7n0BAnd ZqA+wRXe2j/yDu3x251P9Ye/xmJNFDgj0ixrXOod+cczC9C0x/KDnwyWvSIfIIVCxHKy IZYK/RfhnmTcodg5rlRfLud6uMf9Wc+A9MSivBJR9KOtyDYjR09adSo/ZEDGbHcYhcSM UYYKAWAdxBbHKBCtBRpYkUki0kuFJMW0bgNtcqFELalKhnuDA3KHYIcXa1wKgBk9mwA3 ZxA2ZnOFxJmlTKp4Q8GVnbqeW15J11OFRtZWMh/VPPfdwB+8+/weiW/aFUa+rDN8qXD6 Qikw== X-Gm-Message-State: APjAAAVTLyViJo/VuFi4ynq38sT5l0ziarMKzbi29zNwqb9xd9V16112 2jSU1pNHAXwkq0JWPDoMHnOwQL2+7AzUGzI+fo8= X-Google-Smtp-Source: APXvYqxl7SQMlqbkxd94MimkwpZyRHzXhp/K0foMdKb/JFfQQrp0gAwXUHWV1Sk0LUIDGc6TjtVTYEMXYdpBzhSdGjc= X-Received: by 2002:a54:4086:: with SMTP id i6mr4096938oii.65.1580835000228; Tue, 04 Feb 2020 08:50:00 -0800 (PST) MIME-Version: 1.0 References: <201909101917.x8AJH281027807@lectura.cs.arizona.edu> <202001281642.00SGgPfT032152@lectura.cs.arizona.edu> In-Reply-To: From: yoursunny Date: Tue, 4 Feb 2020 11:49:49 -0500 Message-ID: To: Ferruh Yigit Cc: Junxiao Shi , dev@dpdk.org, Olivier MATZ , Jerin Jacob , Igor Ryzhov , Vamsi Attunuru Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [RFC PATCH v2] kni: properly translate pa2va for cloned mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Ferruh Thanks for your review. On Tue, Feb 4, 2020 at 11:17 AM Ferruh Yigit wrote: > > On 1/28/2020 4:16 PM, Junxiao Shi wrote: > > Previously, KNI kernel module uses the difference between m->buf_addr > > and m->buf_iova to calculate userspace virtual address from physical > > address. This works for direct mbufs, but does not work for indirect > > (cloned) mbufs that come from another mempool. Transmitting a cloned > > mbuf may cause segmentation fault in userspace. > > Yes, indirect buffers from another mempool won't work, I think first thing to do > is detect these mbufs and return error to prevent the crash. > > > > > Now, userspace KNI library writes the virtual address of each mbuf > > in m->userdata field, and KNI kernel module uses this field to restore > > virtual address before putting mbuf into free_q. This approach works > > for both direct and indirect mbufs. > > Overwriting the mbuf metadata is not ideal and in some cases it may be > unacceptable so I think we can't make it default behavior. > I believe that after an mbuf has been passed to ethdev for transmission, it is owned by ethdev, and the user application should not use it anymore. This is because, the mbuf could be free'd by ethdev after transmission, so that it is unsafe for the user application to dereference its pointer. Notice that in case the user application has created indirect mbufs pointing to the same buffer, only the mbuf passed to ethdev would have its m->userdata field overwritten; other mbufs still held by the application are unaffected. This is why I decided to overwrite the m->userdata field. Can you explain, in what case overwriting mbuf metadata would be unacceptable? > > But why not have this as option, need to find a way to pass this option to kni > library. > > And kni library may add a flag to mbuf to say the 'userdata' is used to as 'va' > before sending packet to kernel, this can make kernel code unified which can use > 'pa2va()' when there is no flag and 'kva->va' when there is. > Adding this flag to mbuf.h prevents the any possible conflict in the future if > @Oliver accepts this. > > And in mbuf, not sure if there is flag for mbuf that marks the 'userdata' have > valid value or not. But detecting it in userspace and not overwriting the value > but returning error in that case can be better option. > > What do you think? > > > > > NOTE TO REVIEWER - DO NOT MERGE > > The idea of this change is at https://bugs.dpdk.org/show_bug.cgi?id=183#c4 > > Test case is at https://bugs.dpdk.org/show_bug.cgi?id=183#c5 > > I only modified kni_net_rx_normal function. > > If this approach is acceptable, I will modify kni_net_rx_lo_fifo, > > kni_net_rx_lo_fifo_skb, and kni_fifo_trans_pa2va(rx_q) as well. > > > > Bugzilla ID: 183 > > > > Signed-off-by: Junxiao Shi > > --- > > kernel/linux/kni/kni_net.c | 4 ++-- > > lib/librte_eal/linux/eal/include/rte_kni_common.h | 3 ++- > > lib/librte_kni/rte_kni.c | 8 ++++++-- > > 3 files changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > > index 97fe85b..d783545 100644 > > --- a/kernel/linux/kni/kni_net.c > > +++ b/kernel/linux/kni/kni_net.c > > @@ -377,7 +377,7 @@ kni_net_rx_normal(struct kni_dev *kni) > > kva = get_kva(kni, kni->pa[i]); > > len = kva->pkt_len; > > data_kva = get_data_kva(kni, kva); > > - kni->va[i] = pa2va(kni->pa[i], kva); > > + kni->va[i] = kva->va; > > > > skb = netdev_alloc_skb(dev, len); > > if (!skb) { > > @@ -403,7 +403,7 @@ kni_net_rx_normal(struct kni_dev *kni) > > kva = pa2kva(kva->next); > > data_kva = kva2data_kva(kva); > > /* Convert physical address to virtual address */ > > - prev_kva->next = pa2va(prev_kva->next, kva); > > + prev_kva->next = kva->va; > > } > > } > > > > diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h > > index 7313ef5..c694a1d 100644 > > --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h > > +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h > > @@ -86,7 +86,8 @@ struct rte_kni_mbuf { > > uint16_t data_len; /**< Amount of data in segment buffer. */ > > > > /* fields on second cache line */ > > - char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE))); > > + void *va __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE))); > > + /**< Virtual address of this mbuf in userspace (overwrites userdata). */ > > void *pool; > > void *next; /**< Physical address of next mbuf in kernel. */ > > }; > > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c > > index e388751..463485f 100644 > > --- a/lib/librte_kni/rte_kni.c > > +++ b/lib/librte_kni/rte_kni.c > > @@ -359,13 +359,15 @@ va2pa(struct rte_mbuf *m) > > static void * > > va2pa_all(struct rte_mbuf *mbuf) > > { > > - void *phy_mbuf = va2pa(mbuf); > > + void *phy_mbuf = (void*)rte_mempool_virt2iova(mbuf); > > struct rte_mbuf *next = mbuf->next; > > while (next) { > > - mbuf->next = va2pa(next); > > + mbuf->userdata = mbuf; > > + mbuf->next = (void*)rte_mempool_virt2iova(next); > > mbuf = next; > > next = mbuf->next; > > } > > + mbuf->userdata = mbuf; > > return phy_mbuf; > > } > > > > @@ -652,6 +654,8 @@ kni_allocate_mbufs(struct rte_kni *kni) > > offsetof(struct rte_kni_mbuf, buf_addr)); > > RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, next) != > > offsetof(struct rte_kni_mbuf, next)); > > + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, userdata) != > > + offsetof(struct rte_kni_mbuf, va)); > > RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) != > > offsetof(struct rte_kni_mbuf, data_off)); > > RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) != > > >