From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <dev-bounces@dpdk.org> Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BCA31A04CA; Fri, 15 Nov 2019 12:18:51 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 802852F4F; Fri, 15 Nov 2019 12:18:51 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by dpdk.org (Postfix) with ESMTP id 385792C12 for <dev@dpdk.org>; Fri, 15 Nov 2019 12:18:50 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xAFBFsjQ013607; Fri, 15 Nov 2019 03:18:49 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=e8ErPZUAl6k+2d2fLoVPnE/8YHRpo/Y4I+/guIdBoD8=; b=iKHphebrHNsIRQQdwCsE4MNmtbkidh0K1J9BsfYv7+XJvZGQa8CAJQ6eOre3t1ke9zfK ujY8alm2cGqpPYHlxPRlA3QQhPYOiZwGdobLja5GYpI94g1kJn5goIjsWpjOJ3mgB960 dy9BdjgtUs9k8VDM4yx/qdY/1Y+6O1xs/0nuQHwYybaRfm5jeco51E9Gz2aiB9WVDYLK o9E9xvL2S0vhdK8K759pOxxihwYED5PBwyo8X5oFK5FNFUONENGQiKVsF9LlsmhnevX9 +1awh6anUXgVTkPJDEjxuBCi6jr0KnO0wN8YZRaH8/uzNbDF8oyPqrX9m8dPQ2y+m80G 8g== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0b-0016f401.pphosted.com with ESMTP id 2w8whqygrw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 15 Nov 2019 03:18:48 -0800 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 15 Nov 2019 03:18:46 -0800 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 15 Nov 2019 03:18:46 -0800 Received: from hyd1vattunuru-dt.caveonetworks.com (unknown [10.29.52.72]) by maili.marvell.com (Postfix) with ESMTP id A460F3F7040; Fri, 15 Nov 2019 03:18:41 -0800 (PST) From: <vattunuru@marvell.com> To: <dev@dpdk.org> CC: <thomas@monjalon.net>, <jerinj@marvell.com>, <kirankumark@marvell.com>, <olivier.matz@6wind.com>, <ferruh.yigit@intel.com>, <anatoly.burakov@intel.com>, <arybchenko@solarflare.com>, <stephen@networkplumber.org>, <david.marchand@redhat.com>, Vamsi Attunuru <vattunuru@marvell.com> Date: Fri, 15 Nov 2019 16:48:06 +0530 Message-ID: <20191115111807.20935-2-vattunuru@marvell.com> X-Mailer: git-send-email 2.8.4 In-Reply-To: <20191115111807.20935-1-vattunuru@marvell.com> References: <20191105110416.8955-1-vattunuru@marvell.com> <20191115111807.20935-1-vattunuru@marvell.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2019-11-15_03:2019-11-15,2019-11-15 signatures=0 Subject: [dpdk-dev] [PATCH v13 1/2] kni: support IOVA mode in kernel module X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> From: Vamsi Attunuru <vattunuru@marvell.com> Patch adds support for kernel module to work in IOVA = VA mode by providing address translation routines to convert IOVA aka user space VA to kernel virtual addresses. When compared with IOVA = PA mode, KNI netdev ports does not have any impact on performance with this approach, also there is no performance impact on IOVA = PA mode with this patch. This approach does not work with the kernel versions less than 4.6.0. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> --- kernel/linux/kni/compat.h | 14 +++++ kernel/linux/kni/kni_dev.h | 42 +++++++++++++++ kernel/linux/kni/kni_misc.c | 39 ++++++++++---- kernel/linux/kni/kni_net.c | 62 ++++++++++++++++++----- lib/librte_eal/linux/eal/include/rte_kni_common.h | 1 + 5 files changed, 136 insertions(+), 22 deletions(-) diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h index 562d8bf..062b170 100644 --- a/kernel/linux/kni/compat.h +++ b/kernel/linux/kni/compat.h @@ -121,3 +121,17 @@ #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0) #define HAVE_SIGNAL_FUNCTIONS_OWN_HEADER #endif + +#if KERNEL_VERSION(4, 6, 0) <= LINUX_VERSION_CODE + +#define HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + +#if KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE +#define GET_USER_PAGES_REMOTE_API_V1 +#elif KERNEL_VERSION(4, 9, 0) == LINUX_VERSION_CODE +#define GET_USER_PAGES_REMOTE_API_V2 +#else +#define GET_USER_PAGES_REMOTE_API_V3 +#endif + +#endif diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h index c1ca678..fb641b6 100644 --- a/kernel/linux/kni/kni_dev.h +++ b/kernel/linux/kni/kni_dev.h @@ -41,6 +41,8 @@ struct kni_dev { /* kni list */ struct list_head list; + uint8_t iova_mode; + uint32_t core_id; /* Core ID to bind */ char name[RTE_KNI_NAMESIZE]; /* Network device name */ struct task_struct *pthread; @@ -84,8 +86,48 @@ struct kni_dev { void *va[MBUF_BURST_SZ]; void *alloc_pa[MBUF_BURST_SZ]; void *alloc_va[MBUF_BURST_SZ]; + + struct task_struct *usr_tsk; }; +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT +static inline phys_addr_t iova_to_phys(struct task_struct *tsk, + unsigned long iova) +{ + phys_addr_t offset, phys_addr; + struct page *page = NULL; + long ret; + + offset = iova & (PAGE_SIZE - 1); + + /* Read one page struct info */ +#ifdef GET_USER_PAGES_REMOTE_API_V3 + ret = get_user_pages_remote(tsk, tsk->mm, iova, 1, + FOLL_TOUCH, &page, NULL, NULL); +#endif +#ifdef GET_USER_PAGES_REMOTE_API_V2 + ret = get_user_pages_remote(tsk, tsk->mm, iova, 1, + FOLL_TOUCH, &page, NULL); +#endif +#ifdef GET_USER_PAGES_REMOTE_API_V1 + ret = get_user_pages_remote(tsk, tsk->mm, iova, 1 + 0, 0, &page, NULL); +#endif + if (ret < 0) + return 0; + + phys_addr = page_to_phys(page) | offset; + put_page(page); + + return phys_addr; +} + +static inline void *iova_to_kva(struct task_struct *tsk, unsigned long iova) +{ + return phys_to_virt(iova_to_phys(tsk, iova)); +} +#endif + void kni_net_release_fifo_phy(struct kni_dev *kni); void kni_net_rx(struct kni_dev *kni); void kni_net_init(struct net_device *dev); diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c index 84ef03b..cda71bd 100644 --- a/kernel/linux/kni/kni_misc.c +++ b/kernel/linux/kni/kni_misc.c @@ -348,15 +348,36 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num, strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE); /* Translate user space info into kernel space info */ - kni->tx_q = phys_to_virt(dev_info.tx_phys); - kni->rx_q = phys_to_virt(dev_info.rx_phys); - kni->alloc_q = phys_to_virt(dev_info.alloc_phys); - kni->free_q = phys_to_virt(dev_info.free_phys); - - kni->req_q = phys_to_virt(dev_info.req_phys); - kni->resp_q = phys_to_virt(dev_info.resp_phys); - kni->sync_va = dev_info.sync_va; - kni->sync_kva = phys_to_virt(dev_info.sync_phys); + if (dev_info.iova_mode) { +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + kni->tx_q = iova_to_kva(current, dev_info.tx_phys); + kni->rx_q = iova_to_kva(current, dev_info.rx_phys); + kni->alloc_q = iova_to_kva(current, dev_info.alloc_phys); + kni->free_q = iova_to_kva(current, dev_info.free_phys); + + kni->req_q = iova_to_kva(current, dev_info.req_phys); + kni->resp_q = iova_to_kva(current, dev_info.resp_phys); + kni->sync_va = dev_info.sync_va; + kni->sync_kva = iova_to_kva(current, dev_info.sync_phys); + kni->usr_tsk = current; + kni->iova_mode = 1; +#else + pr_err("KNI module does not support IOVA to VA translation\n"); + return -EINVAL; +#endif + } else { + + kni->tx_q = phys_to_virt(dev_info.tx_phys); + kni->rx_q = phys_to_virt(dev_info.rx_phys); + kni->alloc_q = phys_to_virt(dev_info.alloc_phys); + kni->free_q = phys_to_virt(dev_info.free_phys); + + kni->req_q = phys_to_virt(dev_info.req_phys); + kni->resp_q = phys_to_virt(dev_info.resp_phys); + kni->sync_va = dev_info.sync_va; + kni->sync_kva = phys_to_virt(dev_info.sync_phys); + kni->iova_mode = 0; + } kni->mbuf_size = dev_info.mbuf_size; diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c index f25b127..1ba9b1b 100644 --- a/kernel/linux/kni/kni_net.c +++ b/kernel/linux/kni/kni_net.c @@ -36,6 +36,22 @@ static void kni_net_rx_normal(struct kni_dev *kni); /* kni rx function pointer, with default to normal rx */ static kni_net_rx_t kni_net_rx_func = kni_net_rx_normal; +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT +/* iova to kernel virtual address */ +static inline void * +iova2kva(struct kni_dev *kni, void *iova) +{ + return phys_to_virt(iova_to_phys(kni->usr_tsk, (unsigned long)iova)); +} + +static inline void * +iova2data_kva(struct kni_dev *kni, struct rte_kni_mbuf *m) +{ + return phys_to_virt(iova_to_phys(kni->usr_tsk, m->buf_physaddr) + + m->data_off); +} +#endif + /* physical address to kernel virtual address */ static void * pa2kva(void *pa) @@ -62,6 +78,26 @@ kva2data_kva(struct rte_kni_mbuf *m) return phys_to_virt(m->buf_physaddr + m->data_off); } +static inline void * +get_kva(struct kni_dev *kni, void *pa) +{ +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + if (kni->iova_mode == 1) + return iova2kva(kni, pa); +#endif + return pa2kva(pa); +} + +static inline void * +get_data_kva(struct kni_dev *kni, void *pkt_kva) +{ +#ifdef HAVE_IOVA_TO_KVA_MAPPING_SUPPORT + if (kni->iova_mode == 1) + return iova2data_kva(kni, pkt_kva); +#endif + return kva2data_kva(pkt_kva); +} + /* * It can be called to process the request. */ @@ -178,7 +214,7 @@ kni_fifo_trans_pa2va(struct kni_dev *kni, return; for (i = 0; i < num_rx; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); kni->va[i] = pa2va(kni->pa[i], kva); kva_nb_segs = kva->nb_segs; @@ -266,8 +302,8 @@ kni_net_tx(struct sk_buff *skb, struct net_device *dev) if (likely(ret == 1)) { void *data_kva; - pkt_kva = pa2kva(pkt_pa); - data_kva = kva2data_kva(pkt_kva); + pkt_kva = get_kva(kni, pkt_pa); + data_kva = get_data_kva(kni, pkt_kva); pkt_va = pa2va(pkt_pa, pkt_kva); len = skb->len; @@ -338,9 +374,9 @@ kni_net_rx_normal(struct kni_dev *kni) /* Transfer received packets to netif */ for (i = 0; i < num_rx; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); len = kva->pkt_len; - data_kva = kva2data_kva(kva); + data_kva = get_data_kva(kni, kva); kni->va[i] = pa2va(kni->pa[i], kva); skb = netdev_alloc_skb(dev, len); @@ -437,9 +473,9 @@ kni_net_rx_lo_fifo(struct kni_dev *kni) num = ret; /* Copy mbufs */ for (i = 0; i < num; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); len = kva->data_len; - data_kva = kva2data_kva(kva); + data_kva = get_data_kva(kni, kva); kni->va[i] = pa2va(kni->pa[i], kva); while (kva->next) { @@ -449,8 +485,8 @@ kni_net_rx_lo_fifo(struct kni_dev *kni) kva = next_kva; } - alloc_kva = pa2kva(kni->alloc_pa[i]); - alloc_data_kva = kva2data_kva(alloc_kva); + alloc_kva = get_kva(kni, kni->alloc_pa[i]); + alloc_data_kva = get_data_kva(kni, alloc_kva); kni->alloc_va[i] = pa2va(kni->alloc_pa[i], alloc_kva); memcpy(alloc_data_kva, data_kva, len); @@ -517,9 +553,9 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni) /* Copy mbufs to sk buffer and then call tx interface */ for (i = 0; i < num; i++) { - kva = pa2kva(kni->pa[i]); + kva = get_kva(kni, kni->pa[i]); len = kva->pkt_len; - data_kva = kva2data_kva(kva); + data_kva = get_data_kva(kni, kva); kni->va[i] = pa2va(kni->pa[i], kva); skb = netdev_alloc_skb(dev, len); @@ -550,8 +586,8 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni) break; prev_kva = kva; - kva = pa2kva(kva->next); - data_kva = kva2data_kva(kva); + kva = get_kva(kni, kva->next); + data_kva = get_data_kva(kni, kva); /* Convert physical address to virtual address */ prev_kva->next = pa2va(prev_kva->next, kva); } diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h index 46f75a7..2427a96 100644 --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h @@ -125,6 +125,7 @@ struct rte_kni_device_info { unsigned int min_mtu; unsigned int max_mtu; uint8_t mac_addr[6]; + uint8_t iova_mode; }; #define KNI_DEVICE "kni" -- 2.8.4