From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 46D49A05D3 for ; Mon, 22 Apr 2019 08:15:40 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 05FB21B49C; Mon, 22 Apr 2019 08:15:40 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by dpdk.org (Postfix) with ESMTP id 435B32BD8 for ; Mon, 22 Apr 2019 08:15:38 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3M65KjW030182; Sun, 21 Apr 2019 23:15:37 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=+foQ6IWwiHyHzIeMEzRTbzSzOg50sbP1m5yRKugIMZk=; b=o8JFPo6nWBylQ0DrZ5R5UB2AbB49aVNUU22CtYiKtplLRNe+KyZz4g8/NcWvAYnL7KHL 773wTatb2bbO18dMsjQ7FcjAPRq/lq1pejAf0mTLmQoILn7VNiU3d1dNgxEgOWcHl3Zw 5r4B/JitdntPC5MPBZ25WDoOMguv4cFqsBAF4ZzBXeI0TyrvWy3JvPriTwdpMbrbuy1Z tymXmPKLUoN1LX9NzYRdS7LSxJLmm5ZLiClS5dAjPwtAicGNiHSbTXbZvqucjeqW1YJr SuzNz7jyx/Ldfg56uAN3r2bMfmOy4rTDFe8SU/6hNDPHH8nrm2VgHpxK6NFkm2bRqdXh hg== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0a-0016f401.pphosted.com with ESMTP id 2s00snnh4y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Sun, 21 Apr 2019 23:15:37 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Sun, 21 Apr 2019 23:15:36 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Sun, 21 Apr 2019 23:15:36 -0700 Received: from kkokkilagadda.marvell.com (unknown [10.28.34.15]) by maili.marvell.com (Postfix) with ESMTP id 9C53E3F7041; Sun, 21 Apr 2019 23:15:35 -0700 (PDT) From: To: CC: , Kiran Kumar K Date: Mon, 22 Apr 2019 11:45:33 +0530 Message-ID: <20190422061533.17538-1-kirankumark@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190422043912.18060-1-kirankumark@marvell.com> References: <20190422043912.18060-1-kirankumark@marvell.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-04-21_08:, , signatures=0 Subject: [dpdk-dev] [PATCH v5] kni: add IOVA va support for kni X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190422061533._V3IWOx4Uu_cPpEm4lIrIR8XFlpty4MYs7gfych4N4Y@z> From: Kiran Kumar K With current KNI implementation kernel module will work only in IOVA=PA mode. This patch will add support for kernel module to work with IOVA=VA mode. The idea is to get the physical address from iova address using api iommu_iova_to_phys. Using this API, we will get the physical address from iova address and later use phys_to_virt API to convert the physical address to kernel virtual address. With this approach we have compared the performance with IOVA=PA and there is no difference observed. Seems like kernel is the overhead. This approach will not work with the kernel versions less than 4.4.0 because of API compatibility issues. Signed-off-by: Kiran Kumar K --- V5 changes: * Fixed build issue with 32b build V4 changes: * Fixed build issues with older kernel versions * This approach will only work with kernel above 4.4.0 V3 Changes: * Add new approach to work kni with IOVA=VA mode using iommu_iova_to_phys API. kernel/linux/kni/kni_dev.h | 4 + kernel/linux/kni/kni_misc.c | 63 ++++++++++++--- kernel/linux/kni/kni_net.c | 76 +++++++++++++++---- lib/librte_eal/linux/eal/eal.c | 9 --- .../linux/eal/include/rte_kni_common.h | 1 + lib/librte_kni/rte_kni.c | 2 + 6 files changed, 122 insertions(+), 33 deletions(-) diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h index df46aa70e..9c4944921 100644 --- a/kernel/linux/kni/kni_dev.h +++ b/kernel/linux/kni/kni_dev.h @@ -23,6 +23,7 @@ #include #include #include +#include #include #define KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */ @@ -39,6 +40,9 @@ struct kni_dev { /* kni list */ struct list_head list; + uint8_t iova_mode; + struct iommu_domain *domain; + struct net_device_stats stats; int status; uint16_t group_id; /* Group ID of a group of KNI devices */ diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c index 31845e10f..9e90af31b 100644 --- a/kernel/linux/kni/kni_misc.c +++ b/kernel/linux/kni/kni_misc.c @@ -306,10 +306,12 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num, struct rte_kni_device_info dev_info; struct net_device *net_dev = NULL; struct kni_dev *kni, *dev, *n; + struct pci_dev *pci = NULL; + struct iommu_domain *domain = NULL; + phys_addr_t phys_addr; #ifdef RTE_KNI_KMOD_ETHTOOL struct pci_dev *found_pci = NULL; struct net_device *lad_dev = NULL; - struct pci_dev *pci = NULL; #endif pr_info("Creating kni...\n"); @@ -368,15 +370,56 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num, strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE); /* Translate user space info into kernel space info */ - kni->tx_q = phys_to_virt(dev_info.tx_phys); - kni->rx_q = phys_to_virt(dev_info.rx_phys); - kni->alloc_q = phys_to_virt(dev_info.alloc_phys); - kni->free_q = phys_to_virt(dev_info.free_phys); - - kni->req_q = phys_to_virt(dev_info.req_phys); - kni->resp_q = phys_to_virt(dev_info.resp_phys); - kni->sync_va = dev_info.sync_va; - kni->sync_kva = phys_to_virt(dev_info.sync_phys); + + if (dev_info.iova_mode) { +#if KERNEL_VERSION(4, 4, 0) > LINUX_VERSION_CODE + (void)pci; + pr_err("Kernel version is not supported\n"); + return -EINVAL; +#else + pci = pci_get_device(dev_info.vendor_id, + dev_info.device_id, NULL); + while (pci) { + if ((pci->bus->number == dev_info.bus) && + (PCI_SLOT(pci->devfn) == dev_info.devid) && + (PCI_FUNC(pci->devfn) == dev_info.function)) { + domain = iommu_get_domain_for_dev(&pci->dev); + break; + } + pci = pci_get_device(dev_info.vendor_id, + dev_info.device_id, pci); + } +#endif + kni->domain = domain; + phys_addr = iommu_iova_to_phys(domain, dev_info.tx_phys); + kni->tx_q = phys_to_virt(phys_addr); + phys_addr = iommu_iova_to_phys(domain, dev_info.rx_phys); + kni->rx_q = phys_to_virt(phys_addr); + phys_addr = iommu_iova_to_phys(domain, dev_info.alloc_phys); + kni->alloc_q = phys_to_virt(phys_addr); + phys_addr = iommu_iova_to_phys(domain, dev_info.free_phys); + kni->free_q = phys_to_virt(phys_addr); + phys_addr = iommu_iova_to_phys(domain, dev_info.req_phys); + kni->req_q = phys_to_virt(phys_addr); + phys_addr = iommu_iova_to_phys(domain, dev_info.resp_phys); + kni->resp_q = phys_to_virt(phys_addr); + kni->sync_va = dev_info.sync_va; + phys_addr = iommu_iova_to_phys(domain, dev_info.sync_phys); + kni->sync_kva = phys_to_virt(phys_addr); + kni->iova_mode = 1; + + } else { + kni->tx_q = phys_to_virt(dev_info.tx_phys); + kni->rx_q = phys_to_virt(dev_info.rx_phys); + kni->alloc_q = phys_to_virt(dev_info.alloc_phys); + kni->free_q = phys_to_virt(dev_info.free_phys); + + kni->req_q = phys_to_virt(dev_info.req_phys); + kni->resp_q = phys_to_virt(dev_info.resp_phys); + kni->sync_va = dev_info.sync_va; + kni->sync_kva = phys_to_virt(dev_info.sync_phys); + kni->iova_mode = 0; + } kni->mbuf_size = dev_info.mbuf_size; diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c index be9e6b0b9..e77a28066 100644 --- a/kernel/linux/kni/kni_net.c +++ b/kernel/linux/kni/kni_net.c @@ -35,6 +35,22 @@ static void kni_net_rx_normal(struct kni_dev *kni); /* kni rx function pointer, with default to normal rx */ static kni_net_rx_t kni_net_rx_func = kni_net_rx_normal; +/* iova to kernel virtual address */ +static void * +iova2kva(struct kni_dev *kni, void *pa) +{ + return phys_to_virt(iommu_iova_to_phys(kni->domain, + (uintptr_t)pa)); +} + +static void * +iova2data_kva(struct kni_dev *kni, struct rte_kni_mbuf *m) +{ + return phys_to_virt((iommu_iova_to_phys(kni->domain, + (uintptr_t)m->buf_physaddr) + + m->data_off)); +} + /* physical address to kernel virtual address */ static void * pa2kva(void *pa) @@ -186,7 +202,10 @@ kni_fifo_trans_pa2va(struct kni_dev *kni, return; for (i = 0; i < num_rx; i++) { - kva = pa2kva(kni->pa[i]); + if (likely(kni->iova_mode == 1)) + kva = iova2kva(kni, kni->pa[i]); + else + kva = pa2kva(kni->pa[i]); kni->va[i] = pa2va(kni->pa[i], kva); } @@ -263,8 +282,13 @@ kni_net_tx(struct sk_buff *skb, struct net_device *dev) if (likely(ret == 1)) { void *data_kva; - pkt_kva = pa2kva(pkt_pa); - data_kva = kva2data_kva(pkt_kva); + if (likely(kni->iova_mode == 1)) { + pkt_kva = iova2kva(kni, pkt_pa); + data_kva = iova2data_kva(kni, pkt_kva); + } else { + pkt_kva = pa2kva(pkt_pa); + data_kva = kva2data_kva(pkt_kva); + } pkt_va = pa2va(pkt_pa, pkt_kva); len = skb->len; @@ -335,9 +359,14 @@ kni_net_rx_normal(struct kni_dev *kni) /* Transfer received packets to netif */ for (i = 0; i < num_rx; i++) { - kva = pa2kva(kni->pa[i]); + if (likely(kni->iova_mode == 1)) { + kva = iova2kva(kni, kni->pa[i]); + data_kva = iova2data_kva(kni, kva); + } else { + kva = pa2kva(kni->pa[i]); + data_kva = kva2data_kva(kva); + } len = kva->pkt_len; - data_kva = kva2data_kva(kva); kni->va[i] = pa2va(kni->pa[i], kva); skb = dev_alloc_skb(len + 2); @@ -434,13 +463,20 @@ kni_net_rx_lo_fifo(struct kni_dev *kni) num = ret; /* Copy mbufs */ for (i = 0; i < num; i++) { - kva = pa2kva(kni->pa[i]); + + if (likely(kni->iova_mode == 1)) { + kva = iova2kva(kni, kni->pa[i]); + data_kva = iova2data_kva(kni, kva); + alloc_kva = iova2kva(kni, kni->alloc_pa[i]); + alloc_data_kva = iova2data_kva(kni, alloc_kva); + } else { + kva = pa2kva(kni->pa[i]); + data_kva = kva2data_kva(kva); + alloc_kva = pa2kva(kni->alloc_pa[i]); + alloc_data_kva = kva2data_kva(alloc_kva); + } len = kva->pkt_len; - data_kva = kva2data_kva(kva); kni->va[i] = pa2va(kni->pa[i], kva); - - alloc_kva = pa2kva(kni->alloc_pa[i]); - alloc_data_kva = kva2data_kva(alloc_kva); kni->alloc_va[i] = pa2va(kni->alloc_pa[i], alloc_kva); memcpy(alloc_data_kva, data_kva, len); @@ -507,9 +543,15 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni) /* Copy mbufs to sk buffer and then call tx interface */ for (i = 0; i < num; i++) { - kva = pa2kva(kni->pa[i]); + + if (likely(kni->iova_mode == 1)) { + kva = iova2kva(kni, kni->pa[i]); + data_kva = iova2data_kva(kni, kva); + } else { + kva = pa2kva(kni->pa[i]); + data_kva = kva2data_kva(kva); + } len = kva->pkt_len; - data_kva = kva2data_kva(kva); kni->va[i] = pa2va(kni->pa[i], kva); skb = dev_alloc_skb(len + 2); @@ -545,8 +587,14 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni) if (!kva->next) break; - kva = pa2kva(va2pa(kva->next, kva)); - data_kva = kva2data_kva(kva); + if (likely(kni->iova_mode == 1)) { + kva = iova2kva(kni, + va2pa(kva->next, kva)); + data_kva = iova2data_kva(kni, kva); + } else { + kva = pa2kva(va2pa(kva->next, kva)); + data_kva = kva2data_kva(kva); + } } } diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c index f7ae62d7b..8fac6707d 100644 --- a/lib/librte_eal/linux/eal/eal.c +++ b/lib/librte_eal/linux/eal/eal.c @@ -1040,15 +1040,6 @@ rte_eal_init(int argc, char **argv) /* autodetect the IOVA mapping mode (default is RTE_IOVA_PA) */ rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class(); - - /* Workaround for KNI which requires physical address to work */ - if (rte_eal_get_configuration()->iova_mode == RTE_IOVA_VA && - rte_eal_check_module("rte_kni") == 1) { - rte_eal_get_configuration()->iova_mode = RTE_IOVA_PA; - RTE_LOG(WARNING, EAL, - "Some devices want IOVA as VA but PA will be used because.. " - "KNI module inserted\n"); - } } else { rte_eal_get_configuration()->iova_mode = internal_config.iova_mode; diff --git a/lib/librte_eal/linux/eal/include/rte_kni_common.h b/lib/librte_eal/linux/eal/include/rte_kni_common.h index 5afa08713..79ee4bc5a 100644 --- a/lib/librte_eal/linux/eal/include/rte_kni_common.h +++ b/lib/librte_eal/linux/eal/include/rte_kni_common.h @@ -128,6 +128,7 @@ struct rte_kni_device_info { unsigned mbuf_size; unsigned int mtu; char mac_addr[6]; + uint8_t iova_mode; }; #define KNI_DEVICE "kni" diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index 946459c79..ec8f23694 100644 --- a/lib/librte_kni/rte_kni.c +++ b/lib/librte_kni/rte_kni.c @@ -304,6 +304,8 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool, kni->group_id = conf->group_id; kni->mbuf_size = conf->mbuf_size; + dev_info.iova_mode = (rte_eal_iova_mode() == RTE_IOVA_VA) ? 1 : 0; + ret = ioctl(kni_fd, RTE_KNI_IOCTL_CREATE, &dev_info); if (ret < 0) goto ioctl_fail; -- 2.17.1