From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id BEE3BC47A for ; Fri, 19 Feb 2016 06:06:22 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP; 18 Feb 2016 21:06:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,469,1449561600"; d="scan'208";a="749153274" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by orsmga003.jf.intel.com with ESMTP; 18 Feb 2016 21:06:20 -0800 Received: from sivswdev02.ir.intel.com (sivswdev02.ir.intel.com [10.237.217.46]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id u1J56Jl8030954; Fri, 19 Feb 2016 05:06:19 GMT Received: from sivswdev02.ir.intel.com (localhost [127.0.0.1]) by sivswdev02.ir.intel.com with ESMTP id u1J56JXi014858; Fri, 19 Feb 2016 05:06:19 GMT Received: (from fyigit@localhost) by sivswdev02.ir.intel.com with id u1J56J90014854; Fri, 19 Feb 2016 05:06:19 GMT From: Ferruh Yigit To: dev@dpdk.org Date: Fri, 19 Feb 2016 05:05:48 +0000 Message-Id: <1455858349-14639-2-git-send-email-ferruh.yigit@intel.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1455858349-14639-1-git-send-email-ferruh.yigit@intel.com> References: <1453912360-18179-1-git-send-email-ferruh.yigit@intel.com> <1455858349-14639-1-git-send-email-ferruh.yigit@intel.com> Subject: [dpdk-dev] [PATCH v2 1/2] kdp: add kernel data path kernel module X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Feb 2016 05:06:23 -0000 This kernel module is based on KNI module, but this one is stripped version of it and only for data messages, no control functionality provided. FIFO implementation of the KNI is kept exact same, but ethtool related code removed and virtual network management related code simplified. This module contains kernel support to create network devices and this module has a simple driver for virtual network device, the driver simply puts/gets packets to/from FIFO instead of real hardware. FIFO is created owned by userspace application, which is for this case KDP PMD. In long term this patch intends to replace the KNI and KNI will be depreciated. Signed-off-by: Ferruh Yigit --- v2: * Use rtnetlink to create interfaces * include modules.h to prevent compile error in old kernels --- MAINTAINERS | 4 + config/common_linuxapp | 8 +- lib/librte_eal/linuxapp/Makefile | 5 +- lib/librte_eal/linuxapp/eal/Makefile | 3 +- .../linuxapp/eal/include/exec-env/rte_kdp_common.h | 139 ++++ lib/librte_eal/linuxapp/kdp/Makefile | 55 ++ lib/librte_eal/linuxapp/kdp/kdp_dev.h | 78 ++ lib/librte_eal/linuxapp/kdp/kdp_fifo.h | 91 +++ lib/librte_eal/linuxapp/kdp/kdp_net.c | 862 +++++++++++++++++++++ 9 files changed, 1242 insertions(+), 3 deletions(-) create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h create mode 100644 lib/librte_eal/linuxapp/kdp/Makefile create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_dev.h create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_fifo.h create mode 100644 lib/librte_eal/linuxapp/kdp/kdp_net.c diff --git a/MAINTAINERS b/MAINTAINERS index 628bc05..05ffe26 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -257,6 +257,10 @@ F: app/test/test_kni.c F: examples/kni/ F: doc/guides/sample_app_ug/kernel_nic_interface.rst +Linux KDP +M: Ferruh Yigit +F: lib/librte_eal/linuxapp/kdp/ + Linux AF_PACKET M: John W. Linville F: drivers/net/af_packet/ diff --git a/config/common_linuxapp b/config/common_linuxapp index f1638db..e1b5032 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -1,6 +1,6 @@ # BSD LICENSE # -# Copyright(c) 2010-2015 Intel Corporation. All rights reserved. +# Copyright(c) 2010-2016 Intel Corporation. All rights reserved. # All rights reserved. # # Redistribution and use in source and binary forms, with or without @@ -314,6 +314,12 @@ CONFIG_RTE_LIBRTE_PMD_XENVIRT=n CONFIG_RTE_LIBRTE_PMD_NULL=y # +# Compile KDP PMD +# +CONFIG_RTE_KDP_KMOD=y +CONFIG_RTE_KDP_PREEMPT_DEFAULT=y + +# # Do prefetch of packet data within PMD driver receive function # CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile index d9c5233..e3f91a7 100644 --- a/lib/librte_eal/linuxapp/Makefile +++ b/lib/librte_eal/linuxapp/Makefile @@ -1,6 +1,6 @@ # BSD LICENSE # -# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# Copyright(c) 2010-2016 Intel Corporation. All rights reserved. # All rights reserved. # # Redistribution and use in source and binary forms, with or without @@ -38,6 +38,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal ifeq ($(CONFIG_RTE_KNI_KMOD),y) DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni endif +ifeq ($(CONFIG_RTE_KDP_KMOD),y) +DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kdp +endif ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y) DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0 endif diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile index 6e26250..a70b793 100644 --- a/lib/librte_eal/linuxapp/eal/Makefile +++ b/lib/librte_eal/linuxapp/eal/Makefile @@ -1,6 +1,6 @@ # BSD LICENSE # -# Copyright(c) 2010-2015 Intel Corporation. All rights reserved. +# Copyright(c) 2010-2016 Intel Corporation. All rights reserved. # All rights reserved. # # Redistribution and use in source and binary forms, with or without @@ -121,6 +121,7 @@ CFLAGS_eal_thread.o += -Wno-return-type endif INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h +INC += rte_kdp_common.h SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP)-include/exec-env := \ $(addprefix include/exec-env/,$(INC)) diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h new file mode 100644 index 0000000..0334876 --- /dev/null +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kdp_common.h @@ -0,0 +1,139 @@ +/*- + * This file is provided under a dual BSD/LGPLv2 license. When using or + * redistributing this file, you may do so under either license. + * + * GNU LESSER GENERAL PUBLIC LICENSE + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2.1 of the GNU Lesser General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with this program; + * + * Contact Information: + * Intel Corporation + * + * + * BSD LICENSE + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + */ + +#ifndef _RTE_KDP_COMMON_H_ +#define _RTE_KDP_COMMON_H_ + +/** + * KDP name + */ +#define RTE_KDP_NAMESIZE 32 + +#define KDP_DEVICE "kdp" + +/* + * Fifo struct mapped in a shared memory. It describes a circular buffer FIFO + * Write and read should wrap around. Fifo is empty when write == read + * Writing should never overwrite the read position + */ +struct rte_kdp_fifo { + volatile unsigned write; /**< Next position to be written*/ + volatile unsigned read; /**< Next position to be read */ + unsigned len; /**< Circular buffer length */ + unsigned elem_size; /**< Pointer size - for 32/64 bit OS */ + void * volatile buffer[0]; /**< The buffer contains mbuf pointers */ +}; + +/* + * The kernel image of the rte_mbuf struct, with only the relevant fields. + * Padding is necessary to assure the offsets of these fields + */ +struct rte_kdp_mbuf { + void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE))); + char pad0[10]; + + /**< Start address of data in segment buffer. */ + uint16_t data_off; + char pad1[4]; + uint64_t ol_flags; /**< Offload features. */ + char pad2[4]; + + /**< Total pkt len: sum of all segment data_len. */ + uint32_t pkt_len; + + /**< Amount of data in segment buffer. */ + uint16_t data_len; + + /* fields on second cache line */ + char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE))); + void *pool; + void *next; +}; + +/* + * Struct used to create a KDP device. Passed to the kernel in IOCTL call + */ +struct rte_kdp_device_info { + char name[RTE_KDP_NAMESIZE]; /**< Network device name for KDP */ + + phys_addr_t tx_phys; + phys_addr_t rx_phys; + phys_addr_t alloc_phys; + phys_addr_t free_phys; + + /* mbuf mempool */ + void *mbuf_va; + phys_addr_t mbuf_phys; + + uint16_t port_id; /**< Group ID */ + uint32_t core_id; /**< core ID to bind for kernel thread */ + + uint8_t force_bind : 1; /**< Flag for kernel thread binding */ + + /* mbuf size */ + unsigned mbuf_size; +}; + +enum { + IFLA_KDP_UNSPEC, + IFLA_KDP_PORTID, + IFLA_KDP_DEVINFO, + __IFLA_KDP_MAX, +}; +#define IFLA_KDP_MAX (__IFLA_KDP_MAX - 1) + +#endif /* _RTE_KDP_COMMON_H_ */ diff --git a/lib/librte_eal/linuxapp/kdp/Makefile b/lib/librte_eal/linuxapp/kdp/Makefile new file mode 100644 index 0000000..3897dc6 --- /dev/null +++ b/lib/librte_eal/linuxapp/kdp/Makefile @@ -0,0 +1,55 @@ +# BSD LICENSE +# +# Copyright(c) 2016 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# module name and path +# +MODULE = rte_kdp + +# +# CFLAGS +# +MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=50 +MODULE_CFLAGS += -I$(RTE_OUTPUT)/include +MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h +MODULE_CFLAGS += -Wall -Werror + +# this lib needs main eal +DEPDIRS-y += lib/librte_eal/linuxapp/eal + +# +# all source are stored in SRCS-y +# +SRCS-y += kdp_net.c + +include $(RTE_SDK)/mk/rte.module.mk diff --git a/lib/librte_eal/linuxapp/kdp/kdp_dev.h b/lib/librte_eal/linuxapp/kdp/kdp_dev.h new file mode 100644 index 0000000..61f4288 --- /dev/null +++ b/lib/librte_eal/linuxapp/kdp/kdp_dev.h @@ -0,0 +1,78 @@ +/*- + * GPL LICENSE SUMMARY + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; + * + * Contact Information: + * Intel Corporation + */ + +#ifndef _KDP_DEV_H_ +#define _KDP_DEV_H_ + +#include + +/** + * A structure describing the private information for a kdp device. + */ +struct kdp_dev { + /* kdp list */ + struct list_head list; + + struct net_device_stats stats; + uint16_t port_id; /* Group ID of a group of KDP devices */ + unsigned core_id; /* Core ID to bind */ + char name[RTE_KDP_NAMESIZE]; /* Network device name */ + struct task_struct *pthread; + + /* wait queue for req/resp */ + wait_queue_head_t wq; + struct mutex sync_lock; + + /* kdp device */ + struct net_device *net_dev; + + /* queue for packets to be sent out */ + void *tx_q; + + /* queue for the packets received */ + void *rx_q; + + /* queue for the allocated mbufs those can be used to save sk buffs */ + void *alloc_q; + + /* free queue for the mbufs to be freed */ + void *free_q; + + void *sync_kva; + void *sync_va; + + void *mbuf_kva; + void *mbuf_va; + + /* mbuf size */ + unsigned mbuf_size; +}; + +#define KDP_ERR(args...) printk(KERN_ERR "KDP: " args) +#define KDP_PRINT(args...) printk(KERN_DEBUG "KDP: " args) + +#ifdef RTE_KDP_KO_DEBUG +#define KDP_DBG(args...) printk(KERN_DEBUG "KDP: " args) +#else +#define KDP_DBG(args...) +#endif + +#endif diff --git a/lib/librte_eal/linuxapp/kdp/kdp_fifo.h b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h new file mode 100644 index 0000000..a5fe080 --- /dev/null +++ b/lib/librte_eal/linuxapp/kdp/kdp_fifo.h @@ -0,0 +1,91 @@ +/*- + * GPL LICENSE SUMMARY + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; + * + * Contact Information: + * Intel Corporation + */ + +#ifndef _KDP_FIFO_H_ +#define _KDP_FIFO_H_ + +#include + +/** + * Adds num elements into the fifo. Return the number actually written + */ +static inline unsigned +kdp_fifo_put(struct rte_kdp_fifo *fifo, void **data, unsigned num) +{ + unsigned i = 0; + unsigned fifo_write = fifo->write; + unsigned fifo_read = fifo->read; + unsigned new_write = fifo_write; + + for (i = 0; i < num; i++) { + new_write = (new_write + 1) & (fifo->len - 1); + + if (new_write == fifo_read) + break; + fifo->buffer[fifo_write] = data[i]; + fifo_write = new_write; + } + fifo->write = fifo_write; + + return i; +} + +/** + * Get up to num elements from the fifo. Return the number actully read + */ +static inline unsigned +kdp_fifo_get(struct rte_kdp_fifo *fifo, void **data, unsigned num) +{ + unsigned i = 0; + unsigned new_read = fifo->read; + unsigned fifo_write = fifo->write; + + for (i = 0; i < num; i++) { + if (new_read == fifo_write) + break; + + data[i] = fifo->buffer[new_read]; + new_read = (new_read + 1) & (fifo->len - 1); + } + fifo->read = new_read; + + return i; +} + +/** + * Get the num of elements in the fifo + */ +static inline unsigned +kdp_fifo_count(struct rte_kdp_fifo *fifo) +{ + return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1); +} + +/** + * Get the num of available elements in the fifo + */ +static inline unsigned +kdp_fifo_free_count(struct rte_kdp_fifo *fifo) +{ + return (fifo->read - fifo->write - 1) & (fifo->len - 1); +} + +#endif /* _KDP_FIFO_H_ */ diff --git a/lib/librte_eal/linuxapp/kdp/kdp_net.c b/lib/librte_eal/linuxapp/kdp/kdp_net.c new file mode 100644 index 0000000..08229f1 --- /dev/null +++ b/lib/librte_eal/linuxapp/kdp/kdp_net.c @@ -0,0 +1,862 @@ +/*- + * GPL LICENSE SUMMARY + * + * Copyright(c) 2016 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; + * + * Contact Information: + * Intel Corporation + */ + +/* + * This code is inspired from the book "Linux Device Drivers" by + * Alessandro Rubini and Jonathan Corbet, published by O'Reilly & Associates + */ + +#include +#include +#include /* eth_type_trans */ +#include +#include + +#include "kdp_fifo.h" +#include "kdp_dev.h" + +#define WD_TIMEOUT 5 /*jiffies */ +#define MBUF_BURST_SZ 32 + +#define KDP_RX_LOOP_NUM 1000 +#define KDP_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */ + +static struct task_struct *kdp_kthread; +static struct rw_semaphore kdp_list_lock; +static struct list_head kdp_list_head; + +/* loopback mode */ +static char *lo_mode; + +/* Kernel thread mode */ +static char *kthread_mode; +static unsigned multiple_kthread_on; + +/* typedef for rx function */ +typedef void (*kdp_net_rx_t)(struct kdp_dev *kdp); + +/* + * Open and close + */ +static int kdp_net_open(struct net_device *dev) +{ + random_ether_addr(dev->dev_addr); + netif_start_queue(dev); + + return 0; +} + +static int kdp_net_release(struct net_device *dev) +{ + netif_stop_queue(dev); /* can't transmit any more */ + + return 0; +} + +/* + * Configuration changes (passed on by ifconfig) + */ +static int kdp_net_config(struct net_device *dev, struct ifmap *map) +{ + if (dev->flags & IFF_UP) /* can't act on a running interface */ + return -EBUSY; + + /* ignore other fields */ + return 0; +} + +/* + * Transmit a packet (called by the kernel) + */ +static int kdp_net_tx(struct sk_buff *skb, struct net_device *dev) +{ + int len = 0; + unsigned ret; + struct kdp_dev *kdp = netdev_priv(dev); + struct rte_kdp_mbuf *pkt_kva = NULL; + struct rte_kdp_mbuf *pkt_va = NULL; + + dev->trans_start = jiffies; /* save the timestamp */ + + /* Check if the length of skb is less than mbuf size */ + if (skb->len > kdp->mbuf_size) + goto drop; + + /** + * Check if it has at least one free entry in tx_q and + * one entry in alloc_q. + */ + if (kdp_fifo_free_count(kdp->tx_q) == 0 || + kdp_fifo_count(kdp->alloc_q) == 0) { + /** + * If no free entry in tx_q or no entry in alloc_q, + * drops skb and goes out. + */ + goto drop; + } + + /* dequeue a mbuf from alloc_q */ + ret = kdp_fifo_get(kdp->alloc_q, (void **)&pkt_va, 1); + if (likely(ret == 1)) { + void *data_kva; + + pkt_kva = (void *)pkt_va - kdp->mbuf_va + kdp->mbuf_kva; + data_kva = pkt_kva->buf_addr + pkt_kva->data_off - kdp->mbuf_va + + kdp->mbuf_kva; + + len = skb->len; + memcpy(data_kva, skb->data, len); + if (unlikely(len < ETH_ZLEN)) { + memset(data_kva + len, 0, ETH_ZLEN - len); + len = ETH_ZLEN; + } + pkt_kva->pkt_len = len; + pkt_kva->data_len = len; + + /* enqueue mbuf into tx_q */ + ret = kdp_fifo_put(kdp->tx_q, (void **)&pkt_va, 1); + if (unlikely(ret != 1)) { + /* Failing should not happen */ + KDP_ERR("Fail to enqueue mbuf into tx_q\n"); + goto drop; + } + } else { + /* Failing should not happen */ + KDP_ERR("Fail to dequeue mbuf from alloc_q\n"); + goto drop; + } + + /* Free skb and update statistics */ + dev_kfree_skb(skb); + kdp->stats.tx_bytes += len; + kdp->stats.tx_packets++; + + return NETDEV_TX_OK; + +drop: + /* Free skb and update statistics */ + dev_kfree_skb(skb); + kdp->stats.tx_dropped++; + + return NETDEV_TX_OK; +} + +static int kdp_net_change_mtu(struct net_device *dev, int new_mtu) +{ + KDP_DBG("kdp_net_change_mtu new mtu %d to be set\n", new_mtu); + + dev->mtu = new_mtu; + + return 0; +} + +/* + * Ioctl commands + */ +static int kdp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) +{ + KDP_DBG("kdp_net_ioctl %d\n", + ((struct kdp_dev *)netdev_priv(dev))->port_id); + + return 0; +} + +static void kdp_net_set_rx_mode(struct net_device *dev) +{ +} + +/* + * Return statistics to the caller + */ +static struct net_device_stats *kdp_net_stats(struct net_device *dev) +{ + struct kdp_dev *kdp = netdev_priv(dev); + + return &kdp->stats; +} + +/* + * Deal with a transmit timeout. + */ +static void kdp_net_tx_timeout(struct net_device *dev) +{ + struct kdp_dev *kdp = netdev_priv(dev); + + KDP_DBG("Transmit timeout at %ld, latency %ld\n", jiffies, + jiffies - dev->trans_start); + + kdp->stats.tx_errors++; + netif_wake_queue(dev); +} + +/** + * kdp_net_set_mac - Change the Ethernet Address of the KDP NIC + * @netdev: network interface device structure + * @p: pointer to an address structure + * + * Returns 0 on success, negative on failure + **/ +static int kdp_net_set_mac(struct net_device *netdev, void *p) +{ + struct sockaddr *addr = p; + if (!is_valid_ether_addr((unsigned char *)(addr->sa_data))) + return -EADDRNOTAVAIL; + memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len); + + return 0; +} + +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0)) +static int kdp_net_change_carrier(struct net_device *dev, bool new_carrier) +{ + if (new_carrier) + netif_carrier_on(dev); + else + netif_carrier_off(dev); + + return 0; +} +#endif + +static const struct net_device_ops kdp_net_netdev_ops = { + .ndo_open = kdp_net_open, + .ndo_stop = kdp_net_release, + .ndo_set_config = kdp_net_config, + .ndo_start_xmit = kdp_net_tx, + .ndo_change_mtu = kdp_net_change_mtu, + .ndo_do_ioctl = kdp_net_ioctl, + .ndo_set_rx_mode = kdp_net_set_rx_mode, + .ndo_get_stats = kdp_net_stats, + .ndo_tx_timeout = kdp_net_tx_timeout, + .ndo_set_mac_address = kdp_net_set_mac, +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3, 9, 0)) + .ndo_change_carrier = kdp_net_change_carrier, +#endif +}; + +/* + * Fill the eth header + */ +static int kdp_net_header(struct sk_buff *skb, struct net_device *dev, + unsigned short type, const void *daddr, + const void *saddr, unsigned int len) +{ + struct ethhdr *eth = (struct ethhdr *) skb_push(skb, ETH_HLEN); + + memcpy(eth->h_source, saddr ? saddr : dev->dev_addr, dev->addr_len); + memcpy(eth->h_dest, daddr ? daddr : dev->dev_addr, dev->addr_len); + eth->h_proto = htons(type); + + return dev->hard_header_len; +} + +/* + * Re-fill the eth header + */ +#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0)) +static int kdp_net_rebuild_header(struct sk_buff *skb) +{ + struct net_device *dev = skb->dev; + struct ethhdr *eth = (struct ethhdr *) skb->data; + + memcpy(eth->h_source, dev->dev_addr, dev->addr_len); + memcpy(eth->h_dest, dev->dev_addr, dev->addr_len); + + return 0; +} +#endif /* < 4.1.0 */ + +static const struct header_ops kdp_net_header_ops = { + .create = kdp_net_header, +#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0)) + .rebuild = kdp_net_rebuild_header, +#endif /* < 4.1.0 */ + .cache = NULL, /* disable caching */ +}; + +static void kdp_net_setup(struct net_device *dev) +{ + struct kdp_dev *kdp; + + ether_setup(dev); + dev->netdev_ops = &kdp_net_netdev_ops; + dev->header_ops = &kdp_net_header_ops; + dev->watchdog_timeo = WD_TIMEOUT; + + kdp = netdev_priv(dev); + init_waitqueue_head(&kdp->wq); + mutex_init(&kdp->sync_lock); + + dev->flags |= IFF_UP; +} + +/* + * RX: normal working mode + */ +static void kdp_net_rx_normal(struct kdp_dev *kdp) +{ + unsigned ret; + uint32_t len; + unsigned i, num_rx, num_fq; + struct rte_kdp_mbuf *kva; + struct rte_kdp_mbuf *va[MBUF_BURST_SZ]; + void *data_kva; + unsigned mbuf_burst_size = MBUF_BURST_SZ; + + struct sk_buff *skb; + struct net_device *dev = kdp->net_dev; + + /* Get the number of free entries in free_q */ + num_fq = kdp_fifo_free_count(kdp->free_q); + if (num_fq == 0) { + /* No room on the free_q, bail out */ + return; + } + + /* Calculate the number of entries to dequeue from rx_q */ + num_rx = min(num_fq, mbuf_burst_size); + + /* Burst dequeue from rx_q */ + num_rx = kdp_fifo_get(kdp->rx_q, (void **)va, num_rx); + if (num_rx == 0) + return; + + /* Transfer received packets to netif */ + for (i = 0; i < num_rx; i++) { + kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva; + len = kva->data_len; + data_kva = kva->buf_addr + kva->data_off - kdp->mbuf_va + + kdp->mbuf_kva; + + skb = dev_alloc_skb(len + 2); + if (!skb) { + KDP_ERR("Out of mem, dropping pkts\n"); + /* Update statistics */ + kdp->stats.rx_dropped++; + } else { + /* Align IP on 16B boundary */ + skb_reserve(skb, 2); + memcpy(skb_put(skb, len), data_kva, len); + skb->dev = dev; + skb->protocol = eth_type_trans(skb, dev); + skb->ip_summed = CHECKSUM_UNNECESSARY; + + /* Call netif interface */ + netif_rx(skb); + + /* Update statistics */ + kdp->stats.rx_bytes += len; + kdp->stats.rx_packets++; + } + } + + /* Burst enqueue mbufs into free_q */ + ret = kdp_fifo_put(kdp->free_q, (void **)va, num_rx); + if (ret != num_rx) + /* Failing should not happen */ + KDP_ERR("Fail to enqueue entries into free_q\n"); +} + +/* + * RX: loopback with enqueue/dequeue fifos. + */ +static void kdp_net_rx_lo_fifo(struct kdp_dev *kdp) +{ + unsigned ret; + uint32_t len; + unsigned i, num, num_rq, num_tq, num_aq, num_fq; + struct rte_kdp_mbuf *kva; + struct rte_kdp_mbuf *va[MBUF_BURST_SZ]; + void *data_kva; + struct rte_kdp_mbuf *alloc_kva; + struct rte_kdp_mbuf *alloc_va[MBUF_BURST_SZ]; + void *alloc_data_kva; + unsigned mbuf_burst_size = MBUF_BURST_SZ; + + /* Get the number of entries in rx_q */ + num_rq = kdp_fifo_count(kdp->rx_q); + + /* Get the number of free entrie in tx_q */ + num_tq = kdp_fifo_free_count(kdp->tx_q); + + /* Get the number of entries in alloc_q */ + num_aq = kdp_fifo_count(kdp->alloc_q); + + /* Get the number of free entries in free_q */ + num_fq = kdp_fifo_free_count(kdp->free_q); + + /* Calculate the number of entries to be dequeued from rx_q */ + num = min(num_rq, num_tq); + num = min(num, num_aq); + num = min(num, num_fq); + num = min(num, mbuf_burst_size); + + /* Return if no entry to dequeue from rx_q */ + if (num == 0) + return; + + /* Burst dequeue from rx_q */ + ret = kdp_fifo_get(kdp->rx_q, (void **)va, num); + if (ret == 0) + return; /* Failing should not happen */ + + /* Dequeue entries from alloc_q */ + ret = kdp_fifo_get(kdp->alloc_q, (void **)alloc_va, num); + if (ret) { + num = ret; + /* Copy mbufs */ + for (i = 0; i < num; i++) { + kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva; + len = kva->pkt_len; + data_kva = kva->buf_addr + kva->data_off - + kdp->mbuf_va + kdp->mbuf_kva; + + alloc_kva = (void *)alloc_va[i] - kdp->mbuf_va + + kdp->mbuf_kva; + alloc_data_kva = alloc_kva->buf_addr + + alloc_kva->data_off - kdp->mbuf_va + + kdp->mbuf_kva; + memcpy(alloc_data_kva, data_kva, len); + alloc_kva->pkt_len = len; + alloc_kva->data_len = len; + + kdp->stats.tx_bytes += len; + kdp->stats.rx_bytes += len; + } + + /* Burst enqueue mbufs into tx_q */ + ret = kdp_fifo_put(kdp->tx_q, (void **)alloc_va, num); + if (ret != num) + /* Failing should not happen */ + KDP_ERR("Fail to enqueue mbufs into tx_q\n"); + } + + /* Burst enqueue mbufs into free_q */ + ret = kdp_fifo_put(kdp->free_q, (void **)va, num); + if (ret != num) + /* Failing should not happen */ + KDP_ERR("Fail to enqueue mbufs into free_q\n"); + + /** + * Update statistic, and enqueue/dequeue failure is impossible, + * as all queues are checked at first. + */ + kdp->stats.tx_packets += num; + kdp->stats.rx_packets += num; +} + +/* + * RX: loopback with enqueue/dequeue fifos and sk buffer copies. + */ +static void kdp_net_rx_lo_fifo_skb(struct kdp_dev *kdp) +{ + unsigned ret; + uint32_t len; + unsigned i, num_rq, num_fq, num; + struct rte_kdp_mbuf *kva; + struct rte_kdp_mbuf *va[MBUF_BURST_SZ]; + void *data_kva; + struct sk_buff *skb; + struct net_device *dev = kdp->net_dev; + unsigned mbuf_burst_size = MBUF_BURST_SZ; + + /* Get the number of entries in rx_q */ + num_rq = kdp_fifo_count(kdp->rx_q); + + /* Get the number of free entries in free_q */ + num_fq = kdp_fifo_free_count(kdp->free_q); + + /* Calculate the number of entries to dequeue from rx_q */ + num = min(num_rq, num_fq); + num = min(num, mbuf_burst_size); + + /* Return if no entry to dequeue from rx_q */ + if (num == 0) + return; + + /* Burst dequeue mbufs from rx_q */ + ret = kdp_fifo_get(kdp->rx_q, (void **)va, num); + if (ret == 0) + return; + + /* Copy mbufs to sk buffer and then call tx interface */ + for (i = 0; i < num; i++) { + kva = (void *)va[i] - kdp->mbuf_va + kdp->mbuf_kva; + len = kva->data_len; + data_kva = kva->buf_addr + kva->data_off - kdp->mbuf_va + + kdp->mbuf_kva; + + skb = dev_alloc_skb(len + 2); + if (skb == NULL) + KDP_ERR("Out of mem, dropping pkts\n"); + else { + /* Align IP on 16B boundary */ + skb_reserve(skb, 2); + memcpy(skb_put(skb, len), data_kva, len); + skb->dev = dev; + skb->ip_summed = CHECKSUM_UNNECESSARY; + dev_kfree_skb(skb); + } + + /* Simulate real usage, allocate/copy skb twice */ + skb = dev_alloc_skb(len + 2); + if (skb == NULL) { + KDP_ERR("Out of mem, dropping pkts\n"); + kdp->stats.rx_dropped++; + } else { + /* Align IP on 16B boundary */ + skb_reserve(skb, 2); + memcpy(skb_put(skb, len), data_kva, len); + skb->dev = dev; + skb->ip_summed = CHECKSUM_UNNECESSARY; + + kdp->stats.rx_bytes += len; + kdp->stats.rx_packets++; + + /* call tx interface */ + kdp_net_tx(skb, dev); + } + } + + /* enqueue all the mbufs from rx_q into free_q */ + ret = kdp_fifo_put(kdp->free_q, (void **)&va, num); + if (ret != num) + /* Failing should not happen */ + KDP_ERR("Fail to enqueue mbufs into free_q\n"); +} + +/* kdp rx function pointer, with default to normal rx */ +static kdp_net_rx_t kdp_net_rx_func = kdp_net_rx_normal; + +/* rx interface */ +static void kdp_net_rx(struct kdp_dev *kdp) +{ + /** + * It doesn't need to check if it is NULL pointer, + * as it has a default value + */ + (*kdp_net_rx_func)(kdp); +} + +static int kdp_thread_single(void *data) +{ + struct kdp_dev *dev; + int j; + + while (!kthread_should_stop()) { + down_read(&kdp_list_lock); + for (j = 0; j < KDP_RX_LOOP_NUM; j++) { + list_for_each_entry(dev, &kdp_list_head, list) { + kdp_net_rx(dev); + } + } + up_read(&kdp_list_lock); +#ifdef RTE_KDP_PREEMPT_DEFAULT + /* reschedule out for a while */ + schedule_timeout_interruptible( + usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL)); +#endif + } + + return 0; +} + +static int kdp_thread_multiple(void *param) +{ + int j; + struct kdp_dev *dev = (struct kdp_dev *)param; + + while (!kthread_should_stop()) { + for (j = 0; j < KDP_RX_LOOP_NUM; j++) + kdp_net_rx(dev); + +#ifdef RTE_KDP_PREEMPT_DEFAULT + schedule_timeout_interruptible( + usecs_to_jiffies(KDP_KTHREAD_RESCHEDULE_INTERVAL)); +#endif + } + + return 0; +} + +static void kdp_setup(struct kdp_dev *kdp, + struct rte_kdp_device_info *info) +{ + kdp->port_id = info->port_id; + kdp->core_id = info->core_id; + strncpy(kdp->name, info->name, RTE_KDP_NAMESIZE); + + /* Translate user space info into kernel space info */ + kdp->tx_q = phys_to_virt(info->tx_phys); + kdp->rx_q = phys_to_virt(info->rx_phys); + kdp->alloc_q = phys_to_virt(info->alloc_phys); + kdp->free_q = phys_to_virt(info->free_phys); + + kdp->mbuf_kva = phys_to_virt(info->mbuf_phys); + kdp->mbuf_va = info->mbuf_va; + + kdp->mbuf_size = info->mbuf_size; + + KDP_PRINT("tx_phys: 0x%016llx, tx_q addr: 0x%p\n", + (unsigned long long) info->tx_phys, kdp->tx_q); + KDP_PRINT("rx_phys: 0x%016llx, rx_q addr: 0x%p\n", + (unsigned long long) info->rx_phys, kdp->rx_q); + KDP_PRINT("alloc_phys: 0x%016llx, alloc_q addr: 0x%p\n", + (unsigned long long) info->alloc_phys, kdp->alloc_q); + KDP_PRINT("free_phys: 0x%016llx, free_q addr: 0x%p\n", + (unsigned long long) info->free_phys, kdp->free_q); + KDP_PRINT("mbuf_phys: 0x%016llx, mbuf_kva: 0x%p\n", + (unsigned long long) info->mbuf_phys, kdp->mbuf_kva); + KDP_PRINT("mbuf_va: 0x%p\n", info->mbuf_va); + KDP_PRINT("mbuf_size: %u\n", kdp->mbuf_size); +} + +static int create_kthread(struct kdp_dev *kdp, + struct rte_kdp_device_info *info) +{ + /** + * Create a new kernel thread for multiple mode, set its core affinity, + * and finally wake it up. + */ + if (multiple_kthread_on) { + kdp->pthread = kthread_create(kdp_thread_multiple, + (void *)kdp, "kdp_%s", kdp->name); + if (IS_ERR(kdp->pthread)) + return -ECANCELED; + + if (info->force_bind) + kthread_bind(kdp->pthread, kdp->core_id); + + wake_up_process(kdp->pthread); + + return 0; + } + + /* single thread */ + if (kdp_kthread == NULL) { + KDP_PRINT("Single kernel thread for all KDP devices\n"); + + /* Create kernel thread for RX */ + kdp_kthread = kthread_run(kdp_thread_single, NULL, + "kdp_single"); + if (IS_ERR(kdp_kthread)) { + KDP_ERR("Unable to create kernel threaed\n"); + return PTR_ERR(kdp_kthread); + } + } + + return 0; +} + +static int kdp_net_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct nlattr *data[]) +{ + struct rte_kdp_device_info dev_info; + struct kdp_dev *kdp; + int ret; + + kdp = netdev_priv(dev); + + if (data && data[IFLA_KDP_PORTID]) + kdp->port_id = nla_get_u8(data[IFLA_KDP_PORTID]); + else + goto error_free; + + if (data && data[IFLA_KDP_DEVINFO]) + memcpy(&dev_info, nla_data(data[IFLA_KDP_DEVINFO]), + sizeof(struct rte_kdp_device_info)); + else + goto error_free; + + /** + * Check if the cpu core id is valid for binding, + * for multiple kernel thread mode. + */ + if (multiple_kthread_on && dev_info.force_bind && + !cpu_online(dev_info.core_id)) { + KDP_ERR("cpu %u is not online\n", dev_info.core_id); + goto error_free; + } + + kdp->net_dev = dev; + kdp_setup(kdp, &dev_info); + + ret = register_netdevice(dev); + if (ret < 0) + goto error_free; + + ret = create_kthread(kdp, &dev_info); + if (ret < 0) + goto error_unregister; + + down_write(&kdp_list_lock); + list_add(&kdp->list, &kdp_list_head); + up_write(&kdp_list_lock); + + return 0; + +error_unregister: + unregister_netdev(dev); +error_free: + free_netdev(dev); + return -EINVAL; +} + +static void single_kthread_stop(void) +{ + /* Stop kernel thread for single mode */ + if (multiple_kthread_on == 0 && kdp_kthread != NULL) { + kthread_stop(kdp_kthread); + kdp_kthread = NULL; + } +} + +static void multiple_kthread_stop(struct kdp_dev *kdp) +{ + /* Stop kernel thread for multiple mode */ + if (multiple_kthread_on && kdp->pthread != NULL) { + kthread_stop(kdp->pthread); + kdp->pthread = NULL; + } +} + +static void kdp_net_dellink(struct net_device *dev, struct list_head *head) +{ + struct kdp_dev *kdp; + + kdp = netdev_priv(dev); + + down_write(&kdp_list_lock); + list_del(&kdp->list); + up_write(&kdp_list_lock); + + multiple_kthread_stop(kdp); + + down_write(&kdp_list_lock); + if (list_empty(&kdp_list_head)) + single_kthread_stop(); + up_write(&kdp_list_lock); + + unregister_netdevice_queue(dev, head); +} + +static struct rtnl_link_ops kdp_link_ops __read_mostly = { + .kind = KDP_DEVICE, + .priv_size = sizeof(struct kdp_dev), + .setup = kdp_net_setup, + .maxtype = IFLA_KDP_MAX, + .newlink = kdp_net_newlink, + .dellink = kdp_net_dellink, +}; + +static int __init +kdp_parse_kthread_mode(void) +{ + if (!kthread_mode) + return 0; + + if (strcmp(kthread_mode, "single") == 0) + return 0; + else if (strcmp(kthread_mode, "multiple") == 0) + multiple_kthread_on = 1; + else + return -1; + + return 0; +} + +static void kdp_net_config_lo_mode(char *lo_str) +{ + if (!lo_str) { + KDP_PRINT("loopback disabled"); + return; + } + + if (!strcmp(lo_str, "lo_mode_none")) + KDP_PRINT("loopback disabled"); + else if (!strcmp(lo_str, "lo_mode_fifo")) { + KDP_PRINT("loopback mode=lo_mode_fifo enabled"); + kdp_net_rx_func = kdp_net_rx_lo_fifo; + } else if (!strcmp(lo_str, "lo_mode_fifo_skb")) { + KDP_PRINT("loopback mode=lo_mode_fifo_skb enabled"); + kdp_net_rx_func = kdp_net_rx_lo_fifo_skb; + } else + KDP_PRINT("Incognizant parameter, loopback disabled"); +} + +static int __init kdp_init(void) +{ + if (kdp_parse_kthread_mode() < 0) { + KDP_ERR("Invalid parameter for kthread_mode\n"); + return -EINVAL; + } + + /* Configure the lo mode according to the input parameter */ + kdp_net_config_lo_mode(lo_mode); + + init_rwsem(&kdp_list_lock); + INIT_LIST_HEAD(&kdp_list_head); + + return rtnl_link_register(&kdp_link_ops); +} +module_init(kdp_init); + +static void kdp_release(void) +{ + struct kdp_dev *kdp, *n; + + single_kthread_stop(); + + down_write(&kdp_list_lock); + list_for_each_entry_safe(kdp, n, &kdp_list_head, list) { + multiple_kthread_stop(kdp); + list_del(&kdp->list); + } + up_write(&kdp_list_lock); +} + +static void __exit kdp_exit(void) +{ + kdp_release(); + rtnl_link_unregister(&kdp_link_ops); +} +module_exit(kdp_exit); + +module_param(lo_mode, charp, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(lo_mode, +"KDP loopback mode (default=lo_mode_none):\n" +" lo_mode_none Kernel loopback disabled\n" +" lo_mode_fifo Enable kernel loopback with fifo\n" +" lo_mode_fifo_skb Enable kernel loopback with fifo and skb buffer\n" +"\n" +); + +module_param(kthread_mode, charp, S_IRUGO); +MODULE_PARM_DESC(kthread_mode, +"Kernel thread mode (default=single):\n" +" single Single kernel thread mode enabled.\n" +" multiple Multiple kernel thread mode enabled.\n" +"\n" +); + +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_AUTHOR("Intel Corporation"); +MODULE_DESCRIPTION("Kernel Module for managing kdp devices"); -- 2.5.0