* [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card
@ 2015-10-02 11:25 Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 1/3] This patch adds a PMD driver for Netronome NFP PCI cards Alejandro.Lucero
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Alejandro.Lucero @ 2015-10-02 11:25 UTC (permalink / raw)
To: dev
From: "Alejandro.Lucero" <alejandro.lucero@netronome.com>
Alejandro.Lucero (3):
This patch adds a PMD driver for Netronome NFP PCI cards.
This patch adds a new UIO driver for Netronome NFP PCI cards.
Modifying configuration scripts for Netronome's nfp_uio driver.
config/common_linuxapp | 6 +
doc/guides/nics/nfp.rst | 248 +++
drivers/net/Makefile | 1 +
drivers/net/nfp/Makefile | 88 +
drivers/net/nfp/nfp_net.c | 2480 +++++++++++++++++++++++++++++
drivers/net/nfp/nfp_net_ctrl.h | 294 ++++
drivers/net/nfp/nfp_net_logs.h | 76 +
drivers/net/nfp/nfp_net_pmd.h | 415 +++++
lib/librte_eal/common/include/rte_pci.h | 1 +
lib/librte_eal/linuxapp/Makefile | 3 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 4 +
lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 2 +-
lib/librte_eal/linuxapp/nfp_uio/Makefile | 53 +
lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c | 497 ++++++
lib/librte_ether/rte_ethdev.c | 1 +
mk/rte.app.mk | 1 +
tools/dpdk_nic_bind.py | 8 +-
tools/setup.sh | 122 +-
18 files changed, 4270 insertions(+), 30 deletions(-)
create mode 100644 doc/guides/nics/nfp.rst
create mode 100644 drivers/net/nfp/Makefile
create mode 100644 drivers/net/nfp/nfp_net.c
create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
create mode 100644 drivers/net/nfp/nfp_net_logs.h
create mode 100644 drivers/net/nfp/nfp_net_pmd.h
create mode 100644 lib/librte_eal/linuxapp/nfp_uio/Makefile
create mode 100644 lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c
--
1.7.9.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* [dpdk-dev] [PATCH 1/3] This patch adds a PMD driver for Netronome NFP PCI cards.
2015-10-02 11:25 [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Alejandro.Lucero
@ 2015-10-02 11:25 ` Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 2/3] This patch adds a new UIO " Alejandro.Lucero
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Alejandro.Lucero @ 2015-10-02 11:25 UTC (permalink / raw)
To: dev
From: "Alejandro.Lucero" <alejandro.lucero@netronome.com>
Signed-off-by: Alejandro.Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf.Neugebauer <rolf.neugebauer@netronome.com>
---
config/common_linuxapp | 6 +
doc/guides/nics/nfp.rst | 248 ++++
drivers/net/Makefile | 1 +
drivers/net/nfp/Makefile | 88 ++
drivers/net/nfp/nfp_net.c | 2480 ++++++++++++++++++++++++++++++++++++++
drivers/net/nfp/nfp_net_ctrl.h | 294 +++++
drivers/net/nfp/nfp_net_logs.h | 76 ++
drivers/net/nfp/nfp_net_pmd.h | 415 +++++++
lib/librte_eal/linuxapp/Makefile | 3 +
mk/rte.app.mk | 1 +
10 files changed, 3612 insertions(+)
create mode 100644 doc/guides/nics/nfp.rst
create mode 100644 drivers/net/nfp/Makefile
create mode 100644 drivers/net/nfp/nfp_net.c
create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
create mode 100644 drivers/net/nfp/nfp_net_logs.h
create mode 100644 drivers/net/nfp/nfp_net_pmd.h
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 0de43d5..d8d6384 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -108,6 +108,7 @@ CONFIG_RTE_LIBEAL_USE_HPET=n
CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
CONFIG_RTE_EAL_IGB_UIO=y
+CONFIG_RTE_EAL_NFP_UIO=y
CONFIG_RTE_EAL_VFIO=y
CONFIG_RTE_MALLOC_DEBUG=n
@@ -238,6 +239,11 @@ CONFIG_RTE_LIBRTE_ENIC_PMD=y
CONFIG_RTE_LIBRTE_ENIC_DEBUG=n
#
+# Compile burst-oriented Netronome PMD driver
+#
+CONFIG_RTE_LIBRTE_NFP_PMD=y
+
+#
# Compile burst-oriented VIRTIO PMD driver
#
CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
new file mode 100644
index 0000000..df5a746
--- /dev/null
+++ b/doc/guides/nics/nfp.rst
@@ -0,0 +1,248 @@
+.. BSD LICENSE
+ Copyright(c) 2015 Netronome Systems, Inc. All rights reserved.
+ All rights reserved.
+
+ Redistribution and use in source and binary forms, with or without
+ modification, are permitted provided that the following conditions
+ are met:
+
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in
+ the documentation and/or other materials provided with the
+ distribution.
+ * Neither the name of Intel Corporation nor the names of its
+ contributors may be used to endorse or promote products derived
+ from this software without specific prior written permission.
+
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+1. Intro
+========
+
+Netronome's sixth generation of flow processors pack 216 programmable
+cores and over 100 hardware accelerators that uniquely combine packet,
+flow, security and content processing in a single device that scales
+up to 400 Gbps.
+
+This document explains how to use DPDK with the Netronome Poll Mode
+Driver (PMD) supporting Netronome's Network Flow Processor 6xxx
+(NFP-6xxx).
+
+Currently the driver supports virtual functions (VFs) only.
+
+2. Dependencies
+===============
+
+Before using the Netronome's DPDK PMD some NFP-6xxx configuration,
+which is not related to DPDK, is required. The system requires
+installation of Netronome's BSP (Board Support Package) which includes
+Linux drivers, programs and libraries.
+
+If you have a NFP-6xxx device you should already have the code and
+documentation for doing this configuration. Contact
+support@netronome.com to obtain the latest available firmware.
+
+The NFP Linux kernel drivers (including the required PF driver for the
+NFP) are available on Github at
+https://github.com/Netronome/nfp-drv-kmods along with build
+instructions.
+
+DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to
+allow access to physical devices from userspace. The NFP PMD requires
+a separate UIO driver, nfp_uio, to perform correct
+initialization. This driver is part of the DPDK source tree and is
+equivalent to Intel's igb_uio driver.
+
+3. Building the software
+========================
+
+Netronome's PMD code is provided in the drivers/net/nfp directory and
+nfp_uio is present in the lib/librte_eal/linuxapp/nfp_uio directory. Both
+are part of the DPDK build if the common_linuxapp configuration file is
+used. If you use another configuration file and want to have NFP support
+just add:
+
+CONFIG_RTE_EAL_NFP_UIO=y
+CONFIG_RTE_LIBRTE_NFP_PMD=y
+
+Once DPDK is built all the DPDK apps and examples include support for
+the NFP PMD. The nfp_uio.ko module will be at build/kmods directory or
+at the directory specified when building DPDK.
+
+
+4. System configuration
+=======================
+
+Using the NFP PMD is not different to using other PMDs. Usual steps are:
+
+1) Configure hugepages
+
+ All major Linux distributions have the hugepages functionality
+ enabled by default. By default this allows the system uses for
+ working with transparent hugepages. But in this case some hugepages
+ need to be created/reserved for use with the DPDK through the
+ hugetlbfs file system. First the virtual file system need to be
+ mounted:
+
+ mount -t hugetlbfs none /mnt/hugetlbfs
+
+ The command uses the common mount point for this file system and it
+ needs to be created if necessary.
+
+ Configuring hugepages is performed via sysfs:
+
+ /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+ This sysfs file is used to specify the number of hugepages to reserve.
+ For example:
+
+ echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+ This will reserve 2GB of memory using 1024 2MB hugepages. The file
+ may be read to see if the operation was performed correctly:
+
+ cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+ The number of unused hugepages may also be inspected.
+ Before executing the DPDK app it should match the value of
+ nr_hugepages.
+
+ cat /sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages
+
+ The hugepages reservation should be performed at system
+ initialisation and it is usual to use a kernel parameter for
+ configuration. If the reservation is attempted on a busy
+ system it will likely fail. Reserving memory for hugepages may
+ be done adding the following to the grub kernel command line:
+
+ default_hugepagesz=1M hugepagesz=2M hugepages=1024
+
+ This will reserve 2GBytes of memory using 2Mbytes huge pages.
+
+ Finally, for a NUMA system the allocation needs to be made on the
+ correct NUMA node. In a DPDK app there is a master core which will
+ (usually) perform memory allocation. It is important that some of the
+ hugepages are reserved on the NUMA memory node where the network
+ device is attached. This is because of a restriction in DPDK by which
+ TX and RX descriptors rings must be created on the master code.
+ Per-node allocation of hugepages may be inspected and controlled
+ using sysfs.
+
+ For example:
+
+ cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+ For a NUMA system there will be a specific hugepage directory
+ per node allowing control of hugepage reservation. A common
+ problem may occur when hugepages reservation is performed
+ after the system has been working for some time. Configuration
+ using the global sysfs hugepage interface will succeed but the
+ per-node allocations may be unsatisfactory.
+
+ The number of hugepages that need to be reserved depends on how the
+ app uses TX and RX descriptors, and packets mbufs.
+
+2) Enable SR-IOV on the NFP-6xxx device
+
+ The current NFP PMD works with Virtual Functions (VFs) on a
+ NFP device. Make sure that one of the Physical Function (PF)
+ drivers from the above Github repository is installed and
+ loaded.
+
+ Virtual Functions need to be enabled before they can be used
+ with the PMD. Before enabling the VFs it is useful to obtain
+ information about the current NFP PCI device detected by the
+ system:
+
+ lspci -d19ee:
+
+ Now, for example, configure two virtual functions on a NFP-6xxx device
+ whose PCI system identity is "0000:03:00.0":
+
+ echo 2 > /sys/bus/pci/devices/0000:03:00.0/sriov_numvfs
+
+ The result of this command may be shown using lspci again:
+
+ lspci -d19ee: -k
+
+ Two new PCI devices should appear in the output of the above
+ command. The -k option shows the device driver, if any, that
+ devices are bound to. Depending on the modules loaded at this
+ point the new PCI devices may be bound to nfp_netvf driver.
+
+3) To install the uio kernel module (manually)
+
+ All major Linux distributions have support for this kernel module so
+ it is straightforward to install it:
+
+ modprobe uio
+
+ The module should now be listed by the lsmod command.
+
+4) To install the nfp_uio kernel module (manually)
+
+ This module supports NFP-6xxx devices through the UIO interface.
+
+ After compilation the module should be in the build kmod directory or
+ wherever the build directory is configured.
+
+ cd build/kmod
+ insmod ./nfp_uio.ko
+
+ The module should now be listed by the lsmod command.
+
+ Depending on which NFP modules are loaded, nfp_uio may be
+ automatically bound to the NFP PCI devices by the system. Otherwise
+ the binding needs to be done explicitly. This is the case when
+ nfp_netvf, the Linux kernel driver for NFP VFs, was loaded when VFs
+ were created. As described later in this document this
+ configuration may also be performed using scripts provided by DPDK.
+
+ First the device needs to be unbound, for example from the
+ nfp_netvf driver:
+
+ echo 0000:03:08.0 > /sys/bus/pci/devices/0000:03:08.0/driver/unbind
+
+ lspci -d19ee: -k
+
+ The output of lspci should now show that 0000:03:08.0 is not bound to
+ any driver.
+
+ The next step is to add the NFP PCI ID to the NFP UIO driver:
+
+ echo 19ee 6003 > /sys/bus/pci/drivers/nfp_uio/new_id
+
+ And then to bind the device to the nfp_uio driver:
+
+ echo 0000:03:08.0 > /sys/bus/pci/drivers/nfp_uio/bind
+
+ lspci -d19ee: -k
+
+ lspci should show that device bound to nfp_uio driver.
+
+5) Using tools from DPDK source to install and bind modules
+
+ DPDK provides scripts which are useful for installing the UIO
+ modules and for binding the right device to those modules avoiding
+ doing so manually.
+
+ In the tools directory of the DPDK source there are two scripts:
+
+ setup.sh
+ dpdk_nic_bind.py
+
+ Configuration may be performed by running setup.sh which invokes
+ dpdk_nic_bind.py as needed. Executing setup.sh will display a menu
+ of configuration options.
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 5ebf963..bc08591 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -48,6 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += ring
DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt
+DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp
include $(RTE_SDK)/mk/rte.sharelib.mk
include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile
new file mode 100644
index 0000000..ef74e27
--- /dev/null
+++ b/drivers/net/nfp/Makefile
@@ -0,0 +1,88 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_nfp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+#
+# Add extra flags for base driver files (also known as shared code)
+# to disable warnings
+#
+ifeq ($(CC), icc)
+CFLAGS_BASE_DRIVER = -wd593
+else ifeq ($(CC), clang)
+CFLAGS_BASE_DRIVER += -Wno-sign-compare
+CFLAGS_BASE_DRIVER += -Wno-unused-value
+CFLAGS_BASE_DRIVER += -Wno-unused-parameter
+CFLAGS_BASE_DRIVER += -Wno-strict-aliasing
+CFLAGS_BASE_DRIVER += -Wno-format
+CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers
+CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
+CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
+else
+CFLAGS_BASE_DRIVER = -Wno-sign-compare
+CFLAGS_BASE_DRIVER += -Wno-unused-value
+CFLAGS_BASE_DRIVER += -Wno-unused-parameter
+CFLAGS_BASE_DRIVER += -Wno-strict-aliasing
+CFLAGS_BASE_DRIVER += -Wno-format
+CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers
+CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
+CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
+CFLAGS_BASE_DRIVER += -Wno-format-security
+
+ifeq ($(shell test $(GCC_VERSION) -ge 44 && echo 1), 1)
+CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable
+endif
+
+endif
+OBJS_BASE_DRIVER=$(patsubst %.c,%.o,$(notdir $(wildcard $(RTE_SDK)/lib/librte_pmd_nfp/*.c)))
+$(foreach obj, $(OBJS_BASE_DRIVER), $(eval CFLAGS_$(obj)+=$(CFLAGS_BASE_DRIVER)))
+
+VPATH += $(RTE_SDK)/drivers/net/nfp/
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp_net.c
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_eal lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_mempool lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_net lib/librte_malloc
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
new file mode 100644
index 0000000..f1cb2d6
--- /dev/null
+++ b/drivers/net/nfp/nfp_net.c
@@ -0,0 +1,2480 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Small portions derived from code Copyright(c) 2010-2015 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived from this
+ * software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+/*
+ * vim:shiftwidth=8:noexpandtab
+ *
+ * @file dpdk/pmd/nfp_net.c
+ *
+ * Netronome vNIC DPDK Poll-Mode Driver: Main entry point
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/socket.h>
+#include <sys/io.h>
+#include <assert.h>
+#include <time.h>
+#include <math.h>
+#include <inttypes.h>
+
+#include <rte_byteorder.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_ethdev.h>
+#include <rte_dev.h>
+#include <rte_ether.h>
+#include <rte_malloc.h>
+#include <rte_memzone.h>
+#include <rte_mempool.h>
+#include <rte_version.h>
+#include <rte_string_fns.h>
+#include <rte_alarm.h>
+
+#include "nfp_net_pmd.h"
+#include "nfp_net_logs.h"
+#include "nfp_net_ctrl.h"
+
+/*
+ * Prototypes
+ */
+static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_configure(struct rte_eth_dev *dev);
+static int nfp_net_start(struct rte_eth_dev *dev);
+static void nfp_net_stop(struct rte_eth_dev *dev);
+static void nfp_net_close(struct rte_eth_dev *dev);
+
+static void nfp_net_promisc_enable(struct rte_eth_dev *dev);
+static void nfp_net_promisc_disable(struct rte_eth_dev *dev);
+
+static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
+static void nfp_net_stats_get(struct rte_eth_dev *dev,
+ struct rte_eth_stats *stats);
+static void nfp_net_stats_reset(struct rte_eth_dev *dev);
+
+static void nfp_net_infos_get(struct rte_eth_dev *dev,
+ struct rte_eth_dev_info *dev_info);
+static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
+static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
+static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
+ uint16_t queue_idx);
+static int nfp_net_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_rxconf *rx_conf,
+ struct rte_mempool *mp);
+static void nfp_net_rx_queue_release(void *rxq);
+static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_txconf *tx_conf);
+static void nfp_net_tx_queue_release(void *txq);
+
+static int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
+static uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts);
+
+static void nfp_net_dev_interrupt_handler(struct rte_intr_handle *handle,
+ void *param);
+static void nfp_net_dev_interrupt_delayed_handler(void *param);
+
+/*
+ * The offset of the queue controller queues in the PCIe Target. These
+ * happen to be at the same offset on the NFP6000 and the NFP3200 so
+ * we use a single macro here.
+ */
+#define NFP_PCIE_QUEUE(_q) (0x80000 + (0x800 * ((_q) & 0xff)))
+
+/* Maximum value which can be added to a queue with one transaction */
+#define NFP_QCP_MAX_ADD 0x7f
+
+#define RTE_MBUF_DMA_ADDR_DEFAULT(mb) \
+ (uint64_t) ((mb)->buf_physaddr + RTE_PKTMBUF_HEADROOM)
+
+/* nfp_qcp_ptr - Read or Write Pointer of a queue */
+enum nfp_qcp_ptr {
+ NFP_QCP_READ_PTR = 0,
+ NFP_QCP_WRITE_PTR
+};
+
+/**
+ * nfp_qcp_ptr_add - Add the value to the selected pointer of a queue
+ * @q: Base address for queue structure
+ * @ptr: Add to the Read or Write pointer
+ * @val: Value to add to the queue pointer
+ *
+ * If @val is greater than @NFP_QCP_MAX_ADD multiple writes are performed.
+ */
+static inline void
+nfp_qcp_ptr_add(__u8 *q, enum nfp_qcp_ptr ptr, __u32 val)
+{
+ __u32 off;
+
+ if (ptr == NFP_QCP_READ_PTR)
+ off = NFP_QCP_QUEUE_ADD_RPTR;
+ else
+ off = NFP_QCP_QUEUE_ADD_WPTR;
+
+ while (val > NFP_QCP_MAX_ADD) {
+ nn_writel(rte_cpu_to_le_32(NFP_QCP_MAX_ADD), q + off);
+ val -= NFP_QCP_MAX_ADD;
+ }
+
+ nn_writel(rte_cpu_to_le_32(val), q + off);
+}
+
+/**
+ * nfp_qcp_read - Read the current Read/Write pointer value for a queue
+ * @q: Base address for queue structure
+ * @ptr: Read or Write pointer
+ */
+static inline __u32
+nfp_qcp_read(__u8 *q, enum nfp_qcp_ptr ptr)
+{
+ __u32 off;
+ __u32 val;
+
+ if (ptr == NFP_QCP_READ_PTR)
+ off = NFP_QCP_QUEUE_STS_LO;
+ else
+ off = NFP_QCP_QUEUE_STS_HI;
+
+ val = rte_cpu_to_le_32(nn_readl(q + off));
+
+ if (ptr == NFP_QCP_READ_PTR)
+ return val & NFP_QCP_QUEUE_STS_LO_READPTR_mask;
+ else
+ return val & NFP_QCP_QUEUE_STS_HI_WRITEPTR_mask;
+}
+
+/*
+ * Functions to read/write from/to Config BAR
+ * Performs any endian conversion necessary.
+ */
+static inline __u8
+nn_cfg_readb(struct nfp_net_hw *hw, int off)
+{
+ return nn_readb(hw->ctrl_bar + off);
+}
+
+static inline void
+nn_cfg_writeb(struct nfp_net_hw *hw, int off, __u8 val)
+{
+ nn_writeb(val, hw->ctrl_bar + off);
+}
+
+static inline __u32
+nn_cfg_readl(struct nfp_net_hw *hw, int off)
+{
+ return rte_le_to_cpu_32(nn_readl(hw->ctrl_bar + off));
+}
+
+static inline void
+nn_cfg_writel(struct nfp_net_hw *hw, int off, __u32 val)
+{
+ nn_writel(rte_cpu_to_le_32(val), hw->ctrl_bar + off);
+}
+
+static inline __u64
+nn_cfg_readq(struct nfp_net_hw *hw, int off)
+{
+ return rte_le_to_cpu_64(nn_readq(hw->ctrl_bar + off));
+}
+
+static inline void
+nn_cfg_writeq(struct nfp_net_hw *hw, int off, __u64 val)
+{
+ nn_writeq(rte_cpu_to_le_64(val), hw->ctrl_bar + off);
+}
+
+/*
+ * Creating memzone for hardware rings.
+
+ */
+static const struct rte_memzone *
+ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
+ uint16_t queue_id, uint32_t ring_size, int socket_id)
+{
+ char z_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+
+ snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
+ dev->driver->pci_drv.name,
+ ring_name, dev->data->port_id, queue_id);
+
+ mz = rte_memzone_lookup(z_name);
+ if (mz)
+ return mz;
+
+ return rte_memzone_reserve_aligned(z_name, ring_size, socket_id, 0,
+ NFP_MEMZONE_ALIGN);
+}
+
+/**
+ * Atomically reads link status information from global structure rte_eth_dev.
+ *
+ * @param dev
+ * - Pointer to the structure rte_eth_dev to read from.
+ * - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ * - On success, zero.
+ * - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_read_link_status(struct rte_eth_dev *dev,
+ struct rte_eth_link *link)
+{
+ struct rte_eth_link *dst = link;
+ struct rte_eth_link *src = &(dev->data->dev_link);
+
+ if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+ *(uint64_t *)src) == 0)
+ return -1;
+
+ return 0;
+}
+
+/**
+ * Atomically writes the link status information into global
+ * structure rte_eth_dev.
+ *
+ * @param dev
+ * - Pointer to the structure rte_eth_dev to read from.
+ * - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ * - On success, zero.
+ * - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_write_link_status(struct rte_eth_dev *dev,
+ struct rte_eth_link *link)
+{
+ struct rte_eth_link *dst = &(dev->data->dev_link);
+ struct rte_eth_link *src = link;
+
+ if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+ *(uint64_t *)src) == 0)
+ return -1;
+
+ return 0;
+}
+
+static void
+nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
+{
+ unsigned i;
+
+ if (rxq->rxbufs == NULL)
+ return;
+
+ for (i = 0; i < rxq->rx_count; i++) {
+ if (rxq->rxbufs[i].mbuf != NULL) {
+ rte_pktmbuf_free_seg(rxq->rxbufs[i].mbuf);
+ rxq->rxbufs[i].mbuf = NULL;
+ }
+ }
+}
+
+static void
+nfp_net_rx_queue_release(void *rx_queue)
+{
+ struct nfp_net_rxq *rxq = rx_queue;
+ if (rxq != NULL) {
+ nfp_net_rx_queue_release_mbufs(rxq);
+ rte_free(rxq->rxbufs);
+ rte_free(rxq);
+ }
+}
+
+static void
+nfp_net_reset_rx_queue(struct nfp_net_rxq *rxq)
+{
+ nfp_net_rx_queue_release_mbufs(rxq);
+ rxq->wr_p = 0;
+ rxq->rd_p = 0;
+ rxq->nb_rx_hold = 0;
+}
+
+static void
+nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq)
+{
+ unsigned i;
+
+ if (txq->txbufs == NULL)
+ return;
+
+ for (i = 0; i < txq->tx_count; i++) {
+ if (txq->txbufs[i].mbuf != NULL) {
+ rte_pktmbuf_free_seg(txq->txbufs[i].mbuf);
+ txq->txbufs[i].mbuf = NULL;
+ }
+ }
+}
+
+static void
+nfp_net_tx_queue_release(void *tx_queue)
+{
+ struct nfp_net_txq *txq = tx_queue;
+ if (txq != NULL) {
+ nfp_net_tx_queue_release_mbufs(txq);
+ rte_free(txq->txbufs);
+ rte_free(txq);
+ }
+}
+
+static void
+nfp_net_reset_tx_queue(struct nfp_net_txq *txq)
+{
+ nfp_net_tx_queue_release_mbufs(txq);
+ txq->wr_p = 0;
+ txq->rd_p = 0;
+ txq->tail = 0;
+}
+
+static int
+__nfp_net_reconfig(struct nfp_net_hw *hw, __u32 update)
+{
+ int cnt;
+ __u32 new;
+ struct timespec wait;
+
+ PMD_DRV_LOG(DEBUG, "Writing to the configuration queue (%p)...\n",
+ hw->qcp_cfg);
+
+ if (hw->qcp_cfg == NULL)
+ rte_panic("Bad configuration queue pointer\n");
+
+ nfp_qcp_ptr_add(hw->qcp_cfg, NFP_QCP_WRITE_PTR, 1);
+
+ wait.tv_sec = 0;
+ wait.tv_nsec = 1000000;
+
+ PMD_DRV_LOG(DEBUG, "Polling for update ack...\n");
+
+ /* Poll update field, waiting for NFP to ack the config */
+ for (cnt = 0; ; cnt++) {
+ new = nn_cfg_readl(hw, NFP_NET_CFG_UPDATE);
+ if (new == 0)
+ break;
+ if (new & NFP_NET_CFG_UPDATE_ERR) {
+ PMD_INIT_LOG(ERR, "Reconfig error: 0x%08x\n", new);
+ return -1;
+ }
+ if (cnt >= NFP_NET_POLL_TIMEOUT) {
+ PMD_INIT_LOG(ERR, "Reconfig timeout for 0x%08x after"
+ " %dms\n", update, cnt);
+ rte_panic("Exiting\n");
+ }
+ nanosleep(&wait, 0); /* waiting for a 1ms */
+ }
+ PMD_DRV_LOG(DEBUG, "Ack DONE\n");
+ return 0;
+}
+
+/**
+ * Reconfigure the NIC
+ * @nn: device to reconfigure
+ * @ctrl: The value for the ctrl field in the BAR config
+ * @update: The value for the update field in the BAR config
+ *
+ * Write the update word to the BAR and ping the reconfig queue. Then poll
+ * until the firmware has acknowledged the update by zeroing the update word.
+ */
+static int
+nfp_net_reconfig(struct nfp_net_hw *hw, __u32 ctrl, __u32 update)
+{
+ __u32 err;
+
+ PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=%08x update=%08x\n",
+ ctrl, update);
+
+ nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl);
+ nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update);
+
+ rte_wmb();
+
+ err = __nfp_net_reconfig(hw, update);
+
+ if (!err)
+ return 0;
+
+ /* Reconfig errors imply situations where they can be handled.
+ * Otherwise, rte_panic is called inside __nfp_net_reconfig */
+ PMD_INIT_LOG(ERR, "Error nfp_net reconfig for ctrl: %x update: %x\n",
+ ctrl, update);
+ return -EIO;
+}
+
+/* Configure an Ethernet device. This function must be invoked first
+ * before any other function in the Ethernet API. This function can
+ * also be re-invoked when a device is in the stopped state. */
+static int
+nfp_net_configure(struct rte_eth_dev *dev)
+{
+ struct rte_eth_conf *dev_conf;
+ struct rte_eth_rxmode *rxmode;
+ struct rte_eth_txmode *txmode;
+ uint32_t new_ctrl = 0;
+ uint32_t update = 0;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ /* A DPDK app sends info about how many queues to use and how
+ * those queues need to be configured. This is used by the
+ * DPDK core and it makes sure no more queues than those
+ * advertised by the driver are requested. This function is
+ * called after that internal process */
+
+ PMD_INIT_LOG(DEBUG, "Configure\n");
+
+ dev_conf = &dev->data->dev_conf;
+ rxmode = &dev_conf->rxmode;
+ txmode = &dev_conf->txmode;
+
+ /* Checking TX mode */
+ if (txmode->mq_mode) {
+ PMD_INIT_LOG(INFO, "TX mq_mode DCB and VMDq not supported\n");
+ return -EINVAL;
+ }
+
+ /* Checking RX mode */
+ if (rxmode->mq_mode & ETH_MQ_RX_RSS) {
+ if (hw->cap & NFP_NET_CFG_CTRL_RSS) {
+ update = NFP_NET_CFG_UPDATE_RSS;
+ new_ctrl = NFP_NET_CFG_CTRL_RSS;
+ } else {
+ PMD_INIT_LOG(INFO, "RSS not supported\n");
+ return -EINVAL;
+ }
+ }
+
+ if (rxmode->split_hdr_size) {
+ PMD_INIT_LOG(INFO, "rxmode does not support split header\n");
+ return -EINVAL;
+ }
+
+ if (rxmode->hw_ip_checksum) {
+ if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM)
+ new_ctrl |= NFP_NET_CFG_CTRL_RXCSUM;
+ else {
+ PMD_INIT_LOG(INFO, "RXCSUM not supported\n");
+ return -EINVAL;
+ }
+ }
+
+ if (rxmode->hw_vlan_filter) {
+ PMD_INIT_LOG(INFO, "VLAN filter not supported\n");
+ return -EINVAL;
+ }
+
+ if (rxmode->hw_vlan_strip) {
+ if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN)
+ new_ctrl |= NFP_NET_CFG_CTRL_RXVLAN;
+ else {
+ PMD_INIT_LOG(INFO, "hw vlan strip not supported\n");
+ return -EINVAL;
+ }
+ }
+
+ if (rxmode->hw_vlan_extend) {
+ PMD_INIT_LOG(INFO, "VLAN extended not supported\n");
+ return -EINVAL;
+ }
+
+ /* Supporting VLAN insertion by default */
+ if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)
+ new_ctrl |= NFP_NET_CFG_CTRL_TXVLAN;
+
+ if (rxmode->jumbo_frame) {
+ /* this is handled in rte_eth_dev_configure */
+ }
+
+ if (rxmode->hw_strip_crc) {
+ PMD_INIT_LOG(INFO, "strip CRC not supported\n");
+ return -EINVAL;
+ }
+
+ if (rxmode->enable_scatter) {
+ PMD_INIT_LOG(INFO, "Scatter not supported\n");
+ return -EINVAL;
+ }
+
+ if (!new_ctrl)
+ return 0;
+
+ update |= NFP_NET_CFG_UPDATE_GEN;
+
+ nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+ if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+ return -EIO;
+
+ hw->ctrl = new_ctrl;
+
+ return 0;
+}
+
+static void
+nfp_net_enable_queues(struct rte_eth_dev *dev)
+{
+ struct nfp_net_hw *hw;
+ uint64_t enabled_queues = 0;
+ int i;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ /* Enabling the required TX queues in the device */
+ for (i = 0; i < dev->data->nb_tx_queues; i++)
+ enabled_queues |= (1 << i);
+
+ nn_cfg_writeq(hw, NFP_NET_CFG_TXRS_ENABLE, enabled_queues);
+
+ enabled_queues = 0;
+
+ /* Enabling the required RX queues in the device */
+ for (i = 0; i < dev->data->nb_rx_queues; i++)
+ enabled_queues |= (1 << i);
+
+ nn_cfg_writeq(hw, NFP_NET_CFG_RXRS_ENABLE, enabled_queues);
+}
+
+static void
+nfp_net_disable_queues(struct rte_eth_dev *dev)
+{
+ struct nfp_net_hw *hw;
+ uint32_t new_ctrl, update = 0;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ nn_cfg_writeq(hw, NFP_NET_CFG_TXRS_ENABLE, 0);
+ nn_cfg_writeq(hw, NFP_NET_CFG_RXRS_ENABLE, 0);
+
+ new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_ENABLE;
+ update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING |
+ NFP_NET_CFG_UPDATE_MSIX;
+
+ if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+ new_ctrl &= ~NFP_NET_CFG_CTRL_RINGCFG;
+
+ /* If an error when reconfig we avoid to change hw state */
+ if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+ return;
+
+ hw->ctrl = new_ctrl;
+}
+
+static int
+nfp_net_rx_freelist_setup(struct rte_eth_dev *dev)
+{
+ int i;
+
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ if (nfp_net_rx_fill_freelist(dev->data->rx_queues[i]) < 0)
+ return -1;
+ }
+ return 0;
+}
+
+static void
+nfp_net_params_setup(struct nfp_net_hw *hw)
+{
+ uint32_t *mac_address;
+
+ nn_cfg_writel(hw, NFP_NET_CFG_MTU, hw->mtu);
+ nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, hw->flbufsz);
+
+ /* A MAC address is 8 bytes long */
+ mac_address = (uint32_t *)(hw->mac_addr);
+
+ nn_cfg_writel(hw, NFP_NET_CFG_MACADDR,
+ rte_cpu_to_be_32(*mac_address));
+ nn_cfg_writel(hw, NFP_NET_CFG_MACADDR + 4,
+ rte_cpu_to_be_32(*(mac_address + 4)));
+}
+
+static void
+nfp_net_cfg_queue_setup(struct nfp_net_hw *hw)
+{
+ hw->qcp_cfg = hw->tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
+}
+
+static int
+nfp_net_start(struct rte_eth_dev *dev)
+{
+ uint32_t new_ctrl, update = 0;
+ struct nfp_net_hw *hw;
+ int ret;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ PMD_INIT_LOG(DEBUG, "Start\n");
+
+ /* Disabling queues just in case... */
+ nfp_net_disable_queues(dev);
+
+ /* Writing configuration parameters in the device */
+ nfp_net_params_setup(hw);
+
+ /* Enabling the required queues in the device */
+ nfp_net_enable_queues(dev);
+
+ /* Enable device */
+ new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_ENABLE | NFP_NET_CFG_UPDATE_MSIX;
+ update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING;
+
+ if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+ new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
+
+ nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+ if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+ return -EIO;
+
+ /* Allocating rte mbuffs for configured rx queues.
+ * This requires queues being enabled before */
+ if (nfp_net_rx_freelist_setup(dev) < 0) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ hw->ctrl = new_ctrl;
+
+ return 0;
+
+error:
+ /* An error returned by this function should mean the app
+ * exiting and then the system releasing all the memory
+ * allocated even memory coming from hugepages.
+ *
+ * The device could be enabled at this point with some queues
+ * ready for getting packets. This is true if the call to
+ * nfp_net_rx_freelist_setup() succeeds for some queues but
+ * fails for subsequent queues.
+ *
+ * This should make the app exiting but better if we tell the
+ * device first. */
+ nfp_net_disable_queues(dev);
+
+ return ret;
+}
+
+/*
+ * Stop device: disable rx and tx functions to allow for reconfiguring.
+ */
+static void
+nfp_net_stop(struct rte_eth_dev *dev)
+{
+ int i;
+
+ PMD_INIT_LOG(DEBUG, "Stop\n");
+
+ nfp_net_disable_queues(dev);
+
+ /* Clear queues */
+ for (i = 0; i < dev->data->nb_tx_queues; i++) {
+ nfp_net_reset_tx_queue(
+ (struct nfp_net_txq *)dev->data->tx_queues[i]);
+ }
+
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ nfp_net_reset_rx_queue(
+ (struct nfp_net_rxq *)dev->data->rx_queues[i]);
+
+ }
+}
+
+/*
+ * Reset and stop device. The device can not be restarted.
+ */
+static void
+nfp_net_close(struct rte_eth_dev *dev)
+{
+ struct nfp_net_hw *hw;
+
+ PMD_INIT_LOG(DEBUG, "Close\n");
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ /* We assume that the DPDK application is stopping all the
+ * threads/queues before calling the device close function.
+ *
+ * It is not clear how close and stop could be used in the real life.
+ * No apps or examples are using them at all. The only difference
+ * between them is here the LSC interrupt is disabled. */
+
+ nfp_net_stop(dev);
+
+ rte_intr_disable(&(dev->pci_dev->intr_handle));
+ nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
+
+ /* The ixgbe PMD driver disables the pcie master on the
+ * device. The i40e does not... */
+}
+
+static void
+nfp_net_promisc_enable(struct rte_eth_dev *dev)
+{
+ uint32_t new_ctrl, update = 0;
+ struct nfp_net_hw *hw;
+
+ PMD_DRV_LOG(DEBUG, "Promiscuous mode enable\n");
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ if (!(hw->cap & NFP_NET_CFG_CTRL_PROMISC)) {
+ PMD_INIT_LOG(INFO, "Promiscuous mode not suppored\n");
+ return;
+ }
+
+ if (hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) {
+ PMD_DRV_LOG(INFO, "Promiscuous mode already enabled\n");
+ return;
+ }
+
+ new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_PROMISC;
+ update = NFP_NET_CFG_UPDATE_GEN;
+
+ /* DPDK sets promiscuous mode on just after this call assuming
+ * it can not fail ... */
+ if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+ return;
+
+ hw->ctrl = new_ctrl;
+}
+
+static void
+nfp_net_promisc_disable(struct rte_eth_dev *dev)
+{
+ uint32_t new_ctrl, update = 0;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ if ((hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) == 0) {
+ PMD_DRV_LOG(INFO, "Promiscuous mode already disabled\n");
+ return;
+ }
+
+ new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_PROMISC;
+ update = NFP_NET_CFG_UPDATE_GEN;
+
+ /* DPDK sets promiscuous mode off just before this call
+ * assuming it can not fail ... */
+ if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+ return;
+
+ hw->ctrl = new_ctrl;
+}
+
+/* return 0 means link status changed, -1 means not changed
+ *
+ * Wait to complete is needed as it can take up to 9 seconds to get the Link
+ * status.
+ */
+static int
+nfp_net_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complete)
+{
+ struct nfp_net_hw *hw;
+ struct rte_eth_link link, old;
+ uint32_t nn_link_status;
+
+ PMD_DRV_LOG(DEBUG, "Link update\n");
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ memset(&old, 0, sizeof(old));
+ nfp_net_dev_atomic_read_link_status(dev, &old);
+
+ nn_link_status = nn_cfg_readl(hw, NFP_NET_CFG_STS);
+
+ memset(&link, 0, sizeof(struct rte_eth_link));
+
+ if (nn_link_status & NFP_NET_CFG_STS_LINK)
+ link.link_status = 1;
+
+ link.link_duplex = ETH_LINK_FULL_DUPLEX;
+ /* Other cards can limit the tx and rx rate per VF */
+ link.link_speed = ETH_LINK_SPEED_40G;
+
+ if (old.link_status != link.link_status) {
+ nfp_net_dev_atomic_write_link_status(dev, &link);
+ if (link.link_status)
+ PMD_DRV_LOG(INFO, "NIC Link is Up\n");
+ else
+ PMD_DRV_LOG(INFO, "NIC Link is Down\n");
+ return 0;
+ }
+
+ return -1;
+}
+
+static void
+nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+ int i;
+ struct nfp_net_hw *hw;
+ struct rte_eth_stats nfp_dev_stats;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ /* RTE_ETHDEV_QUEUE_STAT_CNTRS default value is 16 */
+
+ /* reading per RX ring stats */
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+ break;
+
+ nfp_dev_stats.q_ipackets[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i));
+
+ nfp_dev_stats.q_ipackets[i] -=
+ hw->eth_stats_base.q_ipackets[i];
+
+ nfp_dev_stats.q_ibytes[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8);
+
+ nfp_dev_stats.q_ibytes[i] -=
+ hw->eth_stats_base.q_ibytes[i];
+ }
+
+ /* reading per TX ring stats */
+ for (i = 0; i < dev->data->nb_tx_queues; i++) {
+ if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+ break;
+
+ nfp_dev_stats.q_opackets[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i));
+
+ nfp_dev_stats.q_opackets[i] -=
+ hw->eth_stats_base.q_opackets[i];
+
+ nfp_dev_stats.q_obytes[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8);
+
+ nfp_dev_stats.q_obytes[i] -=
+ hw->eth_stats_base.q_obytes[i];
+ }
+
+ nfp_dev_stats.ipackets =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES);
+
+ nfp_dev_stats.ipackets -= hw->eth_stats_base.ipackets;
+
+ nfp_dev_stats.ibytes =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS);
+
+ nfp_dev_stats.ibytes -= hw->eth_stats_base.ibytes;
+
+ nfp_dev_stats.opackets =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES);
+
+ nfp_dev_stats.opackets -= hw->eth_stats_base.opackets;
+
+ nfp_dev_stats.obytes =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS);
+
+ nfp_dev_stats.obytes -= hw->eth_stats_base.obytes;
+
+ nfp_dev_stats.imcasts =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+ nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+ /* reading general device stats */
+ nfp_dev_stats.ierrors =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS);
+
+ nfp_dev_stats.ierrors -= hw->eth_stats_base.ierrors;
+
+ nfp_dev_stats.oerrors =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS);
+
+ nfp_dev_stats.oerrors -= hw->eth_stats_base.oerrors;
+
+ /* Multicast frames received */
+ nfp_dev_stats.imcasts =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+ nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+ /* RX ring mbuf allocation failures */
+ nfp_dev_stats.rx_nombuf = dev->data->rx_mbuf_alloc_failed;
+
+ nfp_dev_stats.imissed =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
+
+ nfp_dev_stats.imissed -= hw->eth_stats_base.imissed;
+
+ if (stats)
+ memcpy(stats, &nfp_dev_stats, sizeof(*stats));
+}
+
+static void
+nfp_net_stats_reset(struct rte_eth_dev *dev)
+{
+ int i;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ /* hw->eth_stats_base records the per counter starting point.
+ * Lets update it now */
+
+ /* reading per RX ring stats */
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+ break;
+
+ hw->eth_stats_base.q_ipackets[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i));
+
+ hw->eth_stats_base.q_ibytes[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8);
+ }
+
+ /* reading per TX ring stats */
+ for (i = 0; i < dev->data->nb_tx_queues; i++) {
+ if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+ break;
+
+ hw->eth_stats_base.q_opackets[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i));
+
+ hw->eth_stats_base.q_obytes[i] =
+ nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8);
+ }
+
+ hw->eth_stats_base.ipackets =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES);
+
+ hw->eth_stats_base.ibytes =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS);
+
+ hw->eth_stats_base.opackets =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES);
+
+ hw->eth_stats_base.obytes =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS);
+
+ hw->eth_stats_base.imcasts =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+ /* reading general device stats */
+ hw->eth_stats_base.ierrors =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS);
+
+ hw->eth_stats_base.oerrors =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS);
+
+ /* Multicast frames received */
+ hw->eth_stats_base.imcasts =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+ /* RX ring mbuf allocation failures */
+ dev->data->rx_mbuf_alloc_failed = 0;
+
+ hw->eth_stats_base.imissed =
+ nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
+}
+
+static void
+nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ dev_info->driver_name = dev->driver->pci_drv.name;
+ dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
+ dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
+ dev_info->min_rx_bufsize = ETHER_MIN_MTU;
+ dev_info->max_rx_pktlen = hw->max_mtu;
+ /* Next should change when PF support is implemented */
+ dev_info->max_mac_addrs = 1;
+
+ if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN)
+ dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP;
+
+ if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM)
+ dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_IPV4_CKSUM |
+ DEV_RX_OFFLOAD_UDP_CKSUM |
+ DEV_RX_OFFLOAD_TCP_CKSUM;
+
+ if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)
+ dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+
+ if (hw->cap & NFP_NET_CFG_CTRL_TXCSUM)
+ dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_IPV4_CKSUM |
+ DEV_RX_OFFLOAD_UDP_CKSUM |
+ DEV_RX_OFFLOAD_TCP_CKSUM;
+
+ dev_info->default_rxconf = (struct rte_eth_rxconf) {
+ .rx_thresh = {
+ .pthresh = DEFAULT_RX_PTHRESH,
+ .hthresh = DEFAULT_RX_HTHRESH,
+ .wthresh = DEFAULT_RX_WTHRESH,
+ },
+ .rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+ .rx_drop_en = 0,
+ };
+
+ dev_info->default_txconf = (struct rte_eth_txconf) {
+ .tx_thresh = {
+ .pthresh = DEFAULT_TX_PTHRESH,
+ .hthresh = DEFAULT_TX_HTHRESH,
+ .wthresh = DEFAULT_TX_WTHRESH,
+ },
+ .tx_free_thresh = DEFAULT_TX_FREE_THRESH,
+ .tx_rs_thresh = DEFAULT_TX_RSBIT_THRESH,
+ .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
+ ETH_TXQ_FLAGS_NOOFFLOADS,
+ };
+
+ dev_info->reta_size = NFP_NET_CFG_RSS_ITBL_SZ;
+#if RTE_VER_MAJOR == 2 && RTE_VER_MINOR >= 1
+ dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
+#endif
+}
+
+static uint32_t
+nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
+{
+ struct nfp_net_rxq *rxq;
+ struct nfp_net_rx_desc *rxds;
+ __u32 idx;
+ __u32 count;
+
+ rxq = (struct nfp_net_rxq *)dev->data->rx_queues[queue_idx];
+
+ if (!rxq) {
+ PMD_INIT_LOG(ERR, "Bad queue: %u\n", queue_idx);
+ return 0;
+ }
+
+ idx = rxq->rd_p % rxq->rx_count;
+ rxds = &rxq->rxds[idx];
+
+ count = 0;
+
+ /* Other PMDs are just checking the DD bit in intervals of 4
+ * descriptors and counting all four if the first has the DD
+ * bit on. Of course, this is not accurate but can be good for
+ * perfomance. But ideally that should be done in descriptors
+ * chunks belonging to the same cache line */
+
+ while (count < rxq->rx_count) {
+
+ rxds = &rxq->rxds[idx];
+ if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+ break;
+
+ count++;
+ idx++;
+
+ /* Wrapping? */
+ if ((idx) == rxq->rx_count)
+ idx = 0;
+ }
+
+ return count;
+}
+
+static void
+nfp_net_dev_link_status_print(struct rte_eth_dev *dev)
+{
+ struct rte_eth_link link;
+
+ memset(&link, 0, sizeof(link));
+ nfp_net_dev_atomic_read_link_status(dev, &link);
+ if (link.link_status)
+ RTE_LOG(INFO, PMD, "Port %d: Link Up - speed %u Mbps - %s\n",
+ (int)(dev->data->port_id), (unsigned)link.link_speed,
+ link.link_duplex == ETH_LINK_FULL_DUPLEX
+ ? "full-duplex" : "half-duplex");
+ else
+ RTE_LOG(INFO, PMD, " Port %d: Link Down\n",
+ (int)(dev->data->port_id));
+
+ RTE_LOG(INFO, PMD, "PCI Address: %04d:%02d:%02d:%d\n",
+ dev->pci_dev->addr.domain, dev->pci_dev->addr.bus,
+ dev->pci_dev->addr.devid, dev->pci_dev->addr.function);
+}
+
+/*
+ * Interrupt configuration and handling
+ */
+
+/**
+ * nfp_net_irq_unmask - Unmask an interrupt
+ *
+ * If MSI-X auto-masking is enabled clear the mask bit, otherwise
+ * clear the ICR for the entry.
+ */
+static void
+nfp_net_irq_unmask(struct rte_eth_dev *dev)
+{
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
+ /* If MSI-X auto-masking is used, clear the entry */
+ rte_wmb();
+ rte_intr_enable(&(dev->pci_dev->intr_handle));
+ } else {
+ /* Make sure all updates are written before un-masking */
+ rte_wmb();
+ nn_cfg_writeb(hw, NFP_NET_CFG_ICR(NFP_NET_IRQ_LSC_IDX),
+ NFP_NET_CFG_ICR_UNMASKED);
+ }
+}
+
+static void
+nfp_net_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+ int64_t timeout;
+ struct rte_eth_link link;
+ struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+ PMD_DRV_LOG(DEBUG, "We got a LSC interrupt!!!\n");
+
+ /* get the link status */
+ memset(&link, 0, sizeof(link));
+ nfp_net_dev_atomic_read_link_status(dev, &link);
+
+ nfp_net_link_update(dev, 0);
+
+ /* likely to up */
+ if (!link.link_status) {
+ /* handle it 1 sec later, wait it being stable */
+ timeout = NFP_NET_LINK_UP_CHECK_TIMEOUT;
+ /* likely to down */
+ } else {
+ /* handle it 4 sec later, wait it being stable */
+ timeout = NFP_NET_LINK_DOWN_CHECK_TIMEOUT;
+ }
+
+ if (rte_eal_alarm_set(timeout * 1000,
+ nfp_net_dev_interrupt_delayed_handler,
+ (void*)dev) < 0) {
+ RTE_LOG(ERR, PMD, "Error setting alarm");
+ /* Unmasking */
+ nfp_net_irq_unmask(dev);
+ }
+}
+
+/**
+ * Interrupt handler which shall be registered for alarm callback for delayed
+ * handling specific interrupt to wait for the stable nic state. As the NIC
+ * interrupt state is not stable for nfp after link is just down, it needs
+ * to wait 4 seconds to get the stable status.
+ *
+ * @param handle Pointer to interrupt handle.
+ * @param param The address of parameter (struct rte_eth_dev *)
+ *
+ * @return void
+ */
+static void
+nfp_net_dev_interrupt_delayed_handler(void *param)
+{
+ struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+ nfp_net_link_update(dev, 0);
+ _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC);
+
+ nfp_net_dev_link_status_print(dev);
+
+ /* Unmasking */
+ nfp_net_irq_unmask(dev);
+}
+
+static int
+nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ /* check that mtu is within the allowed range */
+ if ((mtu < ETHER_MIN_MTU) || ((uint32_t)mtu > hw->max_mtu))
+ return -EINVAL;
+
+ /* switch to jumbo mode if needed */
+ if ((uint32_t)mtu > ETHER_MAX_LEN)
+ dev->data->dev_conf.rxmode.jumbo_frame = 1;
+ else
+ dev->data->dev_conf.rxmode.jumbo_frame = 0;
+
+ /* update max frame size */
+ dev->data->dev_conf.rxmode.max_rx_pkt_len = (uint32_t)mtu;
+
+ /* writing to configuration space */
+ nn_cfg_writel(hw, NFP_NET_CFG_MTU, (uint32_t)mtu);
+
+ hw->mtu = mtu;
+
+ return 0;
+}
+
+static int
+nfp_net_rx_queue_setup(struct rte_eth_dev *dev,
+ uint16_t queue_idx, uint16_t nb_desc,
+ unsigned int socket_id,
+ const struct rte_eth_rxconf *rx_conf,
+ struct rte_mempool *mp)
+{
+ const struct rte_memzone *tz;
+ struct nfp_net_rxq *rxq;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ PMD_INIT_FUNC_TRACE();
+
+ /* Validating number of descriptors */
+ if (((nb_desc * sizeof(struct nfp_net_rx_desc)) % 128) != 0 ||
+ (nb_desc > NFP_NET_MAX_RX_DESC) ||
+ (nb_desc < NFP_NET_MIN_RX_DESC)) {
+ RTE_LOG(ERR, PMD, "Wrong nb_desc value\n");
+ return (-EINVAL);
+ }
+
+ /* Free memory prior to re-allocation if needed. This is the case after
+ * calling nfp_net_stop */
+ if (dev->data->rx_queues[queue_idx] != NULL) {
+ nfp_net_rx_queue_release(dev->data->rx_queues[queue_idx]);
+ dev->data->rx_queues[queue_idx] = NULL;
+ }
+
+ /* Allocating rx queue data structure */
+ rxq = rte_zmalloc_socket("ethdev RX queue", sizeof(struct nfp_net_rxq),
+ RTE_CACHE_LINE_SIZE, socket_id);
+ if (rxq == NULL)
+ return (-ENOMEM);
+
+ /* Hw queues mapping based on firmware confifguration */
+ rxq->qidx = queue_idx;
+ rxq->fl_qcidx = queue_idx * hw->stride_rx;
+ rxq->rx_qcidx = rxq->fl_qcidx + (hw->stride_rx - 1);
+ rxq->qcp_fl = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->fl_qcidx);
+ rxq->qcp_rx = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->rx_qcidx);
+
+ /* Tracking mbuf size for detecting a potential mbuf overflow due to
+ * NFP_NET_RX_OFFSET */
+ rxq->mem_pool = mp;
+ rxq->mbuf_size = rxq->mem_pool->elt_size;
+ rxq->mbuf_size -= (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM);
+ hw->flbufsz = rxq->mbuf_size;
+
+ rxq->rx_count = nb_desc;
+ rxq->port_id = dev->data->port_id;
+ rxq->rx_free_thresh = rx_conf->rx_free_thresh;
+ rxq->drop_en = rx_conf->rx_drop_en;
+
+ /* Allocate RX ring hardware descriptors. A memzone large enough to
+ * handle the maximum ring size is allocated in order to allow for
+ * resizing in later calls to the queue setup function. */
+ tz = ring_dma_zone_reserve(dev, "rx_ring", queue_idx,
+ sizeof(struct nfp_net_rx_desc) * NFP_NET_MAX_RX_DESC,
+ socket_id);
+
+ if (tz == NULL) {
+ RTE_LOG(ERR, PMD, "Error allocatig rx dma\n");
+ nfp_net_rx_queue_release(rxq);
+ return (-ENOMEM);
+ }
+
+ /* Saving physical and virtual addresses for the RX ring */
+ rxq->dma = (uint64_t) tz->phys_addr;
+ rxq->rxds = (struct nfp_net_rx_desc *) tz->addr;
+
+ /* mbuf pointers array for referencing mbufs linked to RX descriptors */
+ rxq->rxbufs = rte_zmalloc_socket("rxq->rxbufs",
+ sizeof(*rxq->rxbufs) * nb_desc,
+ RTE_CACHE_LINE_SIZE, socket_id);
+ if (rxq->rxbufs == NULL) {
+ nfp_net_rx_queue_release(rxq);
+ return (-ENOMEM);
+ }
+
+ PMD_RX_LOG(DEBUG, "rxbufs=%p hw_ring=%p dma_addr=0x%"PRIx64"\n",
+ rxq->rxbufs, rxq->rxds, (long unsigned int)rxq->dma);
+
+ nfp_net_reset_rx_queue(rxq);
+
+ dev->data->rx_queues[queue_idx] = rxq;
+ rxq->hw = hw;
+
+ /* Telling the HW about the physical address of the RX ring and number
+ * of descriptors in log2 format */
+ nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(queue_idx), rxq->dma);
+ nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(queue_idx), log2(nb_desc));
+
+ return 0;
+}
+
+static int
+nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq)
+{
+ struct nfp_net_rx_buff *rxe = rxq->rxbufs;
+ uint64_t dma_addr;
+ unsigned i;
+
+ PMD_RX_LOG(DEBUG, "nfp_net_rx_fill_freelist for %u descriptors\n",
+ rxq->rx_count);
+
+ for (i = 0; i < rxq->rx_count; i++) {
+ struct nfp_net_rx_desc *rxd;
+ struct rte_mbuf *mbuf = rte_pktmbuf_alloc(rxq->mem_pool);
+
+ if (mbuf == NULL) {
+ RTE_LOG(ERR, PMD, "RX mbuf alloc failed queue_id=%u\n",
+ (unsigned) rxq->qidx);
+ return (-ENOMEM);
+ }
+
+ dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(mbuf));
+
+ rxd = &rxq->rxds[i];
+ rxd->fld.dd = 0;
+ rxd->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+ rxd->fld.dma_addr_lo = dma_addr & 0xffffffff;
+ rxe[i].mbuf = mbuf;
+ PMD_RX_LOG(DEBUG, "[%d]: %"PRIx64"\n", i, dma_addr);
+
+ rxq->wr_p++;
+ }
+
+ /* Make sure all writes are flushed before telling the hardware */
+ rte_wmb();
+
+ /* Not advertising the whole ring as the firmware gets confused if so */
+ PMD_RX_LOG(DEBUG, "Increment FL write pointer in %u\n",
+ rxq->rx_count - 1);
+
+ nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, rxq->rx_count - 1);
+
+ return 0;
+}
+
+static int
+nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_txconf *tx_conf)
+{
+ const struct rte_memzone *tz;
+ struct nfp_net_txq *txq;
+ uint16_t tx_free_thresh;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ PMD_INIT_FUNC_TRACE();
+
+ /* Validating number of descriptors */
+ if (((nb_desc * sizeof(struct nfp_net_tx_desc)) % 128) != 0 ||
+ (nb_desc > NFP_NET_MAX_TX_DESC) ||
+ (nb_desc < NFP_NET_MIN_TX_DESC)) {
+ RTE_LOG(ERR, PMD, "Wrong nb_desc value\n");
+ return -EINVAL;
+ }
+
+ tx_free_thresh = (uint16_t)((tx_conf->tx_free_thresh) ?
+ tx_conf->tx_free_thresh :
+ DEFAULT_TX_FREE_THRESH);
+
+ if (tx_free_thresh > (nb_desc)) {
+ RTE_LOG(ERR, PMD,
+ "tx_free_thresh must be less than the number of TX "
+ "descriptors. (tx_free_thresh=%u port=%d "
+ "queue=%d)\n", (unsigned int)tx_free_thresh,
+ (int)dev->data->port_id, (int)queue_idx);
+ return -(EINVAL);
+ }
+
+ /* Free memory prior to re-allocation if needed. This is the case after
+ * calling nfp_net_stop */
+ if (dev->data->tx_queues[queue_idx] != NULL) {
+ PMD_TX_LOG(DEBUG, "Freeing memory prior to re-allocation %d\n",
+ queue_idx);
+ nfp_net_tx_queue_release(dev->data->tx_queues[queue_idx]);
+ dev->data->tx_queues[queue_idx] = NULL;
+ }
+
+ /* Allocating tx queue data structure */
+ txq = rte_zmalloc_socket("ethdev TX queue", sizeof(struct nfp_net_txq),
+ RTE_CACHE_LINE_SIZE, socket_id);
+ if (txq == NULL) {
+ RTE_LOG(ERR, PMD, "Error allocating tx dma\n");
+ return (-ENOMEM);
+ }
+
+ /* Allocate TX ring hardware descriptors. A memzone large enough to
+ * handle the maximum ring size is allocated in order to allow for
+ * resizing in later calls to the queue setup function. */
+ tz = ring_dma_zone_reserve(dev, "tx_ring", queue_idx,
+ sizeof(struct nfp_net_tx_desc) * NFP_NET_MAX_TX_DESC,
+ socket_id);
+ if (tz == NULL) {
+ RTE_LOG(ERR, PMD, "Error allocating tx dma\n");
+ nfp_net_tx_queue_release(txq);
+ return (-ENOMEM);
+ }
+
+ txq->tx_count = nb_desc;
+ txq->tx_free_thresh = tx_free_thresh;
+ txq->tx_pthresh = tx_conf->tx_thresh.pthresh;
+ txq->tx_hthresh = tx_conf->tx_thresh.hthresh;
+ txq->tx_wthresh = tx_conf->tx_thresh.wthresh;
+
+ /* queue mapping based on firmware configuration */
+ txq->qidx = queue_idx;
+ txq->tx_qcidx = queue_idx * hw->stride_tx;
+ txq->qcp_q = hw->tx_bar + NFP_QCP_QUEUE_OFF(txq->tx_qcidx);
+
+ txq->port_id = dev->data->port_id;
+ txq->txq_flags = tx_conf->txq_flags;
+
+ /* Saving physical and virtual addresses for the TX ring */
+ txq->dma = (uint64_t) tz->phys_addr;
+ txq->txds = (struct nfp_net_tx_desc *) tz->addr;
+
+ /* mbuf pointers array for referencing mbufs linked to TX descriptors */
+ txq->txbufs = rte_zmalloc_socket("txq->txbufs",
+ sizeof(*txq->txbufs) * nb_desc,
+ RTE_CACHE_LINE_SIZE, socket_id);
+ if (txq->txbufs == NULL) {
+ nfp_net_tx_queue_release(txq);
+ return (-ENOMEM);
+ }
+ PMD_TX_LOG(DEBUG, "txbufs=%p hw_ring=%p dma_addr=0x%"PRIx64"\n",
+ txq->txbufs, txq->txds, (long unsigned int)txq->dma);
+
+ nfp_net_reset_tx_queue(txq);
+
+ dev->data->tx_queues[queue_idx] = txq;
+ txq->hw = hw;
+
+ /* Telling the HW about the physical address of the TX ring and number
+ * of descriptors in log2 format */
+ nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(queue_idx), txq->dma);
+ nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(queue_idx), log2(nb_desc));
+
+ return 0;
+}
+
+/**
+ * nfp_net_tx_cksum - Set TX CSUM offload flags in TX descriptor
+ */
+static inline void
+nfp_net_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_tx_desc *txd,
+ struct rte_mbuf *mb)
+{
+ uint16_t ol_flags;
+ struct nfp_net_hw *hw = txq->hw;
+
+ if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
+ return;
+
+ ol_flags = mb->ol_flags;
+
+ /* IPv6 does not need checksum */
+ if (ol_flags & PKT_TX_IP_CKSUM)
+ txd->flags |= PCIE_DESC_TX_IP4_CSUM;
+
+ switch (ol_flags & PKT_TX_L4_MASK) {
+ case PKT_TX_UDP_CKSUM:
+ txd->flags |= PCIE_DESC_TX_UDP_CSUM;
+ break;
+ case PKT_TX_TCP_CKSUM:
+ txd->flags |= PCIE_DESC_TX_TCP_CSUM;
+ break;
+ }
+
+ txd->flags |= PCIE_DESC_TX_CSUM;
+}
+
+/**
+ * nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags
+ */
+static inline void
+nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
+ struct rte_mbuf *mb)
+{
+ struct nfp_net_hw *hw = rxq->hw;
+
+ if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
+ return;
+
+ /* If IPv4 and IP checksum error, fail */
+ if ((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
+ !(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK))
+ mb->ol_flags |= PKT_RX_IP_CKSUM_BAD;
+
+ /* If neither UDP nor TCP return */
+ if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
+ !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
+ return;
+
+ if ((rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
+ !(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM_OK))
+ mb->ol_flags |= PKT_RX_L4_CKSUM_BAD;
+
+ if ((rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM) &&
+ !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM_OK))
+ mb->ol_flags |= PKT_RX_L4_CKSUM_BAD;
+
+ return;
+}
+
+#define HASH_OFFSET ((uint8_t *)mbuf->buf_addr + mbuf->data_off - 4)
+#define HASH_TYPE_OFFSET ((uint8_t *)mbuf->buf_addr + mbuf->data_off - 8)
+
+/**
+ * nfp_net_set_hash - Set mbuf hash data
+ *
+ * The RSS hash and hash-type are pre-pended to the packet data.
+ * Extract and decode it and set the mbuf fields.
+ */
+static inline void
+nfp_net_set_hash(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
+ struct rte_mbuf *mbuf)
+{
+ uint32_t hash;
+ uint32_t hash_type;
+ struct nfp_net_hw *hw = rxq->hw;
+
+ if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+ return;
+
+ if (!(rxd->rxd.flags & PCIE_DESC_RX_RSS))
+ return;
+
+ hash = rte_be_to_cpu_32(*(uint32_t *)HASH_OFFSET);
+ hash_type = rte_be_to_cpu_32(*(uint32_t *)HASH_TYPE_OFFSET);
+
+ /* hash type is sharing the same word with input port info
+ * 31-8: input port
+ * 7:0: hash type */
+ hash_type &= 0xff;
+ mbuf->hash.rss = hash;
+ mbuf->ol_flags |= PKT_RX_RSS_HASH;
+
+ switch (hash_type) {
+ case NFP_NET_RSS_IPV4:
+ mbuf->packet_type |= RTE_PTYPE_INNER_L3_IPV4;
+ break;
+ case NFP_NET_RSS_IPV6:
+ mbuf->packet_type |= RTE_PTYPE_INNER_L3_IPV6;
+ break;
+ case NFP_NET_RSS_IPV6_EX:
+ mbuf->packet_type |= RTE_PTYPE_INNER_L3_IPV6_EXT;
+ break;
+ default:
+ mbuf->packet_type |= RTE_PTYPE_INNER_L4_MASK;
+ }
+}
+
+/**
+ * nfp_net_check_port - Set mbuf in_port field
+ *
+ */
+static void
+nfp_net_check_port(struct nfp_net_rx_desc *rxd, struct rte_mbuf *mbuf)
+{
+ uint32_t port;
+
+ if (!(rxd->rxd.flags & PCIE_DESC_RX_INGRESS_PORT)) {
+ mbuf->port = 0;
+ return;
+ }
+
+ port = rte_be_to_cpu_32(*(uint32_t *)((uint8_t *)mbuf->buf_addr +
+ mbuf->data_off - 8));
+
+ /* hash type is sharing the same word with input port info
+ * 31-8: input port
+ * 7:0: hash type */
+ port = (uint8_t)(port >> 8);
+ mbuf->port = port;
+}
+
+static inline void
+nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
+{
+ rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
+}
+
+#define DESC_META_LEN(d) (d->rxd.meta_len_dd & PCIE_DESC_RX_META_LEN_MASK)
+
+/* RX path design:
+ *
+ * There are some decissions to take:
+ * 1) How to check DD RX descriptors bit
+ * 2) How and when to allocate new mbufs
+ *
+ * Current implementation checks just one single DD bit each loop. As each
+ * descriptor is 8 bytes, it is likely a good idea to check descriptors in
+ * a single cache line instead. Tests with this change have not shown any
+ * performance improvement but it requires further investigation. For example,
+ * depending on which descriptor is next, the number of descriptors could be
+ * less than 8 for just checking those in the same cache line. This implies
+ * extra work which could be counterproductive by itself. Indeed, last firmware
+ * changes are just doing this: writing several descriptors with the DD bit
+ * for saving PCIe bandwidth and DMA operations from the NFP.
+ *
+ * Mbuf allocation is done when a new packet is received. Then the descriptor
+ * is automatically linked with the new mbuf and the old one is given to the
+ * user. The main drawback with this design is mbuf allocation is heavier than
+ * using bulk allocations allowed by DPDK with rte_mempool_get_bulk. From the
+ * cache point of view it does not seem allocating the mbuf early on as we are
+ * doing now have any benefit at all. Again, tests with this change have not
+ * shown any improvement. Also, rte_mempool_get_bulk returns all or nothing
+ * so looking at the implications of this type of allocation should be studied
+ * deeply */
+
+static uint16_t
+nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+ struct nfp_net_rxq *rxq;
+ struct nfp_net_rx_desc *rxds;
+ struct nfp_net_rx_buff *rxb;
+ struct nfp_net_hw *hw;
+ struct rte_mbuf *mb;
+ struct rte_mbuf *new_mb;
+ int idx;
+ uint16_t nb_hold;
+ uint64_t dma_addr;
+ int avail;
+
+ rxq = rx_queue;
+ if (unlikely(rxq == NULL)) {
+ /* DPDK just checks the queue is lower than max queues
+ * enabled. But the queue needs to be configured */
+ RTE_LOG(ERR, PMD, "RX Bad queue\n");
+ return -EINVAL;
+ }
+
+ hw = rxq->hw;
+ avail = 0;
+ nb_hold = 0;
+
+ while (avail < nb_pkts) {
+ idx = rxq->rd_p % rxq->rx_count;
+
+ rxb = &rxq->rxbufs[idx];
+ if (unlikely(!rxb)) {
+ RTE_LOG(ERR, PMD, "rxb does not exist!\n");
+ break;
+ }
+
+ /* Memory barrier to ensure that we won't do other
+ * reads before the DD bit. */
+ rte_rmb();
+
+ rxds = &rxq->rxds[idx];
+ if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+ break;
+
+ /* We got a packet. Let's alloc a new mbuff for refilling the
+ * free descriptor ring as soon as possible */
+ new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+ if (unlikely(new_mb == NULL)) {
+ RTE_LOG(DEBUG, PMD, "RX mbuf alloc failed port_id=%u "
+ "queue_id=%u\n", (unsigned) rxq->port_id,
+ (unsigned) rxq->qidx);
+ nfp_net_mbuf_alloc_failed(rxq);
+ break;
+ }
+
+ nb_hold++;
+
+ /* Grab the mbuff and refill the descriptor with the
+ * previously allocated mbuff */
+ mb = rxb->mbuf;
+ rxb->mbuf = new_mb;
+
+ PMD_RX_LOG(DEBUG, "Packet len: %u, mbuf_size: %u\n",
+ rxds->rxd.data_len, rxq->mbuf_size);
+
+ /* Size of this segment */
+ mb->data_len = rxds->rxd.data_len - DESC_META_LEN(rxds);
+ /* Size of the whole packet. We just support 1 segment */
+ mb->pkt_len = rxds->rxd.data_len - DESC_META_LEN(rxds);
+
+ if (unlikely((mb->data_len + NFP_NET_RX_OFFSET) >
+ rxq->mbuf_size)) {
+
+ /* This should not happen and the user has the
+ * responsibility of avoiding it. But we have
+ * to give some info about the error */
+ RTE_LOG(ERR, PMD,
+ "mbuf overflow likely due to NFP_NET_RX_OFFSET\n"
+ "\t\tYour mbuf size should have extra space for"
+ " NFP_NET_RX_OFFSET=%u bytes.\n"
+ "\t\tCurrently you just have %u bytes available"
+ " but the received packet is %u bytes long",
+ NFP_NET_RX_OFFSET,
+ rxq->mbuf_size - NFP_NET_RX_OFFSET,
+ mb->data_len);
+ return -EINVAL;
+ }
+
+ /* Filling the received mbuff with packet info */
+ mb->data_off = RTE_PKTMBUF_HEADROOM + NFP_NET_RX_OFFSET;
+
+ /* No scatter mode supported */
+ mb->nb_segs = 1;
+ mb->next = NULL;
+
+ /* Checking the RSS flag */
+ nfp_net_set_hash(rxq, rxds, mb);
+
+ /* Checking the checksum flag */
+ nfp_net_rx_cksum(rxq, rxds, mb);
+
+ /* Checking the port flag */
+ nfp_net_check_port(rxds, mb);
+
+ if ((rxds->rxd.flags & PCIE_DESC_RX_VLAN) &&
+ (hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) {
+ mb->vlan_tci = rte_cpu_to_le_32(rxds->rxd.vlan);
+ mb->ol_flags |= PKT_RX_VLAN_PKT;
+ }
+
+ /* Adding the mbuff to the mbuff array passed by the app */
+ rx_pkts[avail++] = mb;
+
+ /* Now resetting and updating the descriptor */
+ rxds->vals[0] = 0;
+ rxds->vals[1] = 0;
+ dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb));
+ rxds->fld.dd = 0;
+ rxds->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+ rxds->fld.dma_addr_lo = dma_addr & 0xffffffff;
+
+ rxq->rd_p++;
+ }
+
+ if (nb_hold == 0)
+ return nb_hold;
+
+ PMD_RX_LOG(DEBUG, "RX port_id=%u queue_id=%u, %d packets received\n",
+ (unsigned) rxq->port_id, (unsigned) rxq->qidx, nb_hold);
+
+ nb_hold += rxq->nb_rx_hold;
+
+ /* FL descriptors needs to be written before incrementing the
+ * FL queue WR pointer */
+ rte_wmb();
+ if (nb_hold > rxq->rx_free_thresh) {
+ PMD_RX_LOG(DEBUG, "port=%u queue=%u nb_hold=%u avail=%u\n",
+ (unsigned) rxq->port_id, (unsigned) rxq->qidx,
+ (unsigned) nb_hold, (unsigned) avail);
+ nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold);
+ nb_hold = 0;
+ }
+ rxq->nb_rx_hold = nb_hold;
+
+ return avail;
+}
+
+
+/**
+ * nfp_net_tx_free_bufs - Check for descriptors with a complete
+ * status
+ *
+ * @txq: TX queue to work with
+ *
+ * Returns number of descriptors freed
+ */
+int
+nfp_net_tx_free_bufs(struct nfp_net_txq *txq)
+{
+ __u32 qcp_rd_p;
+ int todo;
+
+ PMD_TX_LOG(DEBUG, "queue %u. Check for descriptor with a complete"
+ " status\n", txq->qidx);
+
+ /* Work out how many packets have been sent */
+ qcp_rd_p = nfp_qcp_read(txq->qcp_q, NFP_QCP_READ_PTR);
+
+ if(qcp_rd_p == txq->qcp_rd_p) {
+ PMD_TX_LOG(DEBUG, "queue %u: It seems harrier is not sending"
+ "packets (%u, %u)\n", txq->qidx,
+ qcp_rd_p, txq->qcp_rd_p);
+ return 0;
+ }
+
+ if (qcp_rd_p > txq->qcp_rd_p)
+ todo = qcp_rd_p - txq->qcp_rd_p;
+ else
+ todo = qcp_rd_p + txq->tx_count - txq->qcp_rd_p;
+
+ PMD_TX_LOG(DEBUG, "qcp_rd_p %u, txq->qcp_rd_p: %u, qcp->rd_p: %u\n",
+ qcp_rd_p, txq->qcp_rd_p, txq->rd_p);
+
+ if (todo == 0)
+ return todo;
+
+ txq->qcp_rd_p += todo;
+ txq->qcp_rd_p %= txq->tx_count;
+ txq->rd_p += todo;
+
+ return todo;
+}
+
+/* Leaving always free descriptors for avoiding wrapping confusion */
+#define FREE_TX_DESC(t) (t->tx_count - (t->wr_p - t->rd_p) - 8)
+
+/**
+ * nfp_net_txq_full - Check if the TX queue free descriptors
+ * is below tx_free_threshold
+ *
+ * @txq: TX queue to check
+ *
+ * This function uses the host copy* of read/write pointers
+ */
+static inline
+int nfp_net_txq_full(struct nfp_net_txq *txq)
+{
+ return FREE_TX_DESC(txq) < txq->tx_free_thresh;
+}
+
+static uint16_t
+nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+ struct nfp_net_txq *txq;
+ struct nfp_net_hw *hw;
+ struct nfp_net_tx_desc *txds;
+ struct rte_mbuf *pkt;
+ uint64_t dma_addr;
+ int pkt_size, dma_size;
+ uint16_t free_descs, issued_descs;
+ struct rte_mbuf **lmbuf;
+ int i;
+
+ txq = tx_queue;
+ hw = txq->hw;
+ txds = &txq->txds[txq->tail];
+
+ PMD_TX_LOG(DEBUG, "working for queue %u at pos %d and %u packets\n",
+ txq->qidx, wr_idx, nb_pkts);
+
+ if ((FREE_TX_DESC(txq) < nb_pkts) || (nfp_net_txq_full(txq)))
+ nfp_net_tx_free_bufs(txq);
+
+ free_descs = (uint16_t)FREE_TX_DESC(txq);
+ if (unlikely(free_descs == 0))
+ return 0;
+
+ pkt = *tx_pkts;
+
+ i = 0;
+ issued_descs = 0;
+ PMD_TX_LOG(DEBUG, "queue: %u. Sending %u packets\n",
+ txq->qidx, nb_pkts);
+ /* Sending packets */
+ while ((i < nb_pkts) && free_descs) {
+
+ /* Grabbing the mbuf linked to the current descriptor */
+ lmbuf = &txq->txbufs[txq->tail].mbuf;
+ /* Warming the cache for releasing the mbuf later on */
+ RTE_MBUF_PREFETCH_TO_FREE(*lmbuf);
+
+ pkt = *(tx_pkts + i);
+
+ if (unlikely((pkt->nb_segs > 1) &&
+ !(hw->cap & NFP_NET_CFG_CTRL_GATHER))) {
+ PMD_INIT_LOG(INFO, "NFP_NET_CFG_CTRL_GATHER not set\n");
+ rte_panic("Multisegment packet unsupported\n");
+ }
+
+ /* Checking if we have enough descriptors */
+ if (unlikely(pkt->nb_segs > free_descs))
+ goto xmit_end;
+
+ /* Checksum and VLAN flags just in the first descriptor for a
+ * multisegment packet */
+ nfp_net_tx_cksum(txq, txds, pkt);
+
+ if ((pkt->ol_flags & PKT_TX_VLAN_PKT) &&
+ (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)) {
+ txds->flags |= PCIE_DESC_TX_VLAN;
+ txds->vlan = pkt->vlan_tci;
+ }
+
+ if (pkt->ol_flags & PKT_TX_TCP_SEG)
+ rte_panic("TSO is not supported\n");
+
+ /* mbuf data_len is the data in one segment and pkt_len data
+ * in the whole packet. When the packet is just one segment,
+ * then data_len = pkt_len */
+ pkt_size = pkt->pkt_len;
+
+ while (pkt_size) {
+ /* Releasing mbuf which was prefetched above */
+ if (*lmbuf)
+ rte_pktmbuf_free_seg(*lmbuf);
+
+ dma_size = pkt->data_len;
+ dma_addr = RTE_MBUF_DATA_DMA_ADDR(pkt);
+ PMD_TX_LOG(DEBUG, "Working with mbuf at dma address:"
+ "%"PRIx64"\n", dma_addr);
+
+ /* Filling descriptors fields */
+ txds->dma_len = dma_size;
+ txds->data_len = pkt->pkt_len;
+ txds->dma_addr_hi = (dma_addr >> 32) & 0xff;
+ txds->dma_addr_lo = (dma_addr & 0xffffffff);
+ ASSERT(free_descs > 0);
+ free_descs--;
+
+ /* Linking mbuf with descriptor for being released
+ * next time descriptor is used */
+ *lmbuf = pkt;
+
+ txq->wr_p++;
+ txq->tail++;
+ if (unlikely(txq->tail == txq->tx_count)) /* wrapping?*/
+ txq->tail = 0;
+
+ pkt_size -= dma_size;
+ if (!pkt_size) {
+ /* End of packet */
+ txds->offset_eop |= PCIE_DESC_TX_EOP;
+ } else {
+ txds->offset_eop &= PCIE_DESC_TX_OFFSET_MASK;
+ pkt = pkt->next;
+ }
+ /* Referencing next free TX descriptor */
+ txds = &txq->txds[txq->tail];
+ issued_descs++;
+ }
+ i++;
+ }
+
+xmit_end:
+ /* Increment write pointers. Force memory write before we let HW know */
+ rte_wmb();
+ nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, issued_descs);
+
+ return i;
+}
+
+static void
+nfp_net_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+ uint32_t new_ctrl, update;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ new_ctrl = 0;
+
+ if ((mask & ETH_VLAN_FILTER_OFFLOAD) ||
+ (mask & ETH_VLAN_FILTER_OFFLOAD))
+ RTE_LOG(INFO, PMD, "Not support for ETH_VLAN_FILTER_OFFLOAD or"
+ " ETH_VLAN_FILTER_EXTEND");
+
+ /* Enable vlan strip if it is not configured yet */
+ if ((mask & ETH_VLAN_STRIP_OFFLOAD) &&
+ !(hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN))
+ new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_RXVLAN;
+
+ /* Disable vlan strip just if it is configured */
+ if (!(mask & ETH_VLAN_STRIP_OFFLOAD) &&
+ (hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN))
+ new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_RXVLAN;
+
+ if (new_ctrl == 0)
+ return;
+
+ update = NFP_NET_CFG_UPDATE_GEN;
+
+ if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+ return;
+
+ hw->ctrl = new_ctrl;
+}
+
+/**
+ * Update Redirection Table(RETA) of Receive Side Scaling of Ethernet device
+ */
+static int
+nfp_net_reta_update(struct rte_eth_dev *dev,
+ struct rte_eth_rss_reta_entry64 *reta_conf,
+ uint16_t reta_size)
+{
+ uint32_t reta, mask;
+ int i, j;
+ int idx, shift;
+ uint32_t update;
+ struct nfp_net_hw *hw =
+ NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+ return EINVAL;
+
+ if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+ RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+ "(%d) doesn't match the number hardware can supported "
+ "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+ return -EINVAL;
+ }
+
+ /* Update Redirection Table. There are 128 8bit-entries which can be
+ * manage as 32 32bit-entries */
+ for (i = 0; i < reta_size; i += 4) {
+ /* Handling 4 RSS entries per loop */
+ idx = i / RTE_RETA_GROUP_SIZE;
+ shift = i % RTE_RETA_GROUP_SIZE;
+ mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+ if (!mask)
+ continue;
+
+ reta = 0;
+ /* If all 4 entries were set, don't need read RETA register */
+ if (mask != 0xF)
+ reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + i);
+
+ for (j = 0; j < 4; j++) {
+ if (!(mask & (0x1 << j)))
+ continue;
+ if (mask != 0xF)
+ /* Clearing the entry bits */
+ reta &= ~(0xFF << (8 * j));
+ reta |= reta_conf[idx].reta[shift + j] << (8 * j);
+ }
+ nn_cfg_writel(hw, NFP_NET_CFG_RSS_ITBL + shift, reta);
+ }
+
+ update = NFP_NET_CFG_UPDATE_RSS;
+
+ if (nfp_net_reconfig(hw, hw->ctrl, update) < 0)
+ return -EIO;
+
+ return 0;
+}
+
+ /**
+ * Query Redirection Table(RETA) of Receive Side Scaling of Ethernet device.
+ */
+static int
+nfp_net_reta_query(struct rte_eth_dev *dev,
+ struct rte_eth_rss_reta_entry64 *reta_conf,
+ uint16_t reta_size)
+{
+ uint8_t i, j, mask;
+ int idx, shift;
+ uint32_t reta;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+ return -EINVAL;
+
+ if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+ RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+ "(%d) doesn't match the number hardware can supported "
+ "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+ return -EINVAL;
+ }
+
+ /* Reading Redirection Table. There are 128 8bit-entries which can be
+ * manage as 32 32bit-entries */
+ for (i = 0; i < reta_size; i += 4) {
+ /* Handling 4 RSS entries per loop */
+ idx = i / RTE_RETA_GROUP_SIZE;
+ shift = i % RTE_RETA_GROUP_SIZE;
+ mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+ if (!mask)
+ continue;
+
+ reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + shift);
+ for (j = 0; j < 4; j++) {
+ if (!(mask & (0x1 << j)))
+ continue;
+ reta_conf->reta[shift + j] =
+ (uint8_t)((reta >> (8 * j)) & 0xF);
+ }
+ }
+ return 0;
+}
+
+static int
+nfp_net_rss_hash_update(struct rte_eth_dev *dev,
+ struct rte_eth_rss_conf *rss_conf)
+{
+ uint32_t update;
+ uint32_t cfg_rss_ctrl = 0;
+ uint8_t key;
+ uint64_t rss_hf;
+ int i;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ rss_hf = rss_conf->rss_hf;
+
+ /* Checking if RSS is enabled */
+ if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) {
+ if (rss_hf != 0) { /* Enable RSS? */
+ RTE_LOG(ERR, PMD, "RSS unsupported\n");
+ return -EINVAL;
+ }
+ return 0; /* Nothing to do */
+ }
+
+ if (rss_conf->rss_key_len > NFP_NET_CFG_RSS_KEY_SZ) {
+ RTE_LOG(ERR, PMD, "hash key too long\n");
+ return -EINVAL;
+ }
+
+ if (rss_hf & ETH_RSS_IPV4)
+ cfg_rss_ctrl |= NFP_NET_CFG_RSS_IPV4 |
+ NFP_NET_CFG_RSS_IPV4_TCP |
+ NFP_NET_CFG_RSS_IPV4_UDP;
+
+ if (rss_hf & ETH_RSS_IPV6)
+ cfg_rss_ctrl |= NFP_NET_CFG_RSS_IPV6 |
+ NFP_NET_CFG_RSS_IPV6_TCP |
+ NFP_NET_CFG_RSS_IPV6_UDP;
+
+ /* configuring where to apply the RSS hash */
+ nn_cfg_writel(hw, NFP_NET_CFG_RSS_CTRL, cfg_rss_ctrl);
+
+ /* Writing the key byte a byte */
+ for (i = 0; i < rss_conf->rss_key_len; i++) {
+ memcpy(&key, &rss_conf->rss_key[i], 1);
+ nn_cfg_writeb(hw, NFP_NET_CFG_RSS_KEY + i, key);
+ }
+
+ /* Writing the key size */
+ nn_cfg_writeb(hw, NFP_NET_CFG_RSS_KEY_SZ, rss_conf->rss_key_len);
+
+ update = NFP_NET_CFG_UPDATE_RSS;
+
+ if (nfp_net_reconfig(hw, hw->ctrl, update) < 0)
+ return -EIO;
+
+ return 0;
+}
+
+static int
+nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
+ struct rte_eth_rss_conf *rss_conf)
+{
+
+ uint64_t rss_hf;
+ uint32_t cfg_rss_ctrl;
+ uint8_t key;
+ int i;
+ struct nfp_net_hw *hw;
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+ if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+ return EINVAL;
+
+ rss_hf = rss_conf->rss_hf;
+ cfg_rss_ctrl = nn_cfg_readl(hw, NFP_NET_CFG_RSS_CTRL);
+
+ if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4)
+ rss_hf |= ETH_RSS_NONFRAG_IPV4_TCP | ETH_RSS_NONFRAG_IPV4_UDP;
+
+ if(cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4_TCP)
+ rss_hf |= ETH_RSS_NONFRAG_IPV4_TCP;
+
+ if(cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6_TCP)
+ rss_hf |= ETH_RSS_NONFRAG_IPV6_TCP;
+
+ if(cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4_UDP)
+ rss_hf |= ETH_RSS_NONFRAG_IPV4_UDP;
+
+ if(cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6_UDP)
+ rss_hf |= ETH_RSS_NONFRAG_IPV6_UDP;
+
+ if(cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6)
+ rss_hf |= ETH_RSS_NONFRAG_IPV4_UDP | ETH_RSS_NONFRAG_IPV6_UDP;
+
+ /* Reading the key size */
+ rss_conf->rss_key_len = nn_cfg_readl(hw, NFP_NET_CFG_RSS_KEY_SZ);
+
+ /* Reading the key byte a byte */
+ for (i = 0; i < rss_conf->rss_key_len; i++) {
+ key = nn_cfg_readb(hw, NFP_NET_CFG_RSS_KEY + i);
+ memcpy(&rss_conf->rss_key[i], &key, 1);
+ }
+
+ return 0;
+}
+
+/*
+ * Initialise and register driver with DPDK Application
+ */
+static struct eth_dev_ops nfp_net_eth_dev_ops = {
+ .dev_configure = nfp_net_configure,
+ .dev_start = nfp_net_start,
+ .dev_stop = nfp_net_stop,
+ .dev_close = nfp_net_close,
+
+ .promiscuous_enable = nfp_net_promisc_enable,
+ .promiscuous_disable = nfp_net_promisc_disable,
+
+ .link_update = nfp_net_link_update,
+
+ .stats_get = nfp_net_stats_get,
+ .stats_reset = nfp_net_stats_reset,
+
+ .dev_infos_get = nfp_net_infos_get,
+ .mtu_set = nfp_net_dev_mtu_set,
+
+ .vlan_offload_set = nfp_net_vlan_offload_set,
+
+ .reta_update = nfp_net_reta_update,
+ .reta_query = nfp_net_reta_query,
+ .rss_hash_update = nfp_net_rss_hash_update,
+ .rss_hash_conf_get = nfp_net_rss_hash_conf_get,
+
+ .rx_queue_setup = nfp_net_rx_queue_setup,
+ .rx_queue_release = nfp_net_rx_queue_release,
+ .rx_queue_count = nfp_net_rx_queue_count,
+ .tx_queue_setup = nfp_net_tx_queue_setup,
+ .tx_queue_release = nfp_net_tx_queue_release,
+
+ .mac_addr_add = NULL,
+ .mac_addr_remove = NULL,
+};
+
+
+static int
+__nfp_net_init(struct rte_eth_dev *eth_dev)
+{
+ struct rte_pci_device *pci_dev;
+ struct nfp_net_hw *hw;
+
+ uint32_t tx_bar_off, rx_bar_off;
+ uint32_t start_q;
+ int stride = 4;
+
+ PMD_INIT_FUNC_TRACE();
+
+ hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
+
+ eth_dev->dev_ops = &nfp_net_eth_dev_ops;
+ eth_dev->rx_pkt_burst = &nfp_net_recv_pkts;
+ eth_dev->tx_pkt_burst = &nfp_net_xmit_pkts;
+
+ /* For secondary processes, the primary has done all the work */
+ if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+ return 0;
+
+ pci_dev = eth_dev->pci_dev;
+ hw->device_id = pci_dev->id.device_id;
+ hw->vendor_id = pci_dev->id.vendor_id;
+ hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
+ hw->subsystem_vendor_id = pci_dev->id.subsystem_vendor_id;
+
+ PMD_INIT_LOG(DEBUG, "nfp_net: device (%u:%u) %u:%u:%u:%u\n",
+ pci_dev->id.vendor_id, pci_dev->id.device_id,
+ pci_dev->addr.domain, pci_dev->addr.bus,
+ pci_dev->addr.devid, pci_dev->addr.function);
+
+ hw->ctrl_bar = (uint8_t *)pci_dev->mem_resource[0].addr;
+ if (!hw->ctrl_bar) {
+ RTE_LOG(ERR, PMD,
+ "hw->ctrl_bar is NULL. BAR0 not configured\n");
+ return -ENODEV;
+ }
+ hw->max_rx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_RXRINGS);
+ hw->max_tx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_TXRINGS);
+
+ /* Work out where in the BAR the queues start. */
+ switch (pci_dev->id.device_id) {
+ case PCI_DEVICE_ID_NFP6000_VF_NIC:
+ start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
+ tx_bar_off = NFP_PCIE_QUEUE(start_q);
+ start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_RXQ);
+ rx_bar_off = NFP_PCIE_QUEUE(start_q);
+ break;
+ default:
+ RTE_LOG(ERR, PMD, "nfp_net: no device ID matching\n");
+ return -ENODEV;
+ }
+
+ PMD_INIT_LOG(DEBUG, "tx_bar_off: 0x%08x\n", tx_bar_off);
+ PMD_INIT_LOG(DEBUG, "rx_bar_off: 0x%08x\n", rx_bar_off);
+
+ hw->tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off;
+ hw->rx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + rx_bar_off;
+
+ PMD_INIT_LOG(DEBUG, "ctrl_bar: %p, tx_bar: %p, rx_bar: %p\n",
+ hw->ctrl_bar, hw->tx_bar, hw->rx_bar);
+
+ nfp_net_cfg_queue_setup(hw);
+
+ /* Get some of the read-only fields from the config BAR */
+ hw->ver = nn_cfg_readl(hw, NFP_NET_CFG_VERSION);
+ hw->cap = nn_cfg_readl(hw, NFP_NET_CFG_CAP);
+ hw->max_mtu= nn_cfg_readl(hw, NFP_NET_CFG_MAX_MTU);
+ hw->mtu= hw->max_mtu;
+
+ PMD_INIT_LOG(INFO, "VER: %#x, Maximum supported MTU: %d\n",
+ hw->ver, hw->max_mtu);
+ PMD_INIT_LOG(INFO, "CAP: %#x, %s%s%s%s%s%s%s%s%s\n", hw->cap,
+ hw->cap & NFP_NET_CFG_CTRL_PROMISC ? "PROMISC " : "",
+ hw->cap & NFP_NET_CFG_CTRL_RXCSUM ? "RXCSUM " : "",
+ hw->cap & NFP_NET_CFG_CTRL_TXCSUM ? "TXCSUM " : "",
+ hw->cap & NFP_NET_CFG_CTRL_RXVLAN ? "RXVLAN " : "",
+ hw->cap & NFP_NET_CFG_CTRL_TXVLAN ? "TXVLAN " : "",
+ hw->cap & NFP_NET_CFG_CTRL_SCATTER ? "SCATTER " : "",
+ hw->cap & NFP_NET_CFG_CTRL_GATHER ? "GATHER " : "",
+ hw->cap & NFP_NET_CFG_CTRL_LSO ? "TSO " : "",
+ hw->cap & NFP_NET_CFG_CTRL_RSS ? "RSS " : "");
+
+ pci_dev = eth_dev->pci_dev;
+ hw->ctrl = 0;
+
+ hw->stride_rx = stride;
+ hw->stride_tx = stride;
+
+ PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n",
+ hw->max_rx_queues, hw->max_tx_queues);
+
+ /* Allocating memory for mac addr */
+ eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0);
+ if (eth_dev->data->mac_addrs == NULL) {
+ PMD_INIT_LOG(ERR, "Failed to space for MAC address");
+ return -ENOMEM;
+ }
+
+ /* Using random mac addresses for VFs */
+ eth_random_addr(&hw->mac_addr[0]);
+
+ /* Copying mac address to DPDK eth_dev struct */
+ ether_addr_copy(ð_dev->data->mac_addrs[0],
+ (struct ether_addr *)hw->mac_addr);
+
+ PMD_INIT_LOG(INFO, "port %d VendorID=0x%x DeviceID=0x%x "
+ "mac=%02x:%02x:%02x:%02x:%02x:%02x",
+ eth_dev->data->port_id, pci_dev->id.vendor_id,
+ pci_dev->id.device_id,
+ hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
+ hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
+
+ /* Registering LSC interrupt handler */
+ rte_intr_callback_register(&(pci_dev->intr_handle),
+ nfp_net_dev_interrupt_handler,
+ (void *)eth_dev);
+
+ /* enable uio intr after callback register */
+ rte_intr_enable(&(pci_dev->intr_handle));
+
+ /* Telling the firmware about the LSC interrupt entry */
+ nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
+
+ /* Recording current stats counters values */
+ nfp_net_stats_reset(eth_dev);
+
+ return 0;
+}
+
+static int
+nfp_net_init(struct rte_eth_dev *eth_dev)
+{
+ return __nfp_net_init(eth_dev);
+}
+
+static struct rte_pci_id pci_id_nfp_net_map[] = {
+ {
+ .vendor_id = PCI_VENDOR_ID_NETRONOME,
+ .device_id = PCI_DEVICE_ID_NFP6000_PF_NIC,
+ .subsystem_vendor_id = PCI_ANY_ID,
+ .subsystem_device_id = PCI_ANY_ID,
+ },
+ {
+ .vendor_id = PCI_VENDOR_ID_NETRONOME,
+ .device_id = PCI_DEVICE_ID_NFP6000_VF_NIC,
+ .subsystem_vendor_id = PCI_ANY_ID,
+ .subsystem_device_id = PCI_ANY_ID,
+ },
+ {
+ .vendor_id = 0,
+ },
+};
+
+static struct eth_driver rte_nfp_net_pmd = {
+ {
+ .name = "rte_nfp_net_pmd",
+ .id_table = pci_id_nfp_net_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+ },
+ .eth_dev_init = nfp_net_init,
+ .dev_private_size = sizeof(struct nfp_net_adapter),
+};
+
+static int
+nfp_net_pmd_init(const char *name __rte_unused,
+ const char *params __rte_unused)
+{
+ PMD_INIT_FUNC_TRACE();
+ PMD_INIT_LOG(INFO, "librte_pmd_nfp_net version %s\n",
+ NFP_NET_PMD_VERSION);
+
+ rte_eth_driver_register(&rte_nfp_net_pmd);
+ return 0;
+}
+
+static struct rte_driver rte_nfp_net_driver = {
+ .type = PMD_PDEV,
+ .init = nfp_net_pmd_init,
+};
+
+PMD_REGISTER_DRIVER(rte_nfp_net_driver);
+
+/*
+ * Local variables:
+ * c-file-style: "Linux"
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/drivers/net/nfp/nfp_net_ctrl.h b/drivers/net/nfp/nfp_net_ctrl.h
new file mode 100644
index 0000000..ae18327
--- /dev/null
+++ b/drivers/net/nfp/nfp_net_ctrl.h
@@ -0,0 +1,294 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived from this
+ * software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+/* vim:shiftwidth=8:noexpandtab
+ *
+ * Netronome network device driver: Control BAR layout
+ */
+#ifndef _NFP_NET_CTRL_H_
+#define _NFP_NET_CTRL_H_
+
+/**
+ * Configuration BAR size.
+ *
+ * The configuration BAR is 8K in size, but on the NFP6000, due to
+ * THB-350, 32k needs to be reserved.
+ */
+#ifdef __NFP_IS_6000
+#define NFP_NET_CFG_BAR_SZ (32 * 1024)
+#else
+#define NFP_NET_CFG_BAR_SZ (8 * 1024)
+#endif
+
+/**
+ * Offset in Freelist buffer where packet starts on RX
+ */
+#define NFP_NET_RX_OFFSET 32
+
+/**
+ * Hash type pre-pended when a RSS hash was computed
+ */
+#define NFP_NET_RSS_NONE 0
+#define NFP_NET_RSS_IPV4 1
+#define NFP_NET_RSS_IPV6 2
+#define NFP_NET_RSS_IPV6_EX 3
+#define NFP_NET_RSS_IPV4_TCP 4
+#define NFP_NET_RSS_IPV6_TCP 5
+#define NFP_NET_RSS_IPV6_EX_TCP 6
+#define NFP_NET_RSS_IPV4_UDP 7
+#define NFP_NET_RSS_IPV6_UDP 8
+#define NFP_NET_RSS_IPV6_EX_UDP 9
+
+/**
+ * @NFP_NET_TXR_MAX: Maximum number of TX rings
+ * @NFP_NET_TXR_MASK: Mask for TX rings
+ * @NFP_NET_RXR_MAX: Maximum number of RX rings
+ * @NFP_NET_RXR_MASK: Mask for RX rings
+ */
+#define NFP_NET_TXR_MAX 64
+#define NFP_NET_TXR_MASK (NFP_NET_TXR_MAX - 1)
+#define NFP_NET_RXR_MAX 64
+#define NFP_NET_RXR_MASK (NFP_NET_RXR_MAX - 1)
+
+/**
+ * Read/Write config words (0x0000 - 0x002c)
+ * @NFP_NET_CFG_CTRL: Global control
+ * @NFP_NET_CFG_UPDATE: Indicate which fields are updated
+ * @NFP_NET_CFG_TXRS_ENABLE: Bitmask of enabled TX rings
+ * @NFP_NET_CFG_RXRS_ENABLE: Bitmask of enabled RX rings
+ * @NFP_NET_CFG_MTU: Set MTU size
+ * @NFP_NET_CFG_FLBUFSZ: Set freelist buffer size (must be larger than MTU)
+ * @NFP_NET_CFG_EXN: MSI-X table entry for exceptions
+ * @NFP_NET_CFG_LSC: MSI-X table entry for link state changes
+ * @NFP_NET_CFG_MACADDR: MAC address
+ *
+ * TODO:
+ * - define Error details in UPDATE
+ */
+#define NFP_NET_CFG_CTRL 0x0000
+#define NFP_NET_CFG_CTRL_ENABLE (0x1 << 0) /* Global enable */
+#define NFP_NET_CFG_CTRL_PROMISC (0x1 << 1) /* Enable Promisc mode */
+#define NFP_NET_CFG_CTRL_L2BC (0x1 << 2) /* Allow L2 Broadcast */
+#define NFP_NET_CFG_CTRL_L2MC (0x1 << 3) /* Allow L2 Multicast */
+#define NFP_NET_CFG_CTRL_RXCSUM (0x1 << 4) /* Enable RX Checksum */
+#define NFP_NET_CFG_CTRL_TXCSUM (0x1 << 5) /* Enable TX Checksum */
+#define NFP_NET_CFG_CTRL_RXVLAN (0x1 << 6) /* Enable VLAN strip */
+#define NFP_NET_CFG_CTRL_TXVLAN (0x1 << 7) /* Enable VLAN insert */
+#define NFP_NET_CFG_CTRL_SCATTER (0x1 << 8) /* Scatter DMA */
+#define NFP_NET_CFG_CTRL_GATHER (0x1 << 9) /* Gather DMA */
+#define NFP_NET_CFG_CTRL_LSO (0x1 << 10) /* LSO/TSO */
+#define NFP_NET_CFG_CTRL_RINGCFG (0x1 << 16) /* Ring runtime changes */
+#define NFP_NET_CFG_CTRL_RSS (0x1 << 17) /* RSS */
+#define NFP_NET_CFG_CTRL_IRQMOD (0x1 << 18) /* Interrupt moderation */
+#define NFP_NET_CFG_CTRL_RINGPRIO (0x1 << 19) /* Ring priorities */
+#define NFP_NET_CFG_CTRL_MSIXAUTO (0x1 << 20) /* MSI-X auto-masking */
+#define NFP_NET_CFG_CTRL_TXRWB (0x1 << 21) /* Write-back of TX ring*/
+#define NFP_NET_CFG_CTRL_L2SWITCH (0x1 << 22) /* L2 Switch */
+#define NFP_NET_CFG_CTRL_L2SWITCH_LOCAL (0x1 << 23) /* Switch to local */
+#define NFP_NET_CFG_CTRL_VXLANO (0x1 << 24) /* Enable VXLAN */
+#define NFP_NET_CFG_CTRL_NVGREO (0x1 << 25) /* Enable NVGRE */
+#define NFP_NET_CFG_UPDATE 0x0004
+#define NFP_NET_CFG_UPDATE_GEN (0x1 << 0) /* General update */
+#define NFP_NET_CFG_UPDATE_RING (0x1 << 1) /* Ring config change */
+#define NFP_NET_CFG_UPDATE_RSS (0x1 << 2) /* RSS config change */
+#define NFP_NET_CFG_UPDATE_TXRPRIO (0x1 << 3) /* TX Ring prio change */
+#define NFP_NET_CFG_UPDATE_RXRPRIO (0x1 << 4) /* RX Ring prio change */
+#define NFP_NET_CFG_UPDATE_MSIX (0x1 << 5) /* MSI-X change */
+#define NFP_NET_CFG_UPDATE_L2SWITCH (0x1 << 6) /* Switch changes */
+#define NFP_NET_CFG_UPDATE_RESET (0x1 << 7) /* Update due to FLR */
+#define NFP_NET_CFG_UPDATE_IRQMOD (0x1 << 8) /* IRQ mod change */
+#define NFP_NET_CFG_UPDATE_ERR (0x1 << 31) /* A error occurred */
+#define NFP_NET_CFG_TXRS_ENABLE 0x0008
+#define NFP_NET_CFG_RXRS_ENABLE 0x0010
+#define NFP_NET_CFG_MTU 0x0018
+#define NFP_NET_CFG_FLBUFSZ 0x001c
+#define NFP_NET_CFG_EXN 0x001f
+#define NFP_NET_CFG_LSC 0x0020
+#define NFP_NET_CFG_MACADDR 0x0024
+
+/**
+ * Read-only words (0x0030 - 0x0050):
+ * @NFP_NET_CFG_VERSION: Firmware version number
+ * @NFP_NET_CFG_STS: Status
+ * @NFP_NET_CFG_CAP: Capabilities (same bits as @NFP_NET_CFG_CTRL)
+ * @NFP_NET_MAX_TXRINGS: Maximum number of TX rings
+ * @NFP_NET_MAX_RXRINGS: Maximum number of RX rings
+ * @NFP_NET_MAX_MTU: Maximum support MTU
+ * @NFP_NET_CFG_START_TXQ: Start Queue Control Queue to use for TX (PF only)
+ * @NFP_NET_CFG_START_RXQ: Start Queue Control Queue to use for RX (PF only)
+ *
+ * TODO:
+ * - define more STS bits
+ */
+#define NFP_NET_CFG_VERSION 0x0030
+#define NFP_NET_CFG_STS 0x0034
+#define NFP_NET_CFG_STS_LINK (0x1 << 0) /* Link up or down */
+#define NFP_NET_CFG_CAP 0x0038
+#define NFP_NET_CFG_MAX_TXRINGS 0x003c
+#define NFP_NET_CFG_MAX_RXRINGS 0x0040
+#define NFP_NET_CFG_MAX_MTU 0x0044
+/* Next two words are being used by VFs for solving THB350 issue */
+#define NFP_NET_CFG_START_TXQ 0x0048
+#define NFP_NET_CFG_START_RXQ 0x004c
+
+/**
+ * NFP-3200 workaround (0x0050 - 0x0058)
+ * @NFP_NET_CFG_SPARE_ADDR: DMA address for ME code to use (e.g. YDS-155 fix)
+ */
+#define NFP_NET_CFG_SPARE_ADDR 0x0050
+
+/**
+ * 64B reserved for future use (0x0080 - 0x00c0)
+ */
+#define NFP_NET_CFG_RESERVED 0x0080
+#define NFP_NET_CFG_RESERVED_SZ 0x0040
+
+/**
+ * RSS configuration (0x0100 - 0x01ac):
+ * Used only when NFP_NET_CFG_CTRL_RSS is enabled
+ * @NFP_NET_CFG_RSS_CFG: RSS configuration word
+ * @NFP_NET_CFG_RSS_KEY: RSS "secret" key
+ * @NFP_NET_CFG_RSS_ITBL: RSS indirection table
+ */
+#define NFP_NET_CFG_RSS_BASE 0x0100
+#define NFP_NET_CFG_RSS_CTRL NFP_NET_CFG_RSS_BASE
+#define NFP_NET_CFG_RSS_MASK (0x7f)
+#define NFP_NET_CFG_RSS_MASK_of(_x) ((_x) & 0x7f)
+#define NFP_NET_CFG_RSS_IPV4 (1 << 8) /* RSS for IPv4 */
+#define NFP_NET_CFG_RSS_IPV6 (1 << 9) /* RSS for IPv6 */
+#define NFP_NET_CFG_RSS_IPV4_TCP (1 << 10) /* RSS for IPv4/TCP */
+#define NFP_NET_CFG_RSS_IPV4_UDP (1 << 11) /* RSS for IPv4/UDP */
+#define NFP_NET_CFG_RSS_IPV6_TCP (1 << 12) /* RSS for IPv6/TCP */
+#define NFP_NET_CFG_RSS_IPV6_UDP (1 << 13) /* RSS for IPv6/UDP */
+#define NFP_NET_CFG_RSS_TOEPLITZ (1 << 24) /* Use Toeplitz hash */
+#define NFP_NET_CFG_RSS_KEY (NFP_NET_CFG_RSS_BASE + 0x4)
+#define NFP_NET_CFG_RSS_KEY_SZ 0x28
+#define NFP_NET_CFG_RSS_ITBL (NFP_NET_CFG_RSS_BASE + 0x4 + \
+ NFP_NET_CFG_RSS_KEY_SZ)
+#define NFP_NET_CFG_RSS_ITBL_SZ 0x80
+
+/**
+ * TX ring configuration (0x200 - 0x800)
+ * @NFP_NET_CFG_TXR_BASE: Base offset for TX ring configuration
+ * @NFP_NET_CFG_TXR_ADDR: Per TX ring DMA address (8B entries)
+ * @NFP_NET_CFG_TXR_WB_ADDR: Per TX ring write back DMA address (8B entries)
+ * @NFP_NET_CFG_TXR_SZ: Per TX ring ring size (1B entries)
+ * @NFP_NET_CFG_TXR_VEC: Per TX ring MSI-X table entry (1B entries)
+ * @NFP_NET_CFG_TXR_PRIO: Per TX ring priority (1B entries)
+ * @NFP_NET_CFG_TXR_IRQ_MOD: Per TX ring interrupt moderation (4B entries)
+ */
+#define NFP_NET_CFG_TXR_BASE 0x0200
+#define NFP_NET_CFG_TXR_ADDR(_x) (NFP_NET_CFG_TXR_BASE + ((_x) * 0x8))
+#define NFP_NET_CFG_TXR_WB_ADDR(_x) (NFP_NET_CFG_TXR_BASE + 0x200 + \
+ ((_x) * 0x8))
+#define NFP_NET_CFG_TXR_SZ(_x) (NFP_NET_CFG_TXR_BASE + 0x400 + (_x))
+#define NFP_NET_CFG_TXR_VEC(_x) (NFP_NET_CFG_TXR_BASE + 0x440 + (_x))
+#define NFP_NET_CFG_TXR_PRIO(_x) (NFP_NET_CFG_TXR_BASE + 0x480 + (_x))
+#define NFP_NET_CFG_TXR_IRQ_MOD(_x) (NFP_NET_CFG_TXR_BASE + 0x500 + \
+ ((_x) * 0x4))
+
+/**
+ * RX ring configuration (0x0800 - 0x0c00)
+ * @NFP_NET_CFG_RXR_BASE: Base offset for RX ring configuration
+ * @NFP_NET_CFG_RXR_ADDR: Per TX ring DMA address (8B entries)
+ * @NFP_NET_CFG_RXR_SZ: Per TX ring ring size (1B entries)
+ * @NFP_NET_CFG_RXR_VEC: Per TX ring MSI-X table entry (1B entries)
+ * @NFP_NET_CFG_RXR_PRIO: Per TX ring priority (1B entries)
+ * @NFP_NET_CFG_RXR_IRQ_MOD: Per TX ring interrupt moderation (4B entries)
+ */
+#define NFP_NET_CFG_RXR_BASE 0x0800
+#define NFP_NET_CFG_RXR_ADDR(_x) (NFP_NET_CFG_RXR_BASE + ((_x) * 0x8))
+#define NFP_NET_CFG_RXR_SZ(_x) (NFP_NET_CFG_RXR_BASE + 0x200 + (_x))
+#define NFP_NET_CFG_RXR_VEC(_x) (NFP_NET_CFG_RXR_BASE + 0x240 + (_x))
+#define NFP_NET_CFG_RXR_PRIO(_x) (NFP_NET_CFG_RXR_BASE + 0x280 + (_x))
+#define NFP_NET_CFG_RXR_IRQ_MOD(_x) (NFP_NET_CFG_RXR_BASE + 0x300 + \
+ ((_x) * 0x4))
+
+/**
+ * Interrupt Control/Cause registers (0x0c00 - 0x0d00)
+ * These registers are only used when MSI-X auto-masking is not
+ * enabled (@NFP_NET_CFG_CTRL_MSIXAUTO not set). The array is index
+ * by MSI-X entry and are 1B in size. If an entry is zero, the
+ * corresponding entry is enabled. If the FW generates an interrupt,
+ * it writes a cause into the corresponding field. This also masks
+ * the MSI-X entry and the host driver must clear the register to
+ * re-enable the interrupt.
+ */
+#define NFP_NET_CFG_ICR_BASE 0x0c00
+#define NFP_NET_CFG_ICR(_x) (NFP_NET_CFG_ICR_BASE + (_x))
+#define NFP_NET_CFG_ICR_UNMASKED 0x0
+#define NFP_NET_CFG_ICR_RXTX 0x1
+#define NFP_NET_CFG_ICR_LSC 0x2
+
+/**
+ * General device stats (0x0d00 - 0x0d90)
+ * all counters are 64bit.
+ */
+#define NFP_NET_CFG_STATS_BASE 0x0d00
+#define NFP_NET_CFG_STATS_RX_DISCARDS (NFP_NET_CFG_STATS_BASE + 0x00)
+#define NFP_NET_CFG_STATS_RX_ERRORS (NFP_NET_CFG_STATS_BASE + 0x08)
+#define NFP_NET_CFG_STATS_RX_OCTETS (NFP_NET_CFG_STATS_BASE + 0x10)
+#define NFP_NET_CFG_STATS_RX_UC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x18)
+#define NFP_NET_CFG_STATS_RX_MC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x20)
+#define NFP_NET_CFG_STATS_RX_BC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x28)
+#define NFP_NET_CFG_STATS_RX_FRAMES (NFP_NET_CFG_STATS_BASE + 0x30)
+#define NFP_NET_CFG_STATS_RX_MC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x38)
+#define NFP_NET_CFG_STATS_RX_BC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x40)
+
+#define NFP_NET_CFG_STATS_TX_DISCARDS (NFP_NET_CFG_STATS_BASE + 0x48)
+#define NFP_NET_CFG_STATS_TX_ERRORS (NFP_NET_CFG_STATS_BASE + 0x50)
+#define NFP_NET_CFG_STATS_TX_OCTETS (NFP_NET_CFG_STATS_BASE + 0x58)
+#define NFP_NET_CFG_STATS_TX_UC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x60)
+#define NFP_NET_CFG_STATS_TX_MC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x68)
+#define NFP_NET_CFG_STATS_TX_BC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x70)
+#define NFP_NET_CFG_STATS_TX_FRAMES (NFP_NET_CFG_STATS_BASE + 0x78)
+#define NFP_NET_CFG_STATS_TX_MC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x80)
+#define NFP_NET_CFG_STATS_TX_BC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x88)
+
+/**
+ * Per ring stats (0x1000 - 0x1800)
+ * options, 64bit per entry
+ * @NFP_NET_CFG_TXR_STATS: TX ring statistics (Packet and Byte count)
+ * @NFP_NET_CFG_RXR_STATS: RX ring statistics (Packet and Byte count)
+ */
+#define NFP_NET_CFG_TXR_STATS_BASE 0x1000
+#define NFP_NET_CFG_TXR_STATS(_x) (NFP_NET_CFG_TXR_STATS_BASE + \
+ ((_x) * 0x10))
+#define NFP_NET_CFG_RXR_STATS_BASE 0x1400
+#define NFP_NET_CFG_RXR_STATS(_x) (NFP_NET_CFG_RXR_STATS_BASE + \
+ ((_x) * 0x10))
+
+#endif /* _NFP_NET_CTRL_H_ */
+/*
+ * Local variables:
+ * c-file-style: "Linux"
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/drivers/net/nfp/nfp_net_logs.h b/drivers/net/nfp/nfp_net_logs.h
new file mode 100644
index 0000000..6e27480
--- /dev/null
+++ b/drivers/net/nfp/nfp_net_logs.h
@@ -0,0 +1,76 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived from this
+ * software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _NFP_NET_LOGS_H_
+#define _NFP_NET_LOGS_H_
+
+#include <rte_log.h>
+
+#define RTE_LIBRTE_NFP_NET_DEBUG_INIT 1
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_INIT
+#define PMD_INIT_LOG(level, fmt, args...) \
+ RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
+#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
+#else
+#define PMD_INIT_LOG(level, fmt, args...) do { } while(0)
+#define PMD_INIT_FUNC_TRACE() do { } while(0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_RX
+#define PMD_RX_LOG(level, fmt, args...) \
+ RTE_LOG(level, PMD, "%s() rx: " fmt , __func__, ## args)
+#else
+#define PMD_RX_LOG(level, fmt, args...) do { } while(0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_TX
+#define PMD_TX_LOG(level, fmt, args...) \
+ RTE_LOG(level, PMD, "%s() tx: " fmt , __func__, ## args)
+#else
+#define PMD_TX_LOG(level, fmt, args...) do { } while(0)
+#endif
+
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_DRIVER
+#define PMD_DRV_LOG(level, fmt, args...) \
+ RTE_LOG(level, PMD, "%s(): " fmt , __func__, ## args)
+#else
+#define PMD_DRV_LOG(level, fmt, args...) do { } while(0)
+#endif
+
+#ifdef RTE_LIBRTE_NFP_NET_DEBUG_INIT
+#define ASSERT(x) if(!(x)) rte_panic("NFP_NET: x")
+#else
+#define ASSERT(x) do { } while(0);
+#endif
+
+#endif /* _NFP_NET_LOGS_H_ */
diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
new file mode 100644
index 0000000..be85b45
--- /dev/null
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -0,0 +1,415 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived from this
+ * software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+/*
+ * vim:shiftwidth=8:noexpandtab
+ *
+ * @file dpdk/pmd/nfp_net_pmd.h
+ *
+ * Netronome NFP_NET PDM driver
+ */
+
+
+#ifndef _NFP_NET_H_
+#define _NFP_NET_H_
+
+#define NFP_NET_PMD_VERSION "0.1"
+#define PCI_VENDOR_ID_NETRONOME 0x19ee
+#define PCI_DEVICE_ID_NFP6000_PF_NIC 0x6000
+#define PCI_DEVICE_ID_NFP6000_VF_NIC 0x6003
+
+
+/* Forward declaration */
+struct nfp_net_adapter;
+
+/* The maximum number of descriptors is limited by design as
+ * DPDK uses uint16_t variables for these values */
+#define NFP_NET_MAX_TX_DESC (32 * 1024)
+#define NFP_NET_MIN_TX_DESC 64
+
+#define NFP_NET_MAX_RX_DESC (32 * 1024)
+#define NFP_NET_MIN_RX_DESC 64
+
+
+/* Bar allocation */
+#define NFP_NET_CRTL_BAR 0
+#define NFP_NET_TX_BAR 2
+#define NFP_NET_RX_BAR 2
+
+/* Macros for accessing the Queue Controller Peripheral 'CSRs' */
+#define NFP_QCP_QUEUE_OFF(_x) ((_x) * 0x800)
+#define NFP_QCP_QUEUE_ADD_RPTR 0x0000
+#define NFP_QCP_QUEUE_ADD_WPTR 0x0004
+#define NFP_QCP_QUEUE_STS_LO 0x0008
+#define NFP_QCP_QUEUE_STS_LO_READPTR_mask (0x3ffff)
+#define NFP_QCP_QUEUE_STS_HI 0x000c
+#define NFP_QCP_QUEUE_STS_HI_WRITEPTR_mask (0x3ffff)
+
+/* Interrupt definitions */
+#define NFP_NET_IRQ_LSC_IDX 0
+
+#define RTE_MBUF_DATA_DMA_ADDR(mb) \
+ (uint64_t) ((mb)->buf_physaddr + (mb)->data_off)
+
+/*
+ * Default values for RX/TX configuration
+ */
+#define DEFAULT_RX_FREE_THRESH 32
+#define DEFAULT_RX_PTHRESH 8
+#define DEFAULT_RX_HTHRESH 8
+#define DEFAULT_RX_WTHRESH 0
+
+#define DEFAULT_TX_RS_THRESH 32
+#define DEFAULT_TX_FREE_THRESH 32
+#define DEFAULT_TX_PTHRESH 32
+#define DEFAULT_TX_HTHRESH 0
+#define DEFAULT_TX_WTHRESH 0
+#define DEFAULT_TX_RSBIT_THRESH 32
+
+/* Alignment for dma zones */
+#define NFP_MEMZONE_ALIGN 128
+
+/* This is used by the reconfig protocol. It sets the maximum time waiting in
+ * milliseconds before a reconfig timeout happens.
+ */
+#define NFP_NET_POLL_TIMEOUT 5000
+
+#define NFP_QCP_QUEUE_ADDR_SZ (0x800)
+
+#define NFP_NET_LINK_DOWN_CHECK_TIMEOUT 4000 /* ms */
+#define NFP_NET_LINK_UP_CHECK_TIMEOUT 1000 /* ms */
+
+#include <linux/types.h>
+
+static inline uint8_t nn_readb(volatile void *addr)
+{
+ return *((volatile uint8_t *)(addr));
+}
+
+static inline void nn_writeb(__u8 val, volatile void *addr)
+{
+ *((volatile uint8_t *)(addr)) = val;
+}
+
+static inline uint32_t nn_readl(volatile const void *addr)
+{
+ return *((volatile const uint32_t *)(addr));
+}
+
+static inline void nn_writel(__u32 val, volatile void *addr)
+{
+ *((volatile uint32_t *)(addr)) = val;
+}
+
+static inline uint64_t nn_readq(volatile void *addr)
+{
+ const volatile __u32 *p = addr;
+ __u32 low, high;
+
+ high = nn_readl((volatile const void *)(p + 1));
+ low = nn_readl((volatile const void *)p);
+
+ return low + ((__u64)high << 32);
+}
+
+static inline void nn_writeq(__u64 val, volatile void *addr)
+{
+ nn_writel(val >> 32, (volatile char*) addr + 4);
+ nn_writel(val, addr);
+}
+
+/*
+ * TX descriptor format
+ */
+#define PCIE_DESC_TX_EOP (1 << 7)
+#define PCIE_DESC_TX_OFFSET_MASK (0x7f)
+
+/* Flags in the host TX descriptor */
+#define PCIE_DESC_TX_CSUM (1 << 7)
+#define PCIE_DESC_TX_IP4_CSUM (1 << 6)
+#define PCIE_DESC_TX_TCP_CSUM (1 << 5)
+#define PCIE_DESC_TX_UDP_CSUM (1 << 4)
+#define PCIE_DESC_TX_VLAN (1 << 3)
+#define PCIE_DESC_TX_LSO (1 << 2)
+#define PCIE_DESC_TX_ENCAP_NONE (0)
+#define PCIE_DESC_TX_ENCAP_VXLAN (1 << 1)
+#define PCIE_DESC_TX_ENCAP_GRE (1 << 0)
+
+struct nfp_net_tx_desc {
+ union {
+ struct {
+ __u8 dma_addr_hi; /* High bits of host buf address */
+ __le16 dma_len; /* Length to DMA for this desc */
+ __u8 offset_eop; /* Offset in buf where pkt starts +
+ * highest bit is eop flag.
+ */
+ __le32 dma_addr_lo; /* Low 32bit of host buf addr */
+
+ __le16 lso; /* MSS to be used for LSO */
+ __u8 l4_offset; /* LSO, where the L4 data starts */
+ __u8 flags; /* TX Flags, see @PCIE_DESC_TX_* */
+
+ __le16 vlan; /* VLAN tag to add if indicated */
+ __le16 data_len; /* Length of frame + meta data */
+ } __attribute__((__packed__));
+ __le32 vals[4];
+ };
+};
+
+struct nfp_net_txq {
+ struct nfp_net_hw *hw; /* Backpointer to nfp_net structure */
+
+ /* Queue information: @qidx is the queue index from Linux's
+ * perspective. @tx_qcidx is the index of the Queue
+ * Controller Peripheral queue relative to the TX queue BAR.
+ * @cnt is the size of the queue in number of
+ * descriptors. @qcp_q is a pointer to the base of the queue
+ * structure on the NFP */
+ __u8 *qcp_q;
+
+ /* Read and Write pointers. @wr_p and @rd_p are host side pointer,
+ * they are free running and have little relation to the QCP pointers *
+ * @qcp_rd_p is a local copy queue controller peripheral read pointer */
+
+ __u32 wr_p;
+ __u32 rd_p;
+ __u32 qcp_rd_p;
+
+ __u32 tx_count;
+
+ __u32 tx_free_thresh;
+ __u32 tail;
+
+ /* For each descriptor keep a reference to the mbuff and
+ * DMA address used until completion is signalled. */
+ struct {
+ struct rte_mbuf *mbuf;
+ } *txbufs;
+
+ /* Information about the host side queue location. @txds is
+ * the virtual address for the queue, @dma is the DMA address
+ * of the queue and @size is the size in bytes for the queue
+ * (needed for free) */
+ struct nfp_net_tx_desc *txds;
+
+ /* At this point 56 bytes have been used for all the fields in the
+ * TX critical path. We have room for 8 bytes and still all placed
+ * in a cache line. We are not using the threshold values below nor
+ * the txq_flags but if we need to, we can add the most used in the
+ * remaining bytes.
+ */
+ __u32 tx_rs_thresh; /* not used by now. Future? */
+ __u32 tx_pthresh; /* not used by now. Future? */
+ __u32 tx_hthresh; /* not used by now. Future? */
+ __u32 tx_wthresh; /* not used by now. Future? */
+ __u32 txq_flags; /* not used by now. Future? */
+ __u8 port_id;
+ int qidx;
+ int tx_qcidx;
+ __le64 dma;
+} __attribute__ ((__aligned__(64)));
+
+/*
+ * RX and freelist descriptor format
+ */
+#define PCIE_DESC_RX_DD (1 << 7)
+#define PCIE_DESC_RX_META_LEN_MASK (0x7f)
+
+/* Flags in the RX descriptor */
+#define PCIE_DESC_RX_RSS (1 << 15)
+#define PCIE_DESC_RX_I_IP4_CSUM (1 << 14)
+#define PCIE_DESC_RX_I_IP4_CSUM_OK (1 << 13)
+#define PCIE_DESC_RX_I_TCP_CSUM (1 << 12)
+#define PCIE_DESC_RX_I_TCP_CSUM_OK (1 << 11)
+#define PCIE_DESC_RX_I_UDP_CSUM (1 << 10)
+#define PCIE_DESC_RX_I_UDP_CSUM_OK (1 << 9)
+#define PCIE_DESC_RX_INGRESS_PORT (1 << 8)
+#define PCIE_DESC_RX_EOP (1 << 7)
+#define PCIE_DESC_RX_IP4_CSUM (1 << 6)
+#define PCIE_DESC_RX_IP4_CSUM_OK (1 << 5)
+#define PCIE_DESC_RX_TCP_CSUM (1 << 4)
+#define PCIE_DESC_RX_TCP_CSUM_OK (1 << 3)
+#define PCIE_DESC_RX_UDP_CSUM (1 << 2)
+#define PCIE_DESC_RX_UDP_CSUM_OK (1 << 1)
+#define PCIE_DESC_RX_VLAN (1 << 0)
+
+struct nfp_net_rx_desc {
+ union {
+ /* Freelist descriptor */
+ struct {
+ __u8 dma_addr_hi;
+ __le16 spare;
+ __u8 dd;
+
+ __le32 dma_addr_lo;
+ } __attribute__((__packed__)) fld;
+
+ /* RX descriptor */
+ struct {
+ __le16 data_len;
+ __u8 reserved;
+ __u8 meta_len_dd;
+
+ __le16 flags;
+ __le16 vlan;
+ } __attribute__((__packed__)) rxd;
+
+ __le32 vals[2];
+ };
+};
+
+struct nfp_net_rx_buff {
+ struct rte_mbuf *mbuf;
+};
+
+struct nfp_net_rxq {
+ struct nfp_net_hw *hw; /* Backpointer to nfp_net structure */
+
+ /* @qcp_fl and @qcp_rx are pointers to the base addresses of the
+ * freelist and RX queue controller peripheral queue structures on the
+ * NFP */
+ __u8 *qcp_fl;
+ __u8 *qcp_rx;
+
+ /* Read and Write pointers. @wr_p and @rd_p are host side
+ * pointer, they are free running and have little relation to
+ * the QCP pointers. @wr_p is where the driver adds new
+ * freelist descriptors and @rd_p is where the driver start
+ * reading descriptors for newly arrive packets from. */
+ __u32 wr_p;
+ __u32 rd_p;
+
+ /* For each buffer placed on the freelist, record the
+ * associated SKB */
+ struct nfp_net_rx_buff *rxbufs;
+
+ /* Information about the host side queue location. @rxds is
+ * the virtual address for the queue */
+ struct nfp_net_rx_desc *rxds;
+
+ /* The mempool is created by the user specifying a mbuf size.
+ * We save here the reference of the mempool needed in the RX
+ * path and the mbuf size for checking received packets can be
+ * safely copied to the mbuf using the NFP_NET_RX_OFFSET */
+ struct rte_mempool *mem_pool;
+ uint16_t mbuf_size;
+
+ /* Next two fields are used for giving more free descriptors
+ * to the NFP */
+ uint16_t rx_free_thresh;
+ uint16_t nb_rx_hold;
+
+ /* the size of the queue in number of descriptors */
+ uint16_t rx_count;
+
+ /* Fields above this point fit in a single cache line and are all used
+ * in the RX critical path. Fields below this point are just used
+ * during queue configuration or not used at all (yet) */
+
+ /* referencing dev->data->port_id */
+ uint16_t port_id;
+
+ uint8_t crc_len; /* Not used by now */
+ uint8_t drop_en; /* Not used by now */
+
+ /* DMA address of the queue */
+ __le64 dma;
+
+ /* Queue information: @qidx is the queue index from Linux's
+ * perspective. @fl_qcidx is the index of the Queue
+ * Controller peripheral queue relative to the RX queue BAR
+ * used for the freelist and @rx_qcidx is the Queue Controller
+ * Peripheral index for the RX queue. */
+ int qidx;
+ int fl_qcidx;
+ int rx_qcidx;
+} __attribute__ ((__aligned__(64)));
+
+
+struct nfp_net_hw {
+ /* Info from the firmware */
+ uint32_t ver;
+ uint32_t cap;
+ uint32_t max_mtu;
+ uint32_t mtu;
+
+ /* Current values for control */
+ uint32_t ctrl;
+
+ uint8_t *ctrl_bar;
+ uint8_t *tx_bar;
+ uint8_t *rx_bar;
+
+ int stride_rx;
+ int stride_tx;
+
+ __u8 *qcp_cfg;
+
+ uint32_t max_tx_queues;
+ uint32_t max_rx_queues;
+ uint16_t flbufsz;
+ uint16_t device_id;
+ uint16_t vendor_id;
+ uint16_t subsystem_device_id;
+ uint16_t subsystem_vendor_id;
+#if defined(DSTQ_SELECTION)
+#if DSTQ_SELECTION
+ uint16_t device_function;
+#endif
+#endif
+
+ uint8_t mac_addr[ETHER_ADDR_LEN];
+
+ /* Records starting point for counters */
+ struct rte_eth_stats eth_stats_base;
+
+#ifdef NFP_NET_LIBNFP
+ struct nfp_cpp *cpp;
+ struct nfp_cpp_area *ctrl_area;
+ struct nfp_cpp_area *tx_area;
+ struct nfp_cpp_area *rx_area;
+ struct nfp_cpp_area *msix_area;
+#endif
+};
+
+struct nfp_net_adapter {
+ struct nfp_net_hw hw;
+};
+
+#define NFP_NET_DEV_PRIVATE_TO_HW(adapter)\
+ (&((struct nfp_net_adapter *)adapter)->hw)
+
+#endif /* _NFP_NET_H_ */
+/*
+ * Local variables:
+ * c-file-style: "Linux"
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..f36dc4b 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -34,6 +34,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
ifeq ($(CONFIG_RTE_EAL_IGB_UIO),y)
DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += igb_uio
endif
+ifeq ($(CONFIG_RTE_EAL_NFP_UIO),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += nfp_uio
+endif
DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
ifeq ($(CONFIG_RTE_KNI_KMOD),y)
DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 9e1909e..4a00f0f 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -142,6 +142,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING) += -lrte_pmd_ring
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += -lrte_pmd_pcap
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += -lrte_pmd_af_packet
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += -lrte_pmd_null
+_LDLIBS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += -lrte_pmd_nfp
endif # ! $(CONFIG_RTE_BUILD_SHARED_LIB)
--
1.7.9.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* [dpdk-dev] [PATCH 2/3] This patch adds a new UIO driver for Netronome NFP PCI cards.
2015-10-02 11:25 [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 1/3] This patch adds a PMD driver for Netronome NFP PCI cards Alejandro.Lucero
@ 2015-10-02 11:25 ` Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 3/3] Modifying configuration scripts for Netronome's nfp_uio driver Alejandro.Lucero
2015-10-05 10:52 ` [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Mcnamara, John
3 siblings, 0 replies; 5+ messages in thread
From: Alejandro.Lucero @ 2015-10-02 11:25 UTC (permalink / raw)
To: dev
From: "Alejandro.Lucero" <alejandro.lucero@netronome.com>
Current Netronome's PMD just supports Virtual Functions. Future Physical
Function support will require specific Netronome code here.
Signed-off-by: Alejandro.Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf.Neugebauer <rolf.neugebauer@netronome.com>
---
lib/librte_eal/common/include/rte_pci.h | 1 +
lib/librte_eal/linuxapp/eal/eal_pci.c | 4 +
lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 2 +-
lib/librte_eal/linuxapp/nfp_uio/Makefile | 53 +++
lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c | 497 +++++++++++++++++++++++++++++
lib/librte_ether/rte_ethdev.c | 1 +
6 files changed, 557 insertions(+), 1 deletion(-)
create mode 100644 lib/librte_eal/linuxapp/nfp_uio/Makefile
create mode 100644 lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 83e3c28..89baaf6 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -146,6 +146,7 @@ struct rte_devargs;
enum rte_kernel_driver {
RTE_KDRV_UNKNOWN = 0,
RTE_KDRV_IGB_UIO,
+ RTE_KDRV_NFP_UIO,
RTE_KDRV_VFIO,
RTE_KDRV_UIO_GENERIC,
RTE_KDRV_NIC_UIO,
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index bc5b5be..19a93fe 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -137,6 +137,7 @@ pci_map_device(struct rte_pci_device *dev)
#endif
break;
case RTE_KDRV_IGB_UIO:
+ case RTE_KDRV_NFP_UIO:
case RTE_KDRV_UIO_GENERIC:
/* map resources for devices that use uio */
ret = pci_uio_map_resource(dev);
@@ -161,6 +162,7 @@ pci_unmap_device(struct rte_pci_device *dev)
RTE_LOG(ERR, EAL, "Hotplug doesn't support vfio yet\n");
break;
case RTE_KDRV_IGB_UIO:
+ case RTE_KDRV_NFP_UIO:
case RTE_KDRV_UIO_GENERIC:
/* unmap resources for devices that use uio */
pci_uio_unmap_resource(dev);
@@ -357,6 +359,8 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t bus,
dev->kdrv = RTE_KDRV_VFIO;
else if (!strcmp(driver, "igb_uio"))
dev->kdrv = RTE_KDRV_IGB_UIO;
+ else if (!strcmp(driver, "nfp_uio"))
+ dev->kdrv = RTE_KDRV_NFP_UIO;
else if (!strcmp(driver, "uio_pci_generic"))
dev->kdrv = RTE_KDRV_UIO_GENERIC;
else
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index ac50e13..29ec9cb 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -270,7 +270,7 @@ pci_uio_alloc_resource(struct rte_pci_device *dev,
goto error;
}
- if (dev->kdrv == RTE_KDRV_IGB_UIO)
+ if (dev->kdrv == RTE_KDRV_IGB_UIO || dev->kdrv == RTE_KDRV_NFP_UIO)
dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
else {
dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
diff --git a/lib/librte_eal/linuxapp/nfp_uio/Makefile b/lib/librte_eal/linuxapp/nfp_uio/Makefile
new file mode 100644
index 0000000..b9e2f0a
--- /dev/null
+++ b/lib/librte_eal/linuxapp/nfp_uio/Makefile
@@ -0,0 +1,53 @@
+# BSD LICENSE
+#
+# Copyright(c) 2014-2015 Netronome. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = nfp_uio
+MODULE_PATH = drivers/net/nfp_uio
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=100
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -Winline -Wall -Werror
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y := nfp_uio.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c b/lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c
new file mode 100644
index 0000000..98192a5
--- /dev/null
+++ b/lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c
@@ -0,0 +1,497 @@
+/*
+ * Copyright (c) 2014, 2015 Netronome Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution
+ *
+ * 3. Neither the name of the copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived from this
+ * software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Netronome DPDK uio kernel module
+ */
+
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/uio_driver.h>
+#include <linux/io.h>
+#include <linux/msi.h>
+#include <linux/version.h>
+
+#ifndef PCI_MSIX_ENTRY_SIZE
+#define PCI_MSIX_ENTRY_SIZE 16
+#define PCI_MSIX_ENTRY_LOWER_ADDR 0
+#define PCI_MSIX_ENTRY_UPPER_ADDR 4
+#define PCI_MSIX_ENTRY_DATA 8
+#define PCI_MSIX_ENTRY_VECTOR_CTRL 12
+#define PCI_MSIX_ENTRY_CTRL_MASKBIT 1
+#endif
+
+/* Ideally we should support two types of interrupts:
+ *
+ * - Link Status Change Interrupt
+ * - Exception Interrupt
+ *
+ * But the uio Linux kernel interface just admits one interrupt per uio device.
+ */
+#define NFP_NUM_MSI_VECTORS 1
+
+/*
+ * A structure describing the private information for a uio device.
+ */
+struct nfp_uio_pci_dev {
+ struct uio_info info;
+ struct pci_dev *pdev;
+ /* spinlock for accessing PCI config space or msix
+ * data in multi tasks/isr
+ */
+ spinlock_t lock;
+
+ /* pointer to the msix vectors to be allocated later */
+ struct msix_entry msix_entries[NFP_NUM_MSI_VECTORS];
+};
+
+#define PCI_VENDOR_ID_NETRONOME 0x19ee
+#define PCI_DEVICE_NFP6000_VF_NIC 0x6003
+
+#define RTE_PCI_DEV_ID_DECL_NETRO(vend, dev) {PCI_DEVICE(vend, dev)},
+
+/* PCI device id table */
+static struct pci_device_id nfp_uio_pci_ids[] = {
+RTE_PCI_DEV_ID_DECL_NETRO(PCI_VENDOR_ID_NETRONOME, PCI_DEVICE_NFP6000_VF_NIC)
+{ 0, },
+};
+
+MODULE_DEVICE_TABLE(pci, nfp_uio_pci_ids);
+
+static inline struct nfp_uio_pci_dev *
+nfp_uio_get_uio_pci_dev(struct uio_info *info)
+{
+ return container_of(info, struct nfp_uio_pci_dev, info);
+}
+
+static inline int
+pci_lock(struct pci_dev *pdev)
+{
+ /* Some function names changes between 3.2.0 and 3.3.0... */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 3, 0)
+ pci_block_user_cfg_access(pdev);
+ return 1;
+#else
+ return pci_cfg_access_trylock(pdev);
+#endif
+}
+
+static inline void
+pci_unlock(struct pci_dev *pdev)
+{
+ /* Some function names changes between 3.2.0 and 3.3.0... */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 3, 0)
+ pci_unblock_user_cfg_access(pdev);
+#else
+ pci_cfg_access_unlock(pdev);
+#endif
+}
+
+/*
+ * It masks the msix on/off of generating MSI-X messages.
+ */
+static int
+nfp_uio_msix_mask_irq(struct msi_desc *desc, int32_t state)
+{
+ u32 mask_bits = desc->masked;
+ unsigned offset = desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
+ PCI_MSIX_ENTRY_VECTOR_CTRL;
+
+ if (state != 0)
+ mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+ else
+ mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+
+ if (mask_bits != desc->masked) {
+ writel(mask_bits, desc->mask_base + offset);
+ readl(desc->mask_base);
+ desc->masked = mask_bits;
+ }
+
+ return 0;
+}
+
+/**
+ * This function sets/clears the masks for generating LSC interrupts.
+ *
+ * @param info
+ * The pointer to struct uio_info.
+ * @param on
+ * The on/off flag of masking LSC.
+ * @return
+ * -On success, zero value.
+ * -On failure, a negative value.
+ */
+static int
+nfp_uio_set_interrupt_mask(struct nfp_uio_pci_dev *udev, int32_t state)
+{
+ struct pci_dev *pdev = udev->pdev;
+ struct msi_desc *desc;
+
+ /* TODO: Should we change this based on if the firmware advertises
+ NFP_NET_CFG_CTRL_MSIXAUTO? */
+
+ list_for_each_entry(desc, &pdev->msi_list, list) {
+ nfp_uio_msix_mask_irq(desc, state);
+ }
+ return 0;
+}
+
+/**
+ * This is the irqcontrol callback to be registered to uio_info.
+ * It can be used to disable/enable interrupt from user space processes.
+ *
+ * @param info
+ * pointer to uio_info.
+ * @param irq_state
+ * state value. 1 to enable interrupt, 0 to disable interrupt.
+ *
+ * @return
+ * - On success, 0.
+ * - On failure, a negative value.
+ */
+static int
+nfp_uio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
+{
+ unsigned long flags;
+ struct nfp_uio_pci_dev *udev = nfp_uio_get_uio_pci_dev(info);
+ struct pci_dev *pdev = udev->pdev;
+
+ spin_lock_irqsave(&udev->lock, flags);
+ if (!pci_lock(pdev)) {
+ spin_unlock_irqrestore(&udev->lock, flags);
+ return -1;
+ }
+
+ nfp_uio_set_interrupt_mask(udev, irq_state);
+
+ pci_unlock(pdev);
+ spin_unlock_irqrestore(&udev->lock, flags);
+
+ return 0;
+}
+
+/**
+ * This is interrupt handler which will check if the interrupt is for the right
+ device. If yes, disable it here and will be enable later.
+ */
+static irqreturn_t
+nfp_uio_pci_irqhandler(int irq, struct uio_info *info)
+{
+ irqreturn_t ret = IRQ_NONE;
+ unsigned long flags;
+ struct nfp_uio_pci_dev *udev = nfp_uio_get_uio_pci_dev(info);
+ struct pci_dev *pdev = udev->pdev;
+
+ spin_lock_irqsave(&udev->lock, flags);
+ /* block userspace PCI config reads/writes */
+ if (!pci_lock(pdev))
+ goto spin_unlock;
+
+ ret = IRQ_HANDLED;
+
+ /* unblock userspace PCI config reads/writes */
+ pci_unlock(pdev);
+spin_unlock:
+ spin_unlock_irqrestore(&udev->lock, flags);
+ dev_info(&pdev->dev, "irq 0x%x %s\n", irq,
+ (ret == IRQ_HANDLED) ? "handled" : "not handled");
+
+ return ret;
+}
+
+/* Remap pci resources described by bar #pci_bar in uio resource n. */
+static int
+nfp_uio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
+ int n, int pci_bar, const char *name)
+{
+ unsigned long addr, len;
+ void *internal_addr;
+
+ if (ARRAY_SIZE(info->mem) <= n)
+ return -EINVAL;
+
+ addr = pci_resource_start(dev, pci_bar);
+ len = pci_resource_len(dev, pci_bar);
+ if (addr == 0 || len == 0)
+ return -1;
+ internal_addr = ioremap(addr, len);
+ if (!internal_addr)
+ return -1;
+ info->mem[n].name = name;
+ info->mem[n].addr = addr;
+ info->mem[n].internal_addr = internal_addr;
+ info->mem[n].size = len;
+ info->mem[n].memtype = UIO_MEM_PHYS;
+ return 0;
+}
+
+/* Get pci port io resources described by bar #pci_bar in uio resource n. */
+static int
+nfp_uio_pci_setup_ioport(struct pci_dev *dev, struct uio_info *info,
+ int n, int pci_bar, const char *name)
+{
+ unsigned long addr, len;
+
+ if (ARRAY_SIZE(info->port) <= n)
+ return -EINVAL;
+
+ addr = pci_resource_start(dev, pci_bar);
+ len = pci_resource_len(dev, pci_bar);
+ if (addr == 0 || len == 0)
+ return -1;
+
+ info->port[n].name = name;
+ info->port[n].start = addr;
+ info->port[n].size = len;
+ info->port[n].porttype = UIO_PORT_X86;
+
+ return 0;
+}
+
+/* Unmap previously ioremap'd resources */
+static void
+nfp_uio_pci_release_iomem(struct uio_info *info)
+{
+ int i;
+
+ for (i = 0; i < MAX_UIO_MAPS; i++) {
+ if (info->mem[i].internal_addr)
+ iounmap(info->mem[i].internal_addr);
+ }
+}
+
+static int
+nfp_uio_setup_bars(struct pci_dev *dev, struct uio_info *info)
+{
+ int i, iom, iop, ret;
+ unsigned long flags;
+ static const char *bar_names[PCI_STD_RESOURCE_END + 1] = {
+ "BAR0",
+ "BAR1",
+ "BAR2",
+ "BAR3",
+ "BAR4",
+ "BAR5",
+ };
+
+ iom = 0;
+ iop = 0;
+
+ for (i = 0; i != ARRAY_SIZE(bar_names); i++) {
+ if (pci_resource_len(dev, i) == 0 ||
+ pci_resource_start(dev, i) == 0)
+ continue;
+
+ flags = pci_resource_flags(dev, i);
+ if (flags & IORESOURCE_MEM) {
+ ret = nfp_uio_pci_setup_iomem(dev, info, iom, i,
+ bar_names[i]);
+ if (ret != 0)
+ return ret;
+ iom++;
+ } else if (flags & IORESOURCE_IO) {
+ ret = nfp_uio_pci_setup_ioport(dev, info, iop, i,
+ bar_names[i]);
+ if (ret != 0)
+ return ret;
+ iop++;
+ }
+ }
+
+ return (iom != 0) ? ret : -ENOENT;
+}
+
+/* Configuring interrupt. First try MSI-X, then MSI. */
+static void
+init_interrupt(struct nfp_uio_pci_dev *udev)
+{
+ int vector;
+
+ for (vector = 0; vector < NFP_NUM_MSI_VECTORS; vector++)
+ udev->msix_entries[vector].entry = vector;
+
+ if (pci_enable_msix(udev->pdev, udev->msix_entries,
+ NFP_NUM_MSI_VECTORS) == 0) {
+ udev->info.irq_flags = 0;
+ udev->info.irq = udev->msix_entries[0].vector;
+ dev_info(&udev->pdev->dev, "%s configured with MSI-X\n",
+ udev->info.name);
+ } else
+ dev_info(&udev->pdev->dev, "%s MSI-X initialization error\n",
+ udev->info.name);
+}
+
+static int
+nfp_uio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
+{
+ struct nfp_uio_pci_dev *udev;
+ void *map_addr;
+ dma_addr_t map_dma_addr;
+
+ udev = kzalloc(sizeof(*udev), GFP_KERNEL);
+ if (!udev)
+ return -ENOMEM;
+
+ /*
+ * enable device: ask low-level code to enable I/O and
+ * memory
+ */
+ if (pci_enable_device(dev)) {
+ dev_err(&dev->dev, "Cannot enable PCI device\n");
+ goto fail_free;
+ }
+
+ /*
+ * reserve device's PCI memory regions for use by this
+ * module
+ */
+ if (pci_request_regions(dev, "nfp_uio")) {
+ dev_err(&dev->dev, "Cannot request regions\n");
+ goto fail_disable;
+ }
+
+ /* enable bus mastering on the device */
+ pci_set_master(dev);
+
+ /* remap IO memory */
+ if (nfp_uio_setup_bars(dev, &udev->info))
+ goto fail_release_iomem;
+
+ /* set 40-bit DMA mask */
+ if (pci_set_dma_mask(dev, DMA_BIT_MASK(40))) {
+ dev_err(&dev->dev, "Cannot set DMA mask\n");
+ goto fail_release_iomem;
+ } else if (pci_set_consistent_dma_mask(dev, DMA_BIT_MASK(40))) {
+ dev_err(&dev->dev, "Cannot set consistent DMA mask\n");
+ goto fail_release_iomem;
+ }
+
+ /* fill uio infos */
+ udev->info.name = "Netronome NFP UIO";
+ udev->info.version = "0.1";
+ udev->info.handler = nfp_uio_pci_irqhandler;
+ udev->info.irqcontrol = nfp_uio_pci_irqcontrol;
+ udev->info.priv = udev;
+ udev->pdev = dev;
+ spin_lock_init(&udev->lock);
+
+ init_interrupt(udev);
+
+ pci_set_drvdata(dev, &udev->info);
+ nfp_uio_pci_irqcontrol(&udev->info, 0);
+
+ /* register uio driver */
+ if (uio_register_device(&dev->dev, &udev->info))
+ goto fail_release_iomem;
+
+ dev_info(&dev->dev, "uio device registered with irq %lx\n",
+ udev->info.irq);
+
+ /* When binding drivers to devices, some old kernels do not
+ * link devices to iommu identity mapping if iommu=pt is used.
+ *
+ * This is not a problem if the driver does later some call to
+ * the DMA API because the mapping can be done then. But DPDK
+ * apps do not use that DMA API at all.
+ *
+ * Doing a harmless dma mapping for attaching the device to
+ * the iommu identity mapping
+ */
+
+ map_addr = dma_zalloc_coherent(&dev->dev, 1024,
+ &map_dma_addr, GFP_KERNEL);
+
+ pr_info("nfp_uio: mapping 1K dma=%#llx host=%p\n",
+ (unsigned long long)map_dma_addr, map_addr);
+
+ dma_free_coherent(&dev->dev, 1024, map_addr, map_dma_addr);
+
+ pr_info("nfp_uio: unmapping 1K dma=%#llx host=%p\n",
+ (unsigned long long)map_dma_addr, map_addr);
+
+ return 0;
+
+fail_release_iomem:
+ nfp_uio_pci_release_iomem(&udev->info);
+ pci_disable_msix(udev->pdev);
+ pci_release_regions(dev);
+fail_disable:
+ pci_disable_device(dev);
+fail_free:
+ kfree(udev);
+
+ return -ENODEV;
+}
+
+static void
+nfp_uio_pci_remove(struct pci_dev *dev)
+{
+ struct uio_info *info = pci_get_drvdata(dev);
+
+ BUG_ON(!info);
+ BUG_ON(!info->priv);
+
+ uio_unregister_device(info);
+ nfp_uio_pci_release_iomem(info);
+ pci_disable_msix(dev);
+ pci_release_regions(dev);
+ pci_disable_device(dev);
+ pci_set_drvdata(dev, NULL);
+ kfree(info);
+}
+
+static struct pci_driver nfp_uio_pci_driver = {
+ .name = "nfp_uio",
+ .id_table = nfp_uio_pci_ids,
+ .probe = nfp_uio_pci_probe,
+ .remove = nfp_uio_pci_remove,
+};
+
+static int __init
+nfp_uio_pci_init_module(void)
+{
+ return pci_register_driver(&nfp_uio_pci_driver);
+}
+
+static void __exit
+nfp_uio_pci_exit_module(void)
+{
+ pci_unregister_driver(&nfp_uio_pci_driver);
+}
+
+module_init(nfp_uio_pci_init_module);
+module_exit(nfp_uio_pci_exit_module);
+
+MODULE_DESCRIPTION("UIO driver for Netronome NFP PCI cards");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Netronome Systems <support@netronome.com>");
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f593f6e..a84bc63 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -513,6 +513,7 @@ rte_eth_dev_is_detachable(uint8_t port_id)
if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
switch (rte_eth_devices[port_id].pci_dev->kdrv) {
case RTE_KDRV_IGB_UIO:
+ case RTE_KDRV_NFP_UIO:
case RTE_KDRV_UIO_GENERIC:
case RTE_KDRV_NIC_UIO:
break;
--
1.7.9.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* [dpdk-dev] [PATCH 3/3] Modifying configuration scripts for Netronome's nfp_uio driver.
2015-10-02 11:25 [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 1/3] This patch adds a PMD driver for Netronome NFP PCI cards Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 2/3] This patch adds a new UIO " Alejandro.Lucero
@ 2015-10-02 11:25 ` Alejandro.Lucero
2015-10-05 10:52 ` [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Mcnamara, John
3 siblings, 0 replies; 5+ messages in thread
From: Alejandro.Lucero @ 2015-10-02 11:25 UTC (permalink / raw)
To: dev
From: "Alejandro.Lucero" <alejandro.lucero@netronome.com>
Signed-off-by: Alejandro.Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rolf.Neugebauer <rolf.neugebauer@netronome.com>
---
tools/dpdk_nic_bind.py | 8 ++--
tools/setup.sh | 122 ++++++++++++++++++++++++++++++++++++++----------
2 files changed, 101 insertions(+), 29 deletions(-)
diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
index b7bd877..f7f8a39 100755
--- a/tools/dpdk_nic_bind.py
+++ b/tools/dpdk_nic_bind.py
@@ -43,7 +43,7 @@ ETHERNET_CLASS = "0200"
# Each device within this is itself a dictionary of device properties
devices = {}
# list of supported DPDK drivers
-dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic" ]
+dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic", "nfp_uio" ]
# command-line arg flags
b_flag = None
@@ -153,7 +153,7 @@ def find_module(mod):
return path
def check_modules():
- '''Checks that igb_uio is loaded'''
+ '''Checks that at least one dpdk module is loaded'''
global dpdk_drivers
fd = file("/proc/modules")
@@ -261,7 +261,7 @@ def get_nic_details():
devices[d]["Active"] = "*Active*"
break;
- # add igb_uio to list of supporting modules if needed
+ # add module to list of supporting modules if needed
if "Module_str" in devices[d]:
for driver in dpdk_drivers:
if driver not in devices[d]["Module_str"]:
@@ -440,7 +440,7 @@ def display_devices(title, dev_list, extra_params = None):
def show_status():
'''Function called when the script is passed the "--status" option. Displays
- to the user what devices are bound to the igb_uio driver, the kernel driver
+ to the user what devices are bound to a dpdk driver, the kernel driver
or to no driver'''
global dpdk_drivers
kernel_drv = []
diff --git a/tools/setup.sh b/tools/setup.sh
index 5a8b2f3..e434ddb 100755
--- a/tools/setup.sh
+++ b/tools/setup.sh
@@ -236,6 +236,52 @@ load_vfio_module()
}
#
+# Unloads nfp_uio.ko.
+#
+remove_nfp_uio_module()
+{
+ echo "Unloading any existing DPDK UIO module"
+ /sbin/lsmod | grep -s nfp_uio > /dev/null
+ if [ $? -eq 0 ] ; then
+ sudo /sbin/rmmod nfp_uio
+ fi
+}
+
+#
+# Loads new nfp_uio.ko (and uio module if needed).
+#
+load_nfp_uio_module()
+{
+ echo "Using RTE_SDK=$RTE_SDK and RTE_TARGET=$RTE_TARGET"
+ if [ ! -f $RTE_SDK/$RTE_TARGET/kmod/nfp_uio.ko ];then
+ echo "## ERROR: Target does not have the DPDK UIO Kernel Module."
+ echo " To fix, please try to rebuild target."
+ return
+ fi
+
+ remove_nfp_uio_module
+
+ /sbin/lsmod | grep -s uio > /dev/null
+ if [ $? -ne 0 ] ; then
+ modinfo uio > /dev/null
+ if [ $? -eq 0 ]; then
+ echo "Loading uio module"
+ sudo /sbin/modprobe uio
+ fi
+ fi
+
+ # UIO may be compiled into kernel, so it may not be an error if it can't
+ # be loaded.
+
+ echo "Loading DPDK UIO module"
+ sudo /sbin/insmod $RTE_SDK/$RTE_TARGET/kmod/nfp_uio.ko
+ if [ $? -ne 0 ] ; then
+ echo "## ERROR: Could not load kmod/nfp_uio.ko."
+ quit
+ fi
+}
+
+#
# Unloads the rte_kni.ko module.
#
remove_kni_module()
@@ -427,10 +473,10 @@ grep_meminfo()
#
show_nics()
{
- if /sbin/lsmod | grep -q -e igb_uio -e vfio_pci; then
+ if /sbin/lsmod | grep -q -e igb_uio -e vfio_pci -e nfp_uio; then
${RTE_SDK}/tools/dpdk_nic_bind.py --status
else
- echo "# Please load the 'igb_uio' or 'vfio-pci' kernel module before "
+ echo "# Please load the 'igb_uio', 'vfio-pci' or 'nfp_uio' kernel module before "
echo "# querying or adjusting NIC device bindings"
fi
}
@@ -471,6 +517,23 @@ bind_nics_to_igb_uio()
}
#
+# Uses dpdk_nic_bind.py to move devices to work with nfp_uio
+#
+bind_nics_to_nfp_uio()
+{
+ if /sbin/lsmod | grep -q nfp_uio ; then
+ ${RTE_SDK}/tools/dpdk_nic_bind.py --status
+ echo ""
+ echo -n "Enter PCI address of device to bind to NFP UIO driver: "
+ read PCI_PATH
+ sudo ${RTE_SDK}/tools/dpdk_nic_bind.py -b nfp_uio $PCI_PATH && echo "OK"
+ else
+ echo "# Please load the 'nfp_uio' kernel module before querying or "
+ echo "# adjusting NIC device bindings"
+ fi
+}
+
+#
# Uses dpdk_nic_bind.py to move devices to work with kernel drivers again
#
unbind_nics()
@@ -513,29 +576,35 @@ step2_func()
TEXT[1]="Insert IGB UIO module"
FUNC[1]="load_igb_uio_module"
- TEXT[2]="Insert VFIO module"
- FUNC[2]="load_vfio_module"
+ TEXT[2]="Insert NFP UIO module"
+ FUNC[2]="load_nfp_uio_module"
- TEXT[3]="Insert KNI module"
- FUNC[3]="load_kni_module"
+ TEXT[3]="Insert VFIO module"
+ FUNC[3]="load_vfio_module"
- TEXT[4]="Setup hugepage mappings for non-NUMA systems"
- FUNC[4]="set_non_numa_pages"
+ TEXT[4]="Insert KNI module"
+ FUNC[4]="load_kni_module"
- TEXT[5]="Setup hugepage mappings for NUMA systems"
- FUNC[5]="set_numa_pages"
+ TEXT[5]="Setup hugepage mappings for non-NUMA systems"
+ FUNC[5]="set_non_numa_pages"
- TEXT[6]="Display current Ethernet device settings"
- FUNC[6]="show_nics"
+ TEXT[6]="Setup hugepage mappings for NUMA systems"
+ FUNC[6]="set_numa_pages"
- TEXT[7]="Bind Ethernet device to IGB UIO module"
- FUNC[7]="bind_nics_to_igb_uio"
+ TEXT[7]="Display current Ethernet device settings"
+ FUNC[7]="show_nics"
- TEXT[8]="Bind Ethernet device to VFIO module"
- FUNC[8]="bind_nics_to_vfio"
+ TEXT[8]="Bind Ethernet device to IGB UIO module"
+ FUNC[8]="bind_nics_to_igb_uio"
- TEXT[9]="Setup VFIO permissions"
- FUNC[9]="set_vfio_permissions"
+ TEXT[9]="Bind Ethernet device to NFP UIO module"
+ FUNC[9]="bind_nics_to_nfp_uio"
+
+ TEXT[10]="Bind Ethernet device to VFIO module"
+ FUNC[10]="bind_nics_to_vfio"
+
+ TEXT[11]="Setup VFIO permissions"
+ FUNC[11]="set_vfio_permissions"
}
#
@@ -574,20 +643,23 @@ step5_func()
TEXT[1]="Uninstall all targets"
FUNC[1]="uninstall_targets"
- TEXT[2]="Unbind NICs from IGB UIO or VFIO driver"
+ TEXT[2]="Unbind NICs from IGB UIO, VFIO or NFP UIO driver"
FUNC[2]="unbind_nics"
TEXT[3]="Remove IGB UIO module"
FUNC[3]="remove_igb_uio_module"
- TEXT[4]="Remove VFIO module"
- FUNC[4]="remove_vfio_module"
+ TEXT[4]="Remove NFP UIO module"
+ FUNC[4]="remove_nfp_uio_module"
+
+ TEXT[5]="Remove VFIO module"
+ FUNC[5]="remove_vfio_module"
- TEXT[5]="Remove KNI module"
- FUNC[5]="remove_kni_module"
+ TEXT[6]="Remove KNI module"
+ FUNC[6]="remove_kni_module"
- TEXT[6]="Remove hugepage mappings"
- FUNC[6]="clear_huge_pages"
+ TEXT[7]="Remove hugepage mappings"
+ FUNC[7]="clear_huge_pages"
}
STEPS[1]="step1_func"
--
1.7.9.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card
2015-10-02 11:25 [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Alejandro.Lucero
` (2 preceding siblings ...)
2015-10-02 11:25 ` [dpdk-dev] [PATCH 3/3] Modifying configuration scripts for Netronome's nfp_uio driver Alejandro.Lucero
@ 2015-10-05 10:52 ` Mcnamara, John
3 siblings, 0 replies; 5+ messages in thread
From: Mcnamara, John @ 2015-10-05 10:52 UTC (permalink / raw)
To: Alejandro.Lucero, dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alejandro.Lucero
> Sent: Friday, October 2, 2015 12:26 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card
>
> From: "Alejandro.Lucero" <alejandro.lucero@netronome.com>
>
> Alejandro.Lucero (3):
> This patch adds a PMD driver for Netronome NFP PCI cards.
> This patch adds a new UIO driver for Netronome NFP PCI cards.
> Modifying configuration scripts for Netronome's nfp_uio driver.
Hi,
Thank you for submitting this. It is a good addition to DPDK.
Some minor comments to help the patchset get accepted.
It looks like the DPDK coding standard has been applied but check the guidelines for any differences:
http://dpdk.org/doc/guides/contributing/coding_style.html
It is also worth checking the patchset with the Linux kernel checkpatch utility. Not all kernel warnings are relevant to DPDK but most are. The following is an example check:
linux/scripts/checkpatch.pl --ignore PREFER_KERNEL_TYPES,SPLIT_STRING,VOLATILE -q patches*.patch
The first patch contains a doc patch. These are generally submitted in a separate patch as part of the patchset. Also it needs an addition to the index in doc/guides/nics/index.rst to actually include it in the documentation. Finally the RST has a few minor issues. See the doc guidelines for a how-to on installing and running the doc build system and the suggested formatting for the documentation.
http://dpdk.org/doc/guides/contributing/documentation.html
John.
--
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-10-05 10:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-02 11:25 [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 1/3] This patch adds a PMD driver for Netronome NFP PCI cards Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 2/3] This patch adds a new UIO " Alejandro.Lucero
2015-10-02 11:25 ` [dpdk-dev] [PATCH 3/3] Modifying configuration scripts for Netronome's nfp_uio driver Alejandro.Lucero
2015-10-05 10:52 ` [dpdk-dev] [PATCH 0/3] Support for Netronome´s NFP-6xxx card Mcnamara, John
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).