From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ubuntu (host217-39-174-19.in-addr.btopenworld.com [217.39.174.19]) by dpdk.org (Postfix) with SMTP id 0083B2FDD for ; Fri, 16 Oct 2015 12:45:11 +0200 (CEST) Received: by ubuntu (Postfix, from userid 5466) id 9E4D9EA763; Fri, 16 Oct 2015 11:45:24 +0100 (BST) From: "Alejandro.Lucero" To: dev@dpdk.org Date: Fri, 16 Oct 2015 11:45:21 +0100 Message-Id: <1444992324-5504-2-git-send-email-alejandro.lucero@netronome.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1444992324-5504-1-git-send-email-alejandro.lucero@netronome.com> References: <1444992324-5504-1-git-send-email-alejandro.lucero@netronome.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [dpdk-dev] =?utf-8?q?=5BPATCH_v3_1/4=5D_nfp=3A_new_poll_mode_driv?= =?utf-8?q?er_for_netronome_nfp6000_card?= X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Oct 2015 10:45:15 -0000 From: "Alejandro.Lucero" This patch adds a new PMD for using PCI Virtual Functions with Netronome nfp6000 card. Signed-off-by: Alejandro.Lucero Signed-off-by: Rolf.Neugebauer --- MAINTAINERS | 9 + config/common_linuxapp | 6 + doc/guides/rel_notes/release_2_2.rst | 8 + drivers/net/Makefile | 1 + drivers/net/nfp/Makefile | 88 ++ drivers/net/nfp/nfp_net.c | 2495 ++++++++++++++++++++++++++++= ++++++ drivers/net/nfp/nfp_net_ctrl.h | 290 ++++ drivers/net/nfp/nfp_net_logs.h | 75 + drivers/net/nfp/nfp_net_pmd.h | 434 ++++++ mk/rte.app.mk | 1 + 10 files changed, 3407 insertions(+) create mode 100644 drivers/net/nfp/Makefile create mode 100644 drivers/net/nfp/nfp_net.c create mode 100644 drivers/net/nfp/nfp_net_ctrl.h create mode 100644 drivers/net/nfp/nfp_net_logs.h create mode 100644 drivers/net/nfp/nfp_net_pmd.h diff --git a/MAINTAINERS b/MAINTAINERS index 080a8e8..1fb2ba6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -167,6 +167,10 @@ FreeBSD UIO M: Bruce Richardson F: lib/librte_eal/bsdapp/nic_uio/ =20 +NFP UIO +M: Alejandro Lucero +F: lib/librte_eal/linuxapp/nfp_uio/ + =20 Core Libraries -------------- @@ -255,6 +259,11 @@ M: Adrien Mazarguil F: drivers/net/mlx4/ F: doc/guides/nics/mlx4.rst =20 +Netronome NFP +M: Alejandro Lucero +F: drivers/net/nfp/ +F: doc/guides/nics/nfp.rst + RedHat virtio M: Huawei Xie M: Changchun Ouyang diff --git a/config/common_linuxapp b/config/common_linuxapp index 0de43d5..d8d6384 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -108,6 +108,7 @@ CONFIG_RTE_LIBEAL_USE_HPET=3Dn CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=3Dn CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=3Dn CONFIG_RTE_EAL_IGB_UIO=3Dy +CONFIG_RTE_EAL_NFP_UIO=3Dy CONFIG_RTE_EAL_VFIO=3Dy CONFIG_RTE_MALLOC_DEBUG=3Dn =20 @@ -238,6 +239,11 @@ CONFIG_RTE_LIBRTE_ENIC_PMD=3Dy CONFIG_RTE_LIBRTE_ENIC_DEBUG=3Dn =20 # +# Compile burst-oriented Netronome PMD driver +# +CONFIG_RTE_LIBRTE_NFP_PMD=3Dy + +# # Compile burst-oriented VIRTIO PMD driver # CONFIG_RTE_LIBRTE_VIRTIO_PMD=3Dy diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/= release_2_2.rst index 5687676..364cca3 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -16,6 +16,10 @@ EAL Fixed issue where the ``rte_epoll_wait()`` function didn't return when= the underlying call to ``epoll_wait()`` timed out. =20 +* **eal/linuxapp: New UIO driver for Netronome=C2=B4s NFP support** + + Netronome=C2=B4s NFP PMD requires some specific configuration. Current= implementation + supports just VFs. Future PF support will require major changes to thi= s driver. =20 Drivers ~~~~~~~ @@ -39,6 +43,10 @@ Drivers =20 Fixed issue with libvirt ``virsh destroy`` not killing the VM. =20 +* **drivers/net: New PMD for Netronome=C2=B4s NFP 6xxx cards** + + PMD supporting VFs with Netronome=C2=B4s NFP card. It requires specifi= c UIO + driver, nfp_uio, and previous configuration using Netronome=C2=B4s BSP= . =20 Libraries ~~~~~~~~~ diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 5ebf963..bc08591 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -48,6 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) +=3D ring DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=3D virtio DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) +=3D vmxnet3 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) +=3D xenvirt +DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) +=3D nfp =20 include $(RTE_SDK)/mk/rte.sharelib.mk include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile new file mode 100644 index 0000000..ef74e27 --- /dev/null +++ b/drivers/net/nfp/Makefile @@ -0,0 +1,88 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyrigh= t +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FO= R +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL= , +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE= , +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON AN= Y +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE US= E +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB =3D librte_pmd_nfp.a + +CFLAGS +=3D -O3 +CFLAGS +=3D $(WERROR_FLAGS) + +# +# Add extra flags for base driver files (also known as shared code) +# to disable warnings +# +ifeq ($(CC), icc) +CFLAGS_BASE_DRIVER =3D -wd593 +else ifeq ($(CC), clang) +CFLAGS_BASE_DRIVER +=3D -Wno-sign-compare +CFLAGS_BASE_DRIVER +=3D -Wno-unused-value +CFLAGS_BASE_DRIVER +=3D -Wno-unused-parameter +CFLAGS_BASE_DRIVER +=3D -Wno-strict-aliasing +CFLAGS_BASE_DRIVER +=3D -Wno-format +CFLAGS_BASE_DRIVER +=3D -Wno-missing-field-initializers +CFLAGS_BASE_DRIVER +=3D -Wno-pointer-to-int-cast +CFLAGS_BASE_DRIVER +=3D -Wno-format-nonliteral +else +CFLAGS_BASE_DRIVER =3D -Wno-sign-compare +CFLAGS_BASE_DRIVER +=3D -Wno-unused-value +CFLAGS_BASE_DRIVER +=3D -Wno-unused-parameter +CFLAGS_BASE_DRIVER +=3D -Wno-strict-aliasing +CFLAGS_BASE_DRIVER +=3D -Wno-format +CFLAGS_BASE_DRIVER +=3D -Wno-missing-field-initializers +CFLAGS_BASE_DRIVER +=3D -Wno-pointer-to-int-cast +CFLAGS_BASE_DRIVER +=3D -Wno-format-nonliteral +CFLAGS_BASE_DRIVER +=3D -Wno-format-security + +ifeq ($(shell test $(GCC_VERSION) -ge 44 && echo 1), 1) +CFLAGS_BASE_DRIVER +=3D -Wno-unused-but-set-variable +endif + +endif +OBJS_BASE_DRIVER=3D$(patsubst %.c,%.o,$(notdir $(wildcard $(RTE_SDK)/lib= /librte_pmd_nfp/*.c))) +$(foreach obj, $(OBJS_BASE_DRIVER), $(eval CFLAGS_$(obj)+=3D$(CFLAGS_BAS= E_DRIVER))) + +VPATH +=3D $(RTE_SDK)/drivers/net/nfp/ + +# +# all source are stored in SRCS-y +# +SRCS-$(CONFIG_RTE_LIBRTE_NFP_PMD) +=3D nfp_net.c + +# this lib depends upon: +DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) +=3D lib/librte_eal lib/librte_ethe= r +DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) +=3D lib/librte_mempool lib/librte_= mbuf +DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) +=3D lib/librte_net lib/librte_mall= oc + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c new file mode 100644 index 0000000..2f2d6fc --- /dev/null +++ b/drivers/net/nfp/nfp_net.c @@ -0,0 +1,2495 @@ +/* + * Copyright (c) 2014, 2015 Netronome Systems, Inc. + * All rights reserved. + * + * Small portions derived from code Copyright(c) 2010-2015 Intel Corpora= tion. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions ar= e met: + * + * 1. Redistributions of source code must retain the above copyright not= ice, + * this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution + * + * 3. Neither the name of the copyright holder nor the names of its + * contributors may be used to endorse or promote products derived from= this + * software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "= AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,= THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PU= RPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTOR= S BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSIN= ESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER = IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWIS= E) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED O= F THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * vim:shiftwidth=3D8:noexpandtab + * + * @file dpdk/pmd/nfp_net.c + * + * Netronome vNIC DPDK Poll-Mode Driver: Main entry point + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "nfp_net_pmd.h" +#include "nfp_net_logs.h" +#include "nfp_net_ctrl.h" + +/* Prototypes */ +static void nfp_net_close(struct rte_eth_dev *dev); +static int nfp_net_configure(struct rte_eth_dev *dev); +static void nfp_net_dev_interrupt_handler(struct rte_intr_handle *handle= , + void *param); +static void nfp_net_dev_interrupt_delayed_handler(void *param); +static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu); +static void nfp_net_infos_get(struct rte_eth_dev *dev, + struct rte_eth_dev_info *dev_info); +static int nfp_net_init(struct rte_eth_dev *eth_dev); +static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_comp= lete); +static void nfp_net_promisc_enable(struct rte_eth_dev *dev); +static void nfp_net_promisc_disable(struct rte_eth_dev *dev); +static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq); +static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev, + uint16_t queue_idx); +static uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_p= kts, + uint16_t nb_pkts); +static void nfp_net_rx_queue_release(void *rxq); +static int nfp_net_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queu= e_idx, + uint16_t nb_desc, unsigned int socket_id, + const struct rte_eth_rxconf *rx_conf, + struct rte_mempool *mp); +static int nfp_net_tx_free_bufs(struct nfp_net_txq *txq); +static void nfp_net_tx_queue_release(void *txq); +static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queu= e_idx, + uint16_t nb_desc, unsigned int socket_id, + const struct rte_eth_txconf *tx_conf); +static int nfp_net_start(struct rte_eth_dev *dev); +static void nfp_net_stats_get(struct rte_eth_dev *dev, + struct rte_eth_stats *stats); +static void nfp_net_stats_reset(struct rte_eth_dev *dev); +static void nfp_net_stop(struct rte_eth_dev *dev); +static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_p= kts, + uint16_t nb_pkts); + +/* + * The offset of the queue controller queues in the PCIe Target. These + * happen to be at the same offset on the NFP6000 and the NFP3200 so + * we use a single macro here. + */ +#define NFP_PCIE_QUEUE(_q) (0x80000 + (0x800 * ((_q) & 0xff))) + +/* Maximum value which can be added to a queue with one transaction */ +#define NFP_QCP_MAX_ADD 0x7f + +#define RTE_MBUF_DMA_ADDR_DEFAULT(mb) \ + (uint64_t)((mb)->buf_physaddr + RTE_PKTMBUF_HEADROOM) + +/* nfp_qcp_ptr - Read or Write Pointer of a queue */ +enum nfp_qcp_ptr { + NFP_QCP_READ_PTR =3D 0, + NFP_QCP_WRITE_PTR +}; + +/* + * nfp_qcp_ptr_add - Add the value to the selected pointer of a queue + * @q: Base address for queue structure + * @ptr: Add to the Read or Write pointer + * @val: Value to add to the queue pointer + * + * If @val is greater than @NFP_QCP_MAX_ADD multiple writes are performe= d. + */ +static inline void +nfp_qcp_ptr_add(__u8 *q, enum nfp_qcp_ptr ptr, __u32 val) +{ + __u32 off; + + if (ptr =3D=3D NFP_QCP_READ_PTR) + off =3D NFP_QCP_QUEUE_ADD_RPTR; + else + off =3D NFP_QCP_QUEUE_ADD_WPTR; + + while (val > NFP_QCP_MAX_ADD) { + nn_writel(rte_cpu_to_le_32(NFP_QCP_MAX_ADD), q + off); + val -=3D NFP_QCP_MAX_ADD; + } + + nn_writel(rte_cpu_to_le_32(val), q + off); +} + +/* + * nfp_qcp_read - Read the current Read/Write pointer value for a queue + * @q: Base address for queue structure + * @ptr: Read or Write pointer + */ +static inline __u32 +nfp_qcp_read(__u8 *q, enum nfp_qcp_ptr ptr) +{ + __u32 off; + __u32 val; + + if (ptr =3D=3D NFP_QCP_READ_PTR) + off =3D NFP_QCP_QUEUE_STS_LO; + else + off =3D NFP_QCP_QUEUE_STS_HI; + + val =3D rte_cpu_to_le_32(nn_readl(q + off)); + + if (ptr =3D=3D NFP_QCP_READ_PTR) + return val & NFP_QCP_QUEUE_STS_LO_READPTR_mask; + else + return val & NFP_QCP_QUEUE_STS_HI_WRITEPTR_mask; +} + +/* + * Functions to read/write from/to Config BAR + * Performs any endian conversion necessary. + */ +static inline __u8 +nn_cfg_readb(struct nfp_net_hw *hw, int off) +{ + return nn_readb(hw->ctrl_bar + off); +} + +static inline void +nn_cfg_writeb(struct nfp_net_hw *hw, int off, __u8 val) +{ + nn_writeb(val, hw->ctrl_bar + off); +} + +static inline __u32 +nn_cfg_readl(struct nfp_net_hw *hw, int off) +{ + return rte_le_to_cpu_32(nn_readl(hw->ctrl_bar + off)); +} + +static inline void +nn_cfg_writel(struct nfp_net_hw *hw, int off, __u32 val) +{ + nn_writel(rte_cpu_to_le_32(val), hw->ctrl_bar + off); +} + +static inline __u64 +nn_cfg_readq(struct nfp_net_hw *hw, int off) +{ + return rte_le_to_cpu_64(nn_readq(hw->ctrl_bar + off)); +} + +static inline void +nn_cfg_writeq(struct nfp_net_hw *hw, int off, __u64 val) +{ + nn_writeq(rte_cpu_to_le_64(val), hw->ctrl_bar + off); +} + +/* Creating memzone for hardware rings. */ +static const struct rte_memzone * +ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name, + uint16_t queue_id, uint32_t ring_size, int socket_id) +{ + char z_name[RTE_MEMZONE_NAMESIZE]; + const struct rte_memzone *mz; + + snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d", + dev->driver->pci_drv.name, + ring_name, dev->data->port_id, queue_id); + + mz =3D rte_memzone_lookup(z_name); + if (mz) + return mz; + + return rte_memzone_reserve_aligned(z_name, ring_size, socket_id, 0, + NFP_MEMZONE_ALIGN); +} + +/* + * Atomically reads link status information from global structure rte_et= h_dev. + * + * @param dev + * - Pointer to the structure rte_eth_dev to read from. + * - Pointer to the buffer to be saved with the link status. + * + * @return + * - On success, zero. + * - On failure, negative value. + */ +static inline int +nfp_net_dev_atomic_read_link_status(struct rte_eth_dev *dev, + struct rte_eth_link *link) +{ + struct rte_eth_link *dst =3D link; + struct rte_eth_link *src =3D &dev->data->dev_link; + + if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst, + *(uint64_t *)src) =3D=3D 0) + return -1; + + return 0; +} + +/* + * Atomically writes the link status information into global + * structure rte_eth_dev. + * + * @param dev + * - Pointer to the structure rte_eth_dev to read from. + * - Pointer to the buffer to be saved with the link status. + * + * @return + * - On success, zero. + * - On failure, negative value. + */ +static inline int +nfp_net_dev_atomic_write_link_status(struct rte_eth_dev *dev, + struct rte_eth_link *link) +{ + struct rte_eth_link *dst =3D &dev->data->dev_link; + struct rte_eth_link *src =3D link; + + if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst, + *(uint64_t *)src) =3D=3D 0) + return -1; + + return 0; +} + +static void +nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq) +{ + unsigned i; + + if (rxq->rxbufs =3D=3D NULL) + return; + + for (i =3D 0; i < rxq->rx_count; i++) { + if (rxq->rxbufs[i].mbuf) { + rte_pktmbuf_free_seg(rxq->rxbufs[i].mbuf); + rxq->rxbufs[i].mbuf =3D NULL; + } + } +} + +static void +nfp_net_rx_queue_release(void *rx_queue) +{ + struct nfp_net_rxq *rxq =3D rx_queue; + + if (rxq) { + nfp_net_rx_queue_release_mbufs(rxq); + rte_free(rxq->rxbufs); + rte_free(rxq); + } +} + +static void +nfp_net_reset_rx_queue(struct nfp_net_rxq *rxq) +{ + nfp_net_rx_queue_release_mbufs(rxq); + rxq->wr_p =3D 0; + rxq->rd_p =3D 0; + rxq->nb_rx_hold =3D 0; +} + +static void +nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq) +{ + unsigned i; + + if (txq->txbufs =3D=3D NULL) + return; + + for (i =3D 0; i < txq->tx_count; i++) { + if (txq->txbufs[i].mbuf) { + rte_pktmbuf_free_seg(txq->txbufs[i].mbuf); + txq->txbufs[i].mbuf =3D NULL; + } + } +} + +static void +nfp_net_tx_queue_release(void *tx_queue) +{ + struct nfp_net_txq *txq =3D tx_queue; + + if (txq) { + nfp_net_tx_queue_release_mbufs(txq); + rte_free(txq->txbufs); + rte_free(txq); + } +} + +static void +nfp_net_reset_tx_queue(struct nfp_net_txq *txq) +{ + nfp_net_tx_queue_release_mbufs(txq); + txq->wr_p =3D 0; + txq->rd_p =3D 0; + txq->tail =3D 0; +} + +static int +__nfp_net_reconfig(struct nfp_net_hw *hw, __u32 update) +{ + int cnt; + __u32 new; + struct timespec wait; + + PMD_DRV_LOG(DEBUG, "Writing to the configuration queue (%p)...\n", + hw->qcp_cfg); + + if (hw->qcp_cfg =3D=3D NULL) + rte_panic("Bad configuration queue pointer\n"); + + nfp_qcp_ptr_add(hw->qcp_cfg, NFP_QCP_WRITE_PTR, 1); + + wait.tv_sec =3D 0; + wait.tv_nsec =3D 1000000; + + PMD_DRV_LOG(DEBUG, "Polling for update ack...\n"); + + /* Poll update field, waiting for NFP to ack the config */ + for (cnt =3D 0; ; cnt++) { + new =3D nn_cfg_readl(hw, NFP_NET_CFG_UPDATE); + if (new =3D=3D 0) + break; + if (new & NFP_NET_CFG_UPDATE_ERR) { + PMD_INIT_LOG(ERR, "Reconfig error: 0x%08x\n", new); + return -1; + } + if (cnt >=3D NFP_NET_POLL_TIMEOUT) { + PMD_INIT_LOG(ERR, "Reconfig timeout for 0x%08x after" + " %dms\n", update, cnt); + rte_panic("Exiting\n"); + } + nanosleep(&wait, 0); /* waiting for a 1ms */ + } + PMD_DRV_LOG(DEBUG, "Ack DONE\n"); + return 0; +} + +/* + * Reconfigure the NIC + * @nn: device to reconfigure + * @ctrl: The value for the ctrl field in the BAR config + * @update: The value for the update field in the BAR config + * + * Write the update word to the BAR and ping the reconfig queue. Then po= ll + * until the firmware has acknowledged the update by zeroing the update = word. + */ +static int +nfp_net_reconfig(struct nfp_net_hw *hw, __u32 ctrl, __u32 update) +{ + __u32 err; + + PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=3D%08x update=3D%08x\n", + ctrl, update); + + nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl); + nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update); + + rte_wmb(); + + err =3D __nfp_net_reconfig(hw, update); + + if (!err) + return 0; + + /* + * Reconfig errors imply situations where they can be handled. + * Otherwise, rte_panic is called inside __nfp_net_reconfig + */ + PMD_INIT_LOG(ERR, "Error nfp_net reconfig for ctrl: %x update: %x\n", + ctrl, update); + return -EIO; +} + +/* + * Configure an Ethernet device. This function must be invoked first + * before any other function in the Ethernet API. This function can + * also be re-invoked when a device is in the stopped state. + */ +static int +nfp_net_configure(struct rte_eth_dev *dev) +{ + struct rte_eth_conf *dev_conf; + struct rte_eth_rxmode *rxmode; + struct rte_eth_txmode *txmode; + uint32_t new_ctrl =3D 0; + uint32_t update =3D 0; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* + * A DPDK app sends info about how many queues to use and how + * those queues need to be configured. This is used by the + * DPDK core and it makes sure no more queues than those + * advertised by the driver are requested. This function is + * called after that internal process + */ + + PMD_INIT_LOG(DEBUG, "Configure\n"); + + dev_conf =3D &dev->data->dev_conf; + rxmode =3D &dev_conf->rxmode; + txmode =3D &dev_conf->txmode; + + /* Checking TX mode */ + if (txmode->mq_mode) { + PMD_INIT_LOG(INFO, "TX mq_mode DCB and VMDq not supported\n"); + return -EINVAL; + } + + /* Checking RX mode */ + if (rxmode->mq_mode & ETH_MQ_RX_RSS) { + if (hw->cap & NFP_NET_CFG_CTRL_RSS) { + update =3D NFP_NET_CFG_UPDATE_RSS; + new_ctrl =3D NFP_NET_CFG_CTRL_RSS; + } else { + PMD_INIT_LOG(INFO, "RSS not supported\n"); + return -EINVAL; + } + } + + if (rxmode->split_hdr_size) { + PMD_INIT_LOG(INFO, "rxmode does not support split header\n"); + return -EINVAL; + } + + if (rxmode->hw_ip_checksum) { + if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM) { + new_ctrl |=3D NFP_NET_CFG_CTRL_RXCSUM; + } else { + PMD_INIT_LOG(INFO, "RXCSUM not supported\n"); + return -EINVAL; + } + } + + if (rxmode->hw_vlan_filter) { + PMD_INIT_LOG(INFO, "VLAN filter not supported\n"); + return -EINVAL; + } + + if (rxmode->hw_vlan_strip) { + if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN) { + new_ctrl |=3D NFP_NET_CFG_CTRL_RXVLAN; + } else { + PMD_INIT_LOG(INFO, "hw vlan strip not supported\n"); + return -EINVAL; + } + } + + if (rxmode->hw_vlan_extend) { + PMD_INIT_LOG(INFO, "VLAN extended not supported\n"); + return -EINVAL; + } + + /* Supporting VLAN insertion by default */ + if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN) + new_ctrl |=3D NFP_NET_CFG_CTRL_TXVLAN; + + if (rxmode->jumbo_frame) + /* this is handled in rte_eth_dev_configure */ + + if (rxmode->hw_strip_crc) { + PMD_INIT_LOG(INFO, "strip CRC not supported\n"); + return -EINVAL; + } + + if (rxmode->enable_scatter) { + PMD_INIT_LOG(INFO, "Scatter not supported\n"); + return -EINVAL; + } + + if (!new_ctrl) + return 0; + + update |=3D NFP_NET_CFG_UPDATE_GEN; + + nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl); + if (nfp_net_reconfig(hw, new_ctrl, update) < 0) + return -EIO; + + hw->ctrl =3D new_ctrl; + + return 0; +} + +static void +nfp_net_enable_queues(struct rte_eth_dev *dev) +{ + struct nfp_net_hw *hw; + uint64_t enabled_queues =3D 0; + int i; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* Enabling the required TX queues in the device */ + for (i =3D 0; i < dev->data->nb_tx_queues; i++) + enabled_queues |=3D (1 << i); + + nn_cfg_writeq(hw, NFP_NET_CFG_TXRS_ENABLE, enabled_queues); + + enabled_queues =3D 0; + + /* Enabling the required RX queues in the device */ + for (i =3D 0; i < dev->data->nb_rx_queues; i++) + enabled_queues |=3D (1 << i); + + nn_cfg_writeq(hw, NFP_NET_CFG_RXRS_ENABLE, enabled_queues); +} + +static void +nfp_net_disable_queues(struct rte_eth_dev *dev) +{ + struct nfp_net_hw *hw; + uint32_t new_ctrl, update =3D 0; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + nn_cfg_writeq(hw, NFP_NET_CFG_TXRS_ENABLE, 0); + nn_cfg_writeq(hw, NFP_NET_CFG_RXRS_ENABLE, 0); + + new_ctrl =3D hw->ctrl & ~NFP_NET_CFG_CTRL_ENABLE; + update =3D NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING | + NFP_NET_CFG_UPDATE_MSIX; + + if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG) + new_ctrl &=3D ~NFP_NET_CFG_CTRL_RINGCFG; + + /* If an error when reconfig we avoid to change hw state */ + if (nfp_net_reconfig(hw, new_ctrl, update) < 0) + return; + + hw->ctrl =3D new_ctrl; +} + +static int +nfp_net_rx_freelist_setup(struct rte_eth_dev *dev) +{ + int i; + + for (i =3D 0; i < dev->data->nb_rx_queues; i++) { + if (nfp_net_rx_fill_freelist(dev->data->rx_queues[i]) < 0) + return -1; + } + return 0; +} + +static void +nfp_net_params_setup(struct nfp_net_hw *hw) +{ + uint32_t *mac_address; + + nn_cfg_writel(hw, NFP_NET_CFG_MTU, hw->mtu); + nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, hw->flbufsz); + + /* A MAC address is 8 bytes long */ + mac_address =3D (uint32_t *)(hw->mac_addr); + + nn_cfg_writel(hw, NFP_NET_CFG_MACADDR, + rte_cpu_to_be_32(*mac_address)); + nn_cfg_writel(hw, NFP_NET_CFG_MACADDR + 4, + rte_cpu_to_be_32(*(mac_address + 4))); +} + +static void +nfp_net_cfg_queue_setup(struct nfp_net_hw *hw) +{ + hw->qcp_cfg =3D hw->tx_bar + NFP_QCP_QUEUE_ADDR_SZ; +} + +static int +nfp_net_start(struct rte_eth_dev *dev) +{ + uint32_t new_ctrl, update =3D 0; + struct nfp_net_hw *hw; + int ret; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + PMD_INIT_LOG(DEBUG, "Start\n"); + + /* Disabling queues just in case... */ + nfp_net_disable_queues(dev); + + /* Writing configuration parameters in the device */ + nfp_net_params_setup(hw); + + /* Enabling the required queues in the device */ + nfp_net_enable_queues(dev); + + /* Enable device */ + new_ctrl =3D hw->ctrl | NFP_NET_CFG_CTRL_ENABLE | NFP_NET_CFG_UPDATE_MS= IX; + update =3D NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING; + + if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG) + new_ctrl |=3D NFP_NET_CFG_CTRL_RINGCFG; + + nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl); + if (nfp_net_reconfig(hw, new_ctrl, update) < 0) + return -EIO; + + /* + * Allocating rte mbuffs for configured rx queues. + * This requires queues being enabled before + */ + if (nfp_net_rx_freelist_setup(dev) < 0) { + ret =3D -ENOMEM; + goto error; + } + + hw->ctrl =3D new_ctrl; + + return 0; + +error: + /* + * An error returned by this function should mean the app + * exiting and then the system releasing all the memory + * allocated even memory coming from hugepages. + * + * The device could be enabled at this point with some queues + * ready for getting packets. This is true if the call to + * nfp_net_rx_freelist_setup() succeeds for some queues but + * fails for subsequent queues. + * + * This should make the app exiting but better if we tell the + * device first. + */ + nfp_net_disable_queues(dev); + + return ret; +} + +/* Stop device: disable rx and tx functions to allow for reconfiguring. = */ +static void +nfp_net_stop(struct rte_eth_dev *dev) +{ + int i; + + PMD_INIT_LOG(DEBUG, "Stop\n"); + + nfp_net_disable_queues(dev); + + /* Clear queues */ + for (i =3D 0; i < dev->data->nb_tx_queues; i++) { + nfp_net_reset_tx_queue( + (struct nfp_net_txq *)dev->data->tx_queues[i]); + } + + for (i =3D 0; i < dev->data->nb_rx_queues; i++) { + nfp_net_reset_rx_queue( + (struct nfp_net_rxq *)dev->data->rx_queues[i]); + } +} + +/* Reset and stop device. The device can not be restarted. */ +static void +nfp_net_close(struct rte_eth_dev *dev) +{ + struct nfp_net_hw *hw; + + PMD_INIT_LOG(DEBUG, "Close\n"); + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* + * We assume that the DPDK application is stopping all the + * threads/queues before calling the device close function. + */ + + nfp_net_stop(dev); + + rte_intr_disable(&dev->pci_dev->intr_handle); + nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff); + + /* + * The ixgbe PMD driver disables the pcie master on the + * device. The i40e does not... + */ +} + +static void +nfp_net_promisc_enable(struct rte_eth_dev *dev) +{ + uint32_t new_ctrl, update =3D 0; + struct nfp_net_hw *hw; + + PMD_DRV_LOG(DEBUG, "Promiscuous mode enable\n"); + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + if (!(hw->cap & NFP_NET_CFG_CTRL_PROMISC)) { + PMD_INIT_LOG(INFO, "Promiscuous mode not supported\n"); + return; + } + + if (hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) { + PMD_DRV_LOG(INFO, "Promiscuous mode already enabled\n"); + return; + } + + new_ctrl =3D hw->ctrl | NFP_NET_CFG_CTRL_PROMISC; + update =3D NFP_NET_CFG_UPDATE_GEN; + + /* + * DPDK sets promiscuous mode on just after this call assuming + * it can not fail ... + */ + if (nfp_net_reconfig(hw, new_ctrl, update) < 0) + return; + + hw->ctrl =3D new_ctrl; +} + +static void +nfp_net_promisc_disable(struct rte_eth_dev *dev) +{ + uint32_t new_ctrl, update =3D 0; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + if ((hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) =3D=3D 0) { + PMD_DRV_LOG(INFO, "Promiscuous mode already disabled\n"); + return; + } + + new_ctrl =3D hw->ctrl & ~NFP_NET_CFG_CTRL_PROMISC; + update =3D NFP_NET_CFG_UPDATE_GEN; + + /* + * DPDK sets promiscuous mode off just before this call + * assuming it can not fail ... + */ + if (nfp_net_reconfig(hw, new_ctrl, update) < 0) + return; + + hw->ctrl =3D new_ctrl; +} + +/* + * return 0 means link status changed, -1 means not changed + * + * Wait to complete is needed as it can take up to 9 seconds to get the = Link + * status. + */ +static int +nfp_net_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_co= mplete) +{ + struct nfp_net_hw *hw; + struct rte_eth_link link, old; + uint32_t nn_link_status; + + PMD_DRV_LOG(DEBUG, "Link update\n"); + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + memset(&old, 0, sizeof(old)); + nfp_net_dev_atomic_read_link_status(dev, &old); + + nn_link_status =3D nn_cfg_readl(hw, NFP_NET_CFG_STS); + + memset(&link, 0, sizeof(struct rte_eth_link)); + + if (nn_link_status & NFP_NET_CFG_STS_LINK) + link.link_status =3D 1; + + link.link_duplex =3D ETH_LINK_FULL_DUPLEX; + /* Other cards can limit the tx and rx rate per VF */ + link.link_speed =3D ETH_LINK_SPEED_40G; + + if (old.link_status !=3D link.link_status) { + nfp_net_dev_atomic_write_link_status(dev, &link); + if (link.link_status) + PMD_DRV_LOG(INFO, "NIC Link is Up\n"); + else + PMD_DRV_LOG(INFO, "NIC Link is Down\n"); + return 0; + } + + return -1; +} + +static void +nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) +{ + int i; + struct nfp_net_hw *hw; + struct rte_eth_stats nfp_dev_stats; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* RTE_ETHDEV_QUEUE_STAT_CNTRS default value is 16 */ + + /* reading per RX ring stats */ + for (i =3D 0; i < dev->data->nb_rx_queues; i++) { + if (i =3D=3D RTE_ETHDEV_QUEUE_STAT_CNTRS) + break; + + nfp_dev_stats.q_ipackets[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i)); + + nfp_dev_stats.q_ipackets[i] -=3D + hw->eth_stats_base.q_ipackets[i]; + + nfp_dev_stats.q_ibytes[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8); + + nfp_dev_stats.q_ibytes[i] -=3D + hw->eth_stats_base.q_ibytes[i]; + } + + /* reading per TX ring stats */ + for (i =3D 0; i < dev->data->nb_tx_queues; i++) { + if (i =3D=3D RTE_ETHDEV_QUEUE_STAT_CNTRS) + break; + + nfp_dev_stats.q_opackets[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i)); + + nfp_dev_stats.q_opackets[i] -=3D + hw->eth_stats_base.q_opackets[i]; + + nfp_dev_stats.q_obytes[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8); + + nfp_dev_stats.q_obytes[i] -=3D + hw->eth_stats_base.q_obytes[i]; + } + + nfp_dev_stats.ipackets =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES); + + nfp_dev_stats.ipackets -=3D hw->eth_stats_base.ipackets; + + nfp_dev_stats.ibytes =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS); + + nfp_dev_stats.ibytes -=3D hw->eth_stats_base.ibytes; + + nfp_dev_stats.opackets =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES); + + nfp_dev_stats.opackets -=3D hw->eth_stats_base.opackets; + + nfp_dev_stats.obytes =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS); + + nfp_dev_stats.obytes -=3D hw->eth_stats_base.obytes; + + nfp_dev_stats.imcasts =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES); + + nfp_dev_stats.imcasts -=3D hw->eth_stats_base.imcasts; + + /* reading general device stats */ + nfp_dev_stats.ierrors =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS); + + nfp_dev_stats.ierrors -=3D hw->eth_stats_base.ierrors; + + nfp_dev_stats.oerrors =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS); + + nfp_dev_stats.oerrors -=3D hw->eth_stats_base.oerrors; + + /* Multicast frames received */ + nfp_dev_stats.imcasts =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES); + + nfp_dev_stats.imcasts -=3D hw->eth_stats_base.imcasts; + + /* RX ring mbuf allocation failures */ + nfp_dev_stats.rx_nombuf =3D dev->data->rx_mbuf_alloc_failed; + + nfp_dev_stats.imissed =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS); + + nfp_dev_stats.imissed -=3D hw->eth_stats_base.imissed; + + if (stats) + memcpy(stats, &nfp_dev_stats, sizeof(*stats)); +} + +static void +nfp_net_stats_reset(struct rte_eth_dev *dev) +{ + int i; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* + * hw->eth_stats_base records the per counter starting point. + * Lets update it now + */ + + /* reading per RX ring stats */ + for (i =3D 0; i < dev->data->nb_rx_queues; i++) { + if (i =3D=3D RTE_ETHDEV_QUEUE_STAT_CNTRS) + break; + + hw->eth_stats_base.q_ipackets[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i)); + + hw->eth_stats_base.q_ibytes[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8); + } + + /* reading per TX ring stats */ + for (i =3D 0; i < dev->data->nb_tx_queues; i++) { + if (i =3D=3D RTE_ETHDEV_QUEUE_STAT_CNTRS) + break; + + hw->eth_stats_base.q_opackets[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i)); + + hw->eth_stats_base.q_obytes[i] =3D + nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8); + } + + hw->eth_stats_base.ipackets =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES); + + hw->eth_stats_base.ibytes =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS); + + hw->eth_stats_base.opackets =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES); + + hw->eth_stats_base.obytes =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS); + + hw->eth_stats_base.imcasts =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES); + + /* reading general device stats */ + hw->eth_stats_base.ierrors =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS); + + hw->eth_stats_base.oerrors =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS); + + /* Multicast frames received */ + hw->eth_stats_base.imcasts =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES); + + /* RX ring mbuf allocation failures */ + dev->data->rx_mbuf_alloc_failed =3D 0; + + hw->eth_stats_base.imissed =3D + nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS); +} + +static void +nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_= info) +{ + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + dev_info->driver_name =3D dev->driver->pci_drv.name; + dev_info->max_rx_queues =3D (uint16_t)hw->max_rx_queues; + dev_info->max_tx_queues =3D (uint16_t)hw->max_tx_queues; + dev_info->min_rx_bufsize =3D ETHER_MIN_MTU; + dev_info->max_rx_pktlen =3D hw->max_mtu; + /* Next should change when PF support is implemented */ + dev_info->max_mac_addrs =3D 1; + + if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN) + dev_info->rx_offload_capa =3D DEV_RX_OFFLOAD_VLAN_STRIP; + + if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM) + dev_info->rx_offload_capa |=3D DEV_RX_OFFLOAD_IPV4_CKSUM | + DEV_RX_OFFLOAD_UDP_CKSUM | + DEV_RX_OFFLOAD_TCP_CKSUM; + + if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN) + dev_info->tx_offload_capa =3D DEV_TX_OFFLOAD_VLAN_INSERT; + + if (hw->cap & NFP_NET_CFG_CTRL_TXCSUM) + dev_info->tx_offload_capa |=3D DEV_TX_OFFLOAD_IPV4_CKSUM | + DEV_RX_OFFLOAD_UDP_CKSUM | + DEV_RX_OFFLOAD_TCP_CKSUM; + + dev_info->default_rxconf =3D (struct rte_eth_rxconf) { + .rx_thresh =3D { + .pthresh =3D DEFAULT_RX_PTHRESH, + .hthresh =3D DEFAULT_RX_HTHRESH, + .wthresh =3D DEFAULT_RX_WTHRESH, + }, + .rx_free_thresh =3D DEFAULT_RX_FREE_THRESH, + .rx_drop_en =3D 0, + }; + + dev_info->default_txconf =3D (struct rte_eth_txconf) { + .tx_thresh =3D { + .pthresh =3D DEFAULT_TX_PTHRESH, + .hthresh =3D DEFAULT_TX_HTHRESH, + .wthresh =3D DEFAULT_TX_WTHRESH, + }, + .tx_free_thresh =3D DEFAULT_TX_FREE_THRESH, + .tx_rs_thresh =3D DEFAULT_TX_RSBIT_THRESH, + .txq_flags =3D ETH_TXQ_FLAGS_NOMULTSEGS | + ETH_TXQ_FLAGS_NOOFFLOADS, + }; + + dev_info->reta_size =3D NFP_NET_CFG_RSS_ITBL_SZ; +#if RTE_VER_MAJOR =3D=3D 2 && RTE_VER_MINOR >=3D 1 + dev_info->hash_key_size =3D NFP_NET_CFG_RSS_KEY_SZ; +#endif +} + +static uint32_t +nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx) +{ + struct nfp_net_rxq *rxq; + struct nfp_net_rx_desc *rxds; + __u32 idx; + __u32 count; + + rxq =3D (struct nfp_net_rxq *)dev->data->rx_queues[queue_idx]; + + if (rxq =3D=3D NULL) { + PMD_INIT_LOG(ERR, "Bad queue: %u\n", queue_idx); + return 0; + } + + idx =3D rxq->rd_p % rxq->rx_count; + rxds =3D &rxq->rxds[idx]; + + count =3D 0; + + /* + * Other PMDs are just checking the DD bit in intervals of 4 + * descriptors and counting all four if the first has the DD + * bit on. Of course, this is not accurate but can be good for + * perfomance. But ideally that should be done in descriptors + * chunks belonging to the same cache line + */ + + while (count < rxq->rx_count) { + rxds =3D &rxq->rxds[idx]; + if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) =3D=3D 0) + break; + + count++; + idx++; + + /* Wrapping? */ + if ((idx) =3D=3D rxq->rx_count) + idx =3D 0; + } + + return count; +} + +static void +nfp_net_dev_link_status_print(struct rte_eth_dev *dev) +{ + struct rte_eth_link link; + + memset(&link, 0, sizeof(link)); + nfp_net_dev_atomic_read_link_status(dev, &link); + if (link.link_status) + RTE_LOG(INFO, PMD, "Port %d: Link Up - speed %u Mbps - %s\n", + (int)(dev->data->port_id), (unsigned)link.link_speed, + link.link_duplex =3D=3D ETH_LINK_FULL_DUPLEX + ? "full-duplex" : "half-duplex"); + else + RTE_LOG(INFO, PMD, " Port %d: Link Down\n", + (int)(dev->data->port_id)); + + RTE_LOG(INFO, PMD, "PCI Address: %04d:%02d:%02d:%d\n", + dev->pci_dev->addr.domain, dev->pci_dev->addr.bus, + dev->pci_dev->addr.devid, dev->pci_dev->addr.function); +} + +/* Interrupt configuration and handling */ + +/* + * nfp_net_irq_unmask - Unmask an interrupt + * + * If MSI-X auto-masking is enabled clear the mask bit, otherwise + * clear the ICR for the entry. + */ +static void +nfp_net_irq_unmask(struct rte_eth_dev *dev) +{ + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) { + /* If MSI-X auto-masking is used, clear the entry */ + rte_wmb(); + rte_intr_enable(&dev->pci_dev->intr_handle); + } else { + /* Make sure all updates are written before un-masking */ + rte_wmb(); + nn_cfg_writeb(hw, NFP_NET_CFG_ICR(NFP_NET_IRQ_LSC_IDX), + NFP_NET_CFG_ICR_UNMASKED); + } +} + +static void +nfp_net_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handl= e, + void *param) +{ + int64_t timeout; + struct rte_eth_link link; + struct rte_eth_dev *dev =3D (struct rte_eth_dev *)param; + + PMD_DRV_LOG(DEBUG, "We got a LSC interrupt!!!\n"); + + /* get the link status */ + memset(&link, 0, sizeof(link)); + nfp_net_dev_atomic_read_link_status(dev, &link); + + nfp_net_link_update(dev, 0); + + /* likely to up */ + if (!link.link_status) { + /* handle it 1 sec later, wait it being stable */ + timeout =3D NFP_NET_LINK_UP_CHECK_TIMEOUT; + /* likely to down */ + } else { + /* handle it 4 sec later, wait it being stable */ + timeout =3D NFP_NET_LINK_DOWN_CHECK_TIMEOUT; + } + + if (rte_eal_alarm_set(timeout * 1000, + nfp_net_dev_interrupt_delayed_handler, + (void *)dev) < 0) { + RTE_LOG(ERR, PMD, "Error setting alarm"); + /* Unmasking */ + nfp_net_irq_unmask(dev); + } +} + +/* + * Interrupt handler which shall be registered for alarm callback for de= layed + * handling specific interrupt to wait for the stable nic state. As the = NIC + * interrupt state is not stable for nfp after link is just down, it nee= ds + * to wait 4 seconds to get the stable status. + * + * @param handle Pointer to interrupt handle. + * @param param The address of parameter (struct rte_eth_dev *) + * + * @return void + */ +static void +nfp_net_dev_interrupt_delayed_handler(void *param) +{ + struct rte_eth_dev *dev =3D (struct rte_eth_dev *)param; + + nfp_net_link_update(dev, 0); + _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC); + + nfp_net_dev_link_status_print(dev); + + /* Unmasking */ + nfp_net_irq_unmask(dev); +} + +static int +nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) +{ + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* check that mtu is within the allowed range */ + if ((mtu < ETHER_MIN_MTU) || ((uint32_t)mtu > hw->max_mtu)) + return -EINVAL; + + /* switch to jumbo mode if needed */ + if ((uint32_t)mtu > ETHER_MAX_LEN) + dev->data->dev_conf.rxmode.jumbo_frame =3D 1; + else + dev->data->dev_conf.rxmode.jumbo_frame =3D 0; + + /* update max frame size */ + dev->data->dev_conf.rxmode.max_rx_pkt_len =3D (uint32_t)mtu; + + /* writing to configuration space */ + nn_cfg_writel(hw, NFP_NET_CFG_MTU, (uint32_t)mtu); + + hw->mtu =3D mtu; + + return 0; +} + +static int +nfp_net_rx_queue_setup(struct rte_eth_dev *dev, + uint16_t queue_idx, uint16_t nb_desc, + unsigned int socket_id, + const struct rte_eth_rxconf *rx_conf, + struct rte_mempool *mp) +{ + const struct rte_memzone *tz; + struct nfp_net_rxq *rxq; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + PMD_INIT_FUNC_TRACE(); + + /* Validating number of descriptors */ + if (((nb_desc * sizeof(struct nfp_net_rx_desc)) % 128) !=3D 0 || + (nb_desc > NFP_NET_MAX_RX_DESC) || + (nb_desc < NFP_NET_MIN_RX_DESC)) { + RTE_LOG(ERR, PMD, "Wrong nb_desc value\n"); + return (-EINVAL); + } + + /* + * Free memory prior to re-allocation if needed. This is the case after + * calling nfp_net_stop + */ + if (dev->data->rx_queues[queue_idx]) { + nfp_net_rx_queue_release(dev->data->rx_queues[queue_idx]); + dev->data->rx_queues[queue_idx] =3D NULL; + } + + /* Allocating rx queue data structure */ + rxq =3D rte_zmalloc_socket("ethdev RX queue", sizeof(struct nfp_net_rxq= ), + RTE_CACHE_LINE_SIZE, socket_id); + if (rxq =3D=3D NULL) + return (-ENOMEM); + + /* Hw queues mapping based on firmware confifguration */ + rxq->qidx =3D queue_idx; + rxq->fl_qcidx =3D queue_idx * hw->stride_rx; + rxq->rx_qcidx =3D rxq->fl_qcidx + (hw->stride_rx - 1); + rxq->qcp_fl =3D hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->fl_qcidx); + rxq->qcp_rx =3D hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->rx_qcidx); + + /* + * Tracking mbuf size for detecting a potential mbuf overflow due to + * NFP_NET_RX_OFFSET + */ + rxq->mem_pool =3D mp; + rxq->mbuf_size =3D rxq->mem_pool->elt_size; + rxq->mbuf_size -=3D (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM); + hw->flbufsz =3D rxq->mbuf_size; + + rxq->rx_count =3D nb_desc; + rxq->port_id =3D dev->data->port_id; + rxq->rx_free_thresh =3D rx_conf->rx_free_thresh; + rxq->drop_en =3D rx_conf->rx_drop_en; + + /* + * Allocate RX ring hardware descriptors. A memzone large enough to + * handle the maximum ring size is allocated in order to allow for + * resizing in later calls to the queue setup function. + */ + tz =3D ring_dma_zone_reserve(dev, "rx_ring", queue_idx, + sizeof(struct nfp_net_rx_desc) * + NFP_NET_MAX_RX_DESC, socket_id); + + if (tz =3D=3D NULL) { + RTE_LOG(ERR, PMD, "Error allocatig rx dma\n"); + nfp_net_rx_queue_release(rxq); + return (-ENOMEM); + } + + /* Saving physical and virtual addresses for the RX ring */ + rxq->dma =3D (uint64_t)tz->phys_addr; + rxq->rxds =3D (struct nfp_net_rx_desc *)tz->addr; + + /* mbuf pointers array for referencing mbufs linked to RX descriptors *= / + rxq->rxbufs =3D rte_zmalloc_socket("rxq->rxbufs", + sizeof(*rxq->rxbufs) * nb_desc, + RTE_CACHE_LINE_SIZE, socket_id); + if (rxq->rxbufs =3D=3D NULL) { + nfp_net_rx_queue_release(rxq); + return (-ENOMEM); + } + + PMD_RX_LOG(DEBUG, "rxbufs=3D%p hw_ring=3D%p dma_addr=3D0x%" PRIx64 "\n"= , + rxq->rxbufs, rxq->rxds, (unsigned long int)rxq->dma); + + nfp_net_reset_rx_queue(rxq); + + dev->data->rx_queues[queue_idx] =3D rxq; + rxq->hw =3D hw; + + /* + * Telling the HW about the physical address of the RX ring and number + * of descriptors in log2 format + */ + nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(queue_idx), rxq->dma); + nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(queue_idx), log2(nb_desc)); + + return 0; +} + +static int +nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq) +{ + struct nfp_net_rx_buff *rxe =3D rxq->rxbufs; + uint64_t dma_addr; + unsigned i; + + PMD_RX_LOG(DEBUG, "nfp_net_rx_fill_freelist for %u descriptors\n", + rxq->rx_count); + + for (i =3D 0; i < rxq->rx_count; i++) { + struct nfp_net_rx_desc *rxd; + struct rte_mbuf *mbuf =3D rte_pktmbuf_alloc(rxq->mem_pool); + + if (mbuf =3D=3D NULL) { + RTE_LOG(ERR, PMD, "RX mbuf alloc failed queue_id=3D%u\n", + (unsigned)rxq->qidx); + return (-ENOMEM); + } + + dma_addr =3D rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(mbuf)); + + rxd =3D &rxq->rxds[i]; + rxd->fld.dd =3D 0; + rxd->fld.dma_addr_hi =3D (dma_addr >> 32) & 0xff; + rxd->fld.dma_addr_lo =3D dma_addr & 0xffffffff; + rxe[i].mbuf =3D mbuf; + PMD_RX_LOG(DEBUG, "[%d]: %" PRIx64 "\n", i, dma_addr); + + rxq->wr_p++; + } + + /* Make sure all writes are flushed before telling the hardware */ + rte_wmb(); + + /* Not advertising the whole ring as the firmware gets confused if so *= / + PMD_RX_LOG(DEBUG, "Increment FL write pointer in %u\n", + rxq->rx_count - 1); + + nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, rxq->rx_count - 1); + + return 0; +} + +static int +nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx, + uint16_t nb_desc, unsigned int socket_id, + const struct rte_eth_txconf *tx_conf) +{ + const struct rte_memzone *tz; + struct nfp_net_txq *txq; + uint16_t tx_free_thresh; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + PMD_INIT_FUNC_TRACE(); + + /* Validating number of descriptors */ + if (((nb_desc * sizeof(struct nfp_net_tx_desc)) % 128) !=3D 0 || + (nb_desc > NFP_NET_MAX_TX_DESC) || + (nb_desc < NFP_NET_MIN_TX_DESC)) { + RTE_LOG(ERR, PMD, "Wrong nb_desc value\n"); + return -EINVAL; + } + + tx_free_thresh =3D (uint16_t)((tx_conf->tx_free_thresh) ? + tx_conf->tx_free_thresh : + DEFAULT_TX_FREE_THRESH); + + if (tx_free_thresh > (nb_desc)) { + RTE_LOG(ERR, PMD, + "tx_free_thresh must be less than the number of TX " + "descriptors. (tx_free_thresh=3D%u port=3D%d " + "queue=3D%d)\n", (unsigned int)tx_free_thresh, + (int)dev->data->port_id, (int)queue_idx); + return -(EINVAL); + } + + /* + * Free memory prior to re-allocation if needed. This is the case after + * calling nfp_net_stop + */ + if (dev->data->tx_queues[queue_idx]) { + PMD_TX_LOG(DEBUG, "Freeing memory prior to re-allocation %d\n", + queue_idx); + nfp_net_tx_queue_release(dev->data->tx_queues[queue_idx]); + dev->data->tx_queues[queue_idx] =3D NULL; + } + + /* Allocating tx queue data structure */ + txq =3D rte_zmalloc_socket("ethdev TX queue", sizeof(struct nfp_net_txq= ), + RTE_CACHE_LINE_SIZE, socket_id); + if (txq =3D=3D NULL) { + RTE_LOG(ERR, PMD, "Error allocating tx dma\n"); + return (-ENOMEM); + } + + /* + * Allocate TX ring hardware descriptors. A memzone large enough to + * handle the maximum ring size is allocated in order to allow for + * resizing in later calls to the queue setup function. + */ + tz =3D ring_dma_zone_reserve(dev, "tx_ring", queue_idx, + sizeof(struct nfp_net_tx_desc) * + NFP_NET_MAX_TX_DESC, socket_id); + if (tz =3D=3D NULL) { + RTE_LOG(ERR, PMD, "Error allocating tx dma\n"); + nfp_net_tx_queue_release(txq); + return (-ENOMEM); + } + + txq->tx_count =3D nb_desc; + txq->tx_free_thresh =3D tx_free_thresh; + txq->tx_pthresh =3D tx_conf->tx_thresh.pthresh; + txq->tx_hthresh =3D tx_conf->tx_thresh.hthresh; + txq->tx_wthresh =3D tx_conf->tx_thresh.wthresh; + + /* queue mapping based on firmware configuration */ + txq->qidx =3D queue_idx; + txq->tx_qcidx =3D queue_idx * hw->stride_tx; + txq->qcp_q =3D hw->tx_bar + NFP_QCP_QUEUE_OFF(txq->tx_qcidx); + + txq->port_id =3D dev->data->port_id; + txq->txq_flags =3D tx_conf->txq_flags; + + /* Saving physical and virtual addresses for the TX ring */ + txq->dma =3D (uint64_t)tz->phys_addr; + txq->txds =3D (struct nfp_net_tx_desc *)tz->addr; + + /* mbuf pointers array for referencing mbufs linked to TX descriptors *= / + txq->txbufs =3D rte_zmalloc_socket("txq->txbufs", + sizeof(*txq->txbufs) * nb_desc, + RTE_CACHE_LINE_SIZE, socket_id); + if (txq->txbufs =3D=3D NULL) { + nfp_net_tx_queue_release(txq); + return (-ENOMEM); + } + PMD_TX_LOG(DEBUG, "txbufs=3D%p hw_ring=3D%p dma_addr=3D0x%" PRIx64 "\n"= , + txq->txbufs, txq->txds, (unsigned long int)txq->dma); + + nfp_net_reset_tx_queue(txq); + + dev->data->tx_queues[queue_idx] =3D txq; + txq->hw =3D hw; + + /* + * Telling the HW about the physical address of the TX ring and number + * of descriptors in log2 format + */ + nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(queue_idx), txq->dma); + nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(queue_idx), log2(nb_desc)); + + return 0; +} + +/* nfp_net_tx_cksum - Set TX CSUM offload flags in TX descriptor */ +static inline void +nfp_net_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_tx_desc *txd, + struct rte_mbuf *mb) +{ + uint16_t ol_flags; + struct nfp_net_hw *hw =3D txq->hw; + + if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM)) + return; + + ol_flags =3D mb->ol_flags; + + /* IPv6 does not need checksum */ + if (ol_flags & PKT_TX_IP_CKSUM) + txd->flags |=3D PCIE_DESC_TX_IP4_CSUM; + + switch (ol_flags & PKT_TX_L4_MASK) { + case PKT_TX_UDP_CKSUM: + txd->flags |=3D PCIE_DESC_TX_UDP_CSUM; + break; + case PKT_TX_TCP_CKSUM: + txd->flags |=3D PCIE_DESC_TX_TCP_CSUM; + break; + } + + txd->flags |=3D PCIE_DESC_TX_CSUM; +} + +/* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor fla= gs */ +static inline void +nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd, + struct rte_mbuf *mb) +{ + struct nfp_net_hw *hw =3D rxq->hw; + + if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM)) + return; + + /* If IPv4 and IP checksum error, fail */ + if ((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) && + !(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK)) + mb->ol_flags |=3D PKT_RX_IP_CKSUM_BAD; + + /* If neither UDP nor TCP return */ + if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) && + !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM)) + return; + + if ((rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) && + !(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM_OK)) + mb->ol_flags |=3D PKT_RX_L4_CKSUM_BAD; + + if ((rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM) && + !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM_OK)) + mb->ol_flags |=3D PKT_RX_L4_CKSUM_BAD; +} + +#define NFP_HASH_OFFSET ((uint8_t *)mbuf->buf_addr + mbuf->data_off= - 4) +#define NFP_HASH_TYPE_OFFSET ((uint8_t *)mbuf->buf_addr + mbuf->data_off= - 8) + +/* + * nfp_net_set_hash - Set mbuf hash data + * + * The RSS hash and hash-type are pre-pended to the packet data. + * Extract and decode it and set the mbuf fields. + */ +static inline void +nfp_net_set_hash(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd, + struct rte_mbuf *mbuf) +{ + uint32_t hash; + uint32_t hash_type; + struct nfp_net_hw *hw =3D rxq->hw; + + if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) + return; + + if (!(rxd->rxd.flags & PCIE_DESC_RX_RSS)) + return; + + hash =3D rte_be_to_cpu_32(*(uint32_t *)NFP_HASH_OFFSET); + hash_type =3D rte_be_to_cpu_32(*(uint32_t *)NFP_HASH_TYPE_OFFSET); + + /* + * hash type is sharing the same word with input port info + * 31-8: input port + * 7:0: hash type + */ + hash_type &=3D 0xff; + mbuf->hash.rss =3D hash; + mbuf->ol_flags |=3D PKT_RX_RSS_HASH; + + switch (hash_type) { + case NFP_NET_RSS_IPV4: + mbuf->packet_type |=3D RTE_PTYPE_INNER_L3_IPV4; + break; + case NFP_NET_RSS_IPV6: + mbuf->packet_type |=3D RTE_PTYPE_INNER_L3_IPV6; + break; + case NFP_NET_RSS_IPV6_EX: + mbuf->packet_type |=3D RTE_PTYPE_INNER_L3_IPV6_EXT; + break; + default: + mbuf->packet_type |=3D RTE_PTYPE_INNER_L4_MASK; + } +} + +/* nfp_net_check_port - Set mbuf in_port field */ +static void +nfp_net_check_port(struct nfp_net_rx_desc *rxd, struct rte_mbuf *mbuf) +{ + uint32_t port; + + if (!(rxd->rxd.flags & PCIE_DESC_RX_INGRESS_PORT)) { + mbuf->port =3D 0; + return; + } + + port =3D rte_be_to_cpu_32(*(uint32_t *)((uint8_t *)mbuf->buf_addr + + mbuf->data_off - 8)); + + /* + * hash type is sharing the same word with input port info + * 31-8: input port + * 7:0: hash type + */ + port =3D (uint8_t)(port >> 8); + mbuf->port =3D port; +} + +static inline void +nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq) +{ + rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++; +} + +#define NFP_DESC_META_LEN(d) (d->rxd.meta_len_dd & PCIE_DESC_RX_META_LEN= _MASK) + +/* + * RX path design: + * + * There are some decissions to take: + * 1) How to check DD RX descriptors bit + * 2) How and when to allocate new mbufs + * + * Current implementation checks just one single DD bit each loop. As ea= ch + * descriptor is 8 bytes, it is likely a good idea to check descriptors = in + * a single cache line instead. Tests with this change have not shown an= y + * performance improvement but it requires further investigation. For ex= ample, + * depending on which descriptor is next, the number of descriptors coul= d be + * less than 8 for just checking those in the same cache line. This impl= ies + * extra work which could be counterproductive by itself. Indeed, last f= irmware + * changes are just doing this: writing several descriptors with the DD = bit + * for saving PCIe bandwidth and DMA operations from the NFP. + * + * Mbuf allocation is done when a new packet is received. Then the descr= iptor + * is automatically linked with the new mbuf and the old one is given to= the + * user. The main drawback with this design is mbuf allocation is heavie= r than + * using bulk allocations allowed by DPDK with rte_mempool_get_bulk. Fro= m the + * cache point of view it does not seem allocating the mbuf early on as = we are + * doing now have any benefit at all. Again, tests with this change have= not + * shown any improvement. Also, rte_mempool_get_bulk returns all or noth= ing + * so looking at the implications of this type of allocation should be s= tudied + * deeply + */ + +static uint16_t +nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb= _pkts) +{ + struct nfp_net_rxq *rxq; + struct nfp_net_rx_desc *rxds; + struct nfp_net_rx_buff *rxb; + struct nfp_net_hw *hw; + struct rte_mbuf *mb; + struct rte_mbuf *new_mb; + int idx; + uint16_t nb_hold; + uint64_t dma_addr; + int avail; + + rxq =3D rx_queue; + if (unlikely(rxq =3D=3D NULL)) { + /* + * DPDK just checks the queue is lower than max queues + * enabled. But the queue needs to be configured + */ + RTE_LOG(ERR, PMD, "RX Bad queue\n"); + return -EINVAL; + } + + hw =3D rxq->hw; + avail =3D 0; + nb_hold =3D 0; + + while (avail < nb_pkts) { + idx =3D rxq->rd_p % rxq->rx_count; + + rxb =3D &rxq->rxbufs[idx]; + if (unlikely(rxb =3D=3D NULL)) { + RTE_LOG(ERR, PMD, "rxb does not exist!\n"); + break; + } + + /* + * Memory barrier to ensure that we won't do other + * reads before the DD bit. + */ + rte_rmb(); + + rxds =3D &rxq->rxds[idx]; + if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) =3D=3D 0) + break; + + /* + * We got a packet. Let's alloc a new mbuff for refilling the + * free descriptor ring as soon as possible + */ + new_mb =3D rte_pktmbuf_alloc(rxq->mem_pool); + if (unlikely(new_mb =3D=3D NULL)) { + RTE_LOG(DEBUG, PMD, "RX mbuf alloc failed port_id=3D%u " + "queue_id=3D%u\n", (unsigned)rxq->port_id, + (unsigned)rxq->qidx); + nfp_net_mbuf_alloc_failed(rxq); + break; + } + + nb_hold++; + + /* + * Grab the mbuff and refill the descriptor with the + * previously allocated mbuff + */ + mb =3D rxb->mbuf; + rxb->mbuf =3D new_mb; + + PMD_RX_LOG(DEBUG, "Packet len: %u, mbuf_size: %u\n", + rxds->rxd.data_len, rxq->mbuf_size); + + /* Size of this segment */ + mb->data_len =3D rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + /* Size of the whole packet. We just support 1 segment */ + mb->pkt_len =3D rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + + if (unlikely((mb->data_len + NFP_NET_RX_OFFSET) > + rxq->mbuf_size)) { + /* + * This should not happen and the user has the + * responsibility of avoiding it. But we have + * to give some info about the error + */ + RTE_LOG(ERR, PMD, + "mbuf overflow likely due to NFP_NET_RX_OFFSET\n" + "\t\tYour mbuf size should have extra space for" + " NFP_NET_RX_OFFSET=3D%u bytes.\n" + "\t\tCurrently you just have %u bytes available" + " but the received packet is %u bytes long", + NFP_NET_RX_OFFSET, + rxq->mbuf_size - NFP_NET_RX_OFFSET, + mb->data_len); + return -EINVAL; + } + + /* Filling the received mbuff with packet info */ + mb->data_off =3D RTE_PKTMBUF_HEADROOM + NFP_NET_RX_OFFSET; + + /* No scatter mode supported */ + mb->nb_segs =3D 1; + mb->next =3D NULL; + + /* Checking the RSS flag */ + nfp_net_set_hash(rxq, rxds, mb); + + /* Checking the checksum flag */ + nfp_net_rx_cksum(rxq, rxds, mb); + + /* Checking the port flag */ + nfp_net_check_port(rxds, mb); + + if ((rxds->rxd.flags & PCIE_DESC_RX_VLAN) && + (hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) { + mb->vlan_tci =3D rte_cpu_to_le_32(rxds->rxd.vlan); + mb->ol_flags |=3D PKT_RX_VLAN_PKT; + } + + /* Adding the mbuff to the mbuff array passed by the app */ + rx_pkts[avail++] =3D mb; + + /* Now resetting and updating the descriptor */ + rxds->vals[0] =3D 0; + rxds->vals[1] =3D 0; + dma_addr =3D rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb)); + rxds->fld.dd =3D 0; + rxds->fld.dma_addr_hi =3D (dma_addr >> 32) & 0xff; + rxds->fld.dma_addr_lo =3D dma_addr & 0xffffffff; + + rxq->rd_p++; + } + + if (nb_hold =3D=3D 0) + return nb_hold; + + PMD_RX_LOG(DEBUG, "RX port_id=3D%u queue_id=3D%u, %d packets received\= n", + (unsigned)rxq->port_id, (unsigned)rxq->qidx, nb_hold); + + nb_hold +=3D rxq->nb_rx_hold; + + /* + * FL descriptors needs to be written before incrementing the + * FL queue WR pointer + */ + rte_wmb(); + if (nb_hold > rxq->rx_free_thresh) { + PMD_RX_LOG(DEBUG, "port=3D%u queue=3D%u nb_hold=3D%u avail=3D%u\n", + (unsigned)rxq->port_id, (unsigned)rxq->qidx, + (unsigned)nb_hold, (unsigned)avail); + nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold); + nb_hold =3D 0; + } + rxq->nb_rx_hold =3D nb_hold; + + return avail; +} + +/* + * nfp_net_tx_free_bufs - Check for descriptors with a complete + * status + * @txq: TX queue to work with + * Returns number of descriptors freed + */ +int +nfp_net_tx_free_bufs(struct nfp_net_txq *txq) +{ + __u32 qcp_rd_p; + int todo; + + PMD_TX_LOG(DEBUG, "queue %u. Check for descriptor with a complete" + " status\n", txq->qidx); + + /* Work out how many packets have been sent */ + qcp_rd_p =3D nfp_qcp_read(txq->qcp_q, NFP_QCP_READ_PTR); + + if (qcp_rd_p =3D=3D txq->qcp_rd_p) { + PMD_TX_LOG(DEBUG, "queue %u: It seems harrier is not sending " + "packets (%u, %u)\n", txq->qidx, + qcp_rd_p, txq->qcp_rd_p); + return 0; + } + + if (qcp_rd_p > txq->qcp_rd_p) + todo =3D qcp_rd_p - txq->qcp_rd_p; + else + todo =3D qcp_rd_p + txq->tx_count - txq->qcp_rd_p; + + PMD_TX_LOG(DEBUG, "qcp_rd_p %u, txq->qcp_rd_p: %u, qcp->rd_p: %u\n", + qcp_rd_p, txq->qcp_rd_p, txq->rd_p); + + if (todo =3D=3D 0) + return todo; + + txq->qcp_rd_p +=3D todo; + txq->qcp_rd_p %=3D txq->tx_count; + txq->rd_p +=3D todo; + + return todo; +} + +/* Leaving always free descriptors for avoiding wrapping confusion */ +#define NFP_FREE_TX_DESC(t) (t->tx_count - (t->wr_p - t->rd_p) - 8) + +/* + * nfp_net_txq_full - Check if the TX queue free descriptors + * is below tx_free_threshold + * + * @txq: TX queue to check + * + * This function uses the host copy* of read/write pointers + */ +static inline +int nfp_net_txq_full(struct nfp_net_txq *txq) +{ + return NFP_FREE_TX_DESC(txq) < txq->tx_free_thresh; +} + +static uint16_t +nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb= _pkts) +{ + struct nfp_net_txq *txq; + struct nfp_net_hw *hw; + struct nfp_net_tx_desc *txds; + struct rte_mbuf *pkt; + uint64_t dma_addr; + int pkt_size, dma_size; + uint16_t free_descs, issued_descs; + struct rte_mbuf **lmbuf; + int i; + + txq =3D tx_queue; + hw =3D txq->hw; + txds =3D &txq->txds[txq->tail]; + + PMD_TX_LOG(DEBUG, "working for queue %u at pos %d and %u packets\n", + txq->qidx, wr_idx, nb_pkts); + + if ((NFP_FREE_TX_DESC(txq) < nb_pkts) || (nfp_net_txq_full(txq))) + nfp_net_tx_free_bufs(txq); + + free_descs =3D (uint16_t)NFP_FREE_TX_DESC(txq); + if (unlikely(free_descs =3D=3D 0)) + return 0; + + pkt =3D *tx_pkts; + + i =3D 0; + issued_descs =3D 0; + PMD_TX_LOG(DEBUG, "queue: %u. Sending %u packets\n", + txq->qidx, nb_pkts); + /* Sending packets */ + while ((i < nb_pkts) && free_descs) { + /* Grabbing the mbuf linked to the current descriptor */ + lmbuf =3D &txq->txbufs[txq->tail].mbuf; + /* Warming the cache for releasing the mbuf later on */ + RTE_MBUF_PREFETCH_TO_FREE(*lmbuf); + + pkt =3D *(tx_pkts + i); + + if (unlikely((pkt->nb_segs > 1) && + !(hw->cap & NFP_NET_CFG_CTRL_GATHER))) { + PMD_INIT_LOG(INFO, "NFP_NET_CFG_CTRL_GATHER not set\n"); + rte_panic("Multisegment packet unsupported\n"); + } + + /* Checking if we have enough descriptors */ + if (unlikely(pkt->nb_segs > free_descs)) + goto xmit_end; + + /* + * Checksum and VLAN flags just in the first descriptor for a + * multisegment packet + */ + nfp_net_tx_cksum(txq, txds, pkt); + + if ((pkt->ol_flags & PKT_TX_VLAN_PKT) && + (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)) { + txds->flags |=3D PCIE_DESC_TX_VLAN; + txds->vlan =3D pkt->vlan_tci; + } + + if (pkt->ol_flags & PKT_TX_TCP_SEG) + rte_panic("TSO is not supported\n"); + + /* + * mbuf data_len is the data in one segment and pkt_len data + * in the whole packet. When the packet is just one segment, + * then data_len =3D pkt_len + */ + pkt_size =3D pkt->pkt_len; + + while (pkt_size) { + /* Releasing mbuf which was prefetched above */ + if (*lmbuf) + rte_pktmbuf_free_seg(*lmbuf); + + dma_size =3D pkt->data_len; + dma_addr =3D RTE_MBUF_DATA_DMA_ADDR(pkt); + PMD_TX_LOG(DEBUG, "Working with mbuf at dma address:" + "%" PRIx64 "\n", dma_addr); + + /* Filling descriptors fields */ + txds->dma_len =3D dma_size; + txds->data_len =3D pkt->pkt_len; + txds->dma_addr_hi =3D (dma_addr >> 32) & 0xff; + txds->dma_addr_lo =3D (dma_addr & 0xffffffff); + ASSERT(free_descs > 0); + free_descs--; + + /* + * Linking mbuf with descriptor for being released + * next time descriptor is used + */ + *lmbuf =3D pkt; + + txq->wr_p++; + txq->tail++; + if (unlikely(txq->tail =3D=3D txq->tx_count)) /* wrapping?*/ + txq->tail =3D 0; + + pkt_size -=3D dma_size; + if (!pkt_size) { + /* End of packet */ + txds->offset_eop |=3D PCIE_DESC_TX_EOP; + } else { + txds->offset_eop &=3D PCIE_DESC_TX_OFFSET_MASK; + pkt =3D pkt->next; + } + /* Referencing next free TX descriptor */ + txds =3D &txq->txds[txq->tail]; + issued_descs++; + } + i++; + } + +xmit_end: + /* Increment write pointers. Force memory write before we let HW know *= / + rte_wmb(); + nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, issued_descs); + + return i; +} + +static void +nfp_net_vlan_offload_set(struct rte_eth_dev *dev, int mask) +{ + uint32_t new_ctrl, update; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + new_ctrl =3D 0; + + if ((mask & ETH_VLAN_FILTER_OFFLOAD) || + (mask & ETH_VLAN_FILTER_OFFLOAD)) + RTE_LOG(INFO, PMD, "Not support for ETH_VLAN_FILTER_OFFLOAD or" + " ETH_VLAN_FILTER_EXTEND"); + + /* Enable vlan strip if it is not configured yet */ + if ((mask & ETH_VLAN_STRIP_OFFLOAD) && + !(hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) + new_ctrl =3D hw->ctrl | NFP_NET_CFG_CTRL_RXVLAN; + + /* Disable vlan strip just if it is configured */ + if (!(mask & ETH_VLAN_STRIP_OFFLOAD) && + (hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) + new_ctrl =3D hw->ctrl & ~NFP_NET_CFG_CTRL_RXVLAN; + + if (new_ctrl =3D=3D 0) + return; + + update =3D NFP_NET_CFG_UPDATE_GEN; + + if (nfp_net_reconfig(hw, new_ctrl, update) < 0) + return; + + hw->ctrl =3D new_ctrl; +} + +/* Update Redirection Table(RETA) of Receive Side Scaling of Ethernet de= vice */ +static int +nfp_net_reta_update(struct rte_eth_dev *dev, + struct rte_eth_rss_reta_entry64 *reta_conf, + uint16_t reta_size) +{ + uint32_t reta, mask; + int i, j; + int idx, shift; + uint32_t update; + struct nfp_net_hw *hw =3D + NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) + return -EINVAL; + + if (reta_size !=3D NFP_NET_CFG_RSS_ITBL_SZ) { + RTE_LOG(ERR, PMD, "The size of hash lookup table configured " + "(%d) doesn't match the number hardware can supported " + "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ); + return -EINVAL; + } + + /* + * Update Redirection Table. There are 128 8bit-entries which can be + * manage as 32 32bit-entries + */ + for (i =3D 0; i < reta_size; i +=3D 4) { + /* Handling 4 RSS entries per loop */ + idx =3D i / RTE_RETA_GROUP_SIZE; + shift =3D i % RTE_RETA_GROUP_SIZE; + mask =3D (uint8_t)((reta_conf[idx].mask >> shift) & 0xF); + + if (!mask) + continue; + + reta =3D 0; + /* If all 4 entries were set, don't need read RETA register */ + if (mask !=3D 0xF) + reta =3D nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + i); + + for (j =3D 0; j < 4; j++) { + if (!(mask & (0x1 << j))) + continue; + if (mask !=3D 0xF) + /* Clearing the entry bits */ + reta &=3D ~(0xFF << (8 * j)); + reta |=3D reta_conf[idx].reta[shift + j] << (8 * j); + } + nn_cfg_writel(hw, NFP_NET_CFG_RSS_ITBL + shift, reta); + } + + update =3D NFP_NET_CFG_UPDATE_RSS; + + if (nfp_net_reconfig(hw, hw->ctrl, update) < 0) + return -EIO; + + return 0; +} + + /* Query Redirection Table(RETA) of Receive Side Scaling of Ethernet de= vice. */ +static int +nfp_net_reta_query(struct rte_eth_dev *dev, + struct rte_eth_rss_reta_entry64 *reta_conf, + uint16_t reta_size) +{ + uint8_t i, j, mask; + int idx, shift; + uint32_t reta; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) + return -EINVAL; + + if (reta_size !=3D NFP_NET_CFG_RSS_ITBL_SZ) { + RTE_LOG(ERR, PMD, "The size of hash lookup table configured " + "(%d) doesn't match the number hardware can supported " + "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ); + return -EINVAL; + } + + /* + * Reading Redirection Table. There are 128 8bit-entries which can be + * manage as 32 32bit-entries + */ + for (i =3D 0; i < reta_size; i +=3D 4) { + /* Handling 4 RSS entries per loop */ + idx =3D i / RTE_RETA_GROUP_SIZE; + shift =3D i % RTE_RETA_GROUP_SIZE; + mask =3D (uint8_t)((reta_conf[idx].mask >> shift) & 0xF); + + if (!mask) + continue; + + reta =3D nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + shift); + for (j =3D 0; j < 4; j++) { + if (!(mask & (0x1 << j))) + continue; + reta_conf->reta[shift + j] =3D + (uint8_t)((reta >> (8 * j)) & 0xF); + } + } + return 0; +} + +static int +nfp_net_rss_hash_update(struct rte_eth_dev *dev, + struct rte_eth_rss_conf *rss_conf) +{ + uint32_t update; + uint32_t cfg_rss_ctrl =3D 0; + uint8_t key; + uint64_t rss_hf; + int i; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + rss_hf =3D rss_conf->rss_hf; + + /* Checking if RSS is enabled */ + if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) { + if (rss_hf !=3D 0) { /* Enable RSS? */ + RTE_LOG(ERR, PMD, "RSS unsupported\n"); + return -EINVAL; + } + return 0; /* Nothing to do */ + } + + if (rss_conf->rss_key_len > NFP_NET_CFG_RSS_KEY_SZ) { + RTE_LOG(ERR, PMD, "hash key too long\n"); + return -EINVAL; + } + + if (rss_hf & ETH_RSS_IPV4) + cfg_rss_ctrl |=3D NFP_NET_CFG_RSS_IPV4 | + NFP_NET_CFG_RSS_IPV4_TCP | + NFP_NET_CFG_RSS_IPV4_UDP; + + if (rss_hf & ETH_RSS_IPV6) + cfg_rss_ctrl |=3D NFP_NET_CFG_RSS_IPV6 | + NFP_NET_CFG_RSS_IPV6_TCP | + NFP_NET_CFG_RSS_IPV6_UDP; + + /* configuring where to apply the RSS hash */ + nn_cfg_writel(hw, NFP_NET_CFG_RSS_CTRL, cfg_rss_ctrl); + + /* Writing the key byte a byte */ + for (i =3D 0; i < rss_conf->rss_key_len; i++) { + memcpy(&key, &rss_conf->rss_key[i], 1); + nn_cfg_writeb(hw, NFP_NET_CFG_RSS_KEY + i, key); + } + + /* Writing the key size */ + nn_cfg_writeb(hw, NFP_NET_CFG_RSS_KEY_SZ, rss_conf->rss_key_len); + + update =3D NFP_NET_CFG_UPDATE_RSS; + + if (nfp_net_reconfig(hw, hw->ctrl, update) < 0) + return -EIO; + + return 0; +} + +static int +nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev, + struct rte_eth_rss_conf *rss_conf) +{ + uint64_t rss_hf; + uint32_t cfg_rss_ctrl; + uint8_t key; + int i; + struct nfp_net_hw *hw; + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) + return -EINVAL; + + rss_hf =3D rss_conf->rss_hf; + cfg_rss_ctrl =3D nn_cfg_readl(hw, NFP_NET_CFG_RSS_CTRL); + + if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4) + rss_hf |=3D ETH_RSS_NONFRAG_IPV4_TCP | ETH_RSS_NONFRAG_IPV4_UDP; + + if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4_TCP) + rss_hf |=3D ETH_RSS_NONFRAG_IPV4_TCP; + + if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6_TCP) + rss_hf |=3D ETH_RSS_NONFRAG_IPV6_TCP; + + if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV4_UDP) + rss_hf |=3D ETH_RSS_NONFRAG_IPV4_UDP; + + if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6_UDP) + rss_hf |=3D ETH_RSS_NONFRAG_IPV6_UDP; + + if (cfg_rss_ctrl & NFP_NET_CFG_RSS_IPV6) + rss_hf |=3D ETH_RSS_NONFRAG_IPV4_UDP | ETH_RSS_NONFRAG_IPV6_UDP; + + /* Reading the key size */ + rss_conf->rss_key_len =3D nn_cfg_readl(hw, NFP_NET_CFG_RSS_KEY_SZ); + + /* Reading the key byte a byte */ + for (i =3D 0; i < rss_conf->rss_key_len; i++) { + key =3D nn_cfg_readb(hw, NFP_NET_CFG_RSS_KEY + i); + memcpy(&rss_conf->rss_key[i], &key, 1); + } + + return 0; +} + +/* Initialise and register driver with DPDK Application */ +static struct eth_dev_ops nfp_net_eth_dev_ops =3D { + .dev_configure =3D nfp_net_configure, + .dev_start =3D nfp_net_start, + .dev_stop =3D nfp_net_stop, + .dev_close =3D nfp_net_close, + .promiscuous_enable =3D nfp_net_promisc_enable, + .promiscuous_disable =3D nfp_net_promisc_disable, + .link_update =3D nfp_net_link_update, + .stats_get =3D nfp_net_stats_get, + .stats_reset =3D nfp_net_stats_reset, + .dev_infos_get =3D nfp_net_infos_get, + .mtu_set =3D nfp_net_dev_mtu_set, + .vlan_offload_set =3D nfp_net_vlan_offload_set, + .reta_update =3D nfp_net_reta_update, + .reta_query =3D nfp_net_reta_query, + .rss_hash_update =3D nfp_net_rss_hash_update, + .rss_hash_conf_get =3D nfp_net_rss_hash_conf_get, + .rx_queue_setup =3D nfp_net_rx_queue_setup, + .rx_queue_release =3D nfp_net_rx_queue_release, + .rx_queue_count =3D nfp_net_rx_queue_count, + .tx_queue_setup =3D nfp_net_tx_queue_setup, + .tx_queue_release =3D nfp_net_tx_queue_release, + .mac_addr_add =3D NULL, + .mac_addr_remove =3D NULL, +}; + +static int +__nfp_net_init(struct rte_eth_dev *eth_dev) +{ + struct rte_pci_device *pci_dev; + struct nfp_net_hw *hw; + + uint32_t tx_bar_off, rx_bar_off; + uint32_t start_q; + int stride =3D 4; + + PMD_INIT_FUNC_TRACE(); + + hw =3D NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private); + + eth_dev->dev_ops =3D &nfp_net_eth_dev_ops; + eth_dev->rx_pkt_burst =3D &nfp_net_recv_pkts; + eth_dev->tx_pkt_burst =3D &nfp_net_xmit_pkts; + + /* For secondary processes, the primary has done all the work */ + if (rte_eal_process_type() !=3D RTE_PROC_PRIMARY) + return 0; + + pci_dev =3D eth_dev->pci_dev; + hw->device_id =3D pci_dev->id.device_id; + hw->vendor_id =3D pci_dev->id.vendor_id; + hw->subsystem_device_id =3D pci_dev->id.subsystem_device_id; + hw->subsystem_vendor_id =3D pci_dev->id.subsystem_vendor_id; + + PMD_INIT_LOG(DEBUG, "nfp_net: device (%u:%u) %u:%u:%u:%u\n", + pci_dev->id.vendor_id, pci_dev->id.device_id, + pci_dev->addr.domain, pci_dev->addr.bus, + pci_dev->addr.devid, pci_dev->addr.function); + + hw->ctrl_bar =3D (uint8_t *)pci_dev->mem_resource[0].addr; + if (hw->ctrl_bar =3D=3D NULL) { + RTE_LOG(ERR, PMD, + "hw->ctrl_bar is NULL. BAR0 not configured\n"); + return -ENODEV; + } + hw->max_rx_queues =3D nn_cfg_readl(hw, NFP_NET_CFG_MAX_RXRINGS); + hw->max_tx_queues =3D nn_cfg_readl(hw, NFP_NET_CFG_MAX_TXRINGS); + + /* Work out where in the BAR the queues start. */ + switch (pci_dev->id.device_id) { + case PCI_DEVICE_ID_NFP6000_VF_NIC: + start_q =3D nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ); + tx_bar_off =3D NFP_PCIE_QUEUE(start_q); + start_q =3D nn_cfg_readl(hw, NFP_NET_CFG_START_RXQ); + rx_bar_off =3D NFP_PCIE_QUEUE(start_q); + break; + default: + RTE_LOG(ERR, PMD, "nfp_net: no device ID matching\n"); + return -ENODEV; + } + + PMD_INIT_LOG(DEBUG, "tx_bar_off: 0x%08x\n", tx_bar_off); + PMD_INIT_LOG(DEBUG, "rx_bar_off: 0x%08x\n", rx_bar_off); + + hw->tx_bar =3D (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off; + hw->rx_bar =3D (uint8_t *)pci_dev->mem_resource[2].addr + rx_bar_off; + + PMD_INIT_LOG(DEBUG, "ctrl_bar: %p, tx_bar: %p, rx_bar: %p\n", + hw->ctrl_bar, hw->tx_bar, hw->rx_bar); + + nfp_net_cfg_queue_setup(hw); + + /* Get some of the read-only fields from the config BAR */ + hw->ver =3D nn_cfg_readl(hw, NFP_NET_CFG_VERSION); + hw->cap =3D nn_cfg_readl(hw, NFP_NET_CFG_CAP); + hw->max_mtu =3D nn_cfg_readl(hw, NFP_NET_CFG_MAX_MTU); + hw->mtu =3D hw->max_mtu; + + PMD_INIT_LOG(INFO, "VER: %#x, Maximum supported MTU: %d\n", + hw->ver, hw->max_mtu); + PMD_INIT_LOG(INFO, "CAP: %#x, %s%s%s%s%s%s%s%s%s\n", hw->cap, + hw->cap & NFP_NET_CFG_CTRL_PROMISC ? "PROMISC " : "", + hw->cap & NFP_NET_CFG_CTRL_RXCSUM ? "RXCSUM " : "", + hw->cap & NFP_NET_CFG_CTRL_TXCSUM ? "TXCSUM " : "", + hw->cap & NFP_NET_CFG_CTRL_RXVLAN ? "RXVLAN " : "", + hw->cap & NFP_NET_CFG_CTRL_TXVLAN ? "TXVLAN " : "", + hw->cap & NFP_NET_CFG_CTRL_SCATTER ? "SCATTER " : "", + hw->cap & NFP_NET_CFG_CTRL_GATHER ? "GATHER " : "", + hw->cap & NFP_NET_CFG_CTRL_LSO ? "TSO " : "", + hw->cap & NFP_NET_CFG_CTRL_RSS ? "RSS " : ""); + + pci_dev =3D eth_dev->pci_dev; + hw->ctrl =3D 0; + + hw->stride_rx =3D stride; + hw->stride_tx =3D stride; + + PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n", + hw->max_rx_queues, hw->max_tx_queues); + + /* Allocating memory for mac addr */ + eth_dev->data->mac_addrs =3D rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0)= ; + if (eth_dev->data->mac_addrs =3D=3D NULL) { + PMD_INIT_LOG(ERR, "Failed to space for MAC address"); + return -ENOMEM; + } + + /* Using random mac addresses for VFs */ + eth_random_addr(&hw->mac_addr[0]); + + /* Copying mac address to DPDK eth_dev struct */ + ether_addr_copy(ð_dev->data->mac_addrs[0], + (struct ether_addr *)hw->mac_addr); + + PMD_INIT_LOG(INFO, "port %d VendorID=3D0x%x DeviceID=3D0x%x " + "mac=3D%02x:%02x:%02x:%02x:%02x:%02x", + eth_dev->data->port_id, pci_dev->id.vendor_id, + pci_dev->id.device_id, + hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2], + hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]); + + /* Registering LSC interrupt handler */ + rte_intr_callback_register(&pci_dev->intr_handle, + nfp_net_dev_interrupt_handler, + (void *)eth_dev); + + /* enable uio intr after callback register */ + rte_intr_enable(&pci_dev->intr_handle); + + /* Telling the firmware about the LSC interrupt entry */ + nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX); + + /* Recording current stats counters values */ + nfp_net_stats_reset(eth_dev); + + return 0; +} + +static int +nfp_net_init(struct rte_eth_dev *eth_dev) +{ + return __nfp_net_init(eth_dev); +} + +static struct rte_pci_id pci_id_nfp_net_map[] =3D { + { + .vendor_id =3D PCI_VENDOR_ID_NETRONOME, + .device_id =3D PCI_DEVICE_ID_NFP6000_PF_NIC, + .subsystem_vendor_id =3D PCI_ANY_ID, + .subsystem_device_id =3D PCI_ANY_ID, + }, + { + .vendor_id =3D PCI_VENDOR_ID_NETRONOME, + .device_id =3D PCI_DEVICE_ID_NFP6000_VF_NIC, + .subsystem_vendor_id =3D PCI_ANY_ID, + .subsystem_device_id =3D PCI_ANY_ID, + }, + { + .vendor_id =3D 0, + }, +}; + +static struct eth_driver rte_nfp_net_pmd =3D { + { + .name =3D "rte_nfp_net_pmd", + .id_table =3D pci_id_nfp_net_map, + .drv_flags =3D RTE_PCI_DRV_NEED_MAPPING, + }, + .eth_dev_init =3D nfp_net_init, + .dev_private_size =3D sizeof(struct nfp_net_adapter), +}; + +static int +nfp_net_pmd_init(const char *name __rte_unused, + const char *params __rte_unused) +{ + PMD_INIT_FUNC_TRACE(); + PMD_INIT_LOG(INFO, "librte_pmd_nfp_net version %s\n", + NFP_NET_PMD_VERSION); + + rte_eth_driver_register(&rte_nfp_net_pmd); + return 0; +} + +static struct rte_driver rte_nfp_net_driver =3D { + .type =3D PMD_PDEV, + .init =3D nfp_net_pmd_init, +}; + +PMD_REGISTER_DRIVER(rte_nfp_net_driver); + +/* + * Local variables: + * c-file-style: "Linux" + * indent-tabs-mode: t + * End: + */ diff --git a/drivers/net/nfp/nfp_net_ctrl.h b/drivers/net/nfp/nfp_net_ctr= l.h new file mode 100644 index 0000000..8342500 --- /dev/null +++ b/drivers/net/nfp/nfp_net_ctrl.h @@ -0,0 +1,290 @@ +/* + * Copyright (c) 2014, 2015 Netronome Systems, Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions ar= e met: + * + * 1. Redistributions of source code must retain the above copyright not= ice, + * this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution + * + * 3. Neither the name of the copyright holder nor the names of its + * contributors may be used to endorse or promote products derived from= this + * software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "= AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,= THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PU= RPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTOR= S BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSIN= ESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER = IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWIS= E) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED O= F THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * vim:shiftwidth=3D8:noexpandtab + * + * Netronome network device driver: Control BAR layout + */ +#ifndef _NFP_NET_CTRL_H_ +#define _NFP_NET_CTRL_H_ + +/* + * Configuration BAR size. + * + * The configuration BAR is 8K in size, but on the NFP6000, due to + * THB-350, 32k needs to be reserved. + */ +#ifdef __NFP_IS_6000 +#define NFP_NET_CFG_BAR_SZ (32 * 1024) +#else +#define NFP_NET_CFG_BAR_SZ (8 * 1024) +#endif + +/* Offset in Freelist buffer where packet starts on RX */ +#define NFP_NET_RX_OFFSET 32 + +/* Hash type pre-pended when a RSS hash was computed */ +#define NFP_NET_RSS_NONE 0 +#define NFP_NET_RSS_IPV4 1 +#define NFP_NET_RSS_IPV6 2 +#define NFP_NET_RSS_IPV6_EX 3 +#define NFP_NET_RSS_IPV4_TCP 4 +#define NFP_NET_RSS_IPV6_TCP 5 +#define NFP_NET_RSS_IPV6_EX_TCP 6 +#define NFP_NET_RSS_IPV4_UDP 7 +#define NFP_NET_RSS_IPV6_UDP 8 +#define NFP_NET_RSS_IPV6_EX_UDP 9 + +/* + * @NFP_NET_TXR_MAX: Maximum number of TX rings + * @NFP_NET_TXR_MASK: Mask for TX rings + * @NFP_NET_RXR_MAX: Maximum number of RX rings + * @NFP_NET_RXR_MASK: Mask for RX rings + */ +#define NFP_NET_TXR_MAX 64 +#define NFP_NET_TXR_MASK (NFP_NET_TXR_MAX - 1) +#define NFP_NET_RXR_MAX 64 +#define NFP_NET_RXR_MASK (NFP_NET_RXR_MAX - 1) + +/* + * Read/Write config words (0x0000 - 0x002c) + * @NFP_NET_CFG_CTRL: Global control + * @NFP_NET_CFG_UPDATE: Indicate which fields are updated + * @NFP_NET_CFG_TXRS_ENABLE: Bitmask of enabled TX rings + * @NFP_NET_CFG_RXRS_ENABLE: Bitmask of enabled RX rings + * @NFP_NET_CFG_MTU: Set MTU size + * @NFP_NET_CFG_FLBUFSZ: Set freelist buffer size (must be larger th= an MTU) + * @NFP_NET_CFG_EXN: MSI-X table entry for exceptions + * @NFP_NET_CFG_LSC: MSI-X table entry for link state changes + * @NFP_NET_CFG_MACADDR: MAC address + * + * TODO: + * - define Error details in UPDATE + */ +#define NFP_NET_CFG_CTRL 0x0000 +#define NFP_NET_CFG_CTRL_ENABLE (0x1 << 0) /* Global enable *= / +#define NFP_NET_CFG_CTRL_PROMISC (0x1 << 1) /* Enable Promisc = mode */ +#define NFP_NET_CFG_CTRL_L2BC (0x1 << 2) /* Allow L2 Broadc= ast */ +#define NFP_NET_CFG_CTRL_L2MC (0x1 << 3) /* Allow L2 Multic= ast */ +#define NFP_NET_CFG_CTRL_RXCSUM (0x1 << 4) /* Enable RX Check= sum */ +#define NFP_NET_CFG_CTRL_TXCSUM (0x1 << 5) /* Enable TX Check= sum */ +#define NFP_NET_CFG_CTRL_RXVLAN (0x1 << 6) /* Enable VLAN str= ip */ +#define NFP_NET_CFG_CTRL_TXVLAN (0x1 << 7) /* Enable VLAN ins= ert */ +#define NFP_NET_CFG_CTRL_SCATTER (0x1 << 8) /* Scatter DMA */ +#define NFP_NET_CFG_CTRL_GATHER (0x1 << 9) /* Gather DMA */ +#define NFP_NET_CFG_CTRL_LSO (0x1 << 10) /* LSO/TSO */ +#define NFP_NET_CFG_CTRL_RINGCFG (0x1 << 16) /* Ring runtime ch= anges */ +#define NFP_NET_CFG_CTRL_RSS (0x1 << 17) /* RSS */ +#define NFP_NET_CFG_CTRL_IRQMOD (0x1 << 18) /* Interrupt moder= ation */ +#define NFP_NET_CFG_CTRL_RINGPRIO (0x1 << 19) /* Ring priorities= */ +#define NFP_NET_CFG_CTRL_MSIXAUTO (0x1 << 20) /* MSI-X auto-mask= ing */ +#define NFP_NET_CFG_CTRL_TXRWB (0x1 << 21) /* Write-back of T= X ring*/ +#define NFP_NET_CFG_CTRL_L2SWITCH (0x1 << 22) /* L2 Switch */ +#define NFP_NET_CFG_CTRL_L2SWITCH_LOCAL (0x1 << 23) /* Switch to local= */ +#define NFP_NET_CFG_CTRL_VXLANO (0x1 << 24) /* Enable VXLAN */ +#define NFP_NET_CFG_CTRL_NVGREO (0x1 << 25) /* Enable NVGRE */ +#define NFP_NET_CFG_UPDATE 0x0004 +#define NFP_NET_CFG_UPDATE_GEN (0x1 << 0) /* General update = */ +#define NFP_NET_CFG_UPDATE_RING (0x1 << 1) /* Ring config cha= nge */ +#define NFP_NET_CFG_UPDATE_RSS (0x1 << 2) /* RSS config chan= ge */ +#define NFP_NET_CFG_UPDATE_TXRPRIO (0x1 << 3) /* TX Ring prio ch= ange */ +#define NFP_NET_CFG_UPDATE_RXRPRIO (0x1 << 4) /* RX Ring prio ch= ange */ +#define NFP_NET_CFG_UPDATE_MSIX (0x1 << 5) /* MSI-X change */ +#define NFP_NET_CFG_UPDATE_L2SWITCH (0x1 << 6) /* Switch changes = */ +#define NFP_NET_CFG_UPDATE_RESET (0x1 << 7) /* Update due to F= LR */ +#define NFP_NET_CFG_UPDATE_IRQMOD (0x1 << 8) /* IRQ mod change = */ +#define NFP_NET_CFG_UPDATE_ERR (0x1 << 31) /* A error occurre= d */ +#define NFP_NET_CFG_TXRS_ENABLE 0x0008 +#define NFP_NET_CFG_RXRS_ENABLE 0x0010 +#define NFP_NET_CFG_MTU 0x0018 +#define NFP_NET_CFG_FLBUFSZ 0x001c +#define NFP_NET_CFG_EXN 0x001f +#define NFP_NET_CFG_LSC 0x0020 +#define NFP_NET_CFG_MACADDR 0x0024 + +/* + * Read-only words (0x0030 - 0x0050): + * @NFP_NET_CFG_VERSION: Firmware version number + * @NFP_NET_CFG_STS: Status + * @NFP_NET_CFG_CAP: Capabilities (same bits as @NFP_NET_CFG_CTR= L) + * @NFP_NET_MAX_TXRINGS: Maximum number of TX rings + * @NFP_NET_MAX_RXRINGS: Maximum number of RX rings + * @NFP_NET_MAX_MTU: Maximum support MTU + * @NFP_NET_CFG_START_TXQ: Start Queue Control Queue to use for TX (PF= only) + * @NFP_NET_CFG_START_RXQ: Start Queue Control Queue to use for RX (PF= only) + * + * TODO: + * - define more STS bits + */ +#define NFP_NET_CFG_VERSION 0x0030 +#define NFP_NET_CFG_STS 0x0034 +#define NFP_NET_CFG_STS_LINK (0x1 << 0) /* Link up or down = */ +#define NFP_NET_CFG_CAP 0x0038 +#define NFP_NET_CFG_MAX_TXRINGS 0x003c +#define NFP_NET_CFG_MAX_RXRINGS 0x0040 +#define NFP_NET_CFG_MAX_MTU 0x0044 +/* Next two words are being used by VFs for solving THB350 issue */ +#define NFP_NET_CFG_START_TXQ 0x0048 +#define NFP_NET_CFG_START_RXQ 0x004c + +/* + * NFP-3200 workaround (0x0050 - 0x0058) + * @NFP_NET_CFG_SPARE_ADDR: DMA address for ME code to use (e.g. YDS-15= 5 fix) + */ +#define NFP_NET_CFG_SPARE_ADDR 0x0050 + +/* 64B reserved for future use (0x0080 - 0x00c0) */ +#define NFP_NET_CFG_RESERVED 0x0080 +#define NFP_NET_CFG_RESERVED_SZ 0x0040 + +/* + * RSS configuration (0x0100 - 0x01ac): + * Used only when NFP_NET_CFG_CTRL_RSS is enabled + * @NFP_NET_CFG_RSS_CFG: RSS configuration word + * @NFP_NET_CFG_RSS_KEY: RSS "secret" key + * @NFP_NET_CFG_RSS_ITBL: RSS indirection table + */ +#define NFP_NET_CFG_RSS_BASE 0x0100 +#define NFP_NET_CFG_RSS_CTRL NFP_NET_CFG_RSS_BASE +#define NFP_NET_CFG_RSS_MASK (0x7f) +#define NFP_NET_CFG_RSS_MASK_of(_x) ((_x) & 0x7f) +#define NFP_NET_CFG_RSS_IPV4 (1 << 8) /* RSS for IPv4 */ +#define NFP_NET_CFG_RSS_IPV6 (1 << 9) /* RSS for IPv6 */ +#define NFP_NET_CFG_RSS_IPV4_TCP (1 << 10) /* RSS for IPv4/TCP = */ +#define NFP_NET_CFG_RSS_IPV4_UDP (1 << 11) /* RSS for IPv4/UDP = */ +#define NFP_NET_CFG_RSS_IPV6_TCP (1 << 12) /* RSS for IPv6/TCP = */ +#define NFP_NET_CFG_RSS_IPV6_UDP (1 << 13) /* RSS for IPv6/UDP = */ +#define NFP_NET_CFG_RSS_TOEPLITZ (1 << 24) /* Use Toeplitz hash= */ +#define NFP_NET_CFG_RSS_KEY (NFP_NET_CFG_RSS_BASE + 0x4) +#define NFP_NET_CFG_RSS_KEY_SZ 0x28 +#define NFP_NET_CFG_RSS_ITBL (NFP_NET_CFG_RSS_BASE + 0x4 + \ + NFP_NET_CFG_RSS_KEY_SZ) +#define NFP_NET_CFG_RSS_ITBL_SZ 0x80 + +/* + * TX ring configuration (0x200 - 0x800) + * @NFP_NET_CFG_TXR_BASE: Base offset for TX ring configuration + * @NFP_NET_CFG_TXR_ADDR: Per TX ring DMA address (8B entries) + * @NFP_NET_CFG_TXR_WB_ADDR: Per TX ring write back DMA address (8B entr= ies) + * @NFP_NET_CFG_TXR_SZ: Per TX ring ring size (1B entries) + * @NFP_NET_CFG_TXR_VEC: Per TX ring MSI-X table entry (1B entries) + * @NFP_NET_CFG_TXR_PRIO: Per TX ring priority (1B entries) + * @NFP_NET_CFG_TXR_IRQ_MOD: Per TX ring interrupt moderation (4B entrie= s) + */ +#define NFP_NET_CFG_TXR_BASE 0x0200 +#define NFP_NET_CFG_TXR_ADDR(_x) (NFP_NET_CFG_TXR_BASE + ((_x) * = 0x8)) +#define NFP_NET_CFG_TXR_WB_ADDR(_x) (NFP_NET_CFG_TXR_BASE + 0x200 + = \ + ((_x) * 0x8)) +#define NFP_NET_CFG_TXR_SZ(_x) (NFP_NET_CFG_TXR_BASE + 0x400 + = (_x)) +#define NFP_NET_CFG_TXR_VEC(_x) (NFP_NET_CFG_TXR_BASE + 0x440 + = (_x)) +#define NFP_NET_CFG_TXR_PRIO(_x) (NFP_NET_CFG_TXR_BASE + 0x480 + = (_x)) +#define NFP_NET_CFG_TXR_IRQ_MOD(_x) (NFP_NET_CFG_TXR_BASE + 0x500 + = \ + ((_x) * 0x4)) + +/* + * RX ring configuration (0x0800 - 0x0c00) + * @NFP_NET_CFG_RXR_BASE: Base offset for RX ring configuration + * @NFP_NET_CFG_RXR_ADDR: Per TX ring DMA address (8B entries) + * @NFP_NET_CFG_RXR_SZ: Per TX ring ring size (1B entries) + * @NFP_NET_CFG_RXR_VEC: Per TX ring MSI-X table entry (1B entries) + * @NFP_NET_CFG_RXR_PRIO: Per TX ring priority (1B entries) + * @NFP_NET_CFG_RXR_IRQ_MOD: Per TX ring interrupt moderation (4B entrie= s) + */ +#define NFP_NET_CFG_RXR_BASE 0x0800 +#define NFP_NET_CFG_RXR_ADDR(_x) (NFP_NET_CFG_RXR_BASE + ((_x) * = 0x8)) +#define NFP_NET_CFG_RXR_SZ(_x) (NFP_NET_CFG_RXR_BASE + 0x200 + = (_x)) +#define NFP_NET_CFG_RXR_VEC(_x) (NFP_NET_CFG_RXR_BASE + 0x240 + = (_x)) +#define NFP_NET_CFG_RXR_PRIO(_x) (NFP_NET_CFG_RXR_BASE + 0x280 + = (_x)) +#define NFP_NET_CFG_RXR_IRQ_MOD(_x) (NFP_NET_CFG_RXR_BASE + 0x300 + = \ + ((_x) * 0x4)) + +/* + * Interrupt Control/Cause registers (0x0c00 - 0x0d00) + * These registers are only used when MSI-X auto-masking is not + * enabled (@NFP_NET_CFG_CTRL_MSIXAUTO not set). The array is index + * by MSI-X entry and are 1B in size. If an entry is zero, the + * corresponding entry is enabled. If the FW generates an interrupt, + * it writes a cause into the corresponding field. This also masks + * the MSI-X entry and the host driver must clear the register to + * re-enable the interrupt. + */ +#define NFP_NET_CFG_ICR_BASE 0x0c00 +#define NFP_NET_CFG_ICR(_x) (NFP_NET_CFG_ICR_BASE + (_x)) +#define NFP_NET_CFG_ICR_UNMASKED 0x0 +#define NFP_NET_CFG_ICR_RXTX 0x1 +#define NFP_NET_CFG_ICR_LSC 0x2 + +/* + * General device stats (0x0d00 - 0x0d90) + * all counters are 64bit. + */ +#define NFP_NET_CFG_STATS_BASE 0x0d00 +#define NFP_NET_CFG_STATS_RX_DISCARDS (NFP_NET_CFG_STATS_BASE + 0x00) +#define NFP_NET_CFG_STATS_RX_ERRORS (NFP_NET_CFG_STATS_BASE + 0x08) +#define NFP_NET_CFG_STATS_RX_OCTETS (NFP_NET_CFG_STATS_BASE + 0x10) +#define NFP_NET_CFG_STATS_RX_UC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x18) +#define NFP_NET_CFG_STATS_RX_MC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x20) +#define NFP_NET_CFG_STATS_RX_BC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x28) +#define NFP_NET_CFG_STATS_RX_FRAMES (NFP_NET_CFG_STATS_BASE + 0x30) +#define NFP_NET_CFG_STATS_RX_MC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x38) +#define NFP_NET_CFG_STATS_RX_BC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x40) + +#define NFP_NET_CFG_STATS_TX_DISCARDS (NFP_NET_CFG_STATS_BASE + 0x48) +#define NFP_NET_CFG_STATS_TX_ERRORS (NFP_NET_CFG_STATS_BASE + 0x50) +#define NFP_NET_CFG_STATS_TX_OCTETS (NFP_NET_CFG_STATS_BASE + 0x58) +#define NFP_NET_CFG_STATS_TX_UC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x60) +#define NFP_NET_CFG_STATS_TX_MC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x68) +#define NFP_NET_CFG_STATS_TX_BC_OCTETS (NFP_NET_CFG_STATS_BASE + 0x70) +#define NFP_NET_CFG_STATS_TX_FRAMES (NFP_NET_CFG_STATS_BASE + 0x78) +#define NFP_NET_CFG_STATS_TX_MC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x80) +#define NFP_NET_CFG_STATS_TX_BC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x88) + +/* + * Per ring stats (0x1000 - 0x1800) + * options, 64bit per entry + * @NFP_NET_CFG_TXR_STATS: TX ring statistics (Packet and Byte count) + * @NFP_NET_CFG_RXR_STATS: RX ring statistics (Packet and Byte count) + */ +#define NFP_NET_CFG_TXR_STATS_BASE 0x1000 +#define NFP_NET_CFG_TXR_STATS(_x) (NFP_NET_CFG_TXR_STATS_BASE + \ + ((_x) * 0x10)) +#define NFP_NET_CFG_RXR_STATS_BASE 0x1400 +#define NFP_NET_CFG_RXR_STATS(_x) (NFP_NET_CFG_RXR_STATS_BASE + \ + ((_x) * 0x10)) + +#endif /* _NFP_NET_CTRL_H_ */ +/* + * Local variables: + * c-file-style: "Linux" + * indent-tabs-mode: t + * End: + */ diff --git a/drivers/net/nfp/nfp_net_logs.h b/drivers/net/nfp/nfp_net_log= s.h new file mode 100644 index 0000000..0b966e4 --- /dev/null +++ b/drivers/net/nfp/nfp_net_logs.h @@ -0,0 +1,75 @@ +/* + * Copyright (c) 2014, 2015 Netronome Systems, Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions ar= e met: + * + * 1. Redistributions of source code must retain the above copyright not= ice, + * this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution + * + * 3. Neither the name of the copyright holder nor the names of its + * contributors may be used to endorse or promote products derived from= this + * software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "= AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,= THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PU= RPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTOR= S BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSIN= ESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER = IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWIS= E) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED O= F THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _NFP_NET_LOGS_H_ +#define _NFP_NET_LOGS_H_ + +#include + +#define RTE_LIBRTE_NFP_NET_DEBUG_INIT 1 + +#ifdef RTE_LIBRTE_NFP_NET_DEBUG_INIT +#define PMD_INIT_LOG(level, fmt, args...) \ + RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args) +#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>") +#else +#define PMD_INIT_LOG(level, fmt, args...) do { } while (0) +#define PMD_INIT_FUNC_TRACE() do { } while (0) +#endif + +#ifdef RTE_LIBRTE_NFP_NET_DEBUG_RX +#define PMD_RX_LOG(level, fmt, args...) \ + RTE_LOG(level, PMD, "%s() rx: " fmt, __func__, ## args) +#else +#define PMD_RX_LOG(level, fmt, args...) do { } while (0) +#endif + +#ifdef RTE_LIBRTE_NFP_NET_DEBUG_TX +#define PMD_TX_LOG(level, fmt, args...) \ + RTE_LOG(level, PMD, "%s() tx: " fmt, __func__, ## args) +#else +#define PMD_TX_LOG(level, fmt, args...) do { } while (0) +#endif + +#ifdef RTE_LIBRTE_NFP_NET_DEBUG_DRIVER +#define PMD_DRV_LOG(level, fmt, args...) \ + RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args) +#else +#define PMD_DRV_LOG(level, fmt, args...) do { } while (0) +#endif + +#ifdef RTE_LIBRTE_NFP_NET_DEBUG_INIT +#define ASSERT(x) if (!(x)) rte_panic("NFP_NET: x") +#else +#define ASSERT(x) do { } while (0) +#endif + +#endif /* _NFP_NET_LOGS_H_ */ diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.= h new file mode 100644 index 0000000..3f9fef2 --- /dev/null +++ b/drivers/net/nfp/nfp_net_pmd.h @@ -0,0 +1,434 @@ +/* + * Copyright (c) 2014, 2015 Netronome Systems, Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions ar= e met: + * + * 1. Redistributions of source code must retain the above copyright not= ice, + * this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution + * + * 3. Neither the name of the copyright holder nor the names of its + * contributors may be used to endorse or promote products derived from= this + * software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "= AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,= THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PU= RPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTOR= S BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSIN= ESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER = IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWIS= E) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED O= F THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * vim:shiftwidth=3D8:noexpandtab + * + * @file dpdk/pmd/nfp_net_pmd.h + * + * Netronome NFP_NET PDM driver + */ + +#ifndef _NFP_NET_PMD_H_ +#define _NFP_NET_PMD_H_ + +#define NFP_NET_PMD_VERSION "0.1" +#define PCI_VENDOR_ID_NETRONOME 0x19ee +#define PCI_DEVICE_ID_NFP6000_PF_NIC 0x6000 +#define PCI_DEVICE_ID_NFP6000_VF_NIC 0x6003 + +/* Forward declaration */ +struct nfp_net_adapter; + +/* + * The maximum number of descriptors is limited by design as + * DPDK uses uint16_t variables for these values + */ +#define NFP_NET_MAX_TX_DESC (32 * 1024) +#define NFP_NET_MIN_TX_DESC 64 + +#define NFP_NET_MAX_RX_DESC (32 * 1024) +#define NFP_NET_MIN_RX_DESC 64 + +/* Bar allocation */ +#define NFP_NET_CRTL_BAR 0 +#define NFP_NET_TX_BAR 2 +#define NFP_NET_RX_BAR 2 + +/* Macros for accessing the Queue Controller Peripheral 'CSRs' */ +#define NFP_QCP_QUEUE_OFF(_x) ((_x) * 0x800) +#define NFP_QCP_QUEUE_ADD_RPTR 0x0000 +#define NFP_QCP_QUEUE_ADD_WPTR 0x0004 +#define NFP_QCP_QUEUE_STS_LO 0x0008 +#define NFP_QCP_QUEUE_STS_LO_READPTR_mask (0x3ffff) +#define NFP_QCP_QUEUE_STS_HI 0x000c +#define NFP_QCP_QUEUE_STS_HI_WRITEPTR_mask (0x3ffff) + +/* Interrupt definitions */ +#define NFP_NET_IRQ_LSC_IDX 0 + +#define RTE_MBUF_DATA_DMA_ADDR(mb) \ + ((uint64_t)((mb)->buf_physaddr + (mb)->data_off)) + +/* Default values for RX/TX configuration */ +#define DEFAULT_RX_FREE_THRESH 32 +#define DEFAULT_RX_PTHRESH 8 +#define DEFAULT_RX_HTHRESH 8 +#define DEFAULT_RX_WTHRESH 0 + +#define DEFAULT_TX_RS_THRESH 32 +#define DEFAULT_TX_FREE_THRESH 32 +#define DEFAULT_TX_PTHRESH 32 +#define DEFAULT_TX_HTHRESH 0 +#define DEFAULT_TX_WTHRESH 0 +#define DEFAULT_TX_RSBIT_THRESH 32 + +/* Alignment for dma zones */ +#define NFP_MEMZONE_ALIGN 128 + +/* + * This is used by the reconfig protocol. It sets the maximum time waiti= ng in + * milliseconds before a reconfig timeout happens. + */ +#define NFP_NET_POLL_TIMEOUT 5000 + +#define NFP_QCP_QUEUE_ADDR_SZ (0x800) + +#define NFP_NET_LINK_DOWN_CHECK_TIMEOUT 4000 /* ms */ +#define NFP_NET_LINK_UP_CHECK_TIMEOUT 1000 /* ms */ + +#include + +static inline uint8_t nn_readb(volatile void *addr) +{ + return *((volatile uint8_t *)(addr)); +} + +static inline void nn_writeb(__u8 val, volatile void *addr) +{ + *((volatile uint8_t *)(addr)) =3D val; +} + +static inline uint32_t nn_readl(volatile const void *addr) +{ + return *((volatile const uint32_t *)(addr)); +} + +static inline void nn_writel(__u32 val, volatile void *addr) +{ + *((volatile uint32_t *)(addr)) =3D val; +} + +static inline uint64_t nn_readq(volatile void *addr) +{ + const volatile __u32 *p =3D addr; + __u32 low, high; + + high =3D nn_readl((volatile const void *)(p + 1)); + low =3D nn_readl((volatile const void *)p); + + return low + ((__u64)high << 32); +} + +static inline void nn_writeq(__u64 val, volatile void *addr) +{ + nn_writel(val >> 32, (volatile char *)addr + 4); + nn_writel(val, addr); +} + +/* TX descriptor format */ +#define PCIE_DESC_TX_EOP (1 << 7) +#define PCIE_DESC_TX_OFFSET_MASK (0x7f) + +/* Flags in the host TX descriptor */ +#define PCIE_DESC_TX_CSUM (1 << 7) +#define PCIE_DESC_TX_IP4_CSUM (1 << 6) +#define PCIE_DESC_TX_TCP_CSUM (1 << 5) +#define PCIE_DESC_TX_UDP_CSUM (1 << 4) +#define PCIE_DESC_TX_VLAN (1 << 3) +#define PCIE_DESC_TX_LSO (1 << 2) +#define PCIE_DESC_TX_ENCAP_NONE (0) +#define PCIE_DESC_TX_ENCAP_VXLAN (1 << 1) +#define PCIE_DESC_TX_ENCAP_GRE (1 << 0) + +struct nfp_net_tx_desc { + union { + struct { + __u8 dma_addr_hi; /* High bits of host buf address */ + __le16 dma_len; /* Length to DMA for this desc */ + __u8 offset_eop; /* Offset in buf where pkt starts + + * highest bit is eop flag. + */ + __le32 dma_addr_lo; /* Low 32bit of host buf addr */ + + __le16 lso; /* MSS to be used for LSO */ + __u8 l4_offset; /* LSO, where the L4 data starts */ + __u8 flags; /* TX Flags, see @PCIE_DESC_TX_* */ + + __le16 vlan; /* VLAN tag to add if indicated */ + __le16 data_len; /* Length of frame + meta data */ + } __attribute__((__packed__)); + __le32 vals[4]; + }; +}; + +struct nfp_net_txq { + struct nfp_net_hw *hw; /* Backpointer to nfp_net structure */ + + /* + * Queue information: @qidx is the queue index from Linux's + * perspective. @tx_qcidx is the index of the Queue + * Controller Peripheral queue relative to the TX queue BAR. + * @cnt is the size of the queue in number of + * descriptors. @qcp_q is a pointer to the base of the queue + * structure on the NFP + */ + __u8 *qcp_q; + + /* + * Read and Write pointers. @wr_p and @rd_p are host side pointer, + * they are free running and have little relation to the QCP pointers * + * @qcp_rd_p is a local copy queue controller peripheral read pointer + */ + + __u32 wr_p; + __u32 rd_p; + __u32 qcp_rd_p; + + __u32 tx_count; + + __u32 tx_free_thresh; + __u32 tail; + + /* + * For each descriptor keep a reference to the mbuff and + * DMA address used until completion is signalled. + */ + struct { + struct rte_mbuf *mbuf; + } *txbufs; + + /* + * Information about the host side queue location. @txds is + * the virtual address for the queue, @dma is the DMA address + * of the queue and @size is the size in bytes for the queue + * (needed for free) + */ + struct nfp_net_tx_desc *txds; + + /* + * At this point 56 bytes have been used for all the fields in the + * TX critical path. We have room for 8 bytes and still all placed + * in a cache line. We are not using the threshold values below nor + * the txq_flags but if we need to, we can add the most used in the + * remaining bytes. + */ + __u32 tx_rs_thresh; /* not used by now. Future? */ + __u32 tx_pthresh; /* not used by now. Future? */ + __u32 tx_hthresh; /* not used by now. Future? */ + __u32 tx_wthresh; /* not used by now. Future? */ + __u32 txq_flags; /* not used by now. Future? */ + __u8 port_id; + int qidx; + int tx_qcidx; + __le64 dma; +} __attribute__ ((__aligned__(64))); + +/* RX and freelist descriptor format */ +#define PCIE_DESC_RX_DD (1 << 7) +#define PCIE_DESC_RX_META_LEN_MASK (0x7f) + +/* Flags in the RX descriptor */ +#define PCIE_DESC_RX_RSS (1 << 15) +#define PCIE_DESC_RX_I_IP4_CSUM (1 << 14) +#define PCIE_DESC_RX_I_IP4_CSUM_OK (1 << 13) +#define PCIE_DESC_RX_I_TCP_CSUM (1 << 12) +#define PCIE_DESC_RX_I_TCP_CSUM_OK (1 << 11) +#define PCIE_DESC_RX_I_UDP_CSUM (1 << 10) +#define PCIE_DESC_RX_I_UDP_CSUM_OK (1 << 9) +#define PCIE_DESC_RX_INGRESS_PORT (1 << 8) +#define PCIE_DESC_RX_EOP (1 << 7) +#define PCIE_DESC_RX_IP4_CSUM (1 << 6) +#define PCIE_DESC_RX_IP4_CSUM_OK (1 << 5) +#define PCIE_DESC_RX_TCP_CSUM (1 << 4) +#define PCIE_DESC_RX_TCP_CSUM_OK (1 << 3) +#define PCIE_DESC_RX_UDP_CSUM (1 << 2) +#define PCIE_DESC_RX_UDP_CSUM_OK (1 << 1) +#define PCIE_DESC_RX_VLAN (1 << 0) + +struct nfp_net_rx_desc { + union { + /* Freelist descriptor */ + struct { + __u8 dma_addr_hi; + __le16 spare; + __u8 dd; + + __le32 dma_addr_lo; + } __attribute__((__packed__)) fld; + + /* RX descriptor */ + struct { + __le16 data_len; + __u8 reserved; + __u8 meta_len_dd; + + __le16 flags; + __le16 vlan; + } __attribute__((__packed__)) rxd; + + __le32 vals[2]; + }; +}; + +struct nfp_net_rx_buff { + struct rte_mbuf *mbuf; +}; + +struct nfp_net_rxq { + struct nfp_net_hw *hw; /* Backpointer to nfp_net structure */ + + /* + * @qcp_fl and @qcp_rx are pointers to the base addresses of the + * freelist and RX queue controller peripheral queue structures on the + * NFP + */ + __u8 *qcp_fl; + __u8 *qcp_rx; + + /* + * Read and Write pointers. @wr_p and @rd_p are host side + * pointer, they are free running and have little relation to + * the QCP pointers. @wr_p is where the driver adds new + * freelist descriptors and @rd_p is where the driver start + * reading descriptors for newly arrive packets from. + */ + __u32 wr_p; + __u32 rd_p; + + /* + * For each buffer placed on the freelist, record the + * associated SKB + */ + struct nfp_net_rx_buff *rxbufs; + + /* + * Information about the host side queue location. @rxds is + * the virtual address for the queue + */ + struct nfp_net_rx_desc *rxds; + + /* + * The mempool is created by the user specifying a mbuf size. + * We save here the reference of the mempool needed in the RX + * path and the mbuf size for checking received packets can be + * safely copied to the mbuf using the NFP_NET_RX_OFFSET + */ + struct rte_mempool *mem_pool; + uint16_t mbuf_size; + + /* + * Next two fields are used for giving more free descriptors + * to the NFP + */ + uint16_t rx_free_thresh; + uint16_t nb_rx_hold; + + /* the size of the queue in number of descriptors */ + uint16_t rx_count; + + /* + * Fields above this point fit in a single cache line and are all used + * in the RX critical path. Fields below this point are just used + * during queue configuration or not used at all (yet) + */ + + /* referencing dev->data->port_id */ + uint16_t port_id; + + uint8_t crc_len; /* Not used by now */ + uint8_t drop_en; /* Not used by now */ + + /* DMA address of the queue */ + __le64 dma; + + /* + * Queue information: @qidx is the queue index from Linux's + * perspective. @fl_qcidx is the index of the Queue + * Controller peripheral queue relative to the RX queue BAR + * used for the freelist and @rx_qcidx is the Queue Controller + * Peripheral index for the RX queue. + */ + int qidx; + int fl_qcidx; + int rx_qcidx; +} __attribute__ ((__aligned__(64))); + +struct nfp_net_hw { + /* Info from the firmware */ + uint32_t ver; + uint32_t cap; + uint32_t max_mtu; + uint32_t mtu; + + /* Current values for control */ + uint32_t ctrl; + + uint8_t *ctrl_bar; + uint8_t *tx_bar; + uint8_t *rx_bar; + + int stride_rx; + int stride_tx; + + __u8 *qcp_cfg; + + uint32_t max_tx_queues; + uint32_t max_rx_queues; + uint16_t flbufsz; + uint16_t device_id; + uint16_t vendor_id; + uint16_t subsystem_device_id; + uint16_t subsystem_vendor_id; +#if defined(DSTQ_SELECTION) +#if DSTQ_SELECTION + uint16_t device_function; +#endif +#endif + + uint8_t mac_addr[ETHER_ADDR_LEN]; + + /* Records starting point for counters */ + struct rte_eth_stats eth_stats_base; + +#ifdef NFP_NET_LIBNFP + struct nfp_cpp *cpp; + struct nfp_cpp_area *ctrl_area; + struct nfp_cpp_area *tx_area; + struct nfp_cpp_area *rx_area; + struct nfp_cpp_area *msix_area; +#endif +}; + +struct nfp_net_adapter { + struct nfp_net_hw hw; +}; + +#define NFP_NET_DEV_PRIVATE_TO_HW(adapter)\ + (&((struct nfp_net_adapter *)adapter)->hw) + +#endif /* _NFP_NET_PMD_H_ */ +/* + * Local variables: + * c-file-style: "Linux" + * indent-tabs-mode: t + * End: + */ diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 9e1909e..4a00f0f 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -142,6 +142,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING) +=3D -lrt= e_pmd_ring _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) +=3D -lrte_pmd_pcap _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) +=3D -lrte_pmd_af_packet _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL) +=3D -lrte_pmd_null +_LDLIBS-$(CONFIG_RTE_LIBRTE_NFP_PMD) +=3D -lrte_pmd_nfp =20 endif # ! $(CONFIG_RTE_BUILD_SHARED_LIB) =20 --=20 1.7.9.5