From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f170.google.com (mail-wr0-f170.google.com [209.85.128.170]) by dpdk.org (Postfix) with ESMTP id 71CBF1B1BA for ; Tue, 26 Sep 2017 11:40:33 +0200 (CEST) Received: by mail-wr0-f170.google.com with SMTP id m18so12388350wrm.2 for ; Tue, 26 Sep 2017 02:40:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DeONZp1Br64DGbUXaW8eLR7Xuco2CtxELPOeYLSEkDI=; b=VMmSTzwG2IVesy4NzYryWZXCvkrCagL0hflrSt0hdInzXHQDrWxKZrtE5Cg2fO5XwD ysvvehoRENoD7ADQIaMbTEAsUVCON43iLl1fU6lmrf8HucSBA6QzhWpPMMB+6MqOyD5g rzcSXFL2/2usBz+BDGkmg2pe+8GvfOADP+1ozoabreGQb2tn73acWN0uIReHoODrUw6j YTDgffWp97wKfaWyoFQCVA2gqzXIrru+ypwYwpDf0jMNMhMFueK8woDbWw8/jbP3Bn+P OAy0SSET/z4qLLOgLb8SY31/7SmYsFBFOPTNn3YLTdEhPqa7J6KE14fIJTQh34G3FbFU VTUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DeONZp1Br64DGbUXaW8eLR7Xuco2CtxELPOeYLSEkDI=; b=kMFnvxJk0IAoNdV1Pydf+/HgJdT1KxmaSdM4GIOTGZxspO/GwZgJEeIjZDsGx5kMal VHwbT6CmANmUruzfWfpTpN8WmRvv1IGVGq9TENJy/xq3i6SmV06M7TITyvENAnWsblcQ +1SIO47c9tv3mbOsvKlxzoxKCLNxPEHjS9fM7syRtCU5vGlEq66No/TOJJiJObaRvfRb ERHjg92T9LYU9KMrK8gfDeXmPqZSg2EEGjysL0iNcZCvRqBFBIUC72yWRctWKpkcxK1y zVKrBYDIyGO1uMAJnUpLwAM0ajAcvaiRfHvYmXlcXhedB+6LOIsetn9G4bmdAZD7NpUx x0Cw== X-Gm-Message-State: AHPjjUj+73DnTJJaGpIBsdc5MAQzFeYI5Vytb7vMx6hQtfLuaACXxMRr ABKqKc49P2lhOymrvW3TIchGk8X70Rs= X-Google-Smtp-Source: AOwi7QBP+IniGTU2pQc1F1ZFXa/Gy/Zsw8LtLuuti45Lwm+xBaYkH+NofmYRzT6UNHl63L4ZJk97rQ== X-Received: by 10.46.25.18 with SMTP id p18mr3792954lje.151.1506418830016; Tue, 26 Sep 2017 02:40:30 -0700 (PDT) Received: from tdu.semihalf.local (31-172-191-173.noc.fibertech.net.pl. [31.172.191.173]) by smtp.gmail.com with ESMTPSA id g12sm1273593lfd.64.2017.09.26.02.40.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 26 Sep 2017 02:40:29 -0700 (PDT) From: Tomasz Duszynski To: dev@dpdk.org Cc: mw@semihalf.com, dima@marvell.com, nsamsono@marvell.com, Tomasz Duszynski , Jacek Siuda Date: Tue, 26 Sep 2017 11:39:59 +0200 Message-Id: <1506418805-12117-3-git-send-email-tdu@semihalf.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1506418805-12117-1-git-send-email-tdu@semihalf.com> References: <1506418805-12117-1-git-send-email-tdu@semihalf.com> Subject: [dpdk-dev] [PATCH 2/8] net/mrvl: add mrvl net pmd driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Sep 2017 09:40:33 -0000 Add support for the Marvell PPv2 (Packet Processor v2) 1/10 Gbps adapter. Driver is based on external, publicly available, light-weight Marvell MUSDK library that provides access to network packet processor. Driver comes with support for the following features: * Speed capabilities * Link status * Queue start/stop * MTU update * Jumbo frame * Promiscuous mode * Allmulticast mode * Unicast MAC filter * Multicast MAC filter * RSS hash * VLAN filter * CRC offload * L3 checksum offload * L4 checksum offload * Packet type parsing * Basic stats * Stats per queue Driver was engineered cooperatively by Semihalf and Marvell teams. Semihalf: Jacek Siuda Tomasz Duszynski Marvell: Dmitri Epshtein Natalie Samsonov Signed-off-by: Jacek Siuda Signed-off-by: Tomasz Duszynski --- config/common_base | 7 + drivers/net/Makefile | 2 + drivers/net/mrvl/Makefile | 69 + drivers/net/mrvl/mrvl_ethdev.c | 2277 +++++++++++++++++++++++++++++ drivers/net/mrvl/mrvl_ethdev.h | 115 ++ drivers/net/mrvl/mrvl_qos.c | 627 ++++++++ drivers/net/mrvl/mrvl_qos.h | 112 ++ drivers/net/mrvl/rte_pmd_mrvl_version.map | 3 + mk/rte.app.mk | 1 + 9 files changed, 3213 insertions(+) create mode 100644 drivers/net/mrvl/Makefile create mode 100644 drivers/net/mrvl/mrvl_ethdev.c create mode 100644 drivers/net/mrvl/mrvl_ethdev.h create mode 100644 drivers/net/mrvl/mrvl_qos.c create mode 100644 drivers/net/mrvl/mrvl_qos.h create mode 100644 drivers/net/mrvl/rte_pmd_mrvl_version.map diff --git a/config/common_base b/config/common_base index 5e97a08..d05a60c 100644 --- a/config/common_base +++ b/config/common_base @@ -262,6 +262,13 @@ CONFIG_RTE_LIBRTE_NFP_PMD=n CONFIG_RTE_LIBRTE_NFP_DEBUG=n # +# Compile Marvell PMD driver +# +CONFIG_RTE_LIBRTE_MRVL_PMD=n +CONFIG_RTE_LIBRTE_MRVL_DEBUG=n +CONFIG_RTE_MRVL_MUSDK_DMA_MEMSIZE=41943040 + +# # Compile burst-oriented Broadcom BNXT PMD driver # CONFIG_RTE_LIBRTE_BNXT_PMD=y diff --git a/drivers/net/Makefile b/drivers/net/Makefile index d33c959..4a3a205 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -73,6 +73,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4 DEPDIRS-mlx4 = $(core-libs) DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5 DEPDIRS-mlx5 = $(core-libs) +DIRS-$(CONFIG_RTE_LIBRTE_MRVL_PMD) += mrvl +DEPDIRS-mrvl = $(core-libs) DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp DEPDIRS-nfp = $(core-libs) DIRS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt diff --git a/drivers/net/mrvl/Makefile b/drivers/net/mrvl/Makefile new file mode 100644 index 0000000..ab53f49 --- /dev/null +++ b/drivers/net/mrvl/Makefile @@ -0,0 +1,69 @@ +# BSD LICENSE +# +# Copyright(c) 2017 Semihalf. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Semihalf nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +include $(RTE_SDK)/mk/rte.vars.mk + +ifneq ($(MAKECMDGOALS),clean) +ifneq ($(MAKECMDGOALS),config) +ifeq ($(LIBMUSDK_PATH),) +$(error "Please define LIBMUSDK_PATH environment variable") +endif +ifeq ($(CONFIG_RTE_LIBRTE_CFGFILE),n) +$(error "RTE_LIBRTE_CFGFILE must be enabled in configuration!") +endif +endif +endif + +# library name +LIB = librte_pmd_mrvl.a + +# library version +LIBABIVER := 1 + +# versioning export map +EXPORT_MAP := rte_pmd_mrvl_version.map + +# external library dependencies +CFLAGS += -I$(LIBMUSDK_PATH)/include +CFLAGS += -DMVCONF_ARCH_DMA_ADDR_T_64BIT +CFLAGS += -DCONF_PP2_BPOOL_COOKIE_SIZE=32 +CFLAGS += $(WERROR_FLAGS) +CFLAGS += -O3 +LDLIBS += -L$(LIBMUSDK_PATH)/lib +LDLIBS += -lmusdk + +# library source files +SRCS-$(CONFIG_RTE_LIBRTE_MRVL_PMD) += mrvl_ethdev.c +SRCS-$(CONFIG_RTE_LIBRTE_MRVL_PMD) += mrvl_qos.c + +# library dependencies +DEPDIRS-$(CONFIG_RTE_LIBRTE_MRVL_PMD) += lib/librte_cfgfile + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/net/mrvl/mrvl_ethdev.c b/drivers/net/mrvl/mrvl_ethdev.c new file mode 100644 index 0000000..6ef9a32 --- /dev/null +++ b/drivers/net/mrvl/mrvl_ethdev.c @@ -0,0 +1,2277 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 Semihalf. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Semihalf nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include + +/* Unluckily, container_of is defined by both DPDK and MUSDK, + * we'll declare only one version. + * + * Note that it is not used in this PMD anyway. + */ +#ifdef container_of +#undef container_of +#endif + +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mrvl_ethdev.h" +#include "mrvl_qos.h" + +/* bitmask with reserved hifs */ +#define MRVL_MUSDK_HIFS_RESERVED 0x0F +/* bitmask with reserved bpools */ +#define MRVL_MUSDK_BPOOLS_RESERVED 0x07 +/* bitmask with reserved kernel RSS tables */ +#define MRVL_MUSDK_RSS_RESERVED 0x01 +/* maximum number of available hifs */ +#define MRVL_MUSDK_HIFS_MAX 9 + +/* prefetch shift */ +#define MRVL_MUSDK_PREFETCH_SHIFT 2 + +/* TCAM has 25 entries reserved for uc/mc filter entries */ +#define MRVL_MAC_ADDRS_MAX 25 +#define MRVL_MATCH_LEN 16 +#define MRVL_PKT_EFFEC_OFFS (MRVL_PKT_OFFS + MV_MH_SIZE) +/* Maximum allowable packet size */ +#define MRVL_PKT_SIZE_MAX (10240 - MV_MH_SIZE) + +#define MRVL_IFACE_NAME_ARG "iface" +#define MRVL_CFG_ARG "cfg" + +#define MRVL_BURST_SIZE 64 + +#define MRVL_ARP_LENGTH 28 + +#define MRVL_COOKIE_ADDR_INVALID ~0ULL + +#define MRVL_COOKIE_HIGH_ADDR_SHIFT (sizeof(pp2_cookie_t) * 8) +#define MRVL_COOKIE_HIGH_ADDR_MASK (~0ULL << MRVL_COOKIE_HIGH_ADDR_SHIFT) + +static const char * const valid_args[] = { + MRVL_IFACE_NAME_ARG, + MRVL_CFG_ARG, + NULL +}; + +static int used_hifs = MRVL_MUSDK_HIFS_RESERVED; +static struct pp2_hif *hifs[RTE_MAX_LCORE]; +static int used_bpools[PP2_NUM_PKT_PROC] = { + MRVL_MUSDK_BPOOLS_RESERVED, + MRVL_MUSDK_BPOOLS_RESERVED +}; + +struct pp2_bpool *mrvl_port_to_bpool_lookup[RTE_MAX_ETHPORTS]; +int mrvl_port_bpool_size[PP2_NUM_PKT_PROC][PP2_BPOOL_NUM_POOLS][RTE_MAX_LCORE]; +uint64_t cookie_addr_high = MRVL_COOKIE_ADDR_INVALID; + +/* + * To use buffer harvesting based on loopback port shadow queue structure + * was introduced for buffers information bookkeeping. + * + * Before sending the packet, related buffer information (pp2_buff_inf) is + * stored in shadow queue. After packet is transmitted no longer used + * packet buffer is released back to it's original hardware pool, + * on condition it originated from interface. + * In case it was generated by application itself i.e: mbuf->port field is + * 0xff then its released to software mempool. + */ +struct mrvl_shadow_txq { + int head; /* write index - used when sending buffers */ + int tail; /* read index - used when releasing buffers */ + u16 size; /* queue occupied size */ + u16 num_to_release; /* number of buffers sent, that can be released */ + struct buff_release_entry ent[MRVL_PP2_TX_SHADOWQ_SIZE]; /* q entries */ +}; + +struct mrvl_rxq; +struct mrvl_txq; + +struct mrvl_rxq { + struct mrvl_priv *priv; + struct rte_mempool *mp; + int queue_id; + int port_id; + int cksum_enabled; + uint64_t bytes_recv; + uint64_t drop_mac; +}; + +struct mrvl_txq { + struct mrvl_priv *priv; + int queue_id; + int port_id; + uint64_t bytes_sent; +}; + +/* + * Every tx queue should have dedicated shadow tx queue. + * + * Ports assigned by DPDK might not start at zero or be continuous so + * as a workaround define shadow queues for each possible port so that + * we eventually fit somewhere. + */ +struct mrvl_shadow_txq shadow_txqs[RTE_MAX_ETHPORTS][RTE_MAX_LCORE]; + +/** Number of ports configured. */ +int mrvl_ports_nb; +static int mrvl_lcore_first; +static int mrvl_lcore_last; + +static inline int +mrvl_get_bpool_size(int pp2_id, int pool_id) +{ + int i; + int size = 0; + + for (i = mrvl_lcore_first; i <= mrvl_lcore_last; i++) + size += mrvl_port_bpool_size[pp2_id][pool_id][i]; + + return size; +} + + +static inline int +mrvl_reserve_bit(int *bitmap, int max) +{ + int n = sizeof(*bitmap) * 8 - __builtin_clz(*bitmap); + if (n >= max) + return -1; + + *bitmap |= 1 << n; + + return n; +} + +/** + * Configure rss based on dpdk rss configuration. + * + * @param priv + * Pointer to private structure. + * @param rss_conf + * Pointer to RSS configuration. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_configure_rss(struct mrvl_priv *priv, struct rte_eth_rss_conf *rss_conf) +{ + if (rss_conf->rss_key) + RTE_LOG(WARNING, PMD, "Changing hash key is not supported\n"); + + if (rss_conf->rss_hf == 0) { + priv->ppio_params.inqs_params.hash_type = PP2_PPIO_HASH_T_NONE; + } else if (rss_conf->rss_hf & ETH_RSS_IPV4) { + priv->ppio_params.inqs_params.hash_type = + PP2_PPIO_HASH_T_2_TUPLE; + } else if (rss_conf->rss_hf & ETH_RSS_NONFRAG_IPV4_TCP) { + priv->ppio_params.inqs_params.hash_type = + PP2_PPIO_HASH_T_5_TUPLE; + priv->rss_hf_tcp = 1; + } else if (rss_conf->rss_hf & ETH_RSS_NONFRAG_IPV4_UDP) { + priv->ppio_params.inqs_params.hash_type = + PP2_PPIO_HASH_T_5_TUPLE; + priv->rss_hf_tcp = 0; + } else { + return -EINVAL; + } + + return 0; +} + +/** + * Ethernet device configuration. + * + * Prepare the driver for a given number of TX and RX queues and + * configure RSS. + * + * @param dev + * Pointer to Ethernet device structure. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_dev_configure(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + if (dev->data->dev_conf.rxmode.mq_mode != ETH_MQ_RX_NONE && + dev->data->dev_conf.rxmode.mq_mode != ETH_MQ_RX_RSS) { + RTE_LOG(INFO, PMD, "Unsupported rx multi queue mode %d\n", + dev->data->dev_conf.rxmode.mq_mode); + return -EINVAL; + } + + if (!dev->data->dev_conf.rxmode.hw_strip_crc) { + RTE_LOG(INFO, PMD, + "L2 CRC stripping is always enabled in hw\n"); + dev->data->dev_conf.rxmode.hw_strip_crc = 1; + } + + if (dev->data->dev_conf.rxmode.hw_vlan_strip) { + RTE_LOG(INFO, PMD, "VLAN stripping not supported\n"); + return -EINVAL; + } + + if (dev->data->dev_conf.rxmode.split_hdr_size) { + RTE_LOG(INFO, PMD, "Split headers not supported\n"); + return -EINVAL; + } + + if (dev->data->dev_conf.rxmode.enable_scatter) { + RTE_LOG(INFO, PMD, "RX Scatter/Gather not supported\n"); + return -EINVAL; + } + + if (dev->data->dev_conf.rxmode.enable_lro) { + RTE_LOG(INFO, PMD, "LRO not supported\n"); + return -EINVAL; + } + + if (dev->data->dev_conf.rxmode.jumbo_frame) + dev->data->mtu = dev->data->dev_conf.rxmode.max_rx_pkt_len - + ETHER_HDR_LEN - ETHER_CRC_LEN; + + ret = mrvl_configure_rxqs(priv, dev->data->port_id, + dev->data->nb_rx_queues); + if (ret < 0) + return ret; + + + priv->ppio_params.outqs_params.num_outqs = dev->data->nb_tx_queues; + priv->ppio_params.maintain_stats = 1; + priv->nb_rx_queues = dev->data->nb_rx_queues; + + if (dev->data->nb_rx_queues == 1 && + dev->data->dev_conf.rxmode.mq_mode == ETH_MQ_RX_RSS) { + RTE_LOG(WARNING, PMD, "Disabling hash for 1 rx queue\n"); + priv->ppio_params.inqs_params.hash_type = PP2_PPIO_HASH_T_NONE; + + return 0; + } + + return mrvl_configure_rss(priv, + &dev->data->dev_conf.rx_adv_conf.rss_conf); +} + +/** + * DPDK callback to change the MTU. + * + * Setting the MTU affects hardware MRU (packets larger than the MRU + * will be dropped). + * + * @param dev + * Pointer to Ethernet device structure. + * @param mtu + * New MTU. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) +{ + struct mrvl_priv *priv = dev->data->dev_private; + /* extra MV_MH_SIZE bytes are required for Marvell tag */ + uint16_t mru = mtu + MV_MH_SIZE + ETHER_HDR_LEN + ETHER_CRC_LEN; + int ret; + + if ((mtu < ETHER_MIN_MTU) || (mru > MRVL_PKT_SIZE_MAX)) + return -EINVAL; + + ret = pp2_ppio_set_mru(priv->ppio, mru); + if (ret) + return ret; + + return pp2_ppio_set_mtu(priv->ppio, mtu); +} + +/** + * DPDK callback to bring the link up. + * + * @param dev + * Pointer to Ethernet device structure. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_dev_set_link_up(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + ret = pp2_ppio_enable(priv->ppio); + if (ret) + return ret; + + /* + * mtu/mru can be updated if pp2_ppio_enable() was called at least once + * as pp2_ppio_enable() changes port->t_mode from default 0 to + * PP2_TRAFFIC_INGRESS_EGRESS. + * + * Set mtu to default DPDK value here. + */ + ret = mrvl_mtu_set(dev, dev->data->mtu); + if (ret) + pp2_ppio_disable(priv->ppio); + + dev->data->dev_link.link_status = ETH_LINK_UP; + + return ret; +} + +/** + * DPDK callback to bring the link down. + * + * @param dev + * Pointer to Ethernet device structure. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_dev_set_link_down(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + ret = pp2_ppio_disable(priv->ppio); + if (ret) + return ret; + + dev->data->dev_link.link_status = ETH_LINK_DOWN; + + return ret; +} + +/** + * DPDK callback to start the device. + * + * @param dev + * Pointer to Ethernet device structure. + * + * @return + * 0 on success, negative errno value on failure. + */ +static int +mrvl_dev_start(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + char match[MRVL_MATCH_LEN]; + int ret; + + snprintf(match, sizeof(match), "ppio-%d:%d", + priv->pp_id, priv->ppio_id); + priv->ppio_params.match = match; + + /* + * Calculate the maximum bpool size for refill feature to 1.5 of the + * configured size. In case the bpool size will exceed this value, + * superfluous buffers will be removed + */ + priv->bpool_max_size = priv->bpool_init_size + + (priv->bpool_init_size >> 1); + /* + * Calculate the minimum bpool size for refill feature as follows: + * 2 default burst sizes multiply by number of rx queues. + * If the bpool size will be below this value, new buffers will + * be added to the pool. + */ + priv->bpool_min_size = priv->nb_rx_queues * MRVL_BURST_SIZE * 2; + + ret = pp2_ppio_init(&priv->ppio_params, &priv->ppio); + if (ret) + return ret; + + /* + * In case there are some some stale uc/mc mac addresses flush them + * here. It cannot be done during mrvl_dev_close() as port information + * is already gone at that point (due to pp2_ppio_deinit() in + * mrvl_dev_stop()). + */ + if (!priv->uc_mc_flushed) { + ret = pp2_ppio_flush_mac_addrs(priv->ppio, 1, 1); + if (ret) { + RTE_LOG(ERR, PMD, + "Failed to flush uc/mc filter list\n"); + goto out; + } + priv->uc_mc_flushed = 1; + } + + if (!priv->vlan_flushed) { + ret = pp2_ppio_flush_vlan(priv->ppio); + if (ret) { + RTE_LOG(ERR, PMD, "Failed to flush vlan list\n"); + /* + * TODO + * once pp2_ppio_flush_vlan() is supported jump to out + * goto out; + */ + } + priv->vlan_flushed = 1; + } + + /* For default QoS config, don't start classifier. */ + if (mrvl_qos_cfg != NULL) { + ret = mrvl_start_qos_mapping(priv); + if (ret) { + pp2_ppio_deinit(priv->ppio); + return ret; + } + } + + ret = mrvl_dev_set_link_up(dev); + if (ret) + goto out; + + return 0; +out: + pp2_ppio_deinit(priv->ppio); + return ret; +} + +/** + * Flush receive queues. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_flush_rx_queues(struct rte_eth_dev *dev) +{ + int i; + + RTE_LOG(INFO, PMD, "Flushing rx queues\n"); + for (i = 0; i < dev->data->nb_rx_queues; i++) { + int ret, num; + + do { + struct mrvl_rxq *q = dev->data->rx_queues[i]; + struct pp2_ppio_desc descs[MRVL_PP2_RXD_MAX]; + + num = MRVL_PP2_RXD_MAX; + ret = pp2_ppio_recv(q->priv->ppio, + q->priv->rxq_map[q->queue_id].tc, + q->priv->rxq_map[q->queue_id].inq, + descs, (uint16_t *)&num); + } while (ret == 0 && num); + } +} + +/** + * Flush transmit shadow queues. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_flush_tx_shadow_queues(struct rte_eth_dev *dev) +{ + int i; + + RTE_LOG(INFO, PMD, "Flushing tx shadow queues\n"); + for (i = 0; i < RTE_MAX_LCORE; i++) { + struct mrvl_shadow_txq *sq = + &shadow_txqs[dev->data->port_id][i]; + + while (sq->tail != sq->head) { + uint64_t addr = cookie_addr_high | + sq->ent[sq->tail].buff.cookie; + rte_pktmbuf_free((struct rte_mbuf *)addr); + sq->tail = (sq->tail + 1) & MRVL_PP2_TX_SHADOWQ_MASK; + } + + memset(sq, 0, sizeof(*sq)); + } +} + +/** + * Flush hardware bpool (buffer-pool). + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_flush_bpool(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + uint32_t num; + int ret; + + ret = pp2_bpool_get_num_buffs(priv->bpool, &num); + if (ret) { + RTE_LOG(ERR, PMD, "Failed to get bpool buffers number\n"); + return; + } + + while (num--) { + struct pp2_buff_inf inf; + uint64_t addr; + + ret = pp2_bpool_get_buff(hifs[rte_lcore_id()], priv->bpool, + &inf); + if (ret) + break; + + addr = cookie_addr_high | inf.cookie; + rte_pktmbuf_free((struct rte_mbuf *)addr); + } +} + +/** + * DPDK callback to stop the device. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_dev_stop(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + + mrvl_dev_set_link_down(dev); + mrvl_flush_rx_queues(dev); + mrvl_flush_tx_shadow_queues(dev); + if (priv->qos_tbl) + pp2_cls_qos_tbl_deinit(priv->qos_tbl); + pp2_ppio_deinit(priv->ppio); + priv->ppio = NULL; +} + +/** + * DPDK callback to close the device. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_dev_close(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + size_t i; + + for (i = 0; i < priv->ppio_params.inqs_params.num_tcs; ++i) + if (priv->ppio_params.inqs_params.tcs_params[i].inqs_params) { + rte_free(priv->ppio_params.inqs_params. + tcs_params[i].inqs_params); + priv->ppio_params.inqs_params. + tcs_params[i].inqs_params = NULL; + } + + mrvl_flush_bpool(dev); +} + +/** + * DPDK callback to retrieve physical link information. + * + * @param dev + * Pointer to Ethernet device structure. + * @param wait_to_complete + * Wait for request completion (ignored). + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused) +{ + /* + * TODO + * once MUSDK provides necessary API use it here + */ + struct ethtool_cmd edata; + struct ifreq req; + int ret, fd; + + edata.cmd = ETHTOOL_GSET; + + strcpy(req.ifr_name, dev->data->name); + req.ifr_data = (void *)&edata; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (fd == -1) + return -EFAULT; + + ret = ioctl(fd, SIOCETHTOOL, &req); + if (ret == -1) { + close(fd); + return -EFAULT; + } + + close(fd); + + switch (ethtool_cmd_speed(&edata)) { + case SPEED_10: + dev->data->dev_link.link_speed = ETH_SPEED_NUM_10M; + break; + case SPEED_100: + dev->data->dev_link.link_speed = ETH_SPEED_NUM_100M; + break; + case SPEED_1000: + dev->data->dev_link.link_speed = ETH_SPEED_NUM_1G; + break; + case SPEED_10000: + dev->data->dev_link.link_speed = ETH_SPEED_NUM_10G; + break; + default: + dev->data->dev_link.link_speed = ETH_SPEED_NUM_NONE; + } + + dev->data->dev_link.link_duplex = edata.duplex ? ETH_LINK_FULL_DUPLEX : + ETH_LINK_HALF_DUPLEX; + dev->data->dev_link.link_autoneg = edata.autoneg ? ETH_LINK_AUTONEG : + ETH_LINK_FIXED; + + return 0; +} + +/** + * DPDK callback to enable promiscuous mode. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_promiscuous_enable(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + ret = pp2_ppio_set_uc_promisc(priv->ppio, 1); + if (ret) + RTE_LOG(ERR, PMD, "Failed to enable promiscuous mode\n"); +} + +/** + * DPDK callback to enable allmulti mode. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_allmulticast_enable(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + ret = pp2_ppio_set_mc_promisc(priv->ppio, 1); + if (ret) + RTE_LOG(ERR, PMD, "Failed enable all-multicast mode\n"); +} + +/** + * DPDK callback to disable promiscuous mode. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_promiscuous_disable(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + ret = pp2_ppio_set_uc_promisc(priv->ppio, 0); + if (ret) + RTE_LOG(ERR, PMD, "Failed to disable promiscuous mode\n"); +} + +/** + * DPDK callback to disable allmulticast mode. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_allmulticast_disable(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int ret; + + ret = pp2_ppio_set_mc_promisc(priv->ppio, 0); + if (ret) + RTE_LOG(ERR, PMD, "Failed to disable all-multicast mode\n"); +} + +/** + * DPDK callback to remove a MAC address. + * + * @param dev + * Pointer to Ethernet device structure. + * @param index + * MAC address index. + */ +static void +mrvl_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index) +{ + struct mrvl_priv *priv = dev->data->dev_private; + char buf[ETHER_ADDR_FMT_SIZE]; + int ret; + + ret = pp2_ppio_remove_mac_addr(priv->ppio, + dev->data->mac_addrs[index].addr_bytes); + if (ret) { + ether_format_addr(buf, sizeof(buf), + &dev->data->mac_addrs[index]); + RTE_LOG(ERR, PMD, "Failed to remove mac %s\n", buf); + } +} + +/** + * DPDK callback to add a MAC address. + * + * @param dev + * Pointer to Ethernet device structure. + * @param mac_addr + * MAC address to register. + * @param index + * MAC address index. + * @param vmdq + * VMDq pool index to associate address with (unused). + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr, + uint32_t index, uint32_t vmdq __rte_unused) +{ + struct mrvl_priv *priv = dev->data->dev_private; + char buf[ETHER_ADDR_FMT_SIZE]; + int ret; + + if (index == 0) + /* For setting index 0, mrvl_mac_addr_set() should be used.*/ + return -1; + + /* + * Maximum number of uc addresses can be tuned via kernel module mvpp2x + * parameter uc_filter_max. Maximum number of mc addresses is then + * MRVL_MAC_ADDRS_MAX - uc_filter_max. Currently it defaults to 4 and + * 21 respectively. + * + * If more than uc_filter_max uc addresses were added to filter list + * then NIC will switch to promiscuous mode automatically. + * + * If more than MRVL_MAC_ADDRS_MAX - uc_filter_max number mc addresses + * were added to filter list then NIC will switch to all-multicast mode + * automatically. + */ + ret = pp2_ppio_add_mac_addr(priv->ppio, mac_addr->addr_bytes); + if (ret) { + ether_format_addr(buf, sizeof(buf), mac_addr); + RTE_LOG(ERR, PMD, "Failed to add mac %s\n", buf); + return -1; + } + + return 0; +} + +/** + * DPDK callback to set the primary MAC address. + * + * @param dev + * Pointer to Ethernet device structure. + * @param mac_addr + * MAC address to register. + */ +static void +mrvl_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr) +{ + struct mrvl_priv *priv = dev->data->dev_private; + + pp2_ppio_set_mac_addr(priv->ppio, mac_addr->addr_bytes); + /* + * TODO + * Port stops sending packets if pp2_ppio_set_mac_addr() + * was called after pp2_ppio_enable(). As a quick fix issue + * enable port once again. + */ + pp2_ppio_enable(priv->ppio); +} + +/** + * DPDK callback to get device statistics. + * + * @param dev + * Pointer to Ethernet device structure. + * @param stats + * Stats structure output buffer. + */ +static void +mrvl_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) +{ + struct mrvl_priv *priv = dev->data->dev_private; + struct pp2_ppio_statistics ppio_stats; + uint64_t drop_mac = 0; + unsigned int i, idx, ret; + + for (i = 0; i < dev->data->nb_rx_queues; i++) { + struct mrvl_rxq *rxq = dev->data->rx_queues[i]; + struct pp2_ppio_inq_statistics rx_stats; + + if (!rxq) + continue; + + idx = rxq->queue_id; + if (unlikely(idx >= RTE_ETHDEV_QUEUE_STAT_CNTRS)) { + RTE_LOG(ERR, PMD, + "rx queue %d stats out of range (0 - %d)\n", + idx, RTE_ETHDEV_QUEUE_STAT_CNTRS - 1); + continue; + } + + ret = pp2_ppio_inq_get_statistics(priv->ppio, + priv->rxq_map[idx].tc, priv->rxq_map[idx].inq, + &rx_stats, 0); + if (unlikely(ret)) { + RTE_LOG(ERR, PMD, + "Failed to update rx queue %d stats\n", idx); + break; + } + + stats->q_ibytes[idx] = rxq->bytes_recv; + stats->q_ipackets[idx] = rx_stats.enq_desc - rxq->drop_mac; + stats->q_errors[idx] = rx_stats.drop_early + + rx_stats.drop_fullq + + rx_stats.drop_bm + + rxq->drop_mac; + stats->ibytes += rxq->bytes_recv; + drop_mac += rxq->drop_mac; + } + + for (i = 0; i < dev->data->nb_tx_queues; i++) { + struct mrvl_txq *txq = dev->data->tx_queues[i]; + struct pp2_ppio_outq_statistics tx_stats; + + if (!txq) + continue; + + idx = txq->queue_id; + if (unlikely(idx >= RTE_ETHDEV_QUEUE_STAT_CNTRS)) { + RTE_LOG(ERR, PMD, + "tx queue %d stats out of range (0 - %d)\n", + idx, RTE_ETHDEV_QUEUE_STAT_CNTRS - 1); + } + + ret = pp2_ppio_outq_get_statistics(priv->ppio, idx, + &tx_stats, 0); + if (unlikely(ret)) { + RTE_LOG(ERR, PMD, + "Failed to update tx queue %d stats\n", idx); + break; + } + + stats->q_opackets[idx] = tx_stats.deq_desc; + stats->q_obytes[idx] = txq->bytes_sent; + stats->obytes += txq->bytes_sent; + } + + ret = pp2_ppio_get_statistics(priv->ppio, &ppio_stats, 0); + if (unlikely(ret)) { + RTE_LOG(ERR, PMD, "Failed to update port statistics\n"); + return; + } + + stats->ipackets += ppio_stats.rx_packets - drop_mac; + stats->opackets += ppio_stats.tx_packets; + stats->imissed += ppio_stats.rx_fullq_dropped + + ppio_stats.rx_bm_dropped + + ppio_stats.rx_early_dropped + + ppio_stats.rx_fifo_dropped + + ppio_stats.rx_cls_dropped; + stats->ierrors = drop_mac; +} + +/** + * DPDK callback to clear device statistics. + * + * @param dev + * Pointer to Ethernet device structure. + */ +static void +mrvl_stats_reset(struct rte_eth_dev *dev) +{ + struct mrvl_priv *priv = dev->data->dev_private; + int i; + + for (i = 0; i < dev->data->nb_rx_queues; i++) { + struct mrvl_rxq *rxq = dev->data->rx_queues[i]; + + pp2_ppio_inq_get_statistics(priv->ppio, + priv->rxq_map[i].tc, priv->rxq_map[i].inq, NULL, 1); + rxq->bytes_recv = 0; + rxq->drop_mac = 0; + } + + for (i = 0; i < dev->data->nb_tx_queues; i++) { + struct mrvl_txq *txq = dev->data->tx_queues[i]; + + pp2_ppio_outq_get_statistics(priv->ppio, i, NULL, 1); + txq->bytes_sent = 0; + } + + pp2_ppio_get_statistics(priv->ppio, NULL, 1); +} + +/** + * DPDK callback to get information about the device. + * + * @param dev + * Pointer to Ethernet device structure (unused). + * @param info + * Info structure output buffer. + */ +static void +mrvl_dev_infos_get(struct rte_eth_dev *dev __rte_unused, + struct rte_eth_dev_info *info) +{ + info->max_rx_queues = MRVL_PP2_RXQ_MAX; + info->max_tx_queues = MRVL_PP2_TXQ_MAX; + info->max_mac_addrs = MRVL_MAC_ADDRS_MAX; + + info->rx_desc_lim.nb_max = MRVL_PP2_RXD_MAX; + info->rx_desc_lim.nb_min = MRVL_PP2_RXD_MIN; + info->rx_desc_lim.nb_align = MRVL_PP2_RXD_ALIGN; + + info->tx_desc_lim.nb_max = MRVL_PP2_TXD_MAX; + info->tx_desc_lim.nb_min = MRVL_PP2_TXD_MIN; + info->tx_desc_lim.nb_align = MRVL_PP2_TXD_ALIGN; + + info->rx_offload_capa = DEV_RX_OFFLOAD_IPV4_CKSUM | + DEV_RX_OFFLOAD_UDP_CKSUM | + DEV_RX_OFFLOAD_TCP_CKSUM; + + info->tx_offload_capa = DEV_TX_OFFLOAD_IPV4_CKSUM | + DEV_TX_OFFLOAD_UDP_CKSUM | + DEV_TX_OFFLOAD_TCP_CKSUM; + + info->flow_type_rss_offloads = ETH_RSS_IPV4 | + ETH_RSS_NONFRAG_IPV4_TCP | + ETH_RSS_NONFRAG_IPV4_UDP; + + /* By default packets are dropped if no descriptors are available */ + info->default_rxconf.rx_drop_en = 1; + + info->max_rx_pktlen = MRVL_PKT_SIZE_MAX; +} + +/** + * Return supported packet types. + * + * @param dev + * Pointer to Ethernet device structure (unused). + * + * @return + * Const pointer to the table with supported packet types. + */ +static const uint32_t * +mrvl_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) +{ + static const uint32_t ptypes[] = { + RTE_PTYPE_L2_ETHER, + RTE_PTYPE_L3_IPV4, + RTE_PTYPE_L3_IPV4_EXT, + RTE_PTYPE_L3_IPV4_EXT_UNKNOWN, + RTE_PTYPE_L3_IPV6, + RTE_PTYPE_L3_IPV6_EXT, + RTE_PTYPE_L2_ETHER_ARP, + RTE_PTYPE_L4_TCP, + RTE_PTYPE_L4_UDP + }; + + return ptypes; +} + +/** + * DPDK callback to get information about specific receive queue. + * + * @param dev + * Pointer to Ethernet device structure. + * @param rx_queue_id + * Receive queue index. + * @param qinfo + * Receive queue information structure. + */ +static void mrvl_rxq_info_get(struct rte_eth_dev *dev, uint16_t rx_queue_id, + struct rte_eth_rxq_info *qinfo) +{ + struct mrvl_rxq *q = dev->data->rx_queues[rx_queue_id]; + struct mrvl_priv *priv = dev->data->dev_private; + int inq = priv->rxq_map[rx_queue_id].inq; + int tc = priv->rxq_map[rx_queue_id].tc; + + qinfo->mp = q->mp; + qinfo->nb_desc = priv->ppio_params.inqs_params. + tcs_params[tc].inqs_params[inq].size; +} + +/** + * DPDK callback to get information about specific transmit queue. + * + * @param dev + * Pointer to Ethernet device structure. + * @param tx_queue_id + * Transmit queue index. + * @param qinfo + * Transmit queue information structure. + */ +static void mrvl_txq_info_get(struct rte_eth_dev *dev, uint16_t tx_queue_id, + struct rte_eth_txq_info *qinfo) +{ + struct mrvl_priv *priv = dev->data->dev_private; + + qinfo->nb_desc = + priv->ppio_params.outqs_params.outqs_params[tx_queue_id].size; +} + +/** + * DPDK callback to Configure a VLAN filter. + * + * @param dev + * Pointer to Ethernet device structure. + * @param vlan_id + * VLAN ID to filter. + * @param on + * Toggle filter. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on) +{ + struct mrvl_priv *priv = dev->data->dev_private; + + return on ? pp2_ppio_add_vlan(priv->ppio, vlan_id) : + pp2_ppio_remove_vlan(priv->ppio, vlan_id); +} + +/** + * Release buffers to hardware bpool (buffer-pool) + * + * @param rxq + * Receive queue pointer. + * @param num + * Number of buffers to release to bpool. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_fill_bpool(struct mrvl_rxq *rxq, int num) +{ + struct buff_release_entry entries[MRVL_PP2_TXD_MAX]; + struct rte_mbuf *mbufs[MRVL_PP2_TXD_MAX]; + int i, ret; + unsigned int core_id = rte_lcore_id(); + struct pp2_hif *hif = hifs[core_id]; + struct pp2_bpool *bpool = rxq->priv->bpool; + + ret = rte_pktmbuf_alloc_bulk(rxq->mp, mbufs, num); + if (ret) + return ret; + + if (cookie_addr_high == MRVL_COOKIE_ADDR_INVALID) + cookie_addr_high = + (uint64_t)mbufs[0] & MRVL_COOKIE_HIGH_ADDR_MASK; + + for (i = 0; i < num; i++) { + if (((uint64_t)mbufs[i] & MRVL_COOKIE_HIGH_ADDR_MASK) + != cookie_addr_high) { + RTE_LOG(ERR, PMD, + "mbuf virtual addr high 0x%lx out of range\n", + (uint64_t)mbufs[i] >> 32); + goto out; + } + + entries[i].buff.addr = + rte_mbuf_data_dma_addr_default(mbufs[i]); + entries[i].buff.cookie = (pp2_cookie_t)(uint64_t)mbufs[i]; + entries[i].bpool = bpool; + } + + pp2_bpool_put_buffs(hif, entries, (uint16_t *)&i); + mrvl_port_bpool_size[bpool->pp2_id][bpool->id][core_id] += i; + + if (i != num) + goto out; + + return 0; +out: + for (; i < num; i++) + rte_pktmbuf_free(mbufs[i]); + + return -1; +} + +/** + * DPDK callback to configure the receive queue. + * + * @param dev + * Pointer to Ethernet device structure. + * @param idx + * RX queue index. + * @param desc + * Number of descriptors to configure in queue. + * @param socket + * NUMA socket on which memory must be allocated. + * @param conf + * Thresholds parameters (unused_). + * @param mp + * Memory pool for buffer allocations. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, + unsigned int socket, + const struct rte_eth_rxconf *conf __rte_unused, + struct rte_mempool *mp) +{ + struct mrvl_priv *priv = dev->data->dev_private; + struct mrvl_rxq *rxq; + uint32_t min_size, + max_rx_pkt_len = dev->data->dev_conf.rxmode.max_rx_pkt_len; + int ret; + + if (priv->rxq_map[idx].tc == MRVL_UNKNOWN_TC) { + /* + * Unknown TC mapping, mapping will not have a correct queue. + */ + RTE_LOG(ERR, PMD, "Unknown TC mapping for queue %hu eth%hhu\n", + idx, priv->ppio_id); + return -EFAULT; + } + + min_size = rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM - + MRVL_PKT_EFFEC_OFFS; + if (min_size < max_rx_pkt_len) { + RTE_LOG(ERR, PMD, "Mbuf size must be increased to %u bytes" + " to hold up to %u bytes of data.\n", + max_rx_pkt_len + RTE_PKTMBUF_HEADROOM + + MRVL_PKT_EFFEC_OFFS, + max_rx_pkt_len); + return -EINVAL; + } + + if (dev->data->rx_queues[idx]) { + rte_free(dev->data->rx_queues[idx]); + dev->data->rx_queues[idx] = NULL; + } + + rxq = rte_zmalloc_socket("rxq", sizeof(*rxq), 0, socket); + if (!rxq) + return -ENOMEM; + + rxq->priv = priv; + rxq->mp = mp; + rxq->cksum_enabled = dev->data->dev_conf.rxmode.hw_ip_checksum; + rxq->queue_id = idx; + rxq->port_id = dev->data->port_id; + mrvl_port_to_bpool_lookup[rxq->port_id] = priv->bpool; + + priv->ppio_params.inqs_params. + tcs_params[priv->rxq_map[rxq->queue_id].tc]. + inqs_params[priv->rxq_map[rxq->queue_id].inq].size = desc; + + ret = mrvl_fill_bpool(rxq, desc); + if (ret) { + rte_free(rxq); + return ret; + } + + priv->bpool_init_size += desc; + + dev->data->rx_queues[idx] = rxq; + + return 0; +} + +/** + * DPDK callback to release the receive queue. + * + * @param rxq + * Generic receive queue pointer. + */ +static void +mrvl_rx_queue_release(void *rxq) +{ + struct mrvl_rxq *q = rxq; + int i, num; + + if (!q) + return; + + num = q->priv->ppio_params.inqs_params. + tcs_params[q->priv->rxq_map[q->queue_id].tc]. + inqs_params[q->priv->rxq_map[q->queue_id].inq].size; + for (i = 0; i < num; i++) { + struct pp2_buff_inf inf; + uint64_t addr; + + pp2_bpool_get_buff(hifs[rte_lcore_id()], q->priv->bpool, &inf); + addr = cookie_addr_high | inf.cookie; + rte_pktmbuf_free((struct rte_mbuf *)addr); + } + + rte_free(q); +} + +/** + * DPDK callback to configure the transmit queue. + * + * @param dev + * Pointer to Ethernet device structure. + * @param idx + * Transmit queue index. + * @param desc + * Number of descriptors to configure in the queue. + * @param socket + * NUMA socket on which memory must be allocated. + * @param conf + * Thresholds parameters (unused). + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, + unsigned int socket, + const struct rte_eth_txconf *conf __rte_unused) +{ + struct mrvl_priv *priv = dev->data->dev_private; + struct mrvl_txq *txq; + + if (dev->data->tx_queues[idx]) { + rte_free(dev->data->tx_queues[idx]); + dev->data->tx_queues[idx] = NULL; + } + + txq = rte_zmalloc_socket("txq", sizeof(*txq), 0, socket); + if (!txq) + return -ENOMEM; + + txq->priv = priv; + txq->queue_id = idx; + txq->port_id = dev->data->port_id; + dev->data->tx_queues[idx] = txq; + + priv->ppio_params.outqs_params.outqs_params[idx].size = desc; + priv->ppio_params.outqs_params.outqs_params[idx].weight = 1; + + return 0; +} + +/** + * DPDK callback to release the transmit queue. + * + * @param txq + * Generic transmit queue pointer. + */ +static void +mrvl_tx_queue_release(void *txq) +{ + struct mrvl_txq *q = txq; + + if (!q) + return; + + rte_free(q); +} + +/** + * Update RSS hash configuration + * + * @param dev + * Pointer to Ethernet device structure. + * @param rss_conf + * Pointer to RSS configuration. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_rss_hash_update(struct rte_eth_dev *dev, + struct rte_eth_rss_conf *rss_conf) +{ + struct mrvl_priv *priv = dev->data->dev_private; + + return mrvl_configure_rss(priv, rss_conf); +} + +/** + * DPDK callback to get RSS hash configuration. + * + * @param dev + * Pointer to Ethernet device structure. + * @rss_conf + * Pointer to RSS configuration. + * + * @return + * Always 0. + */ +static int +mrvl_rss_hash_conf_get(struct rte_eth_dev *dev, + struct rte_eth_rss_conf *rss_conf) +{ + struct mrvl_priv *priv = dev->data->dev_private; + enum pp2_ppio_hash_type hash_type = + priv->ppio_params.inqs_params.hash_type; + + rss_conf->rss_key = NULL; + + if (hash_type == PP2_PPIO_HASH_T_NONE) + rss_conf->rss_hf = 0; + else if (hash_type == PP2_PPIO_HASH_T_2_TUPLE) + rss_conf->rss_hf = ETH_RSS_IPV4; + else if (hash_type == PP2_PPIO_HASH_T_5_TUPLE && priv->rss_hf_tcp) + rss_conf->rss_hf = ETH_RSS_NONFRAG_IPV4_TCP; + else if (hash_type == PP2_PPIO_HASH_T_5_TUPLE && !priv->rss_hf_tcp) + rss_conf->rss_hf = ETH_RSS_NONFRAG_IPV4_UDP; + + return 0; +} + +static const struct eth_dev_ops mrvl_ops = { + .dev_configure = mrvl_dev_configure, + .dev_start = mrvl_dev_start, + .dev_stop = mrvl_dev_stop, + .dev_set_link_up = mrvl_dev_set_link_up, + .dev_set_link_down = mrvl_dev_set_link_down, + .dev_close = mrvl_dev_close, + .link_update = mrvl_link_update, + .promiscuous_enable = mrvl_promiscuous_enable, + .allmulticast_enable = mrvl_allmulticast_enable, + .promiscuous_disable = mrvl_promiscuous_disable, + .allmulticast_disable = mrvl_allmulticast_disable, + .mac_addr_remove = mrvl_mac_addr_remove, + .mac_addr_add = mrvl_mac_addr_add, + .mac_addr_set = mrvl_mac_addr_set, + .mtu_set = mrvl_mtu_set, + .stats_get = mrvl_stats_get, + .stats_reset = mrvl_stats_reset, + .dev_infos_get = mrvl_dev_infos_get, + .dev_supported_ptypes_get = mrvl_dev_supported_ptypes_get, + .rxq_info_get = mrvl_rxq_info_get, + .txq_info_get = mrvl_txq_info_get, + .vlan_filter_set = mrvl_vlan_filter_set, + .rx_queue_setup = mrvl_rx_queue_setup, + .rx_queue_release = mrvl_rx_queue_release, + .tx_queue_setup = mrvl_tx_queue_setup, + .tx_queue_release = mrvl_tx_queue_release, + .rss_hash_update = mrvl_rss_hash_update, + .rss_hash_conf_get = mrvl_rss_hash_conf_get, +}; + +/** + * Return packet type information and l3/l4 offsets. + * + * @param desc + * Pointer to the received packet descriptor. + * @param l3_offset + * l3 packet offset. + * @param l4_offset + * l4 packet offset. + * + * @return + * Packet type information. + */ +static inline uint32_t +mrvl_desc_to_packet_type_and_offset(struct pp2_ppio_desc *desc, + uint8_t *l3_offset, uint8_t *l4_offset) +{ + enum pp2_inq_l3_type l3_type; + enum pp2_inq_l4_type l4_type; + uint64_t packet_type; + + pp2_ppio_inq_desc_get_l3_info(desc, &l3_type, l3_offset); + pp2_ppio_inq_desc_get_l4_info(desc, &l4_type, l4_offset); + + packet_type = RTE_PTYPE_L2_ETHER; + + switch (l3_type) { + case PP2_INQ_L3_TYPE_IPV4_NO_OPTS: + packet_type |= RTE_PTYPE_L3_IPV4; + break; + case PP2_INQ_L3_TYPE_IPV4_OK: + packet_type |= RTE_PTYPE_L3_IPV4_EXT; + break; + case PP2_INQ_L3_TYPE_IPV4_TTL_ZERO: + packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN; + break; + case PP2_INQ_L3_TYPE_IPV6_NO_EXT: + packet_type |= RTE_PTYPE_L3_IPV6; + break; + case PP2_INQ_L3_TYPE_IPV6_EXT: + packet_type |= RTE_PTYPE_L3_IPV6_EXT; + break; + case PP2_INQ_L3_TYPE_ARP: + packet_type |= RTE_PTYPE_L2_ETHER_ARP; + /* + * In case of ARP l4_offset is set to wrong value. + * Set it to proper one so that later on mbuf->l3_len can be + * calculated subtracting l4_offset and l3_offset. + */ + *l4_offset = *l3_offset + MRVL_ARP_LENGTH; + break; + default: + RTE_LOG(DEBUG, PMD, "Failed to recognise l3 packet type\n"); + break; + } + + switch (l4_type) { + case PP2_INQ_L4_TYPE_TCP: + packet_type |= RTE_PTYPE_L4_TCP; + break; + case PP2_INQ_L4_TYPE_UDP: + packet_type |= RTE_PTYPE_L4_UDP; + break; + default: + RTE_LOG(DEBUG, PMD, "Failed to recognise l4 packet type\n"); + break; + } + + return packet_type; +} + +/** + * Get offload information from the received packet descriptor. + * + * @param desc + * Pointer to the received packet descriptor. + * + * @return + * Mbuf offload flags. + */ +static inline uint64_t +mrvl_desc_to_ol_flags(struct pp2_ppio_desc *desc) +{ + uint64_t flags; + enum pp2_inq_desc_status status; + + status = pp2_ppio_inq_desc_get_l3_pkt_error(desc); + if (unlikely(status != PP2_DESC_ERR_OK)) + flags = PKT_RX_IP_CKSUM_BAD; + else + flags = PKT_RX_IP_CKSUM_GOOD; + + status = pp2_ppio_inq_desc_get_l4_pkt_error(desc); + if (unlikely(status != PP2_DESC_ERR_OK)) + flags |= PKT_RX_L4_CKSUM_BAD; + else + flags |= PKT_RX_L4_CKSUM_GOOD; + + return flags; +} + +/** + * DPDK callback for receive. + * + * @param rxq + * Generic pointer to the receive queue. + * @param rx_pkts + * Array to store received packets. + * @param nb_pkts + * Maximum number of packets in array. + * + * @return + * Number of packets successfully received. + */ +static uint16_t +mrvl_rx_pkt_burst(void *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) +{ + struct mrvl_rxq *q = rxq; + struct pp2_ppio_desc descs[nb_pkts]; + struct pp2_bpool *bpool; + int i, ret, rx_done = 0; + int num; + unsigned int core_id = rte_lcore_id(); + + if (unlikely(!q->priv->ppio)) + return 0; + + bpool = q->priv->bpool; + + ret = pp2_ppio_recv(q->priv->ppio, + q->priv->rxq_map[q->queue_id].tc, + q->priv->rxq_map[q->queue_id].inq, + descs, + &nb_pkts); + if (unlikely(ret < 0)) { + RTE_LOG(ERR, PMD, "Failed to receive packets\n"); + return 0; + } + mrvl_port_bpool_size[bpool->pp2_id][bpool->id][core_id] -= nb_pkts; + + for (i = 0; i < nb_pkts; i++) { + struct rte_mbuf *mbuf; + uint8_t l3_offset, l4_offset; + enum pp2_inq_desc_status status; + uint64_t addr; + + if (likely((nb_pkts - i) > MRVL_MUSDK_PREFETCH_SHIFT)) { + struct pp2_ppio_desc *pref_desc; + u64 pref_addr; + + pref_desc = &descs[i + MRVL_MUSDK_PREFETCH_SHIFT]; + pref_addr = cookie_addr_high | + pp2_ppio_inq_desc_get_cookie(pref_desc); + rte_mbuf_prefetch_part1((struct rte_mbuf *)(pref_addr)); + rte_mbuf_prefetch_part2((struct rte_mbuf *)(pref_addr)); + } + + addr = cookie_addr_high | + pp2_ppio_inq_desc_get_cookie(&descs[i]); + mbuf = (struct rte_mbuf *)addr; + rte_pktmbuf_reset(mbuf); + + /* drop packet in case of mac, overrun or resource error */ + status = pp2_ppio_inq_desc_get_l2_pkt_error(&descs[i]); + if (unlikely(status != PP2_DESC_ERR_OK)) { + struct pp2_buff_inf binf = { + .addr = rte_mbuf_data_dma_addr_default(mbuf), + .cookie = (pp2_cookie_t)(uint64_t)mbuf, + }; + + pp2_bpool_put_buff(hifs[core_id], bpool, &binf); + mrvl_port_bpool_size + [bpool->pp2_id][bpool->id][core_id]++; + q->drop_mac++; + continue; + } + + mbuf->data_off += MRVL_PKT_EFFEC_OFFS; + mbuf->pkt_len = pp2_ppio_inq_desc_get_pkt_len(&descs[i]); + mbuf->data_len = mbuf->pkt_len; + mbuf->port = q->port_id; + mbuf->packet_type = + mrvl_desc_to_packet_type_and_offset(&descs[i], + &l3_offset, + &l4_offset); + mbuf->l2_len = l3_offset; + mbuf->l3_len = l4_offset - l3_offset; + + if (likely(q->cksum_enabled)) + mbuf->ol_flags = mrvl_desc_to_ol_flags(&descs[i]); + + rx_pkts[rx_done++] = mbuf; + q->bytes_recv += mbuf->pkt_len; + } + + if (rte_spinlock_trylock(&q->priv->lock) == 1) { + num = mrvl_get_bpool_size(bpool->pp2_id, bpool->id); + + if (unlikely((num <= q->priv->bpool_min_size) || + (!rx_done && (num < q->priv->bpool_init_size)))) { + ret = mrvl_fill_bpool(q, MRVL_BURST_SIZE); + if (ret) + RTE_LOG(ERR, PMD, "Failed to fill bpool\n"); + } else if (unlikely(num > q->priv->bpool_max_size)) { + int i; + int pkt_to_remove = num - q->priv->bpool_init_size; + struct rte_mbuf *mbuf; + struct pp2_buff_inf buff; + + RTE_LOG(DEBUG, PMD, "\nport-%d:%d: bpool %d oversize -" + " remove %d buffers (pool size: %d -> %d)\n", + bpool->pp2_id, q->priv->ppio->port_id, + bpool->id, pkt_to_remove, num, + q->priv->bpool_init_size); + + for (i = 0; i < pkt_to_remove; i++) { + pp2_bpool_get_buff(hifs[core_id], bpool, &buff); + mbuf = (struct rte_mbuf *) + (cookie_addr_high | buff.cookie); + rte_pktmbuf_free(mbuf); + } + mrvl_port_bpool_size + [bpool->pp2_id][bpool->id][core_id] -= + pkt_to_remove; + } + rte_spinlock_unlock(&q->priv->lock); + } + + return rx_done; +} + +/** + * Prepare offload information. + * + * @param ol_flags + * Offload flags. + * @param packet_type + * Packet type bitfield. + * @param l3_type + * Pointer to the pp2_ouq_l3_type structure. + * @param l4_type + * Pointer to the pp2_outq_l4_type structure. + * @param gen_l3_cksum + * Will be set to 1 in case l3 checksum is computed. + * @param l4_cksum + * Will be set to 1 in case l4 checksum is computed. + * + * @return + * 0 on success, negative error value otherwise. + */ +static inline int +mrvl_prepare_proto_info(uint64_t ol_flags, uint32_t packet_type, + enum pp2_outq_l3_type *l3_type, + enum pp2_outq_l4_type *l4_type, + int *gen_l3_cksum, + int *gen_l4_cksum) +{ + /* + * Based on ol_flags prepare information + * for pp2_ppio_outq_desc_set_proto_info() which setups descriptor + * for offloading. + */ + if (ol_flags & PKT_TX_IPV4) { + *l3_type = PP2_OUTQ_L3_TYPE_IPV4; + *gen_l3_cksum = ol_flags & PKT_TX_IP_CKSUM ? 1 : 0; + } else if (ol_flags & PKT_TX_IPV6) { + *l3_type = PP2_OUTQ_L3_TYPE_IPV6; + /* no checksum for ipv6 header */ + *gen_l3_cksum = 0; + } else { + /* if something different then stop processing */ + return -1; + } + + ol_flags &= PKT_TX_L4_MASK; + if ((packet_type & RTE_PTYPE_L4_TCP) && + (ol_flags == PKT_TX_TCP_CKSUM)) { + *l4_type = PP2_OUTQ_L4_TYPE_TCP; + *gen_l4_cksum = 1; + } else if ((packet_type & RTE_PTYPE_L4_UDP) && + (ol_flags == PKT_TX_UDP_CKSUM)) { + *l4_type = PP2_OUTQ_L4_TYPE_UDP; + *gen_l4_cksum = 1; + } else { + *l4_type = PP2_OUTQ_L4_TYPE_OTHER; + /* no checksum for other type */ + *gen_l4_cksum = 0; + } + + return 0; +} + +/** + * Release already sent buffers to bpool (buffer-pool). + * + * @param ppio + * Pointer to the port structure. + * @param hif + * Pointer to the MUSDK hardware interface. + * @param sq + * Pointer to the shadow queue. + * @param qid + * Queue id number. + * @param force + * Force releasing packets. + */ +static inline void +mrvl_free_sent_buffers(struct pp2_ppio *ppio, struct pp2_hif *hif, + struct mrvl_shadow_txq *sq, int qid, int force) +{ + struct buff_release_entry *entry; + uint16_t nb_done = 0, num = 0, skip_bufs = 0; + int i, core_id = rte_lcore_id(); + + pp2_ppio_get_num_outq_done(ppio, hif, qid, &nb_done); + + sq->num_to_release += nb_done; + + if (likely(!force && + (sq->num_to_release < MRVL_PP2_BUF_RELEASE_BURST_SIZE))) + return; + + nb_done = sq->num_to_release; + sq->num_to_release = 0; + + for (i = 0; i < nb_done; i++) { + entry = &sq->ent[sq->tail + num]; + if (unlikely(!entry->buff.addr)) { + RTE_LOG(ERR, PMD, + "Shadow memory @%d: cookie(%lx), pa(%lx)!\n", + sq->tail, (u64)entry->buff.cookie, + (u64)entry->buff.addr); + skip_bufs = 1; + goto skip; + } + + if (unlikely(!entry->bpool)) { + struct rte_mbuf *mbuf; + + mbuf = (struct rte_mbuf *) + (cookie_addr_high | entry->buff.cookie); + rte_pktmbuf_free(mbuf); + skip_bufs = 1; + goto skip; + } + + mrvl_port_bpool_size + [entry->bpool->pp2_id][entry->bpool->id][core_id]++; + num++; + if (unlikely(sq->tail + num == MRVL_PP2_TX_SHADOWQ_SIZE)) + goto skip; + continue; +skip: + if (likely(num)) + pp2_bpool_put_buffs(hif, &sq->ent[sq->tail], &num); + num += skip_bufs; + sq->tail = (sq->tail + num) & MRVL_PP2_TX_SHADOWQ_MASK; + sq->size -= num; + num = 0; + } + + if (likely(num)) { + pp2_bpool_put_buffs(hif, &sq->ent[sq->tail], &num); + sq->tail = (sq->tail + num) & MRVL_PP2_TX_SHADOWQ_MASK; + sq->size -= num; + } +} + +/** + * DPDK callback for transmit. + * + * @param txq + * Generic pointer transmit queue. + * @param tx_pkts + * Packets to transmit. + * @param nb_pkts + * Number of packets in array. + * + * @return + * Number of packets successfully transmitted. + */ +static uint16_t +mrvl_tx_pkt_burst(void *txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) +{ + struct mrvl_txq *q = txq; + struct mrvl_shadow_txq *sq = &shadow_txqs[q->port_id][rte_lcore_id()]; + struct pp2_hif *hif = hifs[rte_lcore_id()]; + struct pp2_ppio_desc descs[nb_pkts]; + int i, ret, bytes_sent = 0; + uint16_t num, sq_free_size; + uint64_t addr; + + if (unlikely(!q->priv->ppio)) + return 0; + + if (sq->size) + mrvl_free_sent_buffers(q->priv->ppio, hif, sq, q->queue_id, 0); + + sq_free_size = MRVL_PP2_TX_SHADOWQ_SIZE - sq->size - 1; + if (unlikely(nb_pkts > sq_free_size)) { + RTE_LOG(DEBUG, PMD, + "No room in shadow queue for %d packets!!!" + "%d packets will be sent.\n", + nb_pkts, sq_free_size); + nb_pkts = sq_free_size; + } + + for (i = 0; i < nb_pkts; i++) { + struct rte_mbuf *mbuf = tx_pkts[i]; + int gen_l3_cksum, gen_l4_cksum; + enum pp2_outq_l3_type l3_type; + enum pp2_outq_l4_type l4_type; + + if (likely((nb_pkts - i) > MRVL_MUSDK_PREFETCH_SHIFT)) { + struct rte_mbuf *pref_pkt_hdr; + + pref_pkt_hdr = tx_pkts[i + MRVL_MUSDK_PREFETCH_SHIFT]; + rte_mbuf_prefetch_part1(pref_pkt_hdr); + rte_mbuf_prefetch_part2(pref_pkt_hdr); + } + + sq->ent[sq->head].buff.cookie = (pp2_cookie_t)(uint64_t)mbuf; + sq->ent[sq->head].buff.addr = + rte_mbuf_data_dma_addr_default(mbuf); + sq->ent[sq->head].bpool = + (unlikely(mbuf->port == 0xff || mbuf->refcnt > 1)) ? + NULL : mrvl_port_to_bpool_lookup[mbuf->port]; + sq->head = (sq->head + 1) & MRVL_PP2_TX_SHADOWQ_MASK; + sq->size++; + + pp2_ppio_outq_desc_reset(&descs[i]); + pp2_ppio_outq_desc_set_phys_addr(&descs[i], + rte_pktmbuf_mtophys(mbuf)); + pp2_ppio_outq_desc_set_pkt_offset(&descs[i], 0); + pp2_ppio_outq_desc_set_pkt_len(&descs[i], + rte_pktmbuf_pkt_len(mbuf)); + + bytes_sent += rte_pktmbuf_pkt_len(mbuf); + /* + * in case unsupported ol_flags were passed + * do not update descriptor offload information + */ + ret = mrvl_prepare_proto_info(mbuf->ol_flags, mbuf->packet_type, + &l3_type, &l4_type, &gen_l3_cksum, + &gen_l4_cksum); + if (unlikely(ret)) + continue; + + pp2_ppio_outq_desc_set_proto_info(&descs[i], l3_type, l4_type, + mbuf->l2_len, + mbuf->l2_len + mbuf->l3_len, + gen_l3_cksum, gen_l4_cksum); + } + + num = nb_pkts; + pp2_ppio_send(q->priv->ppio, hif, q->queue_id, + descs, &nb_pkts); + /* number of packets that were not sent */ + if (unlikely(num > nb_pkts)) { + for (i = nb_pkts; i < num; i++) { + sq->head = (MRVL_PP2_TX_SHADOWQ_SIZE + sq->head - 1) & + MRVL_PP2_TX_SHADOWQ_MASK; + addr = cookie_addr_high | + sq->ent[sq->head].buff.cookie; + bytes_sent -= + rte_pktmbuf_pkt_len((struct rte_mbuf *)addr); + } + sq->size -= num - nb_pkts; + } + + q->bytes_sent += bytes_sent; + + return nb_pkts; +} + +/** + * Initialize packet processor. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_init_pp2(void) +{ + struct pp2_init_params init_params; + + memset(&init_params, 0, sizeof(init_params)); + init_params.hif_reserved_map = MRVL_MUSDK_HIFS_RESERVED; + init_params.bm_pool_reserved_map = MRVL_MUSDK_BPOOLS_RESERVED; + init_params.rss_tbl_reserved_map = MRVL_MUSDK_RSS_RESERVED; + + return pp2_init(&init_params); +} + +/** + * Deinitialize packet processor. + * + * @return + * 0 on success, negative error value otherwise. + */ +static void +mrvl_deinit_pp2(void) +{ + pp2_deinit(); +} + +/** + * Create private device structure. + * + * @param dev_name + * Pointer to the port name passed in the initialization parameters. + * + * @return + * Pointer to the newly allocated private device structure. + */ +static struct mrvl_priv * +mrvl_priv_create(const char *dev_name) +{ + struct pp2_bpool_params bpool_params; + char match[MRVL_MATCH_LEN]; + struct mrvl_priv *priv; + int ret, bpool_bit; + + priv = rte_zmalloc_socket(dev_name, sizeof(*priv), 0, rte_socket_id()); + if (!priv) + return NULL; + + ret = pp2_netdev_get_ppio_info((char *)(uintptr_t)dev_name, + &priv->pp_id, &priv->ppio_id); + if (ret) + goto out_free_priv; + + bpool_bit = mrvl_reserve_bit(&used_bpools[priv->pp_id], + PP2_BPOOL_NUM_POOLS); + if (bpool_bit < 0) + goto out_free_priv; + priv->bpool_bit = bpool_bit; + + snprintf(match, sizeof(match), "pool-%d:%d", priv->pp_id, + priv->bpool_bit); + memset(&bpool_params, 0, sizeof(bpool_params)); + bpool_params.match = match; + bpool_params.buff_len = MRVL_PKT_SIZE_MAX + MRVL_PKT_EFFEC_OFFS; + ret = pp2_bpool_init(&bpool_params, &priv->bpool); + if (ret) + goto out_clear_bpool_bit; + + priv->ppio_params.type = PP2_PPIO_T_NIC; + rte_spinlock_init(&priv->lock); + + return priv; +out_clear_bpool_bit: + used_bpools[priv->pp_id] &= ~(1 << priv->bpool_bit); +out_free_priv: + rte_free(priv); + return NULL; +} + +/** + * Create device representing Ethernet port. + * + * @param name + * Pointer to the port's name. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_eth_dev_create(struct rte_vdev_device *vdev, const char *name) +{ + int ret, fd = socket(AF_INET, SOCK_DGRAM, 0); + struct rte_eth_dev *eth_dev; + struct mrvl_priv *priv; + struct ifreq req; + + eth_dev = rte_eth_dev_allocate(name); + if (!eth_dev) + return -ENOMEM; + + priv = mrvl_priv_create(name); + if (!priv) { + ret = -ENOMEM; + goto out_free_dev; + } + + eth_dev->data->mac_addrs = rte_zmalloc("mac_addrs", + ETHER_ADDR_LEN * MRVL_MAC_ADDRS_MAX, 0); + if (!eth_dev->data->mac_addrs) { + RTE_LOG(ERR, PMD, "Failed to allocate space for eth addrs\n"); + ret = -ENOMEM; + goto out_free_priv; + } + + memset(&req, 0, sizeof(req)); + strcpy(req.ifr_name, name); + ret = ioctl(fd, SIOCGIFHWADDR, &req); + if (ret) + goto out_free_mac; + + memcpy(eth_dev->data->mac_addrs[0].addr_bytes, + req.ifr_addr.sa_data, ETHER_ADDR_LEN); + + + eth_dev->rx_pkt_burst = mrvl_rx_pkt_burst; + eth_dev->tx_pkt_burst = mrvl_tx_pkt_burst; + eth_dev->data->dev_private = priv; + eth_dev->device = &vdev->device; + eth_dev->dev_ops = &mrvl_ops; + + return 0; +out_free_mac: + rte_free(eth_dev->data->mac_addrs); +out_free_dev: + rte_eth_dev_release_port(eth_dev); +out_free_priv: + rte_free(priv); + + return ret; +} + +/** + * Cleanup previously created device representing Ethernet port. + * + * @param name + * Pointer to the port name. + */ +static void +mrvl_eth_dev_destroy(const char *name) +{ + struct rte_eth_dev *eth_dev; + struct mrvl_priv *priv; + + eth_dev = rte_eth_dev_allocated(name); + if (!eth_dev) + return; + + priv = eth_dev->data->dev_private; + pp2_bpool_deinit(priv->bpool); + rte_free(priv); + rte_free(eth_dev->data->mac_addrs); + rte_eth_dev_release_port(eth_dev); +} + +/** + * Callback used by rte_kvargs_process() during argument parsing. + * + * @param key + * Pointer to the parsed key (unused). + * @param value + * Pointer to the parsed value. + * @param extra_args + * Pointer to the extra arguments which contains address of the + * table of pointers to parsed interface names. + * + * @return + * Always 0. + */ +static int +mrvl_get_ifnames(const char *key __rte_unused, const char *value, + void *extra_args) +{ + const char **ifnames = extra_args; + + ifnames[mrvl_ports_nb++] = value; + + return 0; +} + +/** + * Initialize per-lcore MUSDK hardware interfaces (hifs). + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +mrvl_init_hifs(void) +{ + struct pp2_hif_params params; + char match[MRVL_MATCH_LEN]; + int i, ret; + + RTE_LCORE_FOREACH(i) { + ret = mrvl_reserve_bit(&used_hifs, MRVL_MUSDK_HIFS_MAX); + if (ret < 0) + return ret; + + snprintf(match, sizeof(match), "hif-%d", ret); + memset(¶ms, 0, sizeof(params)); + params.match = match; + params.out_size = MRVL_PP2_AGGR_TXQD_MAX; + ret = pp2_hif_init(¶ms, &hifs[i]); + if (ret) { + RTE_LOG(ERR, PMD, "Failed to initialize hif %d\n", i); + return ret; + } + } + + return 0; +} + +/** + * Deinitialize per-lcore MUSDK hardware interfaces (hifs). + */ +static void +mrvl_deinit_hifs(void) +{ + int i; + + RTE_LCORE_FOREACH(i) { + if (hifs[i]) + pp2_hif_deinit(hifs[i]); + } +} + +static void mrvl_set_first_last_cores(int core_id) +{ + if (core_id < mrvl_lcore_first) + mrvl_lcore_first = core_id; + + if (core_id > mrvl_lcore_last) + mrvl_lcore_last = core_id; +} + +/** + * DPDK callback to register the virtual device. + * + * @param vdev + * Pointer to the virtual device. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +rte_pmd_mrvl_probe(struct rte_vdev_device *vdev) +{ + struct rte_kvargs *kvlist; + const char *ifnames[PP2_NUM_ETH_PPIO * PP2_NUM_PKT_PROC]; + int ret = -EINVAL; + uint32_t i, ifnum, cfgnum, core_id; + const char *params; + + params = rte_vdev_device_args(vdev); + if (!params) + return -EINVAL; + + kvlist = rte_kvargs_parse(params, valid_args); + if (!kvlist) + return -EINVAL; + + ifnum = rte_kvargs_count(kvlist, MRVL_IFACE_NAME_ARG); + if (ifnum > RTE_DIM(ifnames)) + goto out_free_kvlist; + + rte_kvargs_process(kvlist, MRVL_IFACE_NAME_ARG, + mrvl_get_ifnames, &ifnames); + + cfgnum = rte_kvargs_count(kvlist, MRVL_CFG_ARG); + if (cfgnum > 1) { + RTE_LOG(ERR, PMD, "Cannot handle more than one config file!\n"); + goto out_free_kvlist; + } else if (cfgnum == 1) { + rte_kvargs_process(kvlist, MRVL_CFG_ARG, + mrvl_get_qoscfg, &mrvl_qos_cfg); + } + + /* + * ret == -EEXIST is correct, it means DMA + * has been already initialized (by another PMD). + */ + ret = mv_sys_dma_mem_init(RTE_MRVL_MUSDK_DMA_MEMSIZE); + if ((ret < 0) && (ret != -EEXIST)) + goto out_free_kvlist; + + ret = mrvl_init_pp2(); + if (ret) { + RTE_LOG(ERR, PMD, "Failed to init PP!\n"); + goto out_deinit_dma; + } + + ret = mrvl_init_hifs(); + if (ret) + goto out_deinit_hifs; + + for (i = 0; i < ifnum; i++) { + RTE_LOG(INFO, PMD, "Creating %s\n", ifnames[i]); + ret = mrvl_eth_dev_create(vdev, ifnames[i]); + if (ret) + goto out_cleanup; + } + + rte_kvargs_free(kvlist); + + memset(mrvl_port_bpool_size, 0, sizeof(mrvl_port_bpool_size)); + + mrvl_lcore_first = RTE_MAX_LCORE; + mrvl_lcore_last = 0; + + RTE_LCORE_FOREACH(core_id) { + mrvl_set_first_last_cores(core_id); + } + + return 0; +out_cleanup: + for (; i > 0; i--) + mrvl_eth_dev_destroy(ifnames[i]); +out_deinit_hifs: + mrvl_deinit_hifs(); + mrvl_deinit_pp2(); +out_deinit_dma: + mv_sys_dma_mem_destroy(); +out_free_kvlist: + rte_kvargs_free(kvlist); + + return ret; +} + +/** + * DPDK callback to remove virtual device. + * + * @param vdev + * Pointer to the removed virtual device. + * + * @return + * 0 on success, negative error value otherwise. + */ +static int +rte_pmd_mrvl_remove(struct rte_vdev_device *vdev) +{ + int i; + const char *name; + + name = rte_vdev_device_name(vdev); + if (!name) + return -EINVAL; + + RTE_LOG(INFO, PMD, "Removing %s\n", name); + + for (i = 0; i < rte_eth_dev_count(); i++) { + char ifname[RTE_ETH_NAME_MAX_LEN]; + + rte_eth_dev_get_name_by_port(i, ifname); + mrvl_eth_dev_destroy(ifname); + } + + mrvl_deinit_hifs(); + mrvl_deinit_pp2(); + mv_sys_dma_mem_destroy(); + + return 0; +} + +static struct rte_vdev_driver pmd_mrvl_drv = { + .probe = rte_pmd_mrvl_probe, + .remove = rte_pmd_mrvl_remove, +}; + +RTE_PMD_REGISTER_VDEV(net_mrvl, pmd_mrvl_drv); +RTE_PMD_REGISTER_ALIAS(net_mrvl, eth_mrvl); diff --git a/drivers/net/mrvl/mrvl_ethdev.h b/drivers/net/mrvl/mrvl_ethdev.h new file mode 100644 index 0000000..9fc9624 --- /dev/null +++ b/drivers/net/mrvl/mrvl_ethdev.h @@ -0,0 +1,115 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 Semihalf. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Semihalf nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _MRVL_ETHDEV_H_ +#define _MRVL_ETHDEV_H_ + +#include +#include +#include + + +/** Maximum number of rx queues per port */ +#define MRVL_PP2_RXQ_MAX 32 + +/** Maximum number of tx queues per port */ +#define MRVL_PP2_TXQ_MAX 8 + +/** Minimum number of descriptors in tx queue */ +#define MRVL_PP2_TXD_MIN 16 + +/** Maximum number of descriptors in tx queue */ +#define MRVL_PP2_TXD_MAX 2048 + +/** Tx queue descriptors alignment */ +#define MRVL_PP2_TXD_ALIGN 16 + +/** Minimum number of descriptors in rx queue */ +#define MRVL_PP2_RXD_MIN 16 + +/** Maximum number of descriptors in rx queue */ +#define MRVL_PP2_RXD_MAX 2048 + +/** Rx queue descriptors alignment */ +#define MRVL_PP2_RXD_ALIGN 16 + +/** Maximum number of descriptors in tx aggregated queue */ +#define MRVL_PP2_AGGR_TXQD_MAX 2048 + +/** Maximum number of Traffic Classes. */ +#define MRVL_PP2_TC_MAX 8 + +/** Packet offset inside RX buffer. */ +#define MRVL_PKT_OFFS 64 + +/** Maximum number of descriptors in shadow queue. Must be power of 2 */ +#define MRVL_PP2_TX_SHADOWQ_SIZE MRVL_PP2_TXD_MAX + +/** Shadow queue size mask (since shadow queue size is power of 2) */ +#define MRVL_PP2_TX_SHADOWQ_MASK (MRVL_PP2_TX_SHADOWQ_SIZE - 1) + +/** Minimum number of sent buffers to release from shadow queue to BM */ +#define MRVL_PP2_BUF_RELEASE_BURST_SIZE 64 + +struct mrvl_priv { + /* Hot fields, used in fast path. */ + struct pp2_bpool *bpool; /**< BPool pointer */ + struct pp2_ppio *ppio; /**< Port handler pointer */ + rte_spinlock_t lock; /**< Spinlock for checking bpool status */ + uint16_t bpool_max_size; /**< BPool maximum size */ + uint16_t bpool_min_size; /**< BPool minimum size */ + uint16_t bpool_init_size; /**< Configured BPool size */ + + /** Mapping for DPDK rx queue->(TC, MRVL relative inq) */ + struct { + uint8_t tc; /**< Traffic Class */ + uint8_t inq; /**< Relative in-queue number */ + } rxq_map[MRVL_PP2_RXQ_MAX] __rte_cache_aligned; + + /* Configuration data, used sporadically. */ + uint8_t pp_id; + uint8_t ppio_id; + uint8_t bpool_bit; + uint8_t rss_hf_tcp; + uint8_t uc_mc_flushed; + uint8_t vlan_flushed; + + struct pp2_ppio_params ppio_params; + struct pp2_cls_qos_tbl_params qos_tbl_params; + struct pp2_cls_tbl *qos_tbl; + uint16_t nb_rx_queues; +}; + +/** Number of ports configured. */ +extern int mrvl_ports_nb; + +#endif /* _MRVL_ETHDEV_H_ */ diff --git a/drivers/net/mrvl/mrvl_qos.c b/drivers/net/mrvl/mrvl_qos.c new file mode 100644 index 0000000..049ef97 --- /dev/null +++ b/drivers/net/mrvl/mrvl_qos.c @@ -0,0 +1,627 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 Semihalf. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Semihalf nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +/* Unluckily, container_of is defined by both DPDK and MUSDK, + * we'll declare only one version. + * + * Note that it is not used in this PMD anyway. + */ +#ifdef container_of +#undef container_of +#endif + +#include "mrvl_qos.h" + +/* Parsing tokens. Defined conveniently, so that any correction is easy. */ +#define MRVL_TOK_DEFAULT "default" +#define MRVL_TOK_DEFAULT_TC "default_tc" +#define MRVL_TOK_DSCP "dscp" +#define MRVL_TOK_MAPPING_PRIORITY "mapping_priority" +#define MRVL_TOK_IP "ip" +#define MRVL_TOK_IP_VLAN "ip/vlan" +#define MRVL_TOK_PCP "pcp" +#define MRVL_TOK_PORT "port" +#define MRVL_TOK_RXQ "rxq" +#define MRVL_TOK_SP "SP" +#define MRVL_TOK_TC "tc" +#define MRVL_TOK_TXQ "txq" +#define MRVL_TOK_VLAN "vlan" +#define MRVL_TOK_VLAN_IP "vlan/ip" +#define MRVL_TOK_WEIGHT "weight" + +/** Number of tokens in range a-b = 2. */ +#define MAX_RNG_TOKENS 2 + +/** Maximum possible value of PCP. */ +#define MAX_PCP 7 + +/** Maximum possible value of DSCP. */ +#define MAX_DSCP 63 + +/** Global QoS configuration. */ +struct mrvl_qos_cfg *mrvl_qos_cfg; + +/** + * Convert string to uint32_t with extra checks for result correctness. + * + * @param string String to convert. + * @param val Conversion result. + * @returns 0 in case of success, negative value otherwise. + */ +static int +get_val_securely(const char *string, uint32_t *val) +{ + char *endptr; + size_t len = strlen(string); + + if (len == 0) + return -1; + + *val = strtoul(string, &endptr, 0); + if ((errno != 0) || (RTE_PTR_DIFF(endptr, string) != len)) + return -2; + + return 0; +} + +/** + * Read out-queue configuration from file. + * + * @param file Path to the configuration file. + * @param port Port number. + * @param outq Out queue number. + * @param cfg Pointer to the Marvell QoS configuration structure. + * @returns 0 in case of success, negative value otherwise. + */ +static int +get_outq_cfg(struct rte_cfgfile *file, int port, int outq, + struct mrvl_qos_cfg *cfg) +{ + char sec_name[32]; + const char *entry; + uint32_t val; + + snprintf(sec_name, sizeof(sec_name), "%s %d %s %d", + MRVL_TOK_PORT, port, MRVL_TOK_TXQ, outq); + + /* Skip non-existing */ + if (rte_cfgfile_num_sections(file, sec_name, strlen(sec_name)) <= 0) + return 0; + + entry = rte_cfgfile_get_entry(file, sec_name, + MRVL_TOK_WEIGHT); + if (entry) { + if (get_val_securely(entry, &val) < 0) + return -1; + cfg->port[port].outq[outq].weight = (uint8_t)val; + } + + return 0; +} + +/** + * Gets multiple-entry values and places them in table. + * + * Entry can be anything, e.g. "1 2-3 5 6 7-9". This needs to be converted to + * table entries, respectively: {1, 2, 3, 5, 6, 7, 8, 9}. + * As all result table's elements are always 1-byte long, we + * won't overcomplicate the function, but we'll keep API generic, + * check if someone hasn't changed element size and make it simple + * to extend to other sizes. + * + * This function is purely utilitary, it does not print any error, only returns + * different error numbers. + * + * @param entry[in] Values string to parse. + * @param tab[out] Results table. + * @param elem_sz[in] Element size (in bytes). + * @param max_elems[in] Number of results table elements available. + * @param max val[in] Maximum value allowed. + * @returns Number of correctly parsed elements in case of success. + * @retval -1 Wrong element size. + * @retval -2 More tokens than result table allows. + * @retval -3 Wrong range syntax. + * @retval -4 Wrong range values. + * @retval -5 Maximum value exceeded. + */ +static int +get_entry_values(const char *entry, uint8_t *tab, + size_t elem_sz, uint8_t max_elems, uint8_t max_val) +{ + /* There should not be more tokens than max elements. + * Add 1 for error trap. + */ + char *tokens[max_elems + 1]; + + /* Begin, End + error trap = 3. */ + char *rng_tokens[MAX_RNG_TOKENS + 1]; + long beg, end; + uint32_t token_val; + int nb_tokens, nb_rng_tokens; + int i; + int values = 0; + char val; + char entry_cpy[CFG_VALUE_LEN]; + + if (elem_sz != 1) + return -1; + + /* Copy the entry to safely use rte_strsplit(). */ + snprintf(entry_cpy, RTE_DIM(entry_cpy), "%s", entry); + + /* + * If there are more tokens than array size, rte_strsplit will + * not return error, just array size. + */ + nb_tokens = rte_strsplit(entry_cpy, strlen(entry_cpy), + tokens, max_elems + 1, ' '); + + /* Quick check, will be refined later. */ + if (nb_tokens > max_elems) + return -2; + + for (i = 0; i < nb_tokens; ++i) { + if (strchr(tokens[i], '-') != NULL) { + /* + * Split to begin and end tokens. + * We want to catch error cases too, thus we leave + * option for number of tokens to be more than 2. + */ + nb_rng_tokens = rte_strsplit(tokens[i], + strlen(tokens[i]), rng_tokens, + RTE_DIM(rng_tokens), '-'); + if (nb_rng_tokens != 2) + return -3; + + /* Range and sanity checks. */ + if (get_val_securely(rng_tokens[0], &token_val) < 0) + return -4; + beg = (char)token_val; + if (get_val_securely(rng_tokens[1], &token_val) < 0) + return -4; + end = (char)token_val; + if ((beg < 0) || (beg > UCHAR_MAX) || + (end < 0) || (end > UCHAR_MAX) || (end < beg)) + return -4; + + for (val = beg; val <= end; ++val) { + if (val > max_val) + return -5; + + *tab = val; + tab = RTE_PTR_ADD(tab, elem_sz); + ++values; + if (values >= max_elems) + return -2; + } + } else { + /* Single values. */ + if (get_val_securely(tokens[i], &token_val) < 0) + return -5; + val = (char)token_val; + if (val > max_val) + return -5; + + *tab = val; + tab = RTE_PTR_ADD(tab, elem_sz); + ++values; + if (values >= max_elems) + return -2; + } + } + + return values; +} + +/** + * Parse Traffic Class'es mapping configuration. + * + * @param file Config file handle. + * @param port Which port to look for. + * @param tc Which Traffic Class to look for. + * @param cfg[out] Parsing results. + * @returns 0 in case of success, negative value otherwise. + */ +static int +parse_tc_cfg(struct rte_cfgfile *file, int port, int tc, + struct mrvl_qos_cfg *cfg) +{ + char sec_name[32]; + const char *entry; + int n; + + snprintf(sec_name, sizeof(sec_name), "%s %d %s %d", + MRVL_TOK_PORT, port, MRVL_TOK_TC, tc); + + /* Skip non-existing */ + if (rte_cfgfile_num_sections(file, sec_name, strlen(sec_name)) <= 0) + return 0; + + entry = rte_cfgfile_get_entry(file, sec_name, MRVL_TOK_RXQ); + if (entry) { + n = get_entry_values(entry, + cfg->port[port].tc[tc].inq, + sizeof(cfg->port[port].tc[tc].inq[0]), + RTE_DIM(cfg->port[port].tc[tc].inq), + MRVL_PP2_RXQ_MAX); + if (n < 0) { + RTE_LOG(ERR, PMD, "Error %d while parsing: %s\n", + n, entry); + return n; + } + cfg->port[port].tc[tc].inqs = n; + } + + entry = rte_cfgfile_get_entry(file, sec_name, MRVL_TOK_PCP); + if (entry) { + n = get_entry_values(entry, + cfg->port[port].tc[tc].pcp, + sizeof(cfg->port[port].tc[tc].pcp[0]), + RTE_DIM(cfg->port[port].tc[tc].pcp), + MAX_PCP); + if (n < 0) { + RTE_LOG(ERR, PMD, "Error %d while parsing: %s\n", + n, entry); + return n; + } + cfg->port[port].tc[tc].pcps = n; + } + + entry = rte_cfgfile_get_entry(file, sec_name, MRVL_TOK_DSCP); + if (entry) { + n = get_entry_values(entry, + cfg->port[port].tc[tc].dscp, + sizeof(cfg->port[port].tc[tc].dscp[0]), + RTE_DIM(cfg->port[port].tc[tc].dscp), + MAX_DSCP); + if (n < 0) { + RTE_LOG(ERR, PMD, "Error %d while parsing: %s\n", + n, entry); + return n; + } + cfg->port[port].tc[tc].dscps = n; + } + return 0; +} + +/** + * Parse QoS configuration - rte_kvargs_process handler. + * + * Opens configuration file and parses its content. + * + * @param key Unused. + * @param path Path to config file. + * @param extra_args Pointer to configuration structure. + * @returns 0 in case of success, exits otherwise. + */ +int +mrvl_get_qoscfg(const char *key __rte_unused, const char *path, + void *extra_args) +{ + struct mrvl_qos_cfg **cfg = extra_args; + struct rte_cfgfile *file = rte_cfgfile_load(path, 0); + uint32_t val; + int n, i, ret; + const char *entry; + char sec_name[32]; + + if (file == NULL) + rte_exit(EXIT_FAILURE, "Cannot load configuration %s\n", path); + + /* Create configuration. This is never accessed on the fast path, + * so we can ignore socket. + */ + *cfg = rte_zmalloc("mrvl_qos_cfg", sizeof(struct mrvl_qos_cfg), 0); + if (*cfg == NULL) + rte_exit(EXIT_FAILURE, "Cannot allocate configuration %s\n", + path); + + n = rte_cfgfile_num_sections(file, MRVL_TOK_PORT, + sizeof(MRVL_TOK_PORT) - 1); + + if (n == 0) { + /* This is weird, but not bad. */ + RTE_LOG(WARNING, PMD, "Empty configuration file?\n"); + return 0; + } + + /* Use the number of ports given as vdev parameters. */ + for (n = 0; n < mrvl_ports_nb; ++n) { + snprintf(sec_name, sizeof(sec_name), "%s %d %s", + MRVL_TOK_PORT, n, MRVL_TOK_DEFAULT); + + /* Skip ports non-existing in configuration. */ + if (rte_cfgfile_num_sections(file, sec_name, + strlen(sec_name)) <= 0) { + (*cfg)->port[n].use_global_defaults = 1; + (*cfg)->port[n].mapping_priority = + PP2_CLS_QOS_TBL_VLAN_IP_PRI; + continue; + } + + entry = rte_cfgfile_get_entry(file, sec_name, + MRVL_TOK_DEFAULT_TC); + if (entry) { + if ((get_val_securely(entry, &val) < 0) || + (val > USHRT_MAX)) + return -1; + (*cfg)->port[n].default_tc = (uint8_t)val; + } else { + RTE_LOG(ERR, PMD, + "Default Traffic Class required in custom configuration!\n"); + return -1; + } + + entry = rte_cfgfile_get_entry(file, sec_name, + MRVL_TOK_MAPPING_PRIORITY); + if (entry) { + if (!strncmp(entry, MRVL_TOK_VLAN_IP, + sizeof(MRVL_TOK_VLAN_IP))) + (*cfg)->port[n].mapping_priority = + PP2_CLS_QOS_TBL_VLAN_IP_PRI; + else if (!strncmp(entry, MRVL_TOK_IP_VLAN, + sizeof(MRVL_TOK_IP_VLAN))) + (*cfg)->port[n].mapping_priority = + PP2_CLS_QOS_TBL_IP_VLAN_PRI; + else if (!strncmp(entry, MRVL_TOK_IP, + sizeof(MRVL_TOK_IP))) + (*cfg)->port[n].mapping_priority = + PP2_CLS_QOS_TBL_IP_PRI; + else if (!strncmp(entry, MRVL_TOK_VLAN, + sizeof(MRVL_TOK_VLAN))) + (*cfg)->port[n].mapping_priority = + PP2_CLS_QOS_TBL_VLAN_PRI; + else + rte_exit(EXIT_FAILURE, + "Error in parsing %s value (%s)!\n", + MRVL_TOK_MAPPING_PRIORITY, entry); + } else { + (*cfg)->port[n].mapping_priority = + PP2_CLS_QOS_TBL_VLAN_IP_PRI; + } + + for (i = 0; i < MRVL_PP2_RXQ_MAX; ++i) { + ret = get_outq_cfg(file, n, i, *cfg); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Error %d parsing port %d outq %d!\n", + ret, n, i); + } + + for (i = 0; i < MRVL_PP2_TC_MAX; ++i) { + ret = parse_tc_cfg(file, n, i, *cfg); + if (ret < 0) + rte_exit(EXIT_FAILURE, + "Error %d parsing port %d tc %d!\n", + ret, n, i); + } + } + + return 0; +} + +/** + * Setup Traffic Class. + * + * Fill in TC parameters in single MUSDK TC config entry. + * @param param TC parameters entry. + * @param inqs Number of MUSDK in-queues in this TC. + * @param bpool Bpool for this TC. + * @returns 0 in case of success, exits otherwise. + */ +static int +setup_tc(struct pp2_ppio_tc_params *param, uint8_t inqs, + struct pp2_bpool *bpool) +{ + struct pp2_ppio_inq_params *inq_params; + + param->pkt_offset = MRVL_PKT_OFFS; + param->pools[0] = bpool; + + inq_params = rte_zmalloc_socket("inq_params", + inqs * sizeof(*inq_params), + 0, rte_socket_id()); + if (!inq_params) + return -ENOMEM; + + param->num_in_qs = inqs; + + /* Release old config if necessary. */ + if (param->inqs_params) + rte_free(param->inqs_params); + + param->inqs_params = inq_params; + + return 0; +} + +/** + * Configure RX Queues in a given port. + * + * Sets up RX queues, their Traffic Classes and DPDK rxq->(TC,inq) mapping. + * + * @param priv Port's private data + * @param portid DPDK port ID + * @param max_queues Maximum number of queues to configure. + * @returns 0 in case of success, negative value otherwise. + */ +int +mrvl_configure_rxqs(struct mrvl_priv *priv, uint8_t portid, + uint16_t max_queues) +{ + size_t i, tc; + + if ((mrvl_qos_cfg == NULL) || + (mrvl_qos_cfg->port[portid].use_global_defaults)) { + /* No port configuration, use default: 1 TC, no QoS. */ + priv->ppio_params.inqs_params.num_tcs = 1; + setup_tc(&priv->ppio_params.inqs_params.tcs_params[0], + max_queues, priv->bpool); + + /* Direct mapping of queues i.e. 0->0, 1->1 etc. */ + for (i = 0; i < max_queues; ++i) { + priv->rxq_map[i].tc = 0; + priv->rxq_map[i].inq = i; + } + return 0; + } + + /* We need only a subset of configuration. */ + struct port_cfg *port_cfg = &mrvl_qos_cfg->port[portid]; + + priv->qos_tbl_params.type = port_cfg->mapping_priority; + + /* + * We need to reverse mapping, from tc->pcp (better from usability + * point of view) to pcp->tc (configurable in MUSDK). + * First, set all map elements to "default". + */ + for (i = 0; i < RTE_DIM(priv->qos_tbl_params.pcp_cos_map); ++i) + priv->qos_tbl_params.pcp_cos_map[i].tc = port_cfg->default_tc; + + /* Then, fill in all known values. */ + for (tc = 0; tc < RTE_DIM(port_cfg->tc); ++tc) { + if (port_cfg->tc[tc].pcps > RTE_DIM(port_cfg->tc[0].pcp)) { + /* Better safe than sorry. */ + RTE_LOG(ERR, PMD, + "Too many PCPs configured in TC %zu!\n", tc); + return -1; + } + for (i = 0; i < port_cfg->tc[tc].pcps; ++i) { + priv->qos_tbl_params.pcp_cos_map[ + port_cfg->tc[tc].pcp[i]].tc = tc; + } + } + + /* + * The same logic goes with DSCP. + * First, set all map elements to "default". + */ + for (i = 0; i < RTE_DIM(priv->qos_tbl_params.dscp_cos_map); ++i) + priv->qos_tbl_params.dscp_cos_map[i].tc = + port_cfg->default_tc; + + /* Fill in all known values. */ + for (tc = 0; tc < RTE_DIM(port_cfg->tc); ++tc) { + if (port_cfg->tc[tc].dscps > RTE_DIM(port_cfg->tc[0].dscp)) { + /* Better safe than sorry. */ + RTE_LOG(ERR, PMD, + "Too many DSCPs configured in TC %zu!\n", tc); + return -1; + } + for (i = 0; i < port_cfg->tc[tc].dscps; ++i) { + priv->qos_tbl_params.dscp_cos_map[ + port_cfg->tc[tc].dscp[i]].tc = tc; + } + } + + /* + * Surprisingly, similar logic goes with queue mapping. + * We need only to store qid->tc mapping, + * to know TC when queue is read. + */ + for (i = 0; i < RTE_DIM(priv->rxq_map); ++i) + priv->rxq_map[i].tc = MRVL_UNKNOWN_TC; + + /* Set up DPDKq->(TC,inq) mapping. */ + for (tc = 0; tc < RTE_DIM(port_cfg->tc); ++tc) { + if (port_cfg->tc[tc].inqs > RTE_DIM(port_cfg->tc[0].inq)) { + /* Overflow. */ + RTE_LOG(ERR, PMD, + "Too many RX queues configured per TC %zu!\n", + tc); + return -1; + } + for (i = 0; i < port_cfg->tc[tc].inqs; ++i) { + uint8_t idx = port_cfg->tc[tc].inq[i]; + priv->rxq_map[idx].tc = tc; + priv->rxq_map[idx].inq = i; + } + } + + /* + * Set up TC configuration. TCs need to be sequenced: 0, 1, 2 + * with no gaps. Empty TC means end of processing. + */ + for (i = 0; i < MRVL_PP2_TC_MAX; ++i) { + if (port_cfg->tc[i].inqs == 0) + break; + setup_tc(&priv->ppio_params.inqs_params.tcs_params[i], + port_cfg->tc[i].inqs, + priv->bpool); + } + + priv->ppio_params.inqs_params.num_tcs = i; + + return 0; +} + +/** + * Start QoS mapping. + * + * Finalize QoS table configuration and initialize it in SDK. It can be done + * only after port is started, so we have a valid ppio reference. + * + * @param priv Port's private (configuration) data. + * @returns 0 in case of success, exits otherwise. + */ +int +mrvl_start_qos_mapping(struct mrvl_priv *priv) +{ + size_t i; + + if (priv->ppio == NULL) { + RTE_LOG(ERR, PMD, "ppio must not be NULL here!\n"); + return -1; + } + + for (i = 0; i < RTE_DIM(priv->qos_tbl_params.pcp_cos_map); ++i) + priv->qos_tbl_params.pcp_cos_map[i].ppio = priv->ppio; + + for (i = 0; i < RTE_DIM(priv->qos_tbl_params.dscp_cos_map); ++i) + priv->qos_tbl_params.dscp_cos_map[i].ppio = priv->ppio; + + /* Initialize Classifier QoS table. */ + + return pp2_cls_qos_tbl_init(&priv->qos_tbl_params, &priv->qos_tbl); +} diff --git a/drivers/net/mrvl/mrvl_qos.h b/drivers/net/mrvl/mrvl_qos.h new file mode 100644 index 0000000..8542db6 --- /dev/null +++ b/drivers/net/mrvl/mrvl_qos.h @@ -0,0 +1,112 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 Semihalf. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Semihalf nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _MRVL_QOS_H_ +#define _MRVL_QOS_H_ + +#include +#include + +#include "mrvl_ethdev.h" + +/** Code Points per Traffic Class. Equals max(DSCP, PCP). */ +#define MRVL_CP_PER_TC (64) + +/** Value used as "unknown". */ +#define MRVL_UNKNOWN_TC (0xFF) + +/* QoS config. */ +struct mrvl_qos_cfg { + struct port_cfg { + struct { + uint8_t inq[MRVL_PP2_RXQ_MAX]; + uint8_t dscp[MRVL_CP_PER_TC]; + uint8_t pcp[MRVL_CP_PER_TC]; + uint8_t inqs; + uint8_t dscps; + uint8_t pcps; + } tc[MRVL_PP2_TC_MAX]; + struct { + uint8_t weight; + } outq[MRVL_PP2_RXQ_MAX]; + enum pp2_cls_qos_tbl_type mapping_priority; + uint16_t inqs; + uint16_t outqs; + uint8_t default_tc; + uint8_t use_global_defaults; + } port[RTE_MAX_ETHPORTS]; +}; + +/** Global QoS configuration. */ +extern struct mrvl_qos_cfg *mrvl_qos_cfg; + +/** + * Parse QoS configuration - rte_kvargs_process handler. + * + * Opens configuration file and parses its content. + * + * @param key Unused. + * @param path Path to config file. + * @param extra_args Pointer to configuration structure. + * @returns 0 in case of success, exits otherwise. + */ +int +mrvl_get_qoscfg(const char *key __rte_unused, const char *path, + void *extra_args); + +/** + * Configure RX Queues in a given port. + * + * Sets up RX queues, their Traffic Classes and DPDK rxq->(TC,inq) mapping. + * + * @param priv Port's private data + * @param portid DPDK port ID + * @param max_queues Maximum number of queues to configure. + * @returns 0 in case of success, negative value otherwise. + */ +int +mrvl_configure_rxqs(struct mrvl_priv *priv, uint8_t portid, + uint16_t max_queues); + +/** + * Start QoS mapping. + * + * Finalize QoS table configuration and initialize it in SDK. It can be done + * only after port is started, so we have a valid ppio reference. + * + * @param priv Port's private (configuration) data. + * @returns 0 in case of success, exits otherwise. + */ +int +mrvl_start_qos_mapping(struct mrvl_priv *priv); + +#endif /* _MRVL_QOS_H_ */ diff --git a/drivers/net/mrvl/rte_pmd_mrvl_version.map b/drivers/net/mrvl/rte_pmd_mrvl_version.map new file mode 100644 index 0000000..a753031 --- /dev/null +++ b/drivers/net/mrvl/rte_pmd_mrvl_version.map @@ -0,0 +1,3 @@ +DPDK_17.11 { + local: *; +}; diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 94568a8..8df74bb 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -130,6 +130,7 @@ endif _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += -lrte_pmd_lio _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -lrte_pmd_mlx4 -libverbs _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_pmd_mlx5 -libverbs +_LDLIBS-$(CONFIG_RTE_LIBRTE_MRVL_PMD) += -lrte_pmd_mrvl -L$(LIBMUSDK_PATH)/lib -lmusdk _LDLIBS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += -lrte_pmd_nfp _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += -lrte_pmd_null _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += -lrte_pmd_pcap -lpcap -- 2.7.4