From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f52.google.com (mail-wm0-f52.google.com [74.125.82.52]) by dpdk.org (Postfix) with ESMTP id 879103DC for ; Wed, 8 Mar 2017 16:16:10 +0100 (CET) Received: by mail-wm0-f52.google.com with SMTP id v203so16620703wmg.0 for ; Wed, 08 Mar 2017 07:16:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references:in-reply-to :references; bh=7yPqLm/d5+vk0npsDKitV5S/YYOVL8wHX5jWvvcUPnY=; b=ISdLb0EZv/YArF0t8w7/MmyNP1EHOS5IhTMRoo30W1NbJq30LIK6txokCPBrpLIxsI Gs3Hb5oKIw4pjua0evMeoEYKo0HQIb8/TGu5FmTVsz+xdd5JO1XpvjkFtXFXo4XQaTge Y4TIjM/8o04iTr4GHQPkCHkBYgX2Vq2wFP0+5Gf4tY9EhDznkgUfDSXaSNWYH+npTOHS 2qWifyDjWQH9I0AojUxS0/J7OtpequZOKlGxgWWxi1Ux5E3Ov+OOsCrIKjg0yynSJHH+ 0JEaN4RKvzBYUDo6ZQ0KKuDbAUEfLNnJUamZjkAi4Ds3IIxCn+lMMFJt9NmM2ibUVRbM BmDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=7yPqLm/d5+vk0npsDKitV5S/YYOVL8wHX5jWvvcUPnY=; b=GeUg9Qmd3LKDuCX+ZeC91leOtJLU62cBSzqeJACX0pUswOq7J5qu9MZWLWWXRqa0gZ sey+qMHmyS+m1pYDFxgGHh5oQOIbp+fTVcMuY20U3JZQelzWRj+58Z3okP1yaoNpIPrD aEc+lIfrobETTqgIkDzZt2bzMP8XKXWIsujhT18Tm7nYL+RYjHPOQaglMFhckGVXuY8W Hfe3qJ4OG1j8AGxOWOYFBeVQGU7nRB0G1FYYZMl6jYa8a2sTP5mi6FPyNttuQlsWM3XN 1TjUj4upXIRzW8LnL7+rc7yKA/PL/aay+jAKv8MvQaPwzTVrKBJyXh1undc9PuFG63JR 71Og== X-Gm-Message-State: AMke39l19UV1fjI2T0A4XuSVE0Tr34F75bl3OcEXewyBNtanWrHaQ160oxxVanwhAXTE9ArT X-Received: by 10.28.169.87 with SMTP id s84mr22686900wme.77.1488986168386; Wed, 08 Mar 2017 07:16:08 -0800 (PST) Received: from bidouze.dev.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id t103sm4553592wrc.43.2017.03.08.07.16.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 08 Mar 2017 07:16:07 -0800 (PST) From: Gaetan Rivet To: dev@dpdk.org Date: Wed, 8 Mar 2017 16:15:39 +0100 Message-Id: <1d50a6f160bd12c3f24fc9b6a260edb6b03ac394.1488985489.git.gaetan.rivet@6wind.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH v2 06/13] net/failsafe: add fail-safe PMD X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Mar 2017 15:16:10 -0000 Introduce the fail-safe poll mode driver initialization and enable its build infrastructure. This PMD allows for applications to benefits from true hot-plugging support without having to implement it. It intercepts and manages Ethernet device removal events issued by slave PMDs and re-initializes them transparently when brought back. It also allows defining a contingency to the removal of a device, by designating a fail-over device that will take on transmitting operations if the preferred device is removed. Applications only see a fail-safe instance, without caring for underlying activity ensuring their continued operations. Signed-off-by: Gaetan Rivet Acked-by: Olga Shern --- MAINTAINERS | 5 + config/common_base | 5 + doc/guides/nics/fail_safe.rst | 133 +++++++ doc/guides/nics/features/failsafe.ini | 24 ++ doc/guides/nics/index.rst | 1 + drivers/net/Makefile | 1 + drivers/net/failsafe/Makefile | 72 ++++ drivers/net/failsafe/failsafe.c | 229 +++++++++++ drivers/net/failsafe/failsafe_args.c | 347 ++++++++++++++++ drivers/net/failsafe/failsafe_eal.c | 318 +++++++++++++++ drivers/net/failsafe/failsafe_ops.c | 677 ++++++++++++++++++++++++++++++++ drivers/net/failsafe/failsafe_private.h | 232 +++++++++++ drivers/net/failsafe/failsafe_rxtx.c | 107 +++++ mk/rte.app.mk | 1 + 14 files changed, 2152 insertions(+) create mode 100644 doc/guides/nics/fail_safe.rst create mode 100644 doc/guides/nics/features/failsafe.ini create mode 100644 drivers/net/failsafe/Makefile create mode 100644 drivers/net/failsafe/failsafe.c create mode 100644 drivers/net/failsafe/failsafe_args.c create mode 100644 drivers/net/failsafe/failsafe_eal.c create mode 100644 drivers/net/failsafe/failsafe_ops.c create mode 100644 drivers/net/failsafe/failsafe_private.h create mode 100644 drivers/net/failsafe/failsafe_rxtx.c diff --git a/MAINTAINERS b/MAINTAINERS index cc3bf98..ab9ed0c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -315,6 +315,11 @@ M: Matej Vido F: drivers/net/szedata2/ F: doc/guides/nics/szedata2.rst +Fail-safe PMD +M: Gaetan Rivet +F: drivers/net/failsafe/ +F: doc/guides/nics/fail_safe.rst + Intel e1000 M: Wenzhuo Lu F: drivers/net/e1000/ diff --git a/config/common_base b/config/common_base index 71a4fcb..ae64a5b 100644 --- a/config/common_base +++ b/config/common_base @@ -364,6 +364,11 @@ CONFIG_RTE_LIBRTE_PMD_XENVIRT=n CONFIG_RTE_LIBRTE_PMD_NULL=y # +# Compile fail-safe PMD +# +CONFIG_RTE_LIBRTE_PMD_FAILSAFE=y + +# # Do prefetch of packet data within PMD driver receive function # CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/doc/guides/nics/fail_safe.rst b/doc/guides/nics/fail_safe.rst new file mode 100644 index 0000000..056f85f --- /dev/null +++ b/doc/guides/nics/fail_safe.rst @@ -0,0 +1,133 @@ +.. BSD LICENSE + Copyright 2017 6WIND S.A. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of 6WIND S.A. nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Fail-safe poll mode driver library +================================== + +The Fail-safe poll mode driver library (**librte_pmd_failsafe**) is a virtual +device that allows using any device supporting hotplug (sudden device removal +and plugging on its bus), without modifying other components relying on such +device (application, other PMDs). + +Additionally to the Seamless Hotplug feature, the Fail-safe PMD offers the +ability to redirect operations to secondary devices when the primary has been +removed from the system. + +.. note:: + + The library is enabled by default. You can enable it or disable it manually + by setting the ``CONFIG_RTE_LIBRTE_PMD_FAILSAFE`` configuration option. + +Features +-------- + +The Fail-safe PMD only supports a limited set of features. If you plan to use a +device underneath the Fail-safe PMD with a specific feature, this feature must +be supported by the Fail-safe PMD to avoid throwing any error. + +Check the feature matrix for the complete set of supported features. + +Compilation options +------------------- + +These options can be modified in the ``$RTE_TARGET/build/.config`` file. + +- ``CONFIG_RTE_LIBRTE_PMD_FAILSAFE`` (default **y**) + + Toggle compiling librte_pmd_failsafe itself. + +- ``CONFIG_RTE_LIBRTE_PMD_FAILSAFE_DEBUG`` (default **n**) + + Toggle debugging code. + +Using the Fail-safe PMD from the EAL command line +------------------------------------------------- + +The Fail-safe PMD can be used like most other DPDK virtual devices, by passing a +``--vdev`` parameter to the EAL when starting the application. The device name +must start with the *net_failsafe* prefix, followed by numbers or letters. This +name must be unique for each device. Each fail-safe instance must have at least one +sub-device, up to ``RTE_MAX_ETHPORTS-1``. + +A sub-device can be any legal DPDK device, including possibly another fail-safe +instance. + +Fail-safe command line parameters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- **dev()** parameter + + This parameter allows the user to define a sub-device. The ```` part of + this parameter must be a valid device definition. It could be the argument + provided to a ``-w`` PCI device specification or the argument that would be + given to a ``--vdev`` parameter (including a fail-safe). + Enclosing the device definition within parenthesis here allows using + additional sub-device parameters if need be. They will be passed on to the + sub-device. + +- **mac** parameter [MAC address] + + This parameter allows the user to set a default MAC address to the fail-safe + and all of its sub-devices. + If no default mac address is provided, the fail-safe PMD will read the MAC + address of the first of its sub-device to be successfully probed and use it as + its default MAC address, trying to set it to all of its other sub-devices. + If no sub-device was successfully probed at initialization, then a random MAC + address is generated, that will be subsequently applied to all sub-device once + they are probed. + +Usage example +~~~~~~~~~~~~~ + +This section shows some example of using **testpmd** with a fail-safe PMD. + +#. Request huge pages: + + .. code-block:: console + + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages + +#. Start testpmd + + .. code-block:: console + + $RTE_TARGET/build/app/testpmd -c 0xff -n 4 --no-pci \ + --vdev='net_failsafe0,mac=de:ad:be:ef:01:02,dev(84:00.0),dev(net_ring0,nodeaction=r1:0:CREATE)' -- \ + -i + +Using the Fail-safe PMD from an application +------------------------------------------- + +This driver strives to be as seamless as possible to existing applications, in +order to propose the hotplug functionality in the easiest way possible. + +Care must be taken, however, to respect the **ether** API concerning device +access, and in particular, using the ``RTE_ETH_FOREACH_DEV`` macro to iterate +over ethernet devices, instead of directly accessing them or by writing one's +own device iterator. diff --git a/doc/guides/nics/features/failsafe.ini b/doc/guides/nics/features/failsafe.ini new file mode 100644 index 0000000..3c52823 --- /dev/null +++ b/doc/guides/nics/features/failsafe.ini @@ -0,0 +1,24 @@ +; +; Supported features of the 'fail-safe' poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Link status = Y +Queue start/stop = Y +MTU update = Y +Jumbo frame = Y +Promiscuous mode = Y +Allmulticast mode = Y +Unicast MAC filter = Y +Multicast MAC filter = Y +VLAN filter = Y +Packet type parsing = Y +Basic stats = Y +Stats per queue = Y +ARMv7 = Y +ARMv8 = Y +Power8 = Y +x86-32 = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index 87f9334..0ba52e1 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -58,6 +58,7 @@ Network Interface Controller Drivers vhost vmxnet3 pcap_ring + fail_safe **Figures** diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 40fc333..3c658b4 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -38,6 +38,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe DIRS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += e1000 DIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += ena DIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic +DIRS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile new file mode 100644 index 0000000..06199ad --- /dev/null +++ b/drivers/net/failsafe/Makefile @@ -0,0 +1,72 @@ +# BSD LICENSE +# +# Copyright 2017 6WIND S.A. +# Copyright 2017 Mellanox. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of 6WIND S.A. nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +include $(RTE_SDK)/mk/rte.vars.mk + +# Library name +LIB = librte_pmd_failsafe.a + +# Sources are stored in SRCS-y +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe.c +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_args.c +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_eal.c +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c +SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c + +# No exported include files + +# This lib depends upon: +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += lib/librte_eal +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += lib/librte_ether +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += lib/librte_kvargs +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += lib/librte_mbuf + +ifneq ($(DEBUG),) +CONFIG_RTE_LIBRTE_PMD_FAILSAFE_DEBUG := y +endif + +# Basic CFLAGS: +CFLAGS += -std=gnu99 -Wall -Wextra +CFLAGS += -I. +CFLAGS += -D_BSD_SOURCE +CFLAGS += -D_XOPEN_SOURCE=700 +CFLAGS += $(WERROR_FLAGS) +CFLAGS += -Wno-strict-prototypes +CFLAGS += -pedantic -DPEDANTIC + +ifeq ($(CONFIG_RTE_LIBRTE_PMD_FAILSAFE_DEBUG),y) +CFLAGS += -g -UNDEBUG +else +CFLAGS += -O3 +CFLAGS += -DNDEBUG +endif + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c new file mode 100644 index 0000000..cd60193 --- /dev/null +++ b/drivers/net/failsafe/failsafe.c @@ -0,0 +1,229 @@ +/*- + * BSD LICENSE + * + * Copyright 2017 6WIND S.A. + * Copyright 2017 Mellanox. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +#include +#include +#include +#include +#include +#include + +#include "failsafe_private.h" + +const char pmd_failsafe_driver_name[] = FAILSAFE_DRIVER_NAME; +static const struct rte_eth_link eth_link = { + .link_speed = ETH_SPEED_NUM_10G, + .link_duplex = ETH_LINK_FULL_DUPLEX, + .link_status = ETH_LINK_UP, + .link_autoneg = ETH_LINK_SPEED_AUTONEG, +}; + +static int +sub_device_create(struct rte_eth_dev *dev, + const char *params) +{ + uint8_t nb_subs; + int ret; + + ret = failsafe_args_count_subdevice(dev, params); + if (ret) + return ret; + if (PRIV(dev)->subs_tail > FAILSAFE_MAX_ETHPORTS) { + ERROR("Cannot allocate more than %d ports", + FAILSAFE_MAX_ETHPORTS); + return -ENOSPC; + } + nb_subs = PRIV(dev)->subs_tail; + PRIV(dev)->subs = rte_zmalloc(NULL, + sizeof(struct sub_device) * nb_subs, + RTE_CACHE_LINE_SIZE); + if (PRIV(dev)->subs == NULL) { + ERROR("Could not allocate sub_devices"); + return -ENOMEM; + } + return 0; +} + +static void +sub_device_free(struct rte_eth_dev *dev) +{ + rte_free(PRIV(dev)->subs); +} + +static int +eth_dev_create(const char *name, + const unsigned socket_id, + const char *params) +{ + struct rte_eth_dev *dev; + struct ether_addr *mac; + struct fs_priv *priv; + int ret; + + dev = NULL; + priv = NULL; + if (name == NULL) + return -EINVAL; + INFO("Creating fail-safe device on NUMA socket %u", + socket_id); + dev = rte_eth_dev_allocate(name); + if (dev == NULL) { + ERROR("Unable to allocate rte_eth_dev"); + return -1; + } + priv = rte_zmalloc_socket(name, sizeof(*priv), 0, socket_id); + if (priv == NULL) { + ERROR("Unable to allocate private data"); + goto free_dev; + } + dev->data->dev_private = priv; + PRIV(dev)->dev = dev; + dev->dev_ops = &failsafe_ops; + TAILQ_INIT(&dev->link_intr_cbs); + dev->data->dev_flags = 0x0; + dev->driver = NULL; + dev->data->kdrv = RTE_KDRV_NONE; + dev->data->drv_name = pmd_failsafe_driver_name; + dev->data->numa_node = socket_id; + dev->data->mac_addrs = &PRIV(dev)->mac_addrs[0]; + dev->data->dev_link = eth_link; + PRIV(dev)->nb_mac_addr = 1; + dev->rx_pkt_burst = (eth_rx_burst_t)&failsafe_rx_burst; + dev->tx_pkt_burst = (eth_tx_burst_t)&failsafe_tx_burst; + if (params == NULL) { + ERROR("This PMD requires sub-devices, none provided"); + goto free_priv; + } + ret = sub_device_create(dev, params); + if (ret) { + ERROR("Could not allocate sub_devices"); + goto free_priv; + } + ret = failsafe_args_parse(dev, params); + if (ret) + goto free_subs; + ret = failsafe_eal_init(dev); + if (ret) + goto free_args; + mac = &dev->data->mac_addrs[0]; + if (!mac_from_arg) { + struct sub_device *sdev; + uint8_t i; + + /* + * Use the ether_addr from first probed + * device, either preferred or fallback. + */ + FOREACH_SUBDEV(sdev, i, dev) + if (sdev->state >= DEV_PROBED) { + ether_addr_copy(Ð(sdev)->data->mac_addrs[0], + mac); + break; + } + /* + * If no device has been probed and no ether_addr + * has been provided on the command line, use a random + * valid one. + */ + if (i == priv->subs_tail) { + eth_random_addr( + &mac->addr_bytes[0]); + ret = rte_eth_dev_default_mac_addr_set( + dev->data->port_id, mac); + if (ret) { + ERROR("Failed to set default MAC address"); + goto free_subs; + } + } + } + INFO("MAC address is %02x:%02x:%02x:%02x:%02x:%02x", + mac->addr_bytes[0], mac->addr_bytes[1], + mac->addr_bytes[2], mac->addr_bytes[3], + mac->addr_bytes[4], mac->addr_bytes[5]); + return 0; +free_args: + failsafe_args_free(dev); +free_subs: + sub_device_free(dev); +free_priv: + rte_free(priv); +free_dev: + rte_eth_dev_release_port(dev); + return -1; +} + +static int +rte_eth_free(const char *name) +{ + struct rte_eth_dev *dev; + int ret; + + dev = rte_eth_dev_allocated(name); + if (dev == NULL) + return -ENODEV; + ret = failsafe_eal_uninit(dev); + if (ret) + ERROR("Error while uninitializing sub-EAL"); + failsafe_args_free(dev); + sub_device_free(dev); + rte_free(PRIV(dev)); + rte_eth_dev_release_port(dev); + return ret; +} + +static int +rte_pmd_failsafe_probe(const char *name, const char *params) +{ + if (name == NULL) + return -EINVAL; + INFO("Initializing " FAILSAFE_DRIVER_NAME " for %s", + name); + return eth_dev_create(name, rte_socket_id(), params); +} + +static int +rte_pmd_failsafe_remove(const char *name) +{ + if (name == NULL) + return -EINVAL; + INFO("Uninitializing " FAILSAFE_DRIVER_NAME " for %s", name); + return rte_eth_free(name); +} + +static struct rte_vdev_driver failsafe_drv = { + .probe = rte_pmd_failsafe_probe, + .remove = rte_pmd_failsafe_remove, +}; + +RTE_PMD_REGISTER_VDEV(net_failsafe, failsafe_drv); +RTE_PMD_REGISTER_ALIAS(net_failsafe, eth_failsafe); +RTE_PMD_REGISTER_PARAM_STRING(net_failsafe, PMD_FAILSAFE_PARAM_STRING); diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c new file mode 100644 index 0000000..faa48f4 --- /dev/null +++ b/drivers/net/failsafe/failsafe_args.c @@ -0,0 +1,347 @@ +/*- + * BSD LICENSE + * + * Copyright 2017 6WIND S.A. + * Copyright 2017 Mellanox. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +#include +#include + +#include +#include +#include + +#include "failsafe_private.h" + +#define DEVARGS_MAXLEN 4096 + +/* Callback used when a new device is found in devargs */ +typedef int (parse_cb)(struct rte_eth_dev *dev, const char *params, + uint8_t head); + +int mac_from_arg; + +const char *pmd_failsafe_init_parameters[] = { + PMD_FAILSAFE_MAC_KVARG, + NULL, +}; + +/* + * input: text. + * output: 0: if text[0] != '(', + * 0: if there are no corresponding ')' + * n: distance to corresponding ')' otherwise + */ +static size_t +closing_paren(const char *text) +{ + int nb_open = 0; + size_t i = 0; + + while (text[i] != '\0') { + if (text[i] == '(') + nb_open++; + if (text[i] == ')') + nb_open--; + if (nb_open == 0) + return i; + i++; + } + return 0; +} + +static int +parse_device(struct sub_device *sdev, char *args) +{ + struct rte_devargs *d; + size_t b; + + b = 0; + d = &sdev->devargs; + DEBUG("%s", args); + while (args[b] != ',' && + args[b] != '\0') + b++; + if (args[b] != '\0') { + d->args = strdup(&args[b + 1]); + if (d->args == NULL) { + ERROR("Not enough memory to store sub-device args"); + return -ENOMEM; + } + args[b] = '\0'; + } else { + d->args = strdup(""); + } + if (eal_parse_pci_BDF(args, &d->pci.addr) == 0 || + eal_parse_pci_DomBDF(args, &d->pci.addr) == 0) + d->type = RTE_DEVTYPE_WHITELISTED_PCI; + else { + d->type = RTE_DEVTYPE_VIRTUAL; + snprintf(d->virt.drv_name, + sizeof(d->virt.drv_name), "%s", args); + } + sdev->state = DEV_PARSED; + return 0; +} + +static int +parse_device_param(struct rte_eth_dev *dev, const char *param, + uint8_t head) +{ + struct fs_priv *priv; + struct sub_device *sdev; + char *args = NULL; + size_t a, b; + int ret; + + priv = PRIV(dev); + a = 0; + b = 0; + ret = 0; + while (param[b] != '(' && + param[b] != '\0') + b++; + a = b; + b += closing_paren(¶m[b]); + if (a == b) { + ERROR("Dangling parenthesis"); + return -EINVAL; + } + a += 1; + args = strndup(¶m[a], b - a); + if (args == NULL) { + ERROR("Not enough memory for parameter parsing"); + return -ENOMEM; + } + sdev = &priv->subs[head]; + if (strncmp(param, "dev", 3) == 0) { + ret = parse_device(sdev, args); + if (ret) + goto free_args; + } else { + ERROR("Unrecognized device type: %.*s", (int)b, param); + return -EINVAL; + } +free_args: + free(args); + return ret; +} + +static int +parse_sub_devices(parse_cb *cb, + struct rte_eth_dev *dev, const char *params) +{ + size_t a, b; + uint8_t head; + int ret; + + a = 0; + head = 0; + ret = 0; + while (params[a] != '\0') { + b = a; + while (params[b] != '(' && + params[b] != ',' && + params[b] != '\0') + b++; + if (b == a) { + ERROR("Invalid parameter"); + return -EINVAL; + } + if (params[b] == ',') { + a = b + 1; + continue; + } + if (params[b] == '(') { + size_t start = b; + + b += closing_paren(¶ms[b]); + if (b == start) { + ERROR("Dangling parenthesis"); + return -EINVAL; + } + ret = (*cb)(dev, ¶ms[a], head); + if (ret) + return ret; + head += 1; + b += 1; + if (params[b] == '\0') + return 0; + } + a = b + 1; + } + return 0; +} + +static int +remove_sub_devices_definition(char params[DEVARGS_MAXLEN]) +{ + char buffer[DEVARGS_MAXLEN] = {0}; + size_t a, b; + int i; + + a = 0; + i = 0; + while (params[a] != '\0') { + b = a; + while (params[b] != '(' && + params[b] != ',' && + params[b] != '\0') + b++; + if (b == a) { + ERROR("Invalid parameter"); + return -EINVAL; + } + if (params[b] == ',' || params[b] == '\0') + i += snprintf(&buffer[i], b - a + 1, "%s", ¶ms[a]); + if (params[b] == '(') { + size_t start = b; + b += closing_paren(¶ms[b]); + if (b == start) + return -EINVAL; + b += 1; + if (params[b] == '\0') + goto out; + } + a = b + 1; + } +out: + snprintf(params, DEVARGS_MAXLEN, "%s", buffer); + return 0; +} + +static int +get_mac_addr_arg(const char *key __rte_unused, + const char *value, void *out) +{ + struct ether_addr *ea = out; + int ret; + + if ((value == NULL) || (out == NULL)) + return -EINVAL; + ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx", + &ea->addr_bytes[0], &ea->addr_bytes[1], + &ea->addr_bytes[2], &ea->addr_bytes[3], + &ea->addr_bytes[4], &ea->addr_bytes[5]); + return ret != ETHER_ADDR_LEN; +} + +int +failsafe_args_parse(struct rte_eth_dev *dev, const char *params) +{ + struct fs_priv *priv; + char mut_params[DEVARGS_MAXLEN] = ""; + struct rte_kvargs *kvlist = NULL; + unsigned arg_count; + size_t n; + int ret; + + if (dev == NULL || params == NULL) + return -EINVAL; + priv = PRIV(dev); + ret = 0; + priv->subs_tx = FAILSAFE_MAX_ETHPORTS; + /* default parameters */ + mac_from_arg = 0; + n = snprintf(mut_params, sizeof(mut_params), "%s", params); + if (n >= sizeof(mut_params)) { + ERROR("Parameter string too long (>=%zu)", + sizeof(mut_params)); + return -ENOMEM; + } + ret = parse_sub_devices(parse_device_param, + dev, params); + if (ret < 0) + return ret; + ret = remove_sub_devices_definition(mut_params); + if (ret < 0) + return ret; + if (strnlen(mut_params, sizeof(mut_params)) > 0) { + kvlist = rte_kvargs_parse(mut_params, + pmd_failsafe_init_parameters); + if (kvlist == NULL) { + ERROR("Error parsing parameters, usage:\n" + PMD_FAILSAFE_PARAM_STRING); + return -1; + } + /* MAC addr */ + arg_count = rte_kvargs_count(kvlist, + PMD_FAILSAFE_MAC_KVARG); + if (arg_count == 1) { + ret = rte_kvargs_process(kvlist, + PMD_FAILSAFE_MAC_KVARG, + &get_mac_addr_arg, + &dev->data->mac_addrs[0]); + if (ret < 0) + goto free_kvlist; + mac_from_arg = 1; + } + } +free_kvlist: + rte_kvargs_free(kvlist); + return ret; +} + +void +failsafe_args_free(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV(sdev, i, dev) { + free(sdev->devargs.args); + sdev->devargs.args = NULL; + } +} + +static int +count_device(struct rte_eth_dev *dev, const char *param, + uint8_t head __rte_unused) +{ + size_t b = 0; + + while (param[b] != '(' && + param[b] != '\0') + b++; + if (strncmp(param, "dev", b) && + strncmp(param, "exec", b)) { + ERROR("Unrecognized device type: %.*s", (int)b, param); + return -EINVAL; + } + PRIV(dev)->subs_tail += 1; + return 0; +} + +int +failsafe_args_count_subdevice(struct rte_eth_dev *dev, + const char *params) +{ + return parse_sub_devices(count_device, + dev, params); +} diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c new file mode 100644 index 0000000..fcee500 --- /dev/null +++ b/drivers/net/failsafe/failsafe_eal.c @@ -0,0 +1,318 @@ +/*- + * BSD LICENSE + * + * Copyright 2017 6WIND S.A. + * Copyright 2017 Mellanox. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include + +#include "failsafe_private.h" + +static struct rte_eth_dev * +pci_addr_to_eth_dev(struct rte_pci_addr *addr) +{ + uint8_t pid; + + if (addr == NULL) + return NULL; + for (pid = 0; pid < RTE_MAX_ETHPORTS; pid++) { + struct rte_pci_addr *addr2; + struct rte_eth_dev *edev; + + edev = &rte_eth_devices[pid]; + if (edev->device == NULL || + edev->device->devargs == NULL) + continue; + addr2 = &edev->device->devargs->pci.addr; + if (rte_eal_compare_pci_addr(addr, addr2) == 0) + return edev; + } + return NULL; +} + +static int +pci_scan_one(struct sub_device *sdev) +{ + struct rte_devargs *da; + char dirname[PATH_MAX]; + + da = &sdev->devargs; + snprintf(dirname, sizeof(dirname), + "%s/" PCI_PRI_FMT, + pci_get_sysfs_path(), + da->pci.addr.domain, + da->pci.addr.bus, + da->pci.addr.devid, + da->pci.addr.function); + errno = 0; + if (rte_eal_pci_parse_sysfs_entry(&sdev->pci_device, + dirname, &da->pci.addr) < 0) { + if (errno == ENOENT) { + DEBUG("Could not scan requested device " PCI_PRI_FMT, + da->pci.addr.domain, + da->pci.addr.bus, + da->pci.addr.devid, + da->pci.addr.function); + } else { + ERROR("Error while scanning sysfs entry %s", + dirname); + return -1; + } + } else { + sdev->state = DEV_SCANNED; + } + return 0; +} + +static int +pci_scan(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + struct rte_devargs *da; + uint8_t i; + int ret; + + FOREACH_SUBDEV(sdev, i, dev) { + if (sdev->state != DEV_PARSED) + continue; + da = &sdev->devargs; + if (da->type == RTE_DEVTYPE_WHITELISTED_PCI) { + sdev->device.devargs = da; + ret = pci_scan_one(sdev); + if (ret) + return ret; + } + } + return 0; +} + +static int +dev_init(struct rte_eth_dev *dev) +{ + struct sub_device *sdev = NULL; + struct rte_devargs *da; + uint8_t i; + int ret; + + FOREACH_SUBDEV(sdev, i, dev) { + if (sdev->state != DEV_PARSED) + continue; + da = &sdev->devargs; + if (da->type == RTE_DEVTYPE_VIRTUAL) { + sdev->device.devargs = da; + PRIV(dev)->current_probed = i; + ret = rte_eal_vdev_init(da->virt.drv_name, + da->args); + if (ret) + return ret; + sdev->eth_dev = + rte_eth_dev_allocated(da->virt.drv_name); + if (ETH(sdev) == NULL) { + ERROR("sub_device %d init went wrong", i); + continue; + } + ETH(sdev)->state = RTE_ETH_DEV_DEFERRED; + sdev->state = DEV_PROBED; + } + } + return 0; +} + +static int +pci_probe(struct rte_eth_dev *dev) +{ + struct sub_device *sdev = NULL; + struct rte_pci_device *pdev = NULL; + struct rte_devargs *da = NULL; + uint8_t i; + int ret; + + FOREACH_SUBDEV(sdev, i, dev) { + if (sdev->state != DEV_SCANNED) + continue; + da = &sdev->devargs; + if (da->type == RTE_DEVTYPE_WHITELISTED_PCI) { + pdev = &sdev->pci_device; + pdev->device.devargs = da; + ret = rte_eal_pci_probe_all_drivers(pdev); + /* probing failed */ + if (ret < 0) { + ERROR("Failed to probe requested device " + PCI_PRI_FMT, + da->pci.addr.domain, + da->pci.addr.bus, + da->pci.addr.devid, + da->pci.addr.function); + return -1; + } + /* no driver found */ + if (ret > 0) { + ERROR("Requested device " PCI_PRI_FMT + " has no suitable driver", + da->pci.addr.domain, + da->pci.addr.bus, + da->pci.addr.devid, + da->pci.addr.function); + return 1; + } + sdev->eth_dev = + pci_addr_to_eth_dev(&pdev->addr); + if (ETH(sdev) == NULL) { + ERROR("sub_device %d init went wrong", i); + continue; + } + ETH(sdev)->state = RTE_ETH_DEV_DEFERRED; + sdev->state = DEV_PROBED; + } + } + return 0; +} + +int +failsafe_eal_init(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + ret = pci_scan(dev); + if (ret) + return ret; + ret = pci_probe(dev); + if (ret) + return ret; + ret = dev_init(dev); + if (ret) + return ret; + /* + * We only update TX_SUBDEV if we are not started. + * If a sub_device is emitting, we will switch the TX_SUBDEV to the + * preferred port only upon starting it, so that the switch is smoother. + */ + if (PREFERRED_SUBDEV(dev)->state >= DEV_PROBED) { + if (TX_SUBDEV(dev) != PREFERRED_SUBDEV(dev) && + (TX_SUBDEV(dev) == NULL || + (TX_SUBDEV(dev) && TX_SUBDEV(dev)->state < DEV_STARTED))) { + DEBUG("Switching tx_dev to preferred sub_device"); + PRIV(dev)->subs_tx = 0; + } + } else { + if ((TX_SUBDEV(dev) && TX_SUBDEV(dev)->state < DEV_PROBED) || + TX_SUBDEV(dev) == NULL) { + /* Using first probed device */ + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_PROBED) { + DEBUG("Switching tx_dev to sub_device %d", + i); + PRIV(dev)->subs_tx = i; + break; + } + } + } + return 0; +} + +static int +dev_uninit(struct rte_eth_dev *dev) +{ + struct sub_device *sdev = NULL; + struct rte_devargs *da; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_PROBED) { + da = &sdev->devargs; + if (da->type == RTE_DEVTYPE_VIRTUAL) { + sdev->device.devargs = da; + /* + * We are lucky here that no virtual device + * currently creates multiple eth_dev. + */ + ret = rte_eal_vdev_uninit(da->virt.drv_name); + if (ret) + return ret; + sdev->state = DEV_PROBED - 1; + } + } + return 0; +} + +static int +pci_remove(struct rte_eth_dev *dev) +{ + struct sub_device *sdev = NULL; + struct rte_pci_device *pdev = NULL; + struct rte_devargs *da = NULL; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_PROBED) { + da = &sdev->devargs; + if (da->type == RTE_DEVTYPE_WHITELISTED_PCI) { + pdev = &sdev->pci_device; + ret = rte_eal_pci_detach_all_drivers(pdev); + /* probing failed */ + if (ret < 0) { + ERROR("Failed to remove requested device " + PCI_PRI_FMT, + da->pci.addr.domain, + da->pci.addr.bus, + da->pci.addr.devid, + da->pci.addr.function); + return -1; + } + /* no driver found */ + if (ret > 0) { + ERROR("Requested device " PCI_PRI_FMT + " has no suitable driver", + da->pci.addr.domain, + da->pci.addr.bus, + da->pci.addr.devid, + da->pci.addr.function); + return 1; + } + sdev->state = DEV_PROBED - 1; + } + } + return 0; +} + +int +failsafe_eal_uninit(struct rte_eth_dev *dev) +{ + int ret; + + ret = pci_remove(dev); + if (ret) + return ret; + ret = dev_uninit(dev); + if (ret) + return ret; + return 0; +} diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c new file mode 100644 index 0000000..470fea4 --- /dev/null +++ b/drivers/net/failsafe/failsafe_ops.c @@ -0,0 +1,677 @@ +/*- + * BSD LICENSE + * + * Copyright 2017 6WIND S.A. + * Copyright 2017 Mellanox. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include + +#include "failsafe_private.h" + +static struct rte_eth_dev_info default_infos = { + .driver_name = pmd_failsafe_driver_name, + /* Max possible number of elements */ + .max_rx_pktlen = UINT32_MAX, + .max_rx_queues = RTE_MAX_QUEUES_PER_PORT, + .max_tx_queues = RTE_MAX_QUEUES_PER_PORT, + .max_mac_addrs = FAILSAFE_MAX_ETHADDR, + .max_hash_mac_addrs = UINT32_MAX, + .max_vfs = UINT16_MAX, + .max_vmdq_pools = UINT16_MAX, + .rx_desc_lim = { + .nb_max = UINT16_MAX, + .nb_min = 0, + .nb_align = 1, + .nb_seg_max = UINT16_MAX, + .nb_mtu_seg_max = UINT16_MAX, + }, + .tx_desc_lim = { + .nb_max = UINT16_MAX, + .nb_min = 0, + .nb_align = 1, + .nb_seg_max = UINT16_MAX, + .nb_mtu_seg_max = UINT16_MAX, + }, + /* Set of understood capabilities */ + .rx_offload_capa = 0x0, + .tx_offload_capa = 0x0, + .flow_type_rss_offloads = 0x0, +}; + +static int +fs_dev_configure(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV(sdev, i, dev) { + if (sdev->state != DEV_PROBED) + continue; + DEBUG("Configuring sub-device %d", i); + ret = rte_eth_dev_configure(PORT_ID(sdev), + dev->data->nb_rx_queues, + dev->data->nb_tx_queues, + &dev->data->dev_conf); + if (ret) { + ERROR("Could not configure sub_device %d", i); + return ret; + } + sdev->state = DEV_ACTIVE; + } + return 0; +} + +static int +fs_dev_start(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV(sdev, i, dev) { + if (sdev->state != DEV_ACTIVE) + continue; + DEBUG("Starting sub_device %d", i); + ret = rte_eth_dev_start(PORT_ID(sdev)); + if (ret) + return ret; + sdev->state = DEV_STARTED; + } + if (PREFERRED_SUBDEV(dev)->state == DEV_STARTED) { + if (TX_SUBDEV(dev) != PREFERRED_SUBDEV(dev)) { + DEBUG("Switching tx_dev to preferred sub_device"); + PRIV(dev)->subs_tx = 0; + } + } else { + if ((TX_SUBDEV(dev) && TX_SUBDEV(dev)->state < DEV_STARTED) || + TX_SUBDEV(dev) == NULL) { + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_STARTED) { + DEBUG("Switching tx_dev to sub_device %d", i); + PRIV(dev)->subs_tx = i; + break; + } + } + } + return 0; +} + +static void +fs_dev_stop(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_STARTED) { + rte_eth_dev_stop(PORT_ID(sdev)); + sdev->state = DEV_STARTED - 1; + } +} + +static int +fs_dev_set_link_up(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i); + ret = rte_eth_dev_set_link_up(PORT_ID(sdev)); + if (ret) { + ERROR("Operation rte_eth_dev_set_link_up failed for sub_device %d" + " with error %d", i, ret); + return ret; + } + } + return 0; +} + +static int +fs_dev_set_link_down(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i); + ret = rte_eth_dev_set_link_down(PORT_ID(sdev)); + if (ret) { + ERROR("Operation rte_eth_dev_set_link_down failed for sub_device %d" + " with error %d", i, ret); + return ret; + } + } + return 0; +} + +static void dev_free_queues(struct rte_eth_dev *dev); +static void +fs_dev_close(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Closing sub_device %d", i); + rte_eth_dev_close(PORT_ID(sdev)); + sdev->state = DEV_ACTIVE - 1; + } + dev_free_queues(dev); +} + +static void +fs_rx_queue_release(void *queue) +{ + struct rte_eth_dev *dev; + struct sub_device *sdev; + uint8_t i; + struct rxq *rxq; + + if (queue == NULL) + return; + rxq = queue; + dev = rxq->priv->dev; + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + SUBOPS(sdev, rx_queue_release) + (Ð(sdev)->data->rx_queues[rxq->qid]); + rte_free(rxq); +} + +static int +fs_rx_queue_setup(struct rte_eth_dev *dev, + uint16_t rx_queue_id, + uint16_t nb_rx_desc, + unsigned int socket_id, + const struct rte_eth_rxconf *rx_conf, + struct rte_mempool *mb_pool) +{ + struct sub_device *sdev; + struct rxq *rxq; + uint8_t i; + int ret; + + rxq = dev->data->rx_queues[rx_queue_id]; + if (rxq != NULL) { + fs_rx_queue_release(rxq); + dev->data->rx_queues[rx_queue_id] = NULL; + } + rxq = rte_zmalloc(NULL, sizeof(*rxq), + RTE_CACHE_LINE_SIZE); + if (rxq == NULL) + return -ENOMEM; + rxq->qid = rx_queue_id; + rxq->socket_id = socket_id; + rxq->info.mp = mb_pool; + rxq->info.conf = *rx_conf; + rxq->info.nb_desc = nb_rx_desc; + rxq->priv = PRIV(dev); + dev->data->rx_queues[rx_queue_id] = rxq; + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + ret = rte_eth_rx_queue_setup(PORT_ID(sdev), + rx_queue_id, + nb_rx_desc, socket_id, + rx_conf, mb_pool); + if (ret) { + ERROR("RX queue setup failed for sub_device %d", i); + goto free_rxq; + } + } + return 0; +free_rxq: + fs_rx_queue_release(rxq); + return ret; +} + +static void +fs_tx_queue_release(void *queue) +{ + struct rte_eth_dev *dev; + struct sub_device *sdev; + uint8_t i; + struct txq *txq; + + if (queue == NULL) + return; + txq = queue; + dev = txq->priv->dev; + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + SUBOPS(sdev, tx_queue_release) + (Ð(sdev)->data->tx_queues[txq->qid]); + rte_free(txq); +} + +static int +fs_tx_queue_setup(struct rte_eth_dev *dev, + uint16_t tx_queue_id, + uint16_t nb_tx_desc, + unsigned int socket_id, + const struct rte_eth_txconf *tx_conf) +{ + struct sub_device *sdev; + struct txq *txq; + uint8_t i; + int ret; + + txq = dev->data->tx_queues[tx_queue_id]; + if (txq != NULL) { + fs_tx_queue_release(txq); + dev->data->tx_queues[tx_queue_id] = NULL; + } + txq = rte_zmalloc("ethdev TX queue", sizeof(*txq), + RTE_CACHE_LINE_SIZE); + if (txq == NULL) + return -ENOMEM; + txq->qid = tx_queue_id; + txq->socket_id = socket_id; + txq->info.conf = *tx_conf; + txq->info.nb_desc = nb_tx_desc; + txq->priv = PRIV(dev); + dev->data->tx_queues[tx_queue_id] = txq; + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + ret = rte_eth_tx_queue_setup(PORT_ID(sdev), + tx_queue_id, + nb_tx_desc, socket_id, + tx_conf); + if (ret) { + ERROR("TX queue setup failed for sub_device %d", i); + goto free_txq; + } + } + return 0; +free_txq: + fs_tx_queue_release(txq); + return ret; +} + +static void +dev_free_queues(struct rte_eth_dev *dev) +{ + uint16_t i; + + for (i = 0; i < dev->data->nb_rx_queues; i++) { + fs_rx_queue_release(dev->data->rx_queues[i]); + dev->data->rx_queues[i] = NULL; + } + dev->data->nb_rx_queues = 0; + for (i = 0; i < dev->data->nb_tx_queues; i++) { + fs_tx_queue_release(dev->data->tx_queues[i]); + dev->data->tx_queues[i] = NULL; + } + dev->data->nb_tx_queues = 0; +} + +static void +fs_promiscuous_enable(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_promiscuous_enable(PORT_ID(sdev)); +} + +static void +fs_promiscuous_disable(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_promiscuous_disable(PORT_ID(sdev)); +} + +static void +fs_allmulticast_enable(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_allmulticast_enable(PORT_ID(sdev)); +} + +static void +fs_allmulticast_disable(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_allmulticast_disable(PORT_ID(sdev)); +} + +static int +fs_link_update(struct rte_eth_dev *dev, + int wait_to_complete) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Calling link_update on sub_device %d", i); + ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete); + if (ret && ret != -1) { + ERROR("Link update failed for sub_device %d with error %d", + i, ret); + return ret; + } + } + if (TX_SUBDEV(dev)) { + struct rte_eth_link *l1; + struct rte_eth_link *l2; + + l1 = &dev->data->dev_link; + l2 = Ð(TX_SUBDEV(dev))->data->dev_link; + if (memcmp(l1, l2, sizeof(*l1))) { + *l1 = *l2; + return 0; + } + } + return -1; +} + +static void +fs_stats_get(struct rte_eth_dev *dev, + struct rte_eth_stats *stats) +{ + struct sub_device *sdev; + struct rte_eth_stats loc_stats; + uint8_t j; + size_t i; + + memset(stats, 0, sizeof(*stats)); + memset(&loc_stats, 0, sizeof(loc_stats)); + FOREACH_SUBDEV_ST(sdev, j, dev, DEV_PROBED) { + rte_eth_stats_get(PORT_ID(sdev), &loc_stats); + stats->ipackets += loc_stats.ipackets; + stats->opackets += loc_stats.opackets; + stats->ibytes += loc_stats.ibytes; + stats->obytes += loc_stats.obytes; + stats->imissed += loc_stats.imissed; + stats->ierrors += loc_stats.ierrors; + stats->oerrors += loc_stats.oerrors; + stats->rx_nombuf += loc_stats.rx_nombuf; + for (i = 0; i < RTE_DIM(stats->q_ipackets); i++) + stats->q_ipackets[i] += loc_stats.q_ipackets[i]; + for (i = 0; i < RTE_DIM(stats->q_opackets); i++) + stats->q_opackets[i] += loc_stats.q_opackets[i]; + for (i = 0; i < RTE_DIM(stats->q_ibytes); i++) + stats->q_ibytes[i] += loc_stats.q_ibytes[i]; + for (i = 0; i < RTE_DIM(stats->q_obytes); i++) + stats->q_obytes[i] += loc_stats.q_obytes[i]; + for (i = 0; i < RTE_DIM(stats->q_errors); i++) + stats->q_errors[i] += loc_stats.q_errors[i]; + } +} + +static void +fs_stats_reset(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_stats_reset(PORT_ID(sdev)); +} + +/** + * Fail-safe dev_infos_get rules: + * + * No sub_device: + * Numerables: + * Use the maximum possible values for any field, so as not + * to impede any further configuration effort. + * Capabilities: + * Limits capabilities to those that are understood by the + * fail-safe PMD. This understanding stems from the fail-safe + * being capable of verifying that the related capability is + * expressed within the device configuration (struct rte_eth_conf). + * + * At least one probed sub_device: + * Numerables: + * Uses values from the active probed sub_device + * The rationale here is that if any sub_device is less capable + * (for example concerning the number of queues) than the active + * sub_device, then its subsequent configuration will fail. + * It is impossible to foresee this failure when the failing sub_device + * is supposed to be plugged-in later on, so the configuration process + * is the single point of failure and error reporting. + * Capabilities: + * Uses a logical AND of RX capabilities among + * all sub_devices and the default capabilities. + * Uses a logical AND of TX capabilities among + * the active probed sub_device and the default capabilities. + * + */ +static void +fs_dev_infos_get(struct rte_eth_dev *dev, + struct rte_eth_dev_info *infos) +{ + struct sub_device *sdev; + uint8_t i; + + sdev = TX_SUBDEV(dev); + if (sdev == NULL) { + DEBUG("No probed device, using default infos"); + rte_memcpy(&PRIV(dev)->infos, &default_infos, + sizeof(default_infos)); + } else { + uint32_t rx_offload_capa; + + rx_offload_capa = default_infos.rx_offload_capa; + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_PROBED) { + rte_eth_dev_info_get(PORT_ID(sdev), + &PRIV(dev)->infos); + rx_offload_capa &= PRIV(dev)->infos.rx_offload_capa; + } + sdev = TX_SUBDEV(dev); + rte_eth_dev_info_get(PORT_ID(sdev), &PRIV(dev)->infos); + PRIV(dev)->infos.rx_offload_capa = rx_offload_capa; + PRIV(dev)->infos.tx_offload_capa &= + default_infos.tx_offload_capa; + PRIV(dev)->infos.flow_type_rss_offloads &= + default_infos.flow_type_rss_offloads; + } + rte_memcpy(infos, &PRIV(dev)->infos, sizeof(*infos)); +} + +static const uint32_t * +fs_dev_supported_ptypes_get(struct rte_eth_dev *dev) +{ + struct sub_device *sdev; + struct rte_eth_dev *edev; + + sdev = TX_SUBDEV(dev); + if (sdev == NULL) + return NULL; + edev = ETH(sdev); + /* ENOTSUP: counts as no supported ptypes */ + if (SUBOPS(sdev, dev_supported_ptypes_get) == NULL) + return NULL; + /* + * The API does not permit to do a clean AND of all ptypes, + * It is also incomplete by design and we do not really care + * to have a best possible value in this context. + * We just return the ptypes of the device of highest + * priority, usually the PREFERRED device. + */ + return SUBOPS(sdev, dev_supported_ptypes_get)(edev); +} + +static int +fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i); + ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu); + if (ret) { + ERROR("Operation rte_eth_dev_set_mtu failed for sub_device %d" + " with error %d", i, ret); + return ret; + } + } + return 0; +} + +static int +fs_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i); + ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on); + if (ret) { + ERROR("Operation rte_eth_dev_vlan_filter failed for sub_device %d" + " with error %d", i, ret); + return ret; + } + } + return 0; +} + +static int +fs_flow_ctrl_get(struct rte_eth_dev *dev, + struct rte_eth_fc_conf *fc_conf) +{ + struct sub_device *sdev; + + sdev = TX_SUBDEV(dev); + if (sdev == NULL) + return 0; + if (SUBOPS(sdev, flow_ctrl_get) == NULL) + return -ENOTSUP; + return SUBOPS(sdev, flow_ctrl_get)(ETH(sdev), fc_conf); +} + +static int +fs_flow_ctrl_set(struct rte_eth_dev *dev, + struct rte_eth_fc_conf *fc_conf) +{ + struct sub_device *sdev; + uint8_t i; + int ret; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) { + DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i); + ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf); + if (ret) { + ERROR("Operation rte_eth_dev_flow_ctrl_set failed for sub_device %d" + " with error %d", i, ret); + return ret; + } + } + return 0; +} + +static void +fs_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index) +{ + struct sub_device *sdev; + uint8_t i; + + /* No check: already done within the rte_eth_dev_mac_addr_remove + * call for the fail-safe device. + */ + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_dev_mac_addr_remove(PORT_ID(sdev), + &dev->data->mac_addrs[index]); + PRIV(dev)->mac_addr_pool[index] = 0; +} + +static void +fs_mac_addr_add(struct rte_eth_dev *dev, + struct ether_addr *mac_addr, + uint32_t index, + uint32_t vmdq) +{ + struct sub_device *sdev; + uint8_t i; + + assert(index < FAILSAFE_MAX_ETHADDR); + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq); + if (index >= PRIV(dev)->nb_mac_addr) { + DEBUG("Growing mac_addrs array"); + PRIV(dev)->nb_mac_addr = index; + } + PRIV(dev)->mac_addr_pool[index] = vmdq; +} + +static void +fs_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr) +{ + struct sub_device *sdev; + uint8_t i; + + FOREACH_SUBDEV_ST(sdev, i, dev, DEV_ACTIVE) + rte_eth_dev_default_mac_addr_set(PORT_ID(sdev), mac_addr); +} + +const struct eth_dev_ops failsafe_ops = { + .dev_configure = fs_dev_configure, + .dev_start = fs_dev_start, + .dev_stop = fs_dev_stop, + .dev_set_link_down = fs_dev_set_link_down, + .dev_set_link_up = fs_dev_set_link_up, + .dev_close = fs_dev_close, + .promiscuous_enable = fs_promiscuous_enable, + .promiscuous_disable = fs_promiscuous_disable, + .allmulticast_enable = fs_allmulticast_enable, + .allmulticast_disable = fs_allmulticast_disable, + .link_update = fs_link_update, + .stats_get = fs_stats_get, + .stats_reset = fs_stats_reset, + .dev_infos_get = fs_dev_infos_get, + .dev_supported_ptypes_get = fs_dev_supported_ptypes_get, + .mtu_set = fs_mtu_set, + .vlan_filter_set = fs_vlan_filter_set, + .rx_queue_setup = fs_rx_queue_setup, + .tx_queue_setup = fs_tx_queue_setup, + .rx_queue_release = fs_rx_queue_release, + .tx_queue_release = fs_tx_queue_release, + .flow_ctrl_get = fs_flow_ctrl_get, + .flow_ctrl_set = fs_flow_ctrl_set, + .mac_addr_remove = fs_mac_addr_remove, + .mac_addr_add = fs_mac_addr_add, + .mac_addr_set = fs_mac_addr_set, +}; diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h new file mode 100644 index 0000000..84a4d3a --- /dev/null +++ b/drivers/net/failsafe/failsafe_private.h @@ -0,0 +1,232 @@ +/*- + * BSD LICENSE + * + * Copyright 2017 6WIND S.A. + * Copyright 2017 Mellanox. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_ETH_FAILSAFE_PRIVATE_H_ +#define _RTE_ETH_FAILSAFE_PRIVATE_H_ + +#include +#include +#include +#include + +#define FAILSAFE_DRIVER_NAME "Fail-safe PMD" + +#define PMD_FAILSAFE_MAC_KVARG "mac" +#define PMD_FAILSAFE_PARAM_STRING \ + "dev()," \ + "mac=mac_addr" \ + "" + +#define FAILSAFE_PLUGIN_DEFAULT_TIMEOUT_MS 2000 + +#define FAILSAFE_MAX_ETHPORTS (RTE_MAX_ETHPORTS - 1) +#define FAILSAFE_MAX_ETHADDR 128 + +/* TYPES */ + +struct rxq { + struct fs_priv *priv; + uint16_t qid; + /* id of last sub_device polled */ + uint8_t last_polled; + unsigned int socket_id; + struct rte_eth_rxq_info info; +}; + +struct txq { + struct fs_priv *priv; + uint16_t qid; + unsigned int socket_id; + struct rte_eth_txq_info info; +}; + +enum dev_state { + DEV_UNDEFINED = 0, + DEV_PARSED, + DEV_SCANNED, + DEV_PROBED, + DEV_ACTIVE, + DEV_STARTED, +}; + +struct sub_device { + /* Exhaustive DPDK device description */ + struct rte_devargs devargs; + RTE_STD_C11 + union { + struct rte_device device; + struct rte_pci_device pci_device; + }; + struct rte_eth_dev *eth_dev; + /* Device state machine */ + enum dev_state state; +}; + +struct fs_priv { + struct rte_eth_dev *dev; + /* + * Set of sub_devices. + * subs[0] is the preferred device + * any other is just another slave + */ + uint8_t subs_head; /* if head == tail, no subs */ + uint8_t subs_tail; /* first invalid */ + uint8_t subs_tx; + struct sub_device *subs; + uint8_t current_probed; + /* current number of mac_addr slots allocated. */ + uint32_t nb_mac_addr; + struct ether_addr mac_addrs[FAILSAFE_MAX_ETHADDR]; + uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR]; + /* current capabilities */ + struct rte_eth_dev_info infos; +}; + +/* RX / TX */ + +uint16_t failsafe_rx_burst(void *rxq, + struct rte_mbuf **rx_pkts, uint16_t nb_pkts); +uint16_t failsafe_tx_burst(void *txq, + struct rte_mbuf **tx_pkts, uint16_t nb_pkts); + +/* ARGS */ + +int failsafe_args_parse(struct rte_eth_dev *dev, const char *params); +void failsafe_args_free(struct rte_eth_dev *dev); +int failsafe_args_count_subdevice(struct rte_eth_dev *dev, const char *params); + +/* EAL */ + +int failsafe_eal_init(struct rte_eth_dev *dev); +int failsafe_eal_uninit(struct rte_eth_dev *dev); + +/* GLOBALS */ + +extern const char pmd_failsafe_driver_name[]; +extern const struct eth_dev_ops failsafe_ops; +extern int mac_from_arg; + +/* HELPERS */ + +/* dev: (struct rte_eth_dev *) fail-safe device */ +#define PRIV(dev) \ + ((struct fs_priv *)(dev)->data->dev_private) + +/* sdev: (struct sub_device *) */ +#define ETH(sdev) \ + ((sdev)->eth_dev) + +/* sdev: (struct sub_device *) */ +#define PORT_ID(sdev) \ + (ETH(sdev)->data->port_id) + +/** + * Stateful iterator construct over fail-safe sub-devices: + * s: (struct sub_device *), iterator + * i: (uint8_t), increment + * dev: (struct rte_eth_dev *), fail-safe ethdev + * state: (enum dev_state), minimum acceptable device state + */ +#define FOREACH_SUBDEV_ST(s, i, dev, state) \ + for (i = fs_find_next((dev), 0, state); \ + i < PRIV(dev)->subs_tail && (s = &PRIV(dev)->subs[i]); \ + i = fs_find_next((dev), i + 1, state)) + +/** + * Iterator construct over fail-safe sub-devices: + * s: (struct sub_device *), iterator + * i: (uint8_t), increment + * dev: (struct rte_eth_dev *), fail-safe ethdev + */ +#define FOREACH_SUBDEV(s, i, dev) \ + FOREACH_SUBDEV_ST(s, i, dev, DEV_UNDEFINED) + +/* dev: (struct rte_eth_dev *) fail-safe device */ +#define PREFERRED_SUBDEV(dev) \ + (&PRIV(dev)->subs[0]) + +/* dev: (struct rte_eth_dev *) fail-safe device */ +#define TX_SUBDEV(dev) \ + (PRIV(dev)->subs_tx >= PRIV(dev)->subs_tail ? NULL \ + : (PRIV(dev)->subs[PRIV(dev)->subs_tx].state < DEV_PROBED ? NULL \ + : &PRIV(dev)->subs[PRIV(dev)->subs_tx])) + +/** + * s: (struct sub_device *) + * ops: (struct eth_dev_ops) member + */ +#define SUBOPS(s, ops) \ + (ETH(s)->dev_ops->ops) + +#ifndef NDEBUG +#include +#define DEBUG__(m, ...) \ + (fprintf(stderr, "%s:%d: %s(): " m "%c", \ + __FILE__, __LINE__, __func__, __VA_ARGS__), \ + (void)0) +#define DEBUG_(...) \ + (errno = ((int []){ \ + *(volatile int *)&errno, \ + (DEBUG__(__VA_ARGS__), 0) \ + })[0]) +#define DEBUG(...) DEBUG_(__VA_ARGS__, '\n') +#define INFO(...) DEBUG(__VA_ARGS__) +#define WARN(...) DEBUG(__VA_ARGS__) +#define ERROR(...) DEBUG(__VA_ARGS__) +#else +#define DEBUG(...) ((void)0) +#define LOG__(level, m, ...) \ + RTE_LOG(level, PMD, "net_failsafe: " m "%c", __VA_ARGS__) +#define LOG_(level, ...) LOG__(level, __VA_ARGS__, '\n') +#define INFO(...) LOG_(INFO, __VA_ARGS__) +#define WARN(...) LOG_(WARNING, "WARNING: " __VA_ARGS__) +#define ERROR(...) LOG_(ERR, "ERROR: " __VA_ARGS__) +#endif + +/* inlined functions */ + +static inline uint8_t +fs_find_next(struct rte_eth_dev *dev, uint8_t sid, + enum dev_state min_state) +{ + while (sid < PRIV(dev)->subs_tail) { + if (PRIV(dev)->subs[sid].state >= min_state) + break; + sid++; + } + if (sid >= PRIV(dev)->subs_tail) + return PRIV(dev)->subs_tail; + return sid; +} + +#endif /* _RTE_ETH_FAILSAFE_PRIVATE_H_ */ diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c new file mode 100644 index 0000000..a45b4e5 --- /dev/null +++ b/drivers/net/failsafe/failsafe_rxtx.c @@ -0,0 +1,107 @@ +/*- + * BSD LICENSE + * + * Copyright 2017 6WIND S.A. + * Copyright 2017 Mellanox. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include + +#include "failsafe_private.h" + +/* + * TODO: write fast version, + * without additional checks, to be activated once + * everything has been verified to comply. + */ +uint16_t +failsafe_rx_burst(void *queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + struct fs_priv *priv; + struct sub_device *sdev; + struct rxq *rxq; + void *sub_rxq; + uint16_t nb_rx; + uint8_t nb_polled, nb_subs; + uint8_t i; + + rxq = queue; + priv = rxq->priv; + nb_subs = priv->subs_tail - priv->subs_head; + nb_polled = 0; + for (i = rxq->last_polled; nb_polled < nb_subs; nb_polled++) { + i++; + if (i == priv->subs_tail) + i = priv->subs_head; + sdev = &priv->subs[i]; + if (unlikely(ETH(sdev) == NULL)) + continue; + if (unlikely(ETH(sdev)->rx_pkt_burst == NULL)) + continue; + if (unlikely(sdev->state != DEV_STARTED)) + continue; + sub_rxq = ETH(sdev)->data->rx_queues[rxq->qid]; + nb_rx = ETH(sdev)-> + rx_pkt_burst(sub_rxq, rx_pkts, nb_pkts); + if (nb_rx) { + rxq->last_polled = i; + return nb_rx; + } + } + return 0; +} + +/* + * TODO: write fast version, + * without additional checks, to be activated once + * everything has been verified to comply. + */ +uint16_t +failsafe_tx_burst(void *queue, + struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + struct sub_device *sdev; + struct txq *txq; + void *sub_txq; + + txq = queue; + sdev = TX_SUBDEV(txq->priv->dev); + if (unlikely(sdev == NULL)) + return 0; + if (unlikely(ETH(sdev) == NULL)) + return 0; + if (unlikely(ETH(sdev)->tx_pkt_burst == NULL)) + return 0; + sub_txq = ETH(sdev)->data->tx_queues[txq->qid]; + return ETH(sdev)->tx_pkt_burst(sub_txq, tx_pkts, nb_pkts); +} diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 92f3635..2032625 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -112,6 +112,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += -lrte_pmd_e1000 _LDLIBS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += -lrte_pmd_ena _LDLIBS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += -lrte_pmd_enic _LDLIBS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += -lrte_pmd_fm10k +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += -lrte_pmd_failsafe _LDLIBS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += -lrte_pmd_i40e _LDLIBS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += -lrte_pmd_ixgbe _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -lrte_pmd_mlx4 -libverbs -- 2.1.4