From: Harman Kalra <hkalra@marvell.com>
To: Andrzej Ostruszka <aostruszka@marvell.com>
Cc: <dev@dpdk.org>, Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [PATCH 1/4] lib: introduce IF Proxy library
Date: Tue, 31 Mar 2020 18:06:29 +0530 [thread overview]
Message-ID: <20200331123627.GA10566@outlook.office365.com> (raw)
In-Reply-To: <20200306164104.15528-2-aostruszka@marvell.com>
On Fri, Mar 06, 2020 at 05:41:01PM +0100, Andrzej Ostruszka wrote:
> This library allows to designate ports visible to the system (such as
> Tun/Tap or KNI) as port representors serving as proxies for other DPDK
> ports. When such a proxy is configured this library initially queries
> network configuration from the system and later monitors its changes.
>
> The information gathered is passed to the application either via a set
> of user registered callbacks or as an event added to the configured
> notification queue (or a combination of these two mechanisms). This way
> user can use normal network utilities (like those from the iproute2
> suite) to configure DPDK ports.
>
> Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com>
> ---
> MAINTAINERS | 3 +
> config/common_base | 5 +
> config/common_linux | 1 +
> lib/Makefile | 2 +
> .../common/include/rte_eal_interrupts.h | 2 +
> lib/librte_eal/linux/eal/eal_interrupts.c | 14 +-
> lib/librte_if_proxy/Makefile | 29 +
> lib/librte_if_proxy/if_proxy_common.c | 494 +++++++++++++++
> lib/librte_if_proxy/if_proxy_priv.h | 97 +++
> lib/librte_if_proxy/linux/Makefile | 4 +
> lib/librte_if_proxy/linux/if_proxy.c | 552 +++++++++++++++++
> lib/librte_if_proxy/meson.build | 19 +
> lib/librte_if_proxy/rte_if_proxy.h | 561 ++++++++++++++++++
> lib/librte_if_proxy/rte_if_proxy_version.map | 19 +
> lib/meson.build | 2 +-
> 15 files changed, 1799 insertions(+), 5 deletions(-)
> create mode 100644 lib/librte_if_proxy/Makefile
> create mode 100644 lib/librte_if_proxy/if_proxy_common.c
> create mode 100644 lib/librte_if_proxy/if_proxy_priv.h
> create mode 100644 lib/librte_if_proxy/linux/Makefile
> create mode 100644 lib/librte_if_proxy/linux/if_proxy.c
> create mode 100644 lib/librte_if_proxy/meson.build
> create mode 100644 lib/librte_if_proxy/rte_if_proxy.h
> create mode 100644 lib/librte_if_proxy/rte_if_proxy_version.map
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f4e0ed8e0..aec7326ca 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1469,6 +1469,9 @@ F: examples/bpf/
> F: app/test/test_bpf.c
> F: doc/guides/prog_guide/bpf_lib.rst
>
> +IF Proxy - EXPERIMENTAL
> +M: Andrzej Ostruszka <aostruszka@marvell.com>
> +F: lib/librte_if_proxy/
>
> Test Applications
> -----------------
> diff --git a/config/common_base b/config/common_base
> index 7ca2f28b1..dcc0a0650 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -1075,6 +1075,11 @@ CONFIG_RTE_LIBRTE_BPF_ELF=n
> #
> CONFIG_RTE_LIBRTE_IPSEC=y
>
> +#
> +# Compile librte_if_proxy
> +#
> +CONFIG_RTE_LIBRTE_IF_PROXY=n
> +
> #
> # Compile the test application
> #
> diff --git a/config/common_linux b/config/common_linux
> index 816810671..1244eb0ae 100644
> --- a/config/common_linux
> +++ b/config/common_linux
> @@ -16,6 +16,7 @@ CONFIG_RTE_LIBRTE_VHOST_NUMA=y
> CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
> CONFIG_RTE_LIBRTE_PMD_VHOST=y
> CONFIG_RTE_LIBRTE_IFC_PMD=y
> +CONFIG_RTE_LIBRTE_IF_PROXY=y
> CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
> CONFIG_RTE_LIBRTE_PMD_MEMIF=y
> CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
> diff --git a/lib/Makefile b/lib/Makefile
> index 46b91ae1a..6a20806f1 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -118,6 +118,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_TELEMETRY) += librte_telemetry
> DEPDIRS-librte_telemetry := librte_eal librte_metrics librte_ethdev
> DIRS-$(CONFIG_RTE_LIBRTE_RCU) += librte_rcu
> DEPDIRS-librte_rcu := librte_eal
> +DIRS-$(CONFIG_RTE_LIBRTE_IF_PROXY) += librte_if_proxy
> +DEPDIRS-librte_if_proxy := librte_eal librte_ethdev
>
> ifeq ($(CONFIG_RTE_EXEC_ENV_LINUX),y)
> DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_eal/common/include/rte_eal_interrupts.h b/lib/librte_eal/common/include/rte_eal_interrupts.h
> index 773a34a42..296a3853d 100644
> --- a/lib/librte_eal/common/include/rte_eal_interrupts.h
> +++ b/lib/librte_eal/common/include/rte_eal_interrupts.h
> @@ -36,6 +36,8 @@ enum rte_intr_handle_type {
> RTE_INTR_HANDLE_VDEV, /**< virtual device */
> RTE_INTR_HANDLE_DEV_EVENT, /**< device event handle */
> RTE_INTR_HANDLE_VFIO_REQ, /**< VFIO request handle */
> + RTE_INTR_HANDLE_NETLINK, /**< netlink notification handle */
> +
> RTE_INTR_HANDLE_MAX /**< count of elements */
> };
>
> diff --git a/lib/librte_eal/linux/eal/eal_interrupts.c b/lib/librte_eal/linux/eal/eal_interrupts.c
> index cb8e10709..16236a8c4 100644
> --- a/lib/librte_eal/linux/eal/eal_interrupts.c
> +++ b/lib/librte_eal/linux/eal/eal_interrupts.c
> @@ -680,6 +680,9 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
> break;
> /* not used at this moment */
> case RTE_INTR_HANDLE_ALARM:
> +#if RTE_LIBRTE_IF_PROXY
> + case RTE_INTR_HANDLE_NETLINK:
> +#endif
> return -1;
> #ifdef VFIO_PRESENT
> case RTE_INTR_HANDLE_VFIO_MSIX:
> @@ -796,6 +799,9 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
> break;
> /* not used at this moment */
> case RTE_INTR_HANDLE_ALARM:
> +#if RTE_LIBRTE_IF_PROXY
> + case RTE_INTR_HANDLE_NETLINK:
> +#endif
> return -1;
> #ifdef VFIO_PRESENT
> case RTE_INTR_HANDLE_VFIO_MSIX:
> @@ -889,12 +895,12 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
> break;
> #endif
> #endif
> - case RTE_INTR_HANDLE_VDEV:
> case RTE_INTR_HANDLE_EXT:
> - bytes_read = 0;
> - call = true;
> - break;
> + case RTE_INTR_HANDLE_VDEV:
> case RTE_INTR_HANDLE_DEV_EVENT:
> +#if RTE_LIBRTE_IF_PROXY
> + case RTE_INTR_HANDLE_NETLINK:
> +#endif
> bytes_read = 0;
> call = true;
> break;
> diff --git a/lib/librte_if_proxy/Makefile b/lib/librte_if_proxy/Makefile
> new file mode 100644
> index 000000000..43cb702a2
> --- /dev/null
> +++ b/lib/librte_if_proxy/Makefile
> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2020 Marvell International Ltd.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_if_proxy.a
> +
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> +LDLIBS += -lrte_eal -lrte_ethdev
> +
> +EXPORT_MAP := rte_if_proxy_version.map
> +
> +LIBABIVER := 1
> +
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_IF_PROXY) := if_proxy_common.c
> +
> +SYSDIR := $(patsubst "%app",%,$(CONFIG_RTE_EXEC_ENV))
> +include $(SRCDIR)/$(SYSDIR)/Makefile
> +
> +SRCS-$(CONFIG_RTE_LIBRTE_IF_PROXY) += $(addprefix $(SYSDIR)/,$(SRCS))
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_IF_PROXY)-include := rte_if_proxy.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_if_proxy/if_proxy_common.c b/lib/librte_if_proxy/if_proxy_common.c
> new file mode 100644
> index 000000000..230727d0c
> --- /dev/null
> +++ b/lib/librte_if_proxy/if_proxy_common.c
> @@ -0,0 +1,494 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2020 Marvell International Ltd.
> + */
> +
> +#include <if_proxy_priv.h>
> +#include <rte_string_fns.h>
> +
> +
> +/* Definitions of data mentioned in if_proxy_priv.h and local ones. */
> +int ifpx_log_type;
> +
> +uint16_t ifpx_ports[RTE_MAX_ETHPORTS];
> +
> +rte_spinlock_t ifpx_lock = RTE_SPINLOCK_INITIALIZER;
> +
> +struct ifpx_proxies_head ifpx_proxies = TAILQ_HEAD_INITIALIZER(ifpx_proxies);
> +
> +struct ifpx_queue_node {
> + TAILQ_ENTRY(ifpx_queue_node) elem;
> + uint16_t state;
> + struct rte_ring *r;
> +};
> +static
> +TAILQ_HEAD(ifpx_queues_head, ifpx_queue_node) ifpx_queues =
> + TAILQ_HEAD_INITIALIZER(ifpx_queues);
> +
> +/* All function pointers have the same size - so use this one to typecast
> + * different callbacks in rte_ifpx_callbacks and test their presence in a
> + * generic way.
> + */
> +union cb_ptr_t {
> + int (*f_ptr)(void*); /* type for normal event notification */
> + int (*cfg_done)(void); /* lib notification for finished config */
> +};
> +union {
> + struct rte_ifpx_callbacks cbs;
> + union cb_ptr_t funcs[RTE_IFPX_NUM_EVENTS];
> +} ifpx_callbacks;
> +
> +uint64_t rte_ifpx_events_available(void)
> +{
> + /* All events are supported on Linux. */
> + return (1ULL << RTE_IFPX_NUM_EVENTS) - 1;
> +}
> +
> +uint16_t rte_ifpx_proxy_create(enum rte_ifpx_proxy_type type)
> +{
> + char devargs[16] = { '\0' };
> + int dev_cnt = 0, nlen;
> + uint16_t port_id;
> +
> + switch (type) {
> + case RTE_IFPX_DEFAULT:
> + case RTE_IFPX_TAP:
> + nlen = strlcpy(devargs, "net_tap", sizeof(devargs));
> + break;
> + case RTE_IFPX_KNI:
> + nlen = strlcpy(devargs, "net_kni", sizeof(devargs));
> + break;
> + default:
> + IFPX_LOG(ERR, "Unknown proxy type: %d", type);
> + return RTE_MAX_ETHPORTS;
> + }
> +
> + RTE_ETH_FOREACH_DEV(port_id) {
> + if (strcmp(rte_eth_devices[port_id].device->driver->name,
> + devargs) == 0)
> + ++dev_cnt;
> + }
> + snprintf(devargs+nlen, sizeof(devargs)-nlen, "%d", dev_cnt);
> +
> + return rte_ifpx_proxy_create_by_devarg(devargs);
> +}
> +
> +uint16_t rte_ifpx_proxy_create_by_devarg(const char *devarg)
> +{
> + uint16_t port_id = RTE_MAX_ETHPORTS;
> + struct rte_dev_iterator iter;
> +
> + if (rte_dev_probe(devarg) < 0) {
> + IFPX_LOG(ERR, "Failed to create proxy port %s\n", devarg);
> + return RTE_MAX_ETHPORTS;
> + }
> +
> + if (rte_eth_iterator_init(&iter, devarg) == 0) {
> + port_id = rte_eth_iterator_next(&iter);
> + if (port_id != RTE_MAX_ETHPORTS)
> + rte_eth_iterator_cleanup(&iter);
> + }
> +
> + return port_id;
> +}
> +
> +int ifpx_proxy_destroy(struct ifpx_proxy_node *px)
> +{
> + unsigned int i;
> + uint16_t proxy_id = px->proxy_id;
> +
> + TAILQ_REMOVE(&ifpx_proxies, px, elem);
> + free(px);
> +
> + /* Clear any bindings for this proxy. */
> + for (i = 0; i < RTE_DIM(ifpx_ports); ++i) {
> + if (ifpx_ports[i] == proxy_id) {
> + if (i == proxy_id) /* this entry is for proxy itself */
> + ifpx_ports[i] = RTE_MAX_ETHPORTS;
> + else
> + rte_ifpx_port_unbind(i);
> + }
> + }
> +
> + return rte_dev_remove(rte_eth_devices[proxy_id].device);
> +}
> +
> +int rte_ifpx_proxy_destroy(uint16_t proxy_id)
> +{
> + struct ifpx_proxy_node *px;
> + int ec = 0;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->proxy_id != proxy_id)
> + continue;
> + }
> + if (!px) {
> + ec = -EINVAL;
> + goto exit;
> + }
> + if (px->state & IN_USE)
> + px->state |= DEL_PENDING;
> + else
> + ec = ifpx_proxy_destroy(px);
> +exit:
> + rte_spinlock_unlock(&ifpx_lock);
> + return ec;
> +}
> +
> +int rte_ifpx_queue_add(struct rte_ring *r)
> +{
> + struct ifpx_queue_node *node;
> + int ec = 0;
> +
> + if (!r)
> + return -EINVAL;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(node, &ifpx_queues, elem) {
> + if (node->r == r) {
> + ec = -EEXIST;
> + goto exit;
> + }
> + }
> +
> + node = malloc(sizeof(*node));
> + if (!node) {
> + ec = -ENOMEM;
> + goto exit;
> + }
> +
> + node->r = r;
> + TAILQ_INSERT_TAIL(&ifpx_queues, node, elem);
> +exit:
> + rte_spinlock_unlock(&ifpx_lock);
> +
> + return ec;
> +}
> +
> +int rte_ifpx_queue_remove(struct rte_ring *r)
> +{
> + struct ifpx_queue_node *node, *next;
> + int ec = -EINVAL;
> +
> + if (!r)
> + return ec;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + for (node = TAILQ_FIRST(&ifpx_queues); node; node = next) {
> + next = TAILQ_NEXT(node, elem);
> + if (node->r != r)
> + continue;
> + TAILQ_REMOVE(&ifpx_queues, node, elem);
> + free(node);
> + ec = 0;
> + break;
> + }
> + rte_spinlock_unlock(&ifpx_lock);
> +
> + return ec;
> +}
> +
> +int rte_ifpx_port_bind(uint16_t port_id, uint16_t proxy_id)
> +{
> + struct rte_eth_dev_info proxy_eth_info;
> + struct ifpx_proxy_node *px;
> + int ec;
> +
> + if (port_id >= RTE_MAX_ETHPORTS || proxy_id >= RTE_MAX_ETHPORTS ||
> + /* port is a proxy */
> + ifpx_ports[port_id] == port_id) {
> + IFPX_LOG(ERR, "Invalid port_id: %d", port_id);
> + return -EINVAL;
> + }
> +
> + /* Do automatic rebinding but issue a warning since this is not
> + * considered to be a valid behaviour.
> + */
> + if (ifpx_ports[port_id] != RTE_MAX_ETHPORTS) {
> + IFPX_LOG(WARNING, "Port already bound: %d -> %d", port_id,
> + ifpx_ports[port_id]);
> + }
> +
> + /* Search for existing proxy - if not found add one to the list. */
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->proxy_id == proxy_id)
> + break;
> + }
> + if (!px) {
> + ec = rte_eth_dev_info_get(proxy_id, &proxy_eth_info);
> + if (ec < 0 || proxy_eth_info.if_index == 0) {
> + IFPX_LOG(ERR, "Invalid proxy: %d", proxy_id);
> + rte_spinlock_unlock(&ifpx_lock);
> + return ec < 0 ? ec : -EINVAL;
> + }
> + px = malloc(sizeof(*px));
> + if (!px) {
> + rte_spinlock_unlock(&ifpx_lock);
> + return -ENOMEM;
> + }
> + px->proxy_id = proxy_id;
> + px->info.if_index = proxy_eth_info.if_index;
> + rte_eth_dev_get_mtu(proxy_id, &px->info.mtu);
> + rte_eth_macaddr_get(proxy_id, &px->info.mac);
> + memset(px->info.if_name, 0, sizeof(px->info.if_name));
> + TAILQ_INSERT_TAIL(&ifpx_proxies, px, elem);
> + ifpx_ports[proxy_id] = proxy_id;
> + }
> + rte_spinlock_unlock(&ifpx_lock);
> + ifpx_ports[port_id] = proxy_id;
> +
> + /* Add proxy MAC to the port - since port will often just forward
> + * packets from the proxy/system they will be sent with proxy MAC as
> + * src. In order to pass communication in other direction we should be
> + * accepting packets with proxy MAC as dst.
> + */
> + rte_eth_dev_mac_addr_add(port_id, &px->info.mac, 0);
> +
> + if (ifpx_platform.get_info)
> + ifpx_platform.get_info(px->info.if_index);
> +
> + return 0;
> +}
> +
> +int rte_ifpx_port_unbind(uint16_t port_id)
> +{
> + if (port_id >= RTE_MAX_ETHPORTS ||
> + ifpx_ports[port_id] == RTE_MAX_ETHPORTS ||
> + /* port is a proxy */
> + ifpx_ports[port_id] == port_id)
> + return -EINVAL;
> +
> + ifpx_ports[port_id] = RTE_MAX_ETHPORTS;
> + /* Proxy without any port bound is OK - that is the state of the proxy
> + * that has just been created, and it can still report routing
> + * information. So we do not even check if this is the case.
> + */
> +
> + return 0;
> +}
> +
> +int rte_ifpx_callbacks_register(const struct rte_ifpx_callbacks *cbs)
> +{
> + if (!cbs)
> + return -EINVAL;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + ifpx_callbacks.cbs = *cbs;
> + rte_spinlock_unlock(&ifpx_lock);
> +
> + return 0;
> +}
> +
> +void rte_ifpx_callbacks_unregister(void)
> +{
> + rte_spinlock_lock(&ifpx_lock);
> + memset(&ifpx_callbacks.cbs, 0, sizeof(ifpx_callbacks.cbs));
> + rte_spinlock_unlock(&ifpx_lock);
> +}
> +
> +uint16_t rte_ifpx_proxy_get(uint16_t port_id)
> +{
> + if (port_id >= RTE_MAX_ETHPORTS)
> + return RTE_MAX_ETHPORTS;
> +
> + return ifpx_ports[port_id];
> +}
> +
> +unsigned int rte_ifpx_port_get(uint16_t proxy_id,
> + uint16_t *ports, unsigned int num)
> +{
> + unsigned int p, cnt = 0;
> +
> + for (p = 0; p < RTE_DIM(ifpx_ports); ++p) {
> + if (ifpx_ports[p] == proxy_id && ifpx_ports[p] != p) {
> + ++cnt;
> + if (ports && num > 0) {
> + *ports++ = p;
> + --num;
> + }
> + }
> + }
> + return cnt;
> +}
> +
> +const struct rte_ifpx_info *rte_ifpx_info_get(uint16_t port_id)
> +{
> + struct ifpx_proxy_node *px;
> +
> + if (port_id >= RTE_MAX_ETHPORTS ||
> + ifpx_ports[port_id] == RTE_MAX_ETHPORTS)
> + return NULL;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->proxy_id == ifpx_ports[port_id])
> + break;
> + }
> + rte_spinlock_unlock(&ifpx_lock);
> + RTE_ASSERT(px && "Internal IF Proxy library error");
> +
> + return &px->info;
> +}
> +
> +static
> +void queue_event(const struct rte_ifpx_event *ev, struct rte_ring *r)
> +{
> + struct rte_ifpx_event *e = malloc(sizeof(*ev));
> +
> + if (!e) {
> + IFPX_LOG(ERR, "Failed to allocate event!");
> + return;
> + }
> + RTE_ASSERT(r);
> +
> + *e = *ev;
> + rte_ring_sp_enqueue(r, e);
> +}
> +
> +void ifpx_notify_event(struct rte_ifpx_event *ev, struct ifpx_proxy_node *px)
> +{
> + struct ifpx_queue_node *q;
> + int done = 0;
> + uint16_t p, proxy_id;
> +
> + if (px) {
> + if (px->state & DEL_PENDING)
> + return;
> + proxy_id = px->proxy_id;
> + RTE_ASSERT(proxy_id != RTE_MAX_ETHPORTS);
> + px->state |= IN_USE;
> + } else
> + proxy_id = RTE_MAX_ETHPORTS;
> +
> + RTE_ASSERT(ev);
> + /* This function is expected to be called with a lock held. */
> + RTE_ASSERT(rte_spinlock_trylock(&ifpx_lock) == 0);
> +
> + if (ifpx_callbacks.funcs[ev->type].f_ptr) {
> + union cb_ptr_t cb = ifpx_callbacks.funcs[ev->type];
> +
> + /* Drop the lock for the time of callback call. */
> + rte_spinlock_unlock(&ifpx_lock);
> + if (px) {
> + for (p = 0; p < RTE_DIM(ifpx_ports); ++p) {
> + if (ifpx_ports[p] != proxy_id ||
> + ifpx_ports[p] == p)
> + continue;
> + ev->data.port_id = p;
> + done = cb.f_ptr(&ev->data) || done;
Since callback are handled as DPDK interrupts, hope there is no event
which gets lost. Cannot afford to loose a route change event as kernel
might not send it again.
> + }
> + } else {
> + RTE_ASSERT(ev->type == RTE_IFPX_CFG_DONE);
> + done = cb.cfg_done();
> + }
> + rte_spinlock_lock(&ifpx_lock);
> + }
> + if (done)
> + goto exit;
> +
> + /* Event not "consumed" yet so try to notify via queues. */
> + TAILQ_FOREACH(q, &ifpx_queues, elem) {
> + if (px) {
> + for (p = 0; p < RTE_DIM(ifpx_ports); ++p) {
> + if (ifpx_ports[p] != proxy_id ||
> + ifpx_ports[p] == p)
> + continue;
> + /* Set the port_id - the remaining params should
> + * be filled before calling this function.
> + */
> + ev->data.port_id = p;
> + queue_event(ev, q->r);
> + }
> + } else
> + queue_event(ev, q->r);
> + }
> +exit:
> + if (px)
> + px->state &= ~IN_USE;
> +}
> +
> +void ifpx_cleanup_proxies(void)
> +{
> + struct ifpx_proxy_node *px, *next;
> + for (px = TAILQ_FIRST(&ifpx_proxies); px; px = next) {
> + next = TAILQ_NEXT(px, elem);
> + if (px->state & DEL_PENDING)
> + ifpx_proxy_destroy(px);
> + }
> +}
> +
> +int rte_ifpx_listen(void)
> +{
> + int ec;
> +
> + if (!ifpx_platform.listen)
> + return -ENOTSUP;
> +
> + ec = ifpx_platform.listen();
> + if (ec == 0 && ifpx_platform.get_info)
> + ifpx_platform.get_info(0);
nlink_get_info calls request_info with a if_index, passing 0 might
be good in current scenario but valid index should be passed to
get_info.
> +
> + return ec;
> +}
> +
> +int rte_ifpx_close(void)
> +{
> + struct ifpx_proxy_node *px;
> + struct ifpx_queue_node *q;
> + unsigned int p;
> + int ec = 0;
> +
> + if (ifpx_platform.close) {
> + ec = ifpx_platform.close();
> + if (ec != 0)
> + IFPX_LOG(ERR, "Platform 'close' calback failed.");
> + }
> +
> + rte_spinlock_lock(&ifpx_lock);
> + /* Remove queues. */
> + while (!TAILQ_EMPTY(&ifpx_queues)) {
> + q = TAILQ_FIRST(&ifpx_queues);
> + TAILQ_REMOVE(&ifpx_queues, q, elem);
> + free(q);
> + }
> +
> + /* Clear callbacks. */
> + memset(&ifpx_callbacks.cbs, 0, sizeof(ifpx_callbacks.cbs));
> +
> + /* Unbind ports. */
> + for (p = 0; p < RTE_DIM(ifpx_ports); ++p) {
> + if (ifpx_ports[p] == RTE_MAX_ETHPORTS)
> + continue;
> + if (ifpx_ports[p] == p)
> + /* port is a proxy - just clear entry */
> + ifpx_ports[p] = RTE_MAX_ETHPORTS;
> + else
> + rte_ifpx_port_unbind(p);
> + }
> +
> + /* Clear proxies. */
> + while (!TAILQ_EMPTY(&ifpx_proxies)) {
> + px = TAILQ_FIRST(&ifpx_proxies);
> + TAILQ_REMOVE(&ifpx_proxies, px, elem);
> + free(px);
> + }
> +
> + rte_spinlock_unlock(&ifpx_lock);
> +
> + return ec;
> +}
> +
> +RTE_INIT(if_proxy_init)
> +{
> + unsigned int i;
> + for (i = 0; i < RTE_DIM(ifpx_ports); ++i)
> + ifpx_ports[i] = RTE_MAX_ETHPORTS;
> +
> + ifpx_log_type = rte_log_register("lib.if_proxy");
> + if (ifpx_log_type >= 0)
> + rte_log_set_level(ifpx_log_type, RTE_LOG_WARNING);
> +
> + if (ifpx_platform.init)
> + ifpx_platform.init();
> +}
> diff --git a/lib/librte_if_proxy/if_proxy_priv.h b/lib/librte_if_proxy/if_proxy_priv.h
> new file mode 100644
> index 000000000..2fbf9127a
> --- /dev/null
> +++ b/lib/librte_if_proxy/if_proxy_priv.h
> @@ -0,0 +1,97 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2020 Marvell International Ltd.
> + */
> +#ifndef _IF_PROXY_PRIV_H_
> +#define _IF_PROXY_PRIV_H_
> +
> +#include <rte_if_proxy.h>
> +#include <rte_spinlock.h>
> +
> +extern int ifpx_log_type;
> +#define IFPX_LOG(level, fmt, args...) \
> + rte_log(RTE_LOG_ ## level, ifpx_log_type, "%s(): " fmt "\n", \
> + __func__, ##args)
> +
> +/* Table keeping mapping between port and their proxies. */
> +extern
> +uint16_t ifpx_ports[RTE_MAX_ETHPORTS];
> +
> +/* Callbacks and proxies are kept in linked lists. Since this library is really
> + * a slow/config path we guard them with a lock - and only one for all of them
> + * should be enough. We don't expect a need to protect other data structures -
> + * e.g. data for given port is expected be accessed/modified from single thread.
> + */
> +extern rte_spinlock_t ifpx_lock;
> +
> +enum ifpx_node_status {
> + IN_USE = 1U << 0,
> + DEL_PENDING = 1U << 1,
> +};
> +
> +/* List of configured proxies */
> +struct ifpx_proxy_node {
> + TAILQ_ENTRY(ifpx_proxy_node) elem;
> + uint16_t proxy_id;
> + uint16_t state;
> + struct rte_ifpx_info info;
> +};
> +extern
> +TAILQ_HEAD(ifpx_proxies_head, ifpx_proxy_node) ifpx_proxies;
> +
> +/* This function should be called by the implementation whenever it notices
> + * change in the network configuration. The arguments are:
> + * - ev : pointer to filled event data structure (all fields are expected to be
> + * filled, with the exception of 'port_id' for all proxy/port related
> + * events: this function clones the event notification for each bound port
> + * and fills 'port_id' appropriately).
> + * - px : proxy node when given event is proxy/port related, otherwise pass NULL
> + */
> +void ifpx_notify_event(struct rte_ifpx_event *ev, struct ifpx_proxy_node *px);
> +
> +/* This function should be called by the implementation whenever it is done with
> + * notification about network configuration change. It is only really needed
> + * for the case of callback based API - from the callback user might to attempt
> + * to remove callbacks/proxies. Removing of callbacks is handled by the
> + * ifpx_notify_event() function above, however only implementation really knows
> + * when notification for given proxy is finished so it is a duty of it to call
> + * this function to cleanup all proxies that has been marked for deletion.
> + */
> +void ifpx_cleanup_proxies(void);
> +
> +/* This is the internal function removing the proxy from the list. It is
> + * related to the notification function above and intended to be used by the
> + * platform implementation for the case of callback based API.
> + * During notification via callback the internal lock is released so that
> + * operation would not deadlock on an attempt to take a lock. However
> + * modification (destruction) is not really performed - instead the
> + * callbacks/proxies are marked as "to be deleted".
> + * Handling of callbacks that are "to be deleted" is done by the
> + * ifpx_notify_event() function itself however it cannot delete the proxies (in
> + * particular the proxy passed as an argument) since they might still be refered
> + * by the calling function. So it is a responsibility of the platform
> + * implementation to check after calling notification function if there are any
> + * proxies to be removed and use ifpx_proxy_destroy() to actually release them.
> + */
> +int ifpx_proxy_destroy(struct ifpx_proxy_node *px);
> +
> +/* Every implementation should provide definition of this structure:
> + * - init : called during library initialization (NULL when not needed)
> + * - listen : this function should start service listening to the network
> + * configuration events/changes,
> + * - close : this function should close the service started by listen()
> + * - get_info : this function should query system for current configuration of
> + * interface with index 'if_index'. After successful initialization of
> + * listening service this function is calle with 0 as an argument. In that
> + * case configuration of all ports should be obtained - and when this
> + * procedure completes a RTE_IFPX_CFG_DONE event should be signaled via
> + * ifpx_notify_event().
> + */
> +extern
> +struct ifpx_platform_callbacks {
> + void (*init)(void);
> + int (*listen)(void);
> + int (*close)(void);
> + void (*get_info)(int if_index);
> +} ifpx_platform;
> +
> +#endif /* _IF_PROXY_PRIV_H_ */
> diff --git a/lib/librte_if_proxy/linux/Makefile b/lib/librte_if_proxy/linux/Makefile
> new file mode 100644
> index 000000000..275b7e1e3
> --- /dev/null
> +++ b/lib/librte_if_proxy/linux/Makefile
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2020 Marvell International Ltd.
> +
> +SRCS += if_proxy.c
> diff --git a/lib/librte_if_proxy/linux/if_proxy.c b/lib/librte_if_proxy/linux/if_proxy.c
> new file mode 100644
> index 000000000..bf851c096
> --- /dev/null
> +++ b/lib/librte_if_proxy/linux/if_proxy.c
> @@ -0,0 +1,552 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2020 Marvell International Ltd.
> + */
> +#include <if_proxy_priv.h>
> +#include <rte_interrupts.h>
> +#include <rte_string_fns.h>
> +
> +#include <stdbool.h>
> +#include <unistd.h>
> +#include <errno.h>
> +#include <sys/socket.h>
> +#include <linux/rtnetlink.h>
> +#include <linux/if.h>
> +
> +static
> +struct rte_intr_handle ifpx_irq = {
> + .type = RTE_INTR_HANDLE_NETLINK,
> + .fd = -1,
> +};
> +
> +static
> +unsigned int ifpx_pid;
> +
> +static
> +int request_info(int type, int index)
> +{
> + static rte_spinlock_t send_lock = RTE_SPINLOCK_INITIALIZER;
> + struct info_get {
> + struct nlmsghdr h;
> + union {
> + struct ifinfomsg ifm;
> + struct ifaddrmsg ifa;
> + struct rtmsg rtm;
> + struct ndmsg ndm;
> + } __rte_aligned(NLMSG_ALIGNTO);
> + } info_req;
> + int ret;
> +
> + memset(&info_req, 0, sizeof(info_req));
> + /* First byte of these messages is family, so just make sure that this
> + * memset is enough to get all families.
> + */
> + RTE_ASSERT(AF_UNSPEC == 0);
> +
> + info_req.h.nlmsg_pid = ifpx_pid;
> + info_req.h.nlmsg_type = type;
> + info_req.h.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
> + info_req.h.nlmsg_len = offsetof(struct info_get, ifm);
> +
> + switch (type) {
> + case RTM_GETLINK:
> + info_req.h.nlmsg_len += sizeof(info_req.ifm);
> + info_req.ifm.ifi_index = index;
> + break;
> + case RTM_GETADDR:
> + info_req.h.nlmsg_len += sizeof(info_req.ifa);
> + info_req.ifa.ifa_index = index;
> + break;
> + case RTM_GETROUTE:
> + info_req.h.nlmsg_len += sizeof(info_req.rtm);
> + break;
> + case RTM_GETNEIGH:
> + info_req.h.nlmsg_len += sizeof(info_req.ndm);
> + break;
> + default:
> + IFPX_LOG(WARNING, "Unhandled message type: %d", type);
> + return -EINVAL;
> + }
> + /* Store request type (and if it is global or link specific) in 'seq'.
> + * Later it is used during handling of reply to continue requesting of
> + * information dump from system - if needed.
> + */
> + info_req.h.nlmsg_seq = index << 8 | type;
> +
> + IFPX_LOG(DEBUG, "\tRequesting msg %d for: %u", type, index);
> +
> + rte_spinlock_lock(&send_lock);
> + ret = send(ifpx_irq.fd, &info_req, info_req.h.nlmsg_len, 0);
> + if (ret < 0) {
> + IFPX_LOG(ERR, "Failed to send netlink msg: %d", errno);
> + rte_errno = errno;
> + }
> + rte_spinlock_unlock(&send_lock);
> +
> + return ret;
> +}
> +
> +static
> +void handle_link(const struct nlmsghdr *h)
> +{
> + const struct ifinfomsg *ifi = NLMSG_DATA(h);
> + int alen = h->nlmsg_len - NLMSG_LENGTH(sizeof(*ifi));
> + const struct rtattr *attrs[IFLA_MAX+1] = { NULL };
> + const struct rtattr *attr;
> + struct ifpx_proxy_node *px;
> + struct rte_ifpx_event ev;
> +
> + IFPX_LOG(DEBUG, "\tLink action (%u): %u, 0x%x/0x%x (flags/changed)",
> + ifi->ifi_index, h->nlmsg_type, ifi->ifi_flags,
> + ifi->ifi_change);
> +
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->info.if_index == (unsigned int)ifi->ifi_index)
> + break;
> + }
> +
> + /* Drop messages that are not associated with any proxy */
> + if (!px)
> + goto exit;
> + /* When message is a reply to request for specific interface then keep
> + * it only when it contains info for this interface.
> + */
> + if (h->nlmsg_pid == ifpx_pid && h->nlmsg_seq >> 8 &&
> + (h->nlmsg_seq >> 8) != (unsigned)ifi->ifi_index)
> + goto exit;
> +
> + for (attr = IFLA_RTA(ifi); RTA_OK(attr, alen);
> + attr = RTA_NEXT(attr, alen)) {
> + if (attr->rta_type > IFLA_MAX)
> + continue;
> + attrs[attr->rta_type] = attr;
> + }
> +
> + if (ifi->ifi_change & IFF_UP) {
> + ev.type = RTE_IFPX_LINK_CHANGE;
> + ev.link_change.is_up = ifi->ifi_flags & IFF_UP;
> + ifpx_notify_event(&ev, px);
> + }
> + if (attrs[IFLA_MTU]) {
> + uint16_t mtu = *(const int *)RTA_DATA(attrs[IFLA_MTU]);
> + if (mtu != px->info.mtu) {
> + px->info.mtu = mtu;
> + ev.type = RTE_IFPX_MTU_CHANGE;
> + ev.mtu_change.mtu = mtu;
> + ifpx_notify_event(&ev, px);
> + }
> + }
> + if (attrs[IFLA_ADDRESS]) {
> + const struct rte_ether_addr *mac =
> + RTA_DATA(attrs[IFLA_ADDRESS]);
> +
> + RTE_ASSERT(RTA_PAYLOAD(attrs[IFLA_ADDRESS]) ==
> + RTE_ETHER_ADDR_LEN);
> + if (memcmp(mac, &px->info.mac, RTE_ETHER_ADDR_LEN) != 0) {
> + rte_ether_addr_copy(mac, &px->info.mac);
> + ev.type = RTE_IFPX_MAC_CHANGE;
> + rte_ether_addr_copy(mac, &ev.mac_change.mac);
> + ifpx_notify_event(&ev, px);
> + }
> + }
> + if (h->nlmsg_pid == ifpx_pid) {
> + RTE_ASSERT((h->nlmsg_seq & 0xFF) == RTM_GETLINK);
> + /* If this is reply for specific link request (not initial
> + * global dump) then follow up with address request, otherwise
> + * just store the interface name.
> + */
> + if (h->nlmsg_seq >> 8)
> + request_info(RTM_GETADDR, ifi->ifi_index);
> + else if (!px->info.if_name[0] && attrs[IFLA_IFNAME])
> + strlcpy(px->info.if_name, RTA_DATA(attrs[IFLA_IFNAME]),
> + sizeof(px->info.if_name));
> + }
> +
> + ifpx_cleanup_proxies();
> +exit:
> + rte_spinlock_unlock(&ifpx_lock);
> +}
> +
> +static
> +void handle_addr(const struct nlmsghdr *h, bool needs_del)
> +{
> + const struct ifaddrmsg *ifa = NLMSG_DATA(h);
> + int alen = h->nlmsg_len - NLMSG_LENGTH(sizeof(*ifa));
> + const struct rtattr *attrs[IFA_MAX+1] = { NULL };
> + const struct rtattr *attr;
> + struct ifpx_proxy_node *px;
> + struct rte_ifpx_event ev;
> + const uint8_t *ip;
> +
> + IFPX_LOG(DEBUG, "\tAddr action (%u): %u, family: %u",
> + ifa->ifa_index, h->nlmsg_type, ifa->ifa_family);
> +
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->info.if_index == ifa->ifa_index)
> + break;
> + }
> +
> + /* Drop messages that are not associated with any proxy */
> + if (!px)
> + goto exit;
> + /* When message is a reply to request for specific interface then keep
> + * it only when it contains info for this interface.
> + */
> + if (h->nlmsg_pid == ifpx_pid && h->nlmsg_seq >> 8 &&
> + (h->nlmsg_seq >> 8) != ifa->ifa_index)
> + goto exit;
> +
> + for (attr = IFA_RTA(ifa); RTA_OK(attr, alen);
> + attr = RTA_NEXT(attr, alen)) {
> + if (attr->rta_type > IFA_MAX)
> + continue;
> + attrs[attr->rta_type] = attr;
> + }
> +
> + if (attrs[IFA_ADDRESS]) {
> + ip = RTA_DATA(attrs[IFA_ADDRESS]);
> + if (ifa->ifa_family == AF_INET) {
> + ev.type = needs_del ? RTE_IFPX_ADDR_DEL
> + : RTE_IFPX_ADDR_ADD;
> + ev.addr_change.ip =
> + RTE_IPV4(ip[0], ip[1], ip[2], ip[3]);
> + } else {
> + ev.type = needs_del ? RTE_IFPX_ADDR6_DEL
> + : RTE_IFPX_ADDR6_ADD;
> + memcpy(ev.addr6_change.ip, ip, 16);
> + }
> + ifpx_notify_event(&ev, px);
> + ifpx_cleanup_proxies();
> + }
> +exit:
> + rte_spinlock_unlock(&ifpx_lock);
> +}
> +
> +static
> +void handle_route(const struct nlmsghdr *h, bool needs_del)
> +{
> + const struct rtmsg *r = NLMSG_DATA(h);
> + int alen = h->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
> + const struct rtattr *attrs[RTA_MAX+1] = { NULL };
> + const struct rtattr *attr;
> + struct rte_ifpx_event ev;
> + struct ifpx_proxy_node *px = NULL;
> + const uint8_t *ip;
> +
> + IFPX_LOG(DEBUG, "\tRoute action: %u, family: %u",
> + h->nlmsg_type, r->rtm_family);
> +
> + for (attr = RTM_RTA(r); RTA_OK(attr, alen);
> + attr = RTA_NEXT(attr, alen)) {
> + if (attr->rta_type > RTA_MAX)
> + continue;
> + attrs[attr->rta_type] = attr;
> + }
> +
> + memset(&ev, 0, sizeof(ev));
> + ev.type = RTE_IFPX_NUM_EVENTS;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + if (attrs[RTA_OIF]) {
> + int if_index = *((int32_t*)RTA_DATA(attrs[RTA_OIF]));
> +
> + if (if_index > 0) {
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->info.if_index == (uint32_t)if_index)
> + break;
> + }
> + }
> + }
> + /* We are only interested in routes related to the proxy interfaces and
> + * we need to have dst - otherwise skip the message.
> + */
> + if (!px || !attrs[RTA_DST])
> + goto exit;
> +
> + ip = RTA_DATA(attrs[RTA_DST]);
> + /* This is common to both IPv4/6. */
> + ev.route_change.depth = r->rtm_dst_len;
> + if (r->rtm_family == AF_INET) {
> + ev.type = needs_del ? RTE_IFPX_ROUTE_DEL
> + : RTE_IFPX_ROUTE_ADD;
> + ev.route_change.ip =
> + RTE_IPV4(ip[0], ip[1], ip[2], ip[3]);
> + } else {
> + ev.type = needs_del ? RTE_IFPX_ROUTE6_DEL
> + : RTE_IFPX_ROUTE6_ADD;
> + memcpy(ev.route6_change.ip, ip, 16);
> + }
> + if (attrs[RTA_GATEWAY]) {
> + ip = RTA_DATA(attrs[RTA_GATEWAY]);
> + if (r->rtm_family == AF_INET)
> + ev.route_change.gateway =
> + RTE_IPV4(ip[0], ip[1], ip[2], ip[3]);
> + else
> + memcpy(ev.route6_change.gateway, ip, 16);
> + }
> +
> + ifpx_notify_event(&ev, px);
> + /* Let's check for proxies to remove here too - just in case somebody
> + * removed the non-proxy related callback.
> + */
> + ifpx_cleanup_proxies();
> +exit:
> + rte_spinlock_unlock(&ifpx_lock);
> +}
> +
> +/* Link, addr and route related messages seem to have this macro defined but not
> + * neighbour one. Define one if it is missing - const qualifiers added just to
> + * silence compiler - for some reason it is not needed in equivalent macros for
> + * other messages and here compiler is complaining about (char*) cast on pointer
> + * to const.
> + */
> +#ifndef NDA_RTA
> +#define NDA_RTA(r) ((const struct rtattr*)(((const char*)(r)) + \
> + NLMSG_ALIGN(sizeof(struct ndmsg))))
> +#endif
> +
> +static
> +void handle_neigh(const struct nlmsghdr *h, bool needs_del)
> +{
> + const struct ndmsg *n = NLMSG_DATA(h);
> + int alen = h->nlmsg_len - NLMSG_LENGTH(sizeof(*n));
> + const struct rtattr *attrs[NDA_MAX+1] = { NULL };
> + const struct rtattr *attr;
> + struct ifpx_proxy_node *px;
> + struct rte_ifpx_event ev;
> + const uint8_t *ip;
> +
> + IFPX_LOG(DEBUG, "\tNeighbour action: %u, family: %u, state: %u, if: %d",
> + h->nlmsg_type, n->ndm_family, n->ndm_state, n->ndm_ifindex);
> +
> + for (attr = NDA_RTA(n); RTA_OK(attr, alen);
> + attr = RTA_NEXT(attr, alen)) {
> + if (attr->rta_type > NDA_MAX)
> + continue;
> + attrs[attr->rta_type] = attr;
> + }
> +
> + memset(&ev, 0, sizeof(ev));
> + ev.type = RTE_IFPX_NUM_EVENTS;
> +
> + rte_spinlock_lock(&ifpx_lock);
> + TAILQ_FOREACH(px, &ifpx_proxies, elem) {
> + if (px->info.if_index == (unsigned)n->ndm_ifindex)
> + break;
> + }
> + /* We need only subset of neighbourhood related to proxy interfaces.
> + * lladdr seems to be needed only for adding new entry - modifications
> + * (also reported via RTM_NEWLINK) and deletion include only dst.
> + */
> + if (!px || !attrs[NDA_DST] || (!needs_del && !attrs[NDA_LLADDR]))
> + goto exit;
> +
> + ip = RTA_DATA(attrs[NDA_DST]);
> + if (n->ndm_family == AF_INET) {
> + ev.type = needs_del ? RTE_IFPX_NEIGH_DEL
> + : RTE_IFPX_NEIGH_ADD;
> + ev.neigh_change.ip =
> + RTE_IPV4(ip[0], ip[1], ip[2], ip[3]);
> + } else {
> + ev.type = needs_del ? RTE_IFPX_NEIGH6_DEL
> + : RTE_IFPX_NEIGH6_ADD;
> + memcpy(ev.neigh6_change.ip, ip, 16);
> + }
> + if (attrs[NDA_LLADDR])
> + rte_ether_addr_copy(RTA_DATA(attrs[NDA_LLADDR]),
> + &ev.neigh_change.mac);
> +
> + ifpx_notify_event(&ev, px);
> + /* Let's check for proxies to remove here too - just in case somebody
> + * removed the non-proxy related callback.
> + */
> + ifpx_cleanup_proxies();
> +exit:
> + rte_spinlock_unlock(&ifpx_lock);
> +}
> +
> +static
> +void if_proxy_intr_callback(void *arg __rte_unused)
> +{
> + struct nlmsghdr *h;
> + struct sockaddr_nl addr;
> + socklen_t addr_len;
> + char buf[8192];
> + ssize_t len;
> +
> +restart:
> + len = recvfrom(ifpx_irq.fd, buf, sizeof(buf), 0,
> + (struct sockaddr *)&addr, &addr_len);
> + if (len < 0) {
> + if (errno == EINTR) {
> + IFPX_LOG(DEBUG, "recvmsg() interrupted");
> + goto restart;
> + }
> + IFPX_LOG(ERR, "Failed to read netlink msg: %ld (errno %d)",
> + len, errno);
> + return;
> + }
> + if (addr_len != sizeof(addr)) {
> + IFPX_LOG(ERR, "Invalid netlink addr size: %d", addr_len);
> + return;
> + }
> + IFPX_LOG(DEBUG, "Read %lu bytes (buf %lu) from %u/%u", len,
> + sizeof(buf), addr.nl_pid, addr.nl_groups);
> +
> + for (h = (struct nlmsghdr *)buf; NLMSG_OK(h, len);
> + h = NLMSG_NEXT(h, len)) {
> + IFPX_LOG(DEBUG, "Recv msg: %u (%u/%u/%u seq/flags/pid)",
> + h->nlmsg_type, h->nlmsg_seq, h->nlmsg_flags,
> + h->nlmsg_pid);
> +
> + switch (h->nlmsg_type) {
> + case RTM_NEWLINK:
> + case RTM_DELLINK:
> + handle_link(h);
> + break;
> + case RTM_NEWADDR:
> + case RTM_DELADDR:
> + handle_addr(h, h->nlmsg_type == RTM_DELADDR);
> + break;
> + case RTM_NEWROUTE:
> + case RTM_DELROUTE:
> + handle_route(h, h->nlmsg_type == RTM_DELROUTE);
> + break;
> + case RTM_NEWNEIGH:
> + case RTM_DELNEIGH:
> + handle_neigh(h, h->nlmsg_type == RTM_DELNEIGH);
> + break;
> + }
> +
> + /* If this is a reply for global request then follow up with
> + * additional requests and notify about finish.
> + */
> + if (h->nlmsg_pid == ifpx_pid && (h->nlmsg_seq >> 8) == 0 &&
> + h->nlmsg_type == NLMSG_DONE) {
Sorry, but in what scenario will the flow reach here.
> + if ((h->nlmsg_seq & 0xFF) == RTM_GETLINK)
> + request_info(RTM_GETADDR, 0);
> + else if ((h->nlmsg_seq & 0xFF) == RTM_GETADDR)
> + request_info(RTM_GETROUTE, 0);
> + else if ((h->nlmsg_seq & 0xFF) == RTM_GETROUTE)
> + request_info(RTM_GETNEIGH, 0);
> + else {
> + struct rte_ifpx_event ev = {
> + .type = RTE_IFPX_CFG_DONE
> + };
> +
> + RTE_ASSERT((h->nlmsg_seq & 0xFF) ==
> + RTM_GETNEIGH);
> + rte_spinlock_lock(&ifpx_lock);
> + ifpx_notify_event(&ev, NULL);
> + rte_spinlock_unlock(&ifpx_lock);
> + }
> + }
> + }
> + IFPX_LOG(DEBUG, "Finished msg loop: %ld bytes left", len);
> +}
> +
> +static
> +int nlink_listen(void)
> +{
> + struct sockaddr_nl addr = {
> + .nl_family = AF_NETLINK,
> + .nl_pid = 0,
> + };
> + socklen_t addr_len = sizeof(addr);
> + int ret;
> +
> + if (ifpx_irq.fd != -1) {
> + rte_errno = EBUSY;
> + return -1;
> + }
> +
> + addr.nl_groups = 1 << (RTNLGRP_LINK-1)
> + | 1 << (RTNLGRP_NEIGH-1)
> + | 1 << (RTNLGRP_IPV4_IFADDR-1)
> + | 1 << (RTNLGRP_IPV6_IFADDR-1)
> + | 1 << (RTNLGRP_IPV4_ROUTE-1)
> + | 1 << (RTNLGRP_IPV6_ROUTE-1);
> +
> + ifpx_irq.fd = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC,
> + NETLINK_ROUTE);
> + if (ifpx_irq.fd == -1) {
> + IFPX_LOG(ERR, "Failed to create netlink socket: %d", errno);
> + goto error;
> + }
> + /* Starting with kernel 4.19 you can request dump for a specific
> + * interface and kernel will filter out and send only relevant info.
> + * Otherwise NLM_F_DUMP will generate info for all interfaces and you
> + * need to filter them yourself.
> + */
> +#ifdef NETLINK_DUMP_STRICT_CHK
> + ret = 1; /* use this var also as an input param */
> + ret = setsockopt(ifpx_irq.fd, SOL_SOCKET, NETLINK_DUMP_STRICT_CHK,
> + &ret, sizeof(ret));
> + if (ret < 0) {
> + IFPX_LOG(ERR, "Failed to set socket option: %d", errno);
> + goto error;
> + }
> +#endif
> +
> + ret = bind(ifpx_irq.fd, (struct sockaddr *)&addr, addr_len);
> + if (ret < 0) {
> + IFPX_LOG(ERR, "Failed to bind socket: %d", errno);
> + goto error;
> + }
> + ret = getsockname(ifpx_irq.fd, (struct sockaddr *)&addr, &addr_len);
> + if (ret < 0) {
> + IFPX_LOG(ERR, "Failed to get socket addr: %d", errno);
> + goto error;
> + } else {
> + ifpx_pid = addr.nl_pid;
> + IFPX_LOG(DEBUG, "Assigned port ID: %u", addr.nl_pid);
> + }
> +
> + ret = rte_intr_callback_register(&ifpx_irq, if_proxy_intr_callback,
> + NULL);
> + if (ret == 0)
> + return 0;
> +
> +error:
> + rte_errno = errno;
> + if (ifpx_irq.fd != -1) {
> + close(ifpx_irq.fd);
> + ifpx_irq.fd = -1;
> + }
> + return -1;
> +}
> +
> +static
> +int nlink_close(void)
> +{
> + int ec;
> +
> + if (ifpx_irq.fd < 0)
> + return -EBADFD;
> +
> + do
> + ec = rte_intr_callback_unregister(&ifpx_irq,
> + if_proxy_intr_callback, NULL);
> + while (ec == -EAGAIN); /* unlikely but possible - at least I think so */
> +
> + close(ifpx_irq.fd);
> + ifpx_irq.fd = -1;
> + ifpx_pid = 0;
> +
> + return 0;
> +}
> +
> +static
> +void nlink_get_info(int if_index)
> +{
> + if (ifpx_irq.fd != -1)
> + request_info(RTM_GETLINK, if_index);
> +}
> +
> +struct ifpx_platform_callbacks ifpx_platform = {
> + .init = NULL,
> + .listen = nlink_listen,
> + .close = nlink_close,
> + .get_info = nlink_get_info,
> +};
> diff --git a/lib/librte_if_proxy/meson.build b/lib/librte_if_proxy/meson.build
> new file mode 100644
> index 000000000..f0c1a6e15
> --- /dev/null
> +++ b/lib/librte_if_proxy/meson.build
> @@ -0,0 +1,19 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2020 Marvell International Ltd.
> +
> +# Currently only implemented on Linux
> +if not is_linux
> + build = false
> + reason = 'only supported on linux'
> +endif
> +
> +version = 1
> +allow_experimental_apis = true
> +
> +deps += ['ethdev']
> +sources = files('if_proxy_common.c')
> +headers = files('rte_if_proxy.h')
> +
> +if is_linux
> + sources += files('linux/if_proxy.c')
> +endif
> diff --git a/lib/librte_if_proxy/rte_if_proxy.h b/lib/librte_if_proxy/rte_if_proxy.h
> new file mode 100644
> index 000000000..e620319b3
> --- /dev/null
> +++ b/lib/librte_if_proxy/rte_if_proxy.h
> @@ -0,0 +1,561 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2020 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_IF_PROXY_H_
> +#define _RTE_IF_PROXY_H_
> +
> +/**
> + * @file
> + * RTE IF Proxy library
> + *
> + * The IF Proxy library allows for monitoring of system network configuration
> + * and configuration of DPDK ports by using usual system utilities (like the
> + * ones from iproute2 package).
> + *
> + * It is based on the notion of "proxy interface" which actually can be any DPDK
> + * port which is also visible to the system - that is it has non-zero 'if_index'
> + * field in 'rte_eth_dev_info' structure.
> + *
> + * If application doesn't have any such port (or doesn't want to use it for
> + * proxy) it can create one by calling:
> + *
> + * proxy_id = rte_ifpx_create(RTE_IFPX_DEFAULT);
> + *
> + * This function is just a wrapper that constructs valid 'devargs' string based
> + * on the proxy type chosen (currently Tap or KNI) and creates the interface by
> + * calling rte_ifpx_dev_create().
> + *
> + * Once one has DPDK port capable of being proxy one can bind target DPDK port
> + * to it by calling.
> + *
> + * rte_ifpx_port_bind(port_id, proxy_id);
> + *
> + * This binding is a logical one - there is no automatic packet forwarding
> + * between port and it's proxy since the library doesn't know the structure of
> + * application's packet processing. It remains application responsibility to
> + * forward the packets from/to proxy port (by calling the usual DPDK RX/TX burst
> + * API). However when the library notes some change to the proxy interface it
> + * will simply call appropriate callback with 'port_id' of the DPDK port that is
> + * bound to this proxy interface. The binding can be 1 to many - that is many
> + * ports can point to one proxy - in that case registered callbacks will be
> + * called for every bound port.
> + *
> + * The callbacks that are used for notifications are described by the
> + * 'rte_ifpx_callbacks' structure and they are registered by calling:
> + *
> + * rte_ifpx_callbacks_register(&cbs);
> + *
> + * Finally the application should call:
> + *
> + * rte_ifpx_listen();
> + *
> + * which will query system for present network configuration and start listening
> + * to its changes.
> + */
> +
> +#include <rte_eal.h>
> +#include <rte_ethdev.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Enum naming the type of proxy to create.
> + *
> + * @see rte_ifpx_create()
> + */
> +enum rte_ifpx_proxy_type {
> + RTE_IFPX_DEFAULT, /**< Use default proxy type for given arch. */
> + RTE_IFPX_TAP, /**< Use Tap based port for proxy. */
> + RTE_IFPX_KNI /**< Use KNI based port for proxy. */
> +};
> +
> +/**
> + * Create DPDK port that can serve as an interface proxy.
> + *
> + * This function is just a wrapper around rte_ifpx_create_by_devarg() that
> + * constructs its 'devarg' argument based on type of proxy requested.
> + *
> + * @param type
> + * A type of proxy to create.
> + *
> + * @return
> + * DPDK port id on success, RTE_MAX_ETHPORTS otherwise.
> + *
> + * @see enum rte_ifpx_type
> + * @see rte_ifpx_create_by_devarg()
> + */
> +__rte_experimental
> +uint16_t rte_ifpx_proxy_create(enum rte_ifpx_proxy_type type);
> +
> +/**
> + * Create DPDK port that can serve as an interface proxy.
> + *
> + * @param devarg
> + * A string passed to rte_dev_probe() to create proxy port.
> + *
> + * @return
> + * DPDK port id on success, RTE_MAX_ETHPORTS otherwise.
> + */
> +__rte_experimental
> +uint16_t rte_ifpx_proxy_create_by_devarg(const char *devarg);
> +
> +/**
> + * Remove DPDK proxy port.
> + *
> + * In addition to removing the proxy port the bindings (if any) are cleared.
> + *
> + * @param proxy_id
> + * Port id of the proxy that should be removed.
> + *
> + * @return
> + * 0 on success, negative on error.
> + */
> +__rte_experimental
> +int rte_ifpx_proxy_destroy(uint16_t proxy_id);
> +
> +/**
> + * The rte_ifpx_event_type enum lists all possible event types that can be
> + * signaled by this library. To learn what events are supported on your
> + * platform call rte_ifpx_events_available().
> + *
> + * NOTE - do not reorder these enums freely, their values need to correspond to
> + * the order of the callbacks in struct rte_ifpx_callbacks.
> + */
> +enum rte_ifpx_event_type {
> + RTE_IFPX_MAC_CHANGE, /**< @see struct rte_ifpx_mac_change */
> + RTE_IFPX_MTU_CHANGE, /**< @see struct rte_ifpx_mtu_change */
> + RTE_IFPX_LINK_CHANGE, /**< @see struct rte_ifpx_link_change */
> + RTE_IFPX_ADDR_ADD, /**< @see struct rte_ifpx_addr_change */
> + RTE_IFPX_ADDR_DEL, /**< @see struct rte_ifpx_addr_change */
> + RTE_IFPX_ADDR6_ADD, /**< @see struct rte_ifpx_addr6_change */
> + RTE_IFPX_ADDR6_DEL, /**< @see struct rte_ifpx_addr6_change */
> + RTE_IFPX_ROUTE_ADD, /**< @see struct rte_ifpx_route_change */
> + RTE_IFPX_ROUTE_DEL, /**< @see struct rte_ifpx_route_change */
> + RTE_IFPX_ROUTE6_ADD, /**< @see struct rte_ifpx_route6_change */
> + RTE_IFPX_ROUTE6_DEL, /**< @see struct rte_ifpx_route6_change */
> + RTE_IFPX_NEIGH_ADD, /**< @see struct rte_ifpx_neigh_change */
> + RTE_IFPX_NEIGH_DEL, /**< @see struct rte_ifpx_neigh_change */
> + RTE_IFPX_NEIGH6_ADD, /**< @see struct rte_ifpx_neigh6_change */
> + RTE_IFPX_NEIGH6_DEL, /**< @see struct rte_ifpx_neigh6_change */
> + RTE_IFPX_CFG_DONE, /**< This event is a lib specific event - it is
> + * signaled when initial network configuration
> + * query is finished and has no event data.
> + */
> + RTE_IFPX_NUM_EVENTS,
> +};
> +
> +/**
> + * Get the bit mask of implemented events/callbacks for this platform.
> + *
> + * @return
> + * Bit mask of events/callbacks implemented: each event type can be tested by
> + * checking bit (1 << ev) where 'ev' is one of the rte_ifpx_event_type enum
> + * values.
> + * @see enum rte_ifpx_event_type
> + */
> +__rte_experimental
> +uint64_t rte_ifpx_events_available(void);
> +
> +/**
> + * The rte_ifpx_event defines structure used to pass notification event to
> + * application. Each event type has its own dedicated inner structure - these
> + * structures are also used when using callbacks notifications.
> + */
> +struct rte_ifpx_event {
> + enum rte_ifpx_event_type type;
> + union {
> + /** Structure used to pass notification about MAC change of the
> + * proxy interface.
> + * @see RTE_IFPX_MAC_CHANGE
> + */
> + struct rte_ifpx_mac_change {
> + uint16_t port_id;
> + struct rte_ether_addr mac;
> + } mac_change;
> + /** Structure used to pass notification about MTU change.
> + * @see RTE_IFPX_MTU_CHANGE
> + */
> + struct rte_ifpx_mtu_change {
> + uint16_t port_id;
> + uint16_t mtu;
> + } mtu_change;
> + /** Structure used to pass notification about link going
> + * up/down.
> + * @see RTE_IFPX_LINK_CHANGE
> + */
> + struct rte_ifpx_link_change {
> + uint16_t port_id;
> + int is_up;
> + } link_change;
> + /** Structure used to pass notification about IPv4 address being
> + * added/removed. All IPv4 addresses reported by this library
> + * are in host order.
> + * @see RTE_IFPX_ADDR_ADD
> + * @see RTE_IFPX_ADDR_DEL
> + */
> + struct rte_ifpx_addr_change {
> + uint16_t port_id;
> + uint32_t ip;
> + } addr_change;
> + /** Structure used to pass notification about IPv6 address being
> + * added/removed.
> + * @see RTE_IFPX_ADDR6_ADD
> + * @see RTE_IFPX_ADDR6_DEL
> + */
> + struct rte_ifpx_addr6_change {
> + uint16_t port_id;
> + uint8_t ip[16];
> + } addr6_change;
> + /** Structure used to pass notification about IPv4 route being
> + * added/removed.
> + * @see RTE_IFPX_ROUTE_ADD
> + * @see RTE_IFPX_ROUTE_DEL
> + */
> + struct rte_ifpx_route_change {
> + uint16_t port_id;
> + uint8_t depth;
> + uint32_t ip;
> + uint32_t gateway;
> + } route_change;
> + /** Structure used to pass notification about IPv6 route being
> + * added/removed.
> + * @see RTE_IFPX_ROUTE6_ADD
> + * @see RTE_IFPX_ROUTE6_DEL
> + */
> + struct rte_ifpx_route6_change {
> + uint16_t port_id;
> + uint8_t depth;
> + uint8_t ip[16];
> + uint8_t gateway[16];
> + } route6_change;
> + /** Structure used to pass notification about IPv4 neighbour
> + * info changes.
> + * @see RTE_IFPX_NEIGH_ADD
> + * @see RTE_IFPX_NEIGH_DEL
> + */
> + struct rte_ifpx_neigh_change {
> + uint16_t port_id;
> + struct rte_ether_addr mac;
> + uint32_t ip;
> + } neigh_change;
> + /** Structure used to pass notification about IPv6 neighbour
> + * info changes.
> + * @see RTE_IFPX_NEIGH6_ADD
> + * @see RTE_IFPX_NEIGH6_DEL
> + */
> + struct rte_ifpx_neigh6_change {
> + uint16_t port_id;
> + struct rte_ether_addr mac;
> + uint8_t ip[16];
> + } neigh6_change;
> + /* This structure is used internally - to abstract common parts
> + * of proxy/port related events and to be able to refer to this
> + * union without giving it a name.
> + */
> + struct {
> + uint16_t port_id;
> + } data;
> + };
> +};
> +
> +/**
> + * This library can deliver notification about network configuration changes
> + * either by the use of registered callbacks and/or by queueing change events to
> + * configured notification queues. The logic used is:
> + * 1. If there is callback registered for given event type it is called. In
> + * case of many ports to one proxy binding, this callback is called for every
> + * port bound.
> + * 2. If this callback returns non-zero value (for any of ports in case of
> + * many-1 bindings) the handling of an event is considered as complete.
> + * 3. Otherwise the event is added to each configured event queue. The event is
> + * allocated with malloc() so after dequeueing and handling the application
> + * should deallocate it with free().
> + *
> + * This dual notification mechanism is meant to provide some flexibility to
> + * application writer. For example, if you store your data in a single writer/
> + * many readers coherent data structure you could just update this structure
> + * from the callback. If you keep separate copy per lcore/port you could make
> + * some common preparations (if applicable) in the callback, return 0 and use
> + * notification queues to pick up the change and update data structures. Or you
> + * could skip the callbacks altogether and just use notification queues - and
> + * configure them at the level appropriate for your application design (one
> + * global / one per lcore / one per port ...).
> + */
> +
> +/**
> + * Add notification queue to the list of queues.
> + *
> + * @param r
> + * Ring used for queueing of notification events - application can assume that
> + * there is only one producer.
> + * @return
> + * 0 on success, negative otherwise.
> + */
> +int rte_ifpx_queue_add(struct rte_ring *r);
> +
> +/**
> + * Remove notification queue from the list of queues.
> + *
> + * @param r
> + * Notification ring used for queueing of notification events (previously
> + * added via rte_ifpx_queue_add()).
> + * @return
> + * 0 on success, negative otherwise.
> + */
> +int rte_ifpx_queue_remove(struct rte_ring *r);
> +
> +/**
> + * This structure groups the callbacks that might be called as a notification
> + * events for changing network configuration. Not every platform might
> + * implement all of them and you can query the availability with
> + * rte_ifpx_callbacks_available() function.
> + * @see rte_ifpx_events_available()
> + * @see rte_ifpx_callbacks_register()
> + */
> +struct rte_ifpx_callbacks {
> + int (*mac_change)(const struct rte_ifpx_mac_change *event);
> + /**< Callback for notification about MAC change of the proxy interface.
> + * This callback (as all other port related callbacks) is called for
> + * each port (with its port_id as a first argument) bound to the proxy
> + * interface for which change has been observed.
> + * @see struct rte_ifpx_mac_change
> + * @return non-zero if event handling is finished
> + */
> + int (*mtu_change)(const struct rte_ifpx_mtu_change *event);
> + /**< Callback for notification about MTU change.
> + * @see struct rte_ifpx_mtu_change
> + * @return non-zero if event handling is finished
> + */
> + int (*link_change)(const struct rte_ifpx_link_change *event);
> + /**< Callback for notification about link going up/down.
> + * @see struct rte_ifpx_link_change
> + * @return non-zero if event handling is finished
> + */
> + int (*addr_add)(const struct rte_ifpx_addr_change *event);
> + /**< Callback for notification about IPv4 address being added.
> + * @see struct rte_ifpx_addr_change
> + * @return non-zero if event handling is finished
> + */
> + int (*addr_del)(const struct rte_ifpx_addr_change *event);
> + /**< Callback for notification about IPv4 address removal.
> + * @see struct rte_ifpx_addr_change
> + * @return non-zero if event handling is finished
> + */
> + int (*addr6_add)(const struct rte_ifpx_addr6_change *event);
> + /**< Callback for notification about IPv6 address being added.
> + * @see struct rte_ifpx_addr6_change
> + */
> + int (*addr6_del)(const struct rte_ifpx_addr6_change *event);
> + /**< Callback for notification about IPv4 address removal.
> + * @see struct rte_ifpx_addr6_change
> + * @return non-zero if event handling is finished
> + */
> + /* Please note that "route" callbacks might be also called when user
> + * adds address to the interface (that is in addition to address related
> + * callbacks).
> + */
> + int (*route_add)(const struct rte_ifpx_route_change *event);
> + /**< Callback for notification about IPv4 route being added.
> + * @see struct rte_ifpx_route_change
> + * @return non-zero if event handling is finished
> + */
> + int (*route_del)(const struct rte_ifpx_route_change *event);
> + /**< Callback for notification about IPv4 route removal.
> + * @see struct rte_ifpx_route_change
> + * @return non-zero if event handling is finished
> + */
> + int (*route6_add)(const struct rte_ifpx_route6_change *event);
> + /**< Callback for notification about IPv6 route being added.
> + * @see struct rte_ifpx_route6_change
> + * @return non-zero if event handling is finished
> + */
> + int (*route6_del)(const struct rte_ifpx_route6_change *event);
> + /**< Callback for notification about IPv6 route removal.
> + * @see struct rte_ifpx_route6_change
> + * @return non-zero if event handling is finished
> + */
> + int (*neigh_add)(const struct rte_ifpx_neigh_change *event);
> + /**< Callback for notification about IPv4 neighbour being added.
> + * @see struct rte_ifpx_neigh_change
> + * @return non-zero if event handling is finished
> + */
> + int (*neigh_del)(const struct rte_ifpx_neigh_change *event);
> + /**< Callback for notification about IPv4 neighbour removal.
> + * @see struct rte_ifpx_neigh_change
> + * @return non-zero if event handling is finished
> + */
> + int (*neigh6_add)(const struct rte_ifpx_neigh6_change *event);
> + /**< Callback for notification about IPv6 neighbour being added.
> + * @see struct rte_ifpx_neigh_change
> + */
> + int (*neigh6_del)(const struct rte_ifpx_neigh6_change *event);
> + /**< Callback for notification about IPv6 neighbour removal.
> + * @see struct rte_ifpx_neigh_change
> + * @return non-zero if event handling is finished
> + */
> + int (*cfg_done)(void);
> + /**< Lib specific callback - called when initial network configuration
> + * query is finished.
> + * @return non-zero if event handling is finished
> + */
> +};
> +
> +/**
> + * Register proxy callbacks.
> + *
> + * This function registers callbacks to be called upon appropriate network
> + * event notification.
> + *
> + * @param cbs
> + * Set of callbacks that will be called. The library does not take any
> + * ownership of the pointer passed - the callbacks are stored internally.
> + *
> + * @return
> + * 0 on success, negative otherwise.
> + */
> +__rte_experimental
> +int rte_ifpx_callbacks_register(const struct rte_ifpx_callbacks *cbs);
> +
> +/**
> + * Unregister proxy callbacks.
> + *
> + * This function unregisters callbacks previously registered with
> + * rte_ifpx_callbacks_register().
> + *
> + * @param cbs
> + * Handle/pointer returned on previous callback registration.
> + *
> + * @return
> + * 0 on success, negative otherwise.
> + */
> +__rte_experimental
> +void rte_ifpx_callbacks_unregister(void);
> +
> +/**
> + * Bind the port to its proxy.
> + *
> + * After calling this function all network configuration of the proxy (and it's
> + * changes) will be passed to given port by calling registered callbacks with
> + * 'port_id' as an argument.
> + *
> + * Note: since both arguments are of the same type in order to not mix them and
> + * ease remembering the order the first one is kept the same for bind/unbind.
> + *
> + * @param port_id
> + * Id of the port to be bound.
> + * @param proxy_id
> + * Id of the proxy the port needs to be bound to.
> + * @return
> + * 0 on success, negative on error.
> + */
> +__rte_experimental
> +int rte_ifpx_port_bind(uint16_t port_id, uint16_t proxy_id);
> +
> +/**
> + * Unbind the port from its proxy.
> + *
> + * After calling this function registered callbacks will no longer be called for
> + * this port (but they might be called for other ports in one to many binding
> + * scenario).
> + *
> + * @param port_id
> + * Id of the port to unbind.
> + * @return
> + * 0 on success, negative on error.
> + */
> +__rte_experimental
> +int rte_ifpx_port_unbind(uint16_t port_id);
> +
> +/**
> + * Get the system network configuration and start listening to its changes.
> + *
> + * @return
> + * 0 on success, negative otherwise.
> + */
> +__rte_experimental
> +int rte_ifpx_listen(void);
> +
> +/**
> + * Remove all bindings/callbacks and stop listening to network configuration.
> + *
> + * @return
> + * 0 on success, negative otherwise.
> + */
> +__rte_experimental
> +int rte_ifpx_close(void);
> +
> +/**
> + * Get the id of the proxy the port is bound to.
> + *
> + * @param port_id
> + * Id of the port for which to get proxy.
> + * @return
> + * Port id of the proxy on success, RTE_MAX_ETHPORTS on error.
> + */
> +__rte_experimental
> +uint16_t rte_ifpx_proxy_get(uint16_t port_id);
> +
> +/**
> + * Test for port acting as a proxy.
> + *
> + * @param port_id
> + * Id of the port.
> + * @return
> + * 1 if port acts as a proxy, 0 otherwise.
> + */
> +static inline
> +int rte_ifpx_is_proxy(uint16_t port_id)
> +{
> + return rte_ifpx_proxy_get(port_id) == port_id;
> +}
> +
> +/**
> + * Get the ids of the ports bound to the proxy.
> + *
> + * @param proxy_id
> + * Id of the proxy for which to get ports.
> + * @param ports
> + * Array where to store the port ids.
> + * @param num
> + * Size of the 'ports' array.
> + * @return
> + * The number of ports bound to given proxy. Note that bound ports are filled
> + * in 'ports' array up to its size but the return value is always the total
> + * number of ports bound - so you can make call first with NULL/0 to query for
> + * the size of the buffer to create or call it with the buffer you have and
> + * later check if it was large enough.
> + */
> +__rte_experimental
> +unsigned int rte_ifpx_port_get(uint16_t proxy_id,
> + uint16_t *ports, unsigned int num);
> +
> +/**
> + * The structure containing some properties of the proxy interface.
> + */
> +struct rte_ifpx_info {
> + unsigned int if_index; /* entry valid iff if_index != 0 */
> + uint16_t mtu;
> + struct rte_ether_addr mac;
> + char if_name[RTE_ETH_NAME_MAX_LEN];
> +};
> +
> +/**
> + * Get the properties of the proxy interface. Argument can be either id of the
> + * proxy or an id of a port that is bound to it.
> + *
> + * @param port_id
> + * Id of the port (or proxy) for which to get proxy properties.
> + * @return
> + * Pointer to the proxy information structure.
> + */
> +__rte_experimental
> +const struct rte_ifpx_info *rte_ifpx_info_get(uint16_t port_id);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_IF_PROXY_H_ */
> diff --git a/lib/librte_if_proxy/rte_if_proxy_version.map b/lib/librte_if_proxy/rte_if_proxy_version.map
> new file mode 100644
> index 000000000..e2093137d
> --- /dev/null
> +++ b/lib/librte_if_proxy/rte_if_proxy_version.map
> @@ -0,0 +1,19 @@
> +EXPERIMENTAL {
> + global:
> +
> + rte_ifpx_proxy_create;
> + rte_ifpx_proxy_create_by_devarg;
> + rte_ifpx_proxy_destroy;
> + rte_ifpx_events_available;
> + rte_ifpx_callbacks_register;
> + rte_ifpx_callbacks_unregister;
> + rte_ifpx_port_bind;
> + rte_ifpx_port_unbind;
> + rte_ifpx_listen;
> + rte_ifpx_close;
> + rte_ifpx_proxy_get;
> + rte_ifpx_port_get;
> + rte_ifpx_info_get;
> +
> + local: *;
> +};
> diff --git a/lib/meson.build b/lib/meson.build
> index 0af3efab2..c913b33dd 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -19,7 +19,7 @@ libraries = [
> 'acl', 'bbdev', 'bitratestats', 'cfgfile',
> 'compressdev', 'cryptodev',
> 'distributor', 'efd', 'eventdev',
> - 'gro', 'gso', 'ip_frag', 'jobstats',
> + 'gro', 'gso', 'if_proxy', 'ip_frag', 'jobstats',
> 'kni', 'latencystats', 'lpm', 'member',
> 'power', 'pdump', 'rawdev',
> 'rcu', 'rib', 'reorder', 'sched', 'security', 'stack', 'vhost',
> --
> 2.17.1
>
next prev parent reply other threads:[~2020-03-31 12:36 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-06 16:41 [dpdk-dev] [PATCH 0/4] Introduce IF proxy library Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-03-31 12:36 ` Harman Kalra [this message]
2020-03-31 15:37 ` Andrzej Ostruszka [C]
2020-04-01 5:29 ` Varghese, Vipin
2020-04-01 20:08 ` Andrzej Ostruszka [C]
2020-04-08 3:04 ` Varghese, Vipin
2020-04-08 18:13 ` Andrzej Ostruszka [C]
2020-03-06 16:41 ` [dpdk-dev] [PATCH 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 4/4] if_proxy: add example application Andrzej Ostruszka
2020-03-06 17:17 ` [dpdk-dev] [PATCH 0/4] Introduce IF proxy library Andrzej Ostruszka
2020-03-10 11:10 ` [dpdk-dev] [PATCH v2 " Andrzej Ostruszka
2020-03-10 11:10 ` [dpdk-dev] [PATCH v2 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-07-02 0:34 ` Stephen Hemminger
2020-07-07 20:13 ` Andrzej Ostruszka [C]
2020-07-08 16:07 ` Morten Brørup
2020-07-09 8:43 ` Andrzej Ostruszka [C]
2020-07-22 0:40 ` Thomas Monjalon
2020-07-22 8:45 ` Jerin Jacob
2020-07-22 8:56 ` Thomas Monjalon
2020-07-22 9:09 ` Jerin Jacob
2020-07-22 9:27 ` Thomas Monjalon
2020-07-22 9:54 ` Jerin Jacob
2020-07-23 14:09 ` [dpdk-dev] [EXT] " Andrzej Ostruszka [C]
2020-03-10 11:10 ` [dpdk-dev] [PATCH v2 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-03-10 11:10 ` [dpdk-dev] [PATCH v2 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-03-10 11:10 ` [dpdk-dev] [PATCH v2 4/4] if_proxy: add example application Andrzej Ostruszka
2020-03-25 8:08 ` [dpdk-dev] [PATCH v2 0/4] Introduce IF proxy library David Marchand
2020-03-25 11:11 ` Morten Brørup
2020-03-26 17:42 ` Andrzej Ostruszka
2020-04-02 13:48 ` Andrzej Ostruszka [C]
2020-04-03 17:19 ` Thomas Monjalon
2020-04-03 19:09 ` Jerin Jacob
2020-04-03 21:18 ` Morten Brørup
2020-04-03 21:57 ` Thomas Monjalon
2020-04-04 10:18 ` Jerin Jacob
2020-04-10 10:41 ` Morten Brørup
2020-04-04 18:30 ` Andrzej Ostruszka [C]
2020-04-04 19:58 ` Thomas Monjalon
2020-04-10 10:03 ` Morten Brørup
2020-04-10 12:28 ` Jerin Jacob
2020-03-26 12:41 ` Andrzej Ostruszka
2020-03-30 19:23 ` Andrzej Ostruszka
2020-04-03 21:42 ` Thomas Monjalon
2020-04-04 18:07 ` Andrzej Ostruszka [C]
2020-04-04 19:51 ` Thomas Monjalon
2020-04-16 16:11 ` [dpdk-dev] [PATCH " Stephen Hemminger
2020-04-16 16:49 ` Jerin Jacob
2020-04-16 17:04 ` Stephen Hemminger
2020-04-16 17:26 ` Andrzej Ostruszka [C]
2020-04-16 17:27 ` Jerin Jacob
2020-04-16 17:12 ` Andrzej Ostruszka [C]
2020-04-16 17:19 ` Stephen Hemminger
2020-05-04 8:53 ` [dpdk-dev] [PATCH v3 " Andrzej Ostruszka
2020-05-04 8:53 ` [dpdk-dev] [PATCH v3 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-05-04 8:53 ` [dpdk-dev] [PATCH v3 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-05-04 8:53 ` [dpdk-dev] [PATCH v3 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-05-04 8:53 ` [dpdk-dev] [PATCH v3 4/4] if_proxy: add example application Andrzej Ostruszka
2020-06-22 9:21 ` [dpdk-dev] [PATCH v4 0/4] Introduce IF proxy library Andrzej Ostruszka
2020-06-22 9:21 ` [dpdk-dev] [PATCH v4 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-06-22 9:21 ` [dpdk-dev] [PATCH v4 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-06-22 9:21 ` [dpdk-dev] [PATCH v4 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-06-22 9:21 ` [dpdk-dev] [PATCH v4 4/4] if_proxy: add example application Andrzej Ostruszka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200331123627.GA10566@outlook.office365.com \
--to=hkalra@marvell.com \
--cc=aostruszka@marvell.com \
--cc=dev@dpdk.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).