* [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode
@ 2018-03-13 12:50 Nelio Laranjeiro
2018-03-13 12:50 ` [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration Nelio Laranjeiro
` (12 more replies)
0 siblings, 13 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-13 12:50 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
When MLX5 behaves in VF mode and the hypervisor have **trusted** this VF, to
be able to receive specific traffic some requests must be done to configure
the NIC. There is no API currently available to do it though Verbs, but there
is in Linux side using Netlink. This patches uses the wrapper [1].
The specific cases are:
- Enable/disable promiscuous mode.
- Enable/disable allmulti mode.
- Add/remove mac addresses.
This series applies on top of [1][2] and on top of [3]:
[1] https://dpdk.org/ml/archives/dev/2018-March/092621.html
[2] https://dpdk.org/dev/patchwork/patch/36052/
[3] https://dpdk.org/dev/patchwork/patch/35981/
Nelio Laranjeiro (5):
net/mlx5: add VF information in configuration
net/mlx5: retrieve device index from Netlink
net/mlx5: use Netlink to add/remove MAC addresses
net/mlx5: use Netlink to enable promisc/allmulti
net/mlx5: add a parameter for Netlink support in VF
doc/guides/nics/mlx5.rst | 13 +
drivers/net/mlx5/Makefile | 3 +-
drivers/net/mlx5/mlx5.c | 24 ++
drivers/net/mlx5/mlx5.h | 10 +
drivers/net/mlx5/mlx5_mac.c | 25 +-
drivers/net/mlx5/mlx5_trigger.c | 27 +-
drivers/net/mlx5/mlx5_vf.c | 600 ++++++++++++++++++++++++++++++++++++++++
7 files changed, 698 insertions(+), 4 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_vf.c
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
@ 2018-03-13 12:50 ` Nelio Laranjeiro
2018-03-14 17:10 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 2/5] net/mlx5: retrieve device index from Netlink Nelio Laranjeiro
` (11 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-13 12:50 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
drivers/net/mlx5/mlx5.c | 10 ++++++++++
drivers/net/mlx5/mlx5.h | 1 +
2 files changed, 11 insertions(+)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 95dd49420..d5cc85d19 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -597,6 +597,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
int err = 0;
struct ibv_context *attr_ctx = NULL;
struct ibv_device_attr_ex device_attr;
+ unsigned int vf;
unsigned int mps;
unsigned int cqe_comp;
unsigned int tunnel_en = 0;
@@ -646,6 +647,14 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
continue;
DRV_LOG(INFO, "PCI information matches, using device \"%s\"",
list[i]->name);
+ vf = ((pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5EXVF));
attr_ctx = mlx5_glue->open_device(list[i]);
rte_errno = errno;
err = rte_errno;
@@ -871,6 +880,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
DRV_LOG(DEBUG,
"hardware Rx end alignment padding is %ssupported",
(config.hw_padding ? "" : "not "));
+ config.vf = vf;
config.tso = ((device_attr_ex.tso_caps.max_tso > 0) &&
(device_attr_ex.tso_caps.supported_qpts &
(1 << IBV_QPT_RAW_PACKET)));
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index faacfd9d6..9191b2949 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -78,6 +78,7 @@ struct mlx5_dev_config {
unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
unsigned int hw_padding:1; /* End alignment padding is supported. */
+ unsigned int vf:1; /* This is a VF. */
unsigned int mps:2; /* Multi-packet send supported mode. */
unsigned int tunnel_en:1;
/* Whether tunnel stateless offloads are supported. */
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 2/5] net/mlx5: retrieve device index from Netlink
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-03-13 12:50 ` [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration Nelio Laranjeiro
@ 2018-03-13 12:50 ` Nelio Laranjeiro
2018-03-14 17:10 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
` (10 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-13 12:50 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
This patch new file is not compiled yet, it starts a series necessary to
fix some issues with VF devices.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
drivers/net/mlx5/mlx5_vf.c | 134 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 134 insertions(+)
create mode 100644 drivers/net/mlx5/mlx5_vf.c
diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
new file mode 100644
index 000000000..357407f56
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_vf.c
@@ -0,0 +1,134 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 6WIND S.A.
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+#include <linux/rtnetlink.h>
+#include <linux/netlink.h>
+
+#include <rte_netlink.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/* Data exchanged between Netlink library and PMD. */
+struct mlx5_vf_iface {
+ struct rte_eth_dev *dev; /**< Pointer to Ethernet device. */
+ int iface_idx; /**< Network Interface index. */
+};
+
+/**
+ * Parse Netlink message to retrieve the interface index.
+ *
+ * @param nh
+ * Pointer to Netlink Message Header.
+ * @param arg
+ * PMD data register with this callback.
+ *
+ * @return
+ * 0 on success, -1 otherwise.
+ */
+static int
+mlx5_vf_iface_idx_cb(struct nlmsghdr *nh, void *arg)
+{
+ struct mlx5_vf_iface *data = arg;
+ struct rte_eth_dev *dev = data->dev;
+ const struct ether_addr *mac = &dev->data->mac_addrs[0];
+ struct ifinfomsg *iface;
+ struct rtattr *attribute;
+ int len;
+
+ /**
+ * Leave right away if the index does not match its initialised value.
+ * Interface index has already been found.
+ */
+ if (data->iface_idx != -1)
+ return 0;
+ iface = NLMSG_DATA(nh);
+ len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*iface));
+ for (attribute = IFLA_RTA(iface);
+ RTA_OK(attribute, len);
+ attribute = RTA_NEXT(attribute, len)) {
+ if (attribute->rta_type == IFLA_ADDRESS &&
+ !memcmp(RTA_DATA(attribute), mac, ETHER_ADDR_LEN)) {
+#ifndef NDEBUG
+ struct ether_addr m;
+
+ memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN);
+ DRV_LOG(DEBUG,
+ "port %u interace %d MAC address"
+ " %02X:%02X:%02X:%02X:%02X:%02X",
+ dev->data->port_id,
+ iface->ifi_index,
+ m.addr_bytes[0], m.addr_bytes[1],
+ m.addr_bytes[2], m.addr_bytes[3],
+ m.addr_bytes[4], m.addr_bytes[5]);
+#endif
+ data->iface_idx = iface->ifi_index;
+ return 0;
+ }
+ }
+ data->iface_idx = -1;
+ return 0;
+}
+
+/**
+ * Retrieve interface Netlink interface index.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ *
+ * @return
+ * Interface index, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_vf_iface_idx(struct rte_eth_dev *dev)
+{
+ struct nl_req {
+ struct nlmsghdr hdr;
+ struct rtgenmsg gen;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct rtgenmsg)),
+ .nlmsg_type = RTM_GETLINK,
+ .nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP | NLM_F_ROOT,
+ },
+ .gen = {
+ .rtgen_family = AF_UNSPEC,
+ },
+ };
+ struct mlx5_vf_iface data = {
+ .dev = dev,
+ .iface_idx = -1,
+ };
+ int fd;
+ int ret;
+
+ fd = rte_nl_init(RTMGRP_LINK);
+ if (fd < 0) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_send(fd, &req.hdr);
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_recv(fd, mlx5_vf_iface_idx_cb, &data);
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ rte_nl_final(fd);
+ if (data.iface_idx == -1) {
+ rte_errno = EAGAIN;
+ goto error;
+ }
+ return data.iface_idx;
+error:
+ if (fd >= 0)
+ rte_nl_final(fd);
+ DRV_LOG(DEBUG, "port %u cannot retrieve Netlink Interface index %s",
+ dev->data->port_id, strerror(rte_errno));
+ return -rte_errno;
+}
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-03-13 12:50 ` [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration Nelio Laranjeiro
2018-03-13 12:50 ` [dpdk-dev] [PATCH 2/5] net/mlx5: retrieve device index from Netlink Nelio Laranjeiro
@ 2018-03-13 12:50 ` Nelio Laranjeiro
2018-03-14 17:10 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 4/5] net/mlx5: use Netlink to enable promisc/allmulti Nelio Laranjeiro
` (9 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-13 12:50 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive traffic unless it fully requests it
though Netlink. This will cause the request to be processed by the PF
which will add/remove the MAC address to the VF table if the VF is trusted.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
drivers/net/mlx5/Makefile | 3 +-
drivers/net/mlx5/mlx5.c | 7 +
drivers/net/mlx5/mlx5.h | 6 +
drivers/net/mlx5/mlx5_mac.c | 25 +++-
drivers/net/mlx5/mlx5_vf.c | 315 ++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 353 insertions(+), 3 deletions(-)
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index afda4118f..bbda0d1ff 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -59,6 +59,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vf.c
ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
@@ -84,7 +85,7 @@ LDLIBS += -libverbs -lmlx5
endif
LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
-LDLIBS += -lrte_bus_pci
+LDLIBS += -lrte_bus_pci -lrte_netlink
# A few warnings cannot be avoided in external headers.
CFLAGS += -Wno-error=cast-qual
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d5cc85d19..e966884bd 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -205,6 +205,13 @@ mlx5_dev_close(struct rte_eth_dev *dev)
rte_free(priv->reta_idx);
if (priv->primary_socket)
mlx5_socket_uninit(dev);
+ if (priv->config.vf) {
+ ret = mlx5_vf_mac_addr_flush(dev);
+ if (ret)
+ DRV_LOG(WARNING,
+ "port %u some MAC address remains configured",
+ dev->data->port_id);
+ }
ret = mlx5_hrxq_ibv_verify(dev);
if (ret)
DRV_LOG(WARNING, "port %u some hash Rx queue still remain",
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9191b2949..a4333e6a5 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -298,4 +298,10 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct rte_mempool *mp);
int mlx5_mr_release(struct mlx5_mr *mr);
int mlx5_mr_verify(struct rte_eth_dev *dev);
+/* mlx5_vf.c */
+
+int mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac);
+int mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac);
+int mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev);
+
#endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 01c7ba17a..5137ad11d 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -67,11 +67,24 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[ETHER_ADDR_LEN])
void
mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
+ int ret;
+
assert(index < MLX5_MAX_MAC_ADDRESSES);
+ if (index && vf) {
+ ret = mlx5_vf_mac_addr_remove(dev,
+ &dev->data->mac_addrs[index]);
+ if (ret) {
+ DRV_LOG(WARNING,
+ "port %u cannot remove mac address at index %d",
+ dev->data->port_id, index);
+ return;
+ }
+ }
memset(&dev->data->mac_addrs[index], 0, sizeof(struct ether_addr));
if (!dev->data->promiscuous) {
- int ret = mlx5_traffic_restart(dev);
-
+ ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot remove mac address: %s",
dev->data->port_id, strerror(rte_errno));
@@ -97,6 +110,8 @@ int
mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
uint32_t index, uint32_t vmdq __rte_unused)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
unsigned int i;
assert(index < MLX5_MAX_MAC_ADDRESSES);
@@ -111,6 +126,12 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
rte_errno = EADDRINUSE;
return -rte_errno;
}
+ if (index && vf) {
+ int ret = mlx5_vf_mac_addr_add(dev, mac);
+
+ if (ret)
+ return ret;
+ }
dev->data->mac_addrs[index] = *mac;
if (!dev->data->promiscuous)
return mlx5_traffic_restart(dev);
diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
index 357407f56..3db8ee0eb 100644
--- a/drivers/net/mlx5/mlx5_vf.c
+++ b/drivers/net/mlx5/mlx5_vf.c
@@ -11,12 +11,29 @@
#include "mlx5.h"
#include "mlx5_utils.h"
+/*
+ * Define NDA_RTA as defined in iproute2 sources.
+ *
+ * see in iproute2 sources file include/libnetlink.h
+ */
+#ifndef NDA_RTA
+#define NDA_RTA(r) \
+ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg))))
+#endif
+
/* Data exchanged between Netlink library and PMD. */
struct mlx5_vf_iface {
struct rte_eth_dev *dev; /**< Pointer to Ethernet device. */
int iface_idx; /**< Network Interface index. */
};
+/* Add/Remove mac address through Metlink */
+struct mlx5_vf_mac_addr {
+ struct ether_addr (*mac)[];
+ /**< MAC Address handled by the device. */
+ int mac_n; /**< Number of address in the array. */
+};
+
/**
* Parse Netlink message to retrieve the interface index.
*
@@ -132,3 +149,301 @@ mlx5_vf_iface_idx(struct rte_eth_dev *dev)
dev->data->port_id, strerror(rte_errno));
return -rte_errno;
}
+
+/**
+ * Parse Netlink message to retrieve the bridge MAC address.
+ *
+ * @param nh
+ * Pointer to Netlink Message Header.
+ * @param arg
+ * PMD data register with this callback.
+ *
+ * @return
+ * 0 on success, -1 otherwise.
+ */
+static int
+mlx5_vf_mac_addr_cb(struct nlmsghdr *nh, void *arg)
+{
+ struct mlx5_vf_mac_addr *data = arg;
+ struct ndmsg *r = NLMSG_DATA(nh);
+ struct rtattr *attribute;
+ int len;
+
+ len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
+ for (attribute = NDA_RTA(r);
+ RTA_OK(attribute, len);
+ attribute = RTA_NEXT(attribute, len)) {
+ if (attribute->rta_type == NDA_LLADDR) {
+ if (data->mac_n == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(WARNING,
+ "not enough room to finalise the"
+ " request");
+ return -1;
+ }
+#ifndef NDEBUG
+ struct ether_addr m;
+
+ memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN);
+ DRV_LOG(DEBUG,
+ "brige MAC address"
+ " %02X:%02X:%02X:%02X:%02X:%02X",
+ m.addr_bytes[0], m.addr_bytes[1],
+ m.addr_bytes[2], m.addr_bytes[3],
+ m.addr_bytes[4], m.addr_bytes[5]);
+#endif
+ memcpy(&(*data->mac)[data->mac_n++],
+ RTA_DATA(attribute), ETHER_ADDR_LEN);
+ }
+ }
+ return 0;
+}
+
+/**
+ * Get bridge MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac[out]
+ * Pointer to the array table of MAC addresses to fill.
+ * Its size should be of MLX5_MAX_MAC_ADDRESSES.
+ * @param mac_n[out]
+ * Number of entries filled in MAC array.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_vf_mac_addr_list(struct rte_eth_dev *dev, struct ether_addr (*mac)[],
+ int *mac_n)
+{
+ int iface_idx = mlx5_vf_iface_idx(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifm;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_GETNEIGH,
+ .nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
+ },
+ .ifm = {
+ .ifi_family = PF_BRIDGE,
+ .ifi_index = iface_idx,
+ },
+ };
+ struct mlx5_vf_mac_addr data = {
+ .mac = mac,
+ .mac_n = 0,
+ };
+ int fd;
+ int ret;
+
+ fd = rte_nl_init(RTMGRP_LINK);
+ if (fd < 0) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_request(fd, &req.hdr, &req.ifm, sizeof(struct ifinfomsg));
+ if (ret < 0) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_recv(fd, mlx5_vf_mac_addr_cb, &data);
+ if (ret < 0) {
+ rte_errno = errno;
+ goto error;
+ }
+ rte_nl_final(fd);
+ *mac_n = data.mac_n;
+ return 0;
+error:
+ if (fd >= 0)
+ rte_nl_final(fd);
+ DRV_LOG(DEBUG, "port %u cannot retrieve MAC address list %s",
+ dev->data->port_id, strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Modify the mac address neighbour table with Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac
+ * MAC address to consider.
+ * @param add
+ * 1 to add the MAC address, 0 to remove the MAC address.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_vf_mac_addr_modify(struct rte_eth_dev *dev, struct ether_addr *mac,
+ int add)
+{
+ int iface_idx = mlx5_vf_iface_idx(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ndmsg ndm;
+ struct rtattr rta;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)),
+ .nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE |
+ NLM_F_EXCL | NLM_F_ACK,
+ .nlmsg_type = add ? RTM_NEWNEIGH : RTM_DELNEIGH,
+ },
+ .ndm = {
+ .ndm_family = PF_BRIDGE,
+ .ndm_state = NUD_NOARP | NUD_PERMANENT,
+ .ndm_ifindex = iface_idx,
+ .ndm_flags = NTF_SELF,
+ },
+ .rta = {
+ .rta_type = NDA_LLADDR,
+ .rta_len = RTA_LENGTH(ETHER_ADDR_LEN),
+ },
+ };
+ int fd;
+ int ret;
+
+ memcpy(RTA_DATA(&req.rta), mac, ETHER_ADDR_LEN);
+ req.hdr.nlmsg_len = NLMSG_ALIGN(req.hdr.nlmsg_len) +
+ RTA_ALIGN(req.rta.rta_len);
+ fd = rte_nl_init(RTMGRP_LINK);
+ if (fd < 0) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_send(fd, &req.hdr);
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_recv_ack(fd);
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ rte_nl_final(fd);
+ return 0;
+error:
+ if (fd >= 0)
+ rte_nl_final(fd);
+ DRV_LOG(DEBUG,
+ "port %u cannot %s MAC address %02X:%02X:%02X:%02X:%02X:%02X"
+ " %s",
+ dev->data->port_id,
+ add ? "add" : "remove",
+ mac->addr_bytes[0], mac->addr_bytes[1],
+ mac->addr_bytes[2], mac->addr_bytes[3],
+ mac->addr_bytes[4], mac->addr_bytes[5],
+ strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * add a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac_addr
+ * MAC address to register.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int mac_index = MLX5_MAX_MAC_ADDRESSES;
+ int ret;
+
+ ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return ret;
+ for (i = 0; i != macs_n; ++i) {
+ if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
+ mac_index = i;
+ break;
+ }
+ }
+ if (mac_index != MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(INFO, "port %u MAC address already added",
+ dev->data->port_id);
+ rte_errno = EADDRINUSE;
+ return -rte_errno;
+ }
+ return mlx5_vf_mac_addr_modify(dev, mac, 1);
+}
+
+/**
+ * Remove a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param index
+ * MAC address to remove.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int mac_index = MLX5_MAX_MAC_ADDRESSES;
+ int ret;
+
+ ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return ret;
+ for (i = 0; i != macs_n; ++i) {
+ if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
+ mac_index = i;
+ break;
+ }
+ }
+ if (mac_index == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(INFO, "port %u MAC address not found",
+ dev->data->port_id);
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ return mlx5_vf_mac_addr_modify(dev, mac, 0);
+}
+
+/**
+ * Flush all added MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev)
+{
+ int i;
+ const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
+
+ /* Skip the default mac at index 0. */
+ for (i = 1; i != MLX5_MAX_MAC_ADDRESSES; ++i) {
+ struct ether_addr *m = &dev->data->mac_addrs[i];
+
+ if (memcmp(&mac_null, m, ETHER_ADDR_LEN)) {
+ int ret;
+
+ ret = mlx5_vf_mac_addr_remove(dev, m);
+ if (ret)
+ return ret;
+ }
+ }
+ return 0;
+}
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 4/5] net/mlx5: use Netlink to enable promisc/allmulti
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (2 preceding siblings ...)
2018-03-13 12:50 ` [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
@ 2018-03-13 12:50 ` Nelio Laranjeiro
2018-03-14 17:11 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
` (8 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-13 12:50 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive promisc or allmulti traffic unless it
fully requests it though Netlink. This will cause the request to be
processed by the PF which will handle the request and enable it.
This requires the VF to be trusted by the PF.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
drivers/net/mlx5/mlx5.h | 2 +
drivers/net/mlx5/mlx5_trigger.c | 27 ++++++++++-
drivers/net/mlx5/mlx5_vf.c | 102 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 130 insertions(+), 1 deletion(-)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index a4333e6a5..245235641 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -303,5 +303,7 @@ int mlx5_mr_verify(struct rte_eth_dev *dev);
int mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac);
int mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac);
int mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev);
+int mlx5_vf_promisc(struct rte_eth_dev *dev, int enable);
+int mlx5_vf_allmulti(struct rte_eth_dev *dev, int enable);
#endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 6bb4ffb14..3f21797ff 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -278,8 +278,23 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
unsigned int j;
int ret;
- if (priv->isolated)
+ if (priv->isolated) {
+ if (priv->config.vf) {
+ if (dev->data->promiscuous) {
+ ret = mlx5_vf_promisc(dev, 1);
+ rte_errno = ret;
+ if (ret)
+ goto error;
+ }
+ if (dev->data->all_multicast) {
+ ret = mlx5_vf_allmulti(dev, 1);
+ rte_errno = ret;
+ if (ret)
+ goto error;
+ }
+ }
return 0;
+ }
if (dev->data->promiscuous) {
struct rte_flow_item_eth promisc = {
.dst.addr_bytes = "\x00\x00\x00\x00\x00\x00",
@@ -287,6 +302,11 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
.type = 0,
};
+ if (priv->config.vf) {
+ ret = mlx5_vf_promisc(dev, 1);
+ if (ret)
+ goto error;
+ }
ret = mlx5_ctrl_flow(dev, &promisc, &promisc);
if (ret)
goto error;
@@ -298,6 +318,11 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
.type = 0,
};
+ if (priv->config.vf) {
+ ret = mlx5_vf_allmulti(dev, 1);
+ if (ret)
+ goto error;
+ }
ret = mlx5_ctrl_flow(dev, &multicast, &multicast);
if (ret)
goto error;
diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
index 3db8ee0eb..cf71e79d9 100644
--- a/drivers/net/mlx5/mlx5_vf.c
+++ b/drivers/net/mlx5/mlx5_vf.c
@@ -447,3 +447,105 @@ mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev)
}
return 0;
}
+
+/**
+ * Enable promisc/allmulti though Netlink
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param flags
+ * IFF_PROMISC for promiscuous, IFF_ALLMULTI for allmulti.
+ * @param en
+ * 1 to enable, 0 to disable.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_vf_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
+{
+ int iface_idx = mlx5_vf_iface_idx(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifi;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_NEWLINK,
+ .nlmsg_flags = NLM_F_REQUEST,
+ },
+ .ifi = {
+ .ifi_flags = enable ? flags : 0,
+ .ifi_change = flags,
+ .ifi_index = iface_idx,
+ },
+ };
+ int fd;
+ int ret;
+
+ assert(!(flags & ~(IFF_PROMISC | IFF_ALLMULTI)));
+ fd = rte_nl_init(NETLINK_ROUTE);
+ if (fd < 0) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = rte_nl_send(fd, &req.hdr);
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ rte_nl_final(fd);
+ return 0;
+error:
+ if (fd >= 0)
+ rte_nl_final(fd);
+ return -rte_errno;
+}
+
+/**
+ * Enable promisc though Netlink
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param en
+ * 1 to enable promisc, 0 to disable it.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_vf_promisc(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_vf_device_flags(dev, IFF_PROMISC, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s promisc mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
+
+/**
+ * Enable allmulti though Netlink
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param en
+ * 1 to enable promisc, 0 to disable it.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_vf_allmulti(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_vf_device_flags(dev, IFF_ALLMULTI, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s allmulti mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH 5/5] net/mlx5: add a parameter for Netlink support in VF
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (3 preceding siblings ...)
2018-03-13 12:50 ` [dpdk-dev] [PATCH 4/5] net/mlx5: use Netlink to enable promisc/allmulti Nelio Laranjeiro
@ 2018-03-13 12:50 ` Nelio Laranjeiro
2018-03-14 17:11 ` Adrien Mazarguil
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (7 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-13 12:50 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
All Netlink request the PMD will do can also be done by a iproute2 command
line interface, letting the operator configure the VF behavior without
having to modify the application nor reaching PMD limits (e.g. MAC
address number limit).
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
doc/guides/nics/mlx5.rst | 13 ++++++++++++
drivers/net/mlx5/mlx5.c | 7 ++++++
drivers/net/mlx5/mlx5.h | 1 +
drivers/net/mlx5/mlx5_vf.c | 53 ++++++++++++++++++++++++++++++++++++++++++++--
4 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 46d26e4c8..54e6f327b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -135,6 +135,11 @@ Limitations
- Flows with a VXLAN Network Identifier equal (or ends to be equal)
to 0 are not supported.
- VXLAN TSO and checksum offloads are not supported on VM.
+- VF: Flow rules matching a specific MAC address are only triggers if the MAC
+ has been previously added by the application.
+ Adding more than MLX5_MAX_MAC_ADDRESSES MAC addresses is not supported in VF
+ mode. **Note** needs a Linux kernel v4.16 or higher or Mellanox OFED
+ installed.
Statistics
----------
@@ -335,6 +340,14 @@ Run-time configuration
Enabled by default.
+- ``vf_nl_en`` parameter [int]
+
+ A nonzero value enables Netlink requests from the VF to add/remove MAC
+ addresses or/and enable/disable promiscuous/allmulticast on the Netdevice.
+ Otherwise the according configuration must be run with Linux iproute2 tools.
+
+ Enabled by default on VF.
+
Prerequisites
-------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index e966884bd..d1ca0837c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -68,6 +68,9 @@
/* Device parameter to enable hardware Rx vector. */
#define MLX5_RX_VEC_EN "rx_vec_en"
+/* Activate Netlink support in VF mode. */
+#define MLX5_VF_NL_EN "vf_nl_en"
+
#ifndef HAVE_IBV_MLX5_MOD_MPW
#define MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED (1 << 2)
#define MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW (1 << 3)
@@ -414,6 +417,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
config->tx_vec_en = !!tmp;
} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
config->rx_vec_en = !!tmp;
+ } else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
+ config->vf_nl_en = !!tmp;
} else {
DRV_LOG(WARNING, "%s: unknown parameter", key);
rte_errno = EINVAL;
@@ -445,6 +450,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
MLX5_TXQ_MAX_INLINE_LEN,
MLX5_TX_VEC_EN,
MLX5_RX_VEC_EN,
+ MLX5_VF_NL_EN,
NULL,
};
struct rte_kvargs *kvlist;
@@ -750,6 +756,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
.txq_inline = MLX5_ARG_UNSET,
.txqs_inline = MLX5_ARG_UNSET,
.inline_max_packet_sz = MLX5_ARG_UNSET,
+ .vf_nl_en = 1,
};
len = snprintf(name, sizeof(name), PCI_PRI_FMT,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 245235641..8c3e925af 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -88,6 +88,7 @@ struct mlx5_dev_config {
unsigned int tx_vec_en:1; /* Tx vector is enabled. */
unsigned int rx_vec_en:1; /* Rx vector is enabled. */
unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
+ unsigned int vf_nl_en:1; /* Enable Netlink request in VF mode. */
unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
unsigned int ind_table_max_size; /* Maximum indirection table size. */
int txq_inline; /* Maximum packet size for inlining. */
diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
index cf71e79d9..18b02365d 100644
--- a/drivers/net/mlx5/mlx5_vf.c
+++ b/drivers/net/mlx5/mlx5_vf.c
@@ -361,7 +361,16 @@ mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac)
int i;
int mac_index = MLX5_MAX_MAC_ADDRESSES;
int ret;
+ struct priv *priv = dev->data->dev_private;
+ if (!priv->config.vf_nl_en) {
+ DRV_LOG(WARNING,
+ "port %u Netlink requests are disabled, add the MAC"
+ " though bridge command: bridge fdb add <mac> dev"
+ " <device>",
+ dev->data->port_id);
+ return 0;
+ }
ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
if (ret)
return ret;
@@ -399,7 +408,16 @@ mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac)
int i;
int mac_index = MLX5_MAX_MAC_ADDRESSES;
int ret;
+ struct priv *priv = dev->data->dev_private;
+ if (!priv->config.vf_nl_en) {
+ DRV_LOG(WARNING,
+ "port %u Netlink requests are disabled, delete the MAC"
+ " though bridge command: bridge fdb del <mac> dev"
+ " <device>",
+ dev->data->port_id);
+ return 0;
+ }
ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
if (ret)
return ret;
@@ -432,7 +450,16 @@ mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev)
{
int i;
const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
+ struct priv *priv = dev->data->dev_private;
+ if (!priv->config.vf_nl_en) {
+ DRV_LOG(WARNING,
+ "port %u Netlink requests are disabled, flush added"
+ " MAC though bridge command: bridge fdb del <mac> dev"
+ " <device>",
+ dev->data->port_id);
+ return 0;
+ }
/* Skip the default mac at index 0. */
for (i = 1; i != MLX5_MAX_MAC_ADDRESSES; ++i) {
struct ether_addr *m = &dev->data->mac_addrs[i];
@@ -516,8 +543,19 @@ mlx5_vf_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
int
mlx5_vf_promisc(struct rte_eth_dev *dev, int enable)
{
- int ret = mlx5_vf_device_flags(dev, IFF_PROMISC, enable);
+ int ret;
+ struct priv *priv = dev->data->dev_private;
+ if (!priv->config.vf_nl_en) {
+ DRV_LOG(WARNING,
+ "port %u Netlink requests are disabled, %s promisc"
+ " though ip link command: ip link set <device> promisc"
+ " %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ enable ? "on" : "off");
+ return 0;
+ }
+ ret = mlx5_vf_device_flags(dev, IFF_PROMISC, enable);
if (ret)
DRV_LOG(DEBUG,
"port %u cannot %s promisc mode: Netlink error %s",
@@ -540,8 +578,19 @@ mlx5_vf_promisc(struct rte_eth_dev *dev, int enable)
int
mlx5_vf_allmulti(struct rte_eth_dev *dev, int enable)
{
- int ret = mlx5_vf_device_flags(dev, IFF_ALLMULTI, enable);
+ int ret;
+ struct priv *priv = dev->data->dev_private;
+ if (!priv->config.vf_nl_en) {
+ DRV_LOG(WARNING,
+ "port %u Netlink requests are disabled, %s allmulti"
+ " though ip link command: ip link set <device>"
+ " allmulticast %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ enable ? "on" : "off");
+ return 0;
+ }
+ ret = mlx5_vf_device_flags(dev, IFF_ALLMULTI, enable);
if (ret)
DRV_LOG(DEBUG,
"port %u cannot %s allmulti mode: Netlink error %s",
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration
2018-03-13 12:50 ` [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration Nelio Laranjeiro
@ 2018-03-14 17:10 ` Adrien Mazarguil
0 siblings, 0 replies; 32+ messages in thread
From: Adrien Mazarguil @ 2018-03-14 17:10 UTC (permalink / raw)
To: Nelio Laranjeiro; +Cc: Yongseok Koh, dev
On Tue, Mar 13, 2018 at 01:50:35PM +0100, Nelio Laranjeiro wrote:
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Perhaps with a line that states the reason given similar code was dropped by
prior patch "net/mlx5: fix sriov flag", for instance:
Subsequent patches will need a convenient method to determine if the
underlying device is a VF. Make this information part of the configuration
structure.
Besides that:
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
--
Adrien Mazarguil
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 2/5] net/mlx5: retrieve device index from Netlink
2018-03-13 12:50 ` [dpdk-dev] [PATCH 2/5] net/mlx5: retrieve device index from Netlink Nelio Laranjeiro
@ 2018-03-14 17:10 ` Adrien Mazarguil
0 siblings, 0 replies; 32+ messages in thread
From: Adrien Mazarguil @ 2018-03-14 17:10 UTC (permalink / raw)
To: Nelio Laranjeiro; +Cc: Yongseok Koh, dev
On Tue, Mar 13, 2018 at 01:50:36PM +0100, Nelio Laranjeiro wrote:
> This patch new file is not compiled yet, it starts a series necessary to
> fix some issues with VF devices.
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The fact this new file is not compiled in at this point won't be obvious
when running git bisect and makes validation more difficult. I suggest to
simply merge it into the next patch.
A few more comments on this code below.
> ---
> drivers/net/mlx5/mlx5_vf.c | 134 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 134 insertions(+)
> create mode 100644 drivers/net/mlx5/mlx5_vf.c
>
> diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
Regarding the name of this file, it looks to me like all the enclosed
functions could work equally well with non-VF devices. While this series is
targeted at VFs, internal functions are actually more versatile.
I suggest a more generic name, e.g. "mlx5_nl.c".
> new file mode 100644
> index 000000000..357407f56
> --- /dev/null
> +++ b/drivers/net/mlx5/mlx5_vf.c
> @@ -0,0 +1,134 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018 6WIND S.A.
> + * Copyright 2018 Mellanox Technologies, Ltd.
> + */
> +
> +#include <linux/rtnetlink.h>
> +#include <linux/netlink.h>
> +
> +#include <rte_netlink.h>
> +
> +#include "mlx5.h"
> +#include "mlx5_utils.h"
> +
> +/* Data exchanged between Netlink library and PMD. */
> +struct mlx5_vf_iface {
> + struct rte_eth_dev *dev; /**< Pointer to Ethernet device. */
> + int iface_idx; /**< Network Interface index. */
> +};
Minor documentation nits here and elsewhere, you should remove the random
capitalization of "Interface", "Message", "Header" and so on in the middle
of sentences.
> +
> +/**
> + * Parse Netlink message to retrieve the interface index.
> + *
> + * @param nh
> + * Pointer to Netlink Message Header.
> + * @param arg
> + * PMD data register with this callback.
> + *
> + * @return
> + * 0 on success, -1 otherwise.
Looks like it only returns 0, never -1.
You should probably describe that ((struct mlx5_vf_iface)arg)->iface_idx is
left to -1 when its index can't be found.
> + */
> +static int
> +mlx5_vf_iface_idx_cb(struct nlmsghdr *nh, void *arg)
> +{
> + struct mlx5_vf_iface *data = arg;
> + struct rte_eth_dev *dev = data->dev;
> + const struct ether_addr *mac = &dev->data->mac_addrs[0];
> + struct ifinfomsg *iface;
> + struct rtattr *attribute;
> + int len;
> +
> + /**
> + * Leave right away if the index does not match its initialised value.
> + * Interface index has already been found.
> + */
> + if (data->iface_idx != -1)
> + return 0;
> + iface = NLMSG_DATA(nh);
> + len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*iface));
> + for (attribute = IFLA_RTA(iface);
> + RTA_OK(attribute, len);
> + attribute = RTA_NEXT(attribute, len)) {
> + if (attribute->rta_type == IFLA_ADDRESS &&
> + !memcmp(RTA_DATA(attribute), mac, ETHER_ADDR_LEN)) {
Hmm, I'm not sure a matching MAC address is safe enough to determine it's
the right netdevice. Could be a bridge (br0) interface or any other virtual
thing. Although not great, you should probably rely on /sys as in
mlx5_get_ifname().
> +#ifndef NDEBUG
> + struct ether_addr m;
> +
> + memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN);
> + DRV_LOG(DEBUG,
> + "port %u interace %d MAC address"
interace => interface
> + " %02X:%02X:%02X:%02X:%02X:%02X",
> + dev->data->port_id,
> + iface->ifi_index,
> + m.addr_bytes[0], m.addr_bytes[1],
> + m.addr_bytes[2], m.addr_bytes[3],
> + m.addr_bytes[4], m.addr_bytes[5]);
> +#endif
> + data->iface_idx = iface->ifi_index;
> + return 0;
> + }
> + }
> + data->iface_idx = -1;
> + return 0;
> +}
> +
> +/**
> + * Retrieve interface Netlink interface index.
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + *
> + * @return
> + * Interface index, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_vf_iface_idx(struct rte_eth_dev *dev)
> +{
> + struct nl_req {
> + struct nlmsghdr hdr;
> + struct rtgenmsg gen;
> + } req = {
> + .hdr = {
> + .nlmsg_len = NLMSG_LENGTH(sizeof(struct rtgenmsg)),
> + .nlmsg_type = RTM_GETLINK,
> + .nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP | NLM_F_ROOT,
> + },
> + .gen = {
> + .rtgen_family = AF_UNSPEC,
Extra space before AF_UNSPEC.
> + },
> + };
> + struct mlx5_vf_iface data = {
> + .dev = dev,
> + .iface_idx = -1,
> + };
> + int fd;
> + int ret;
> +
> + fd = rte_nl_init(RTMGRP_LINK);
> + if (fd < 0) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_send(fd, &req.hdr);
> + if (ret == -1) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_recv(fd, mlx5_vf_iface_idx_cb, &data);
> + if (ret == -1) {
> + rte_errno = errno;
> + goto error;
> + }
> + rte_nl_final(fd);
> + if (data.iface_idx == -1) {
> + rte_errno = EAGAIN;
> + goto error;
> + }
> + return data.iface_idx;
> +error:
> + if (fd >= 0)
> + rte_nl_final(fd);
> + DRV_LOG(DEBUG, "port %u cannot retrieve Netlink Interface index %s",
> + dev->data->port_id, strerror(rte_errno));
> + return -rte_errno;
> +}
> --
> 2.11.0
>
Due to possible MAC addresses collisions, I suggest a simpler approach:
replacing all this code with a combination of mlx5_get_ifname() and
if_nametoindex(). Thoughts?
--
Adrien Mazarguil
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-13 12:50 ` [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
@ 2018-03-14 17:10 ` Adrien Mazarguil
0 siblings, 0 replies; 32+ messages in thread
From: Adrien Mazarguil @ 2018-03-14 17:10 UTC (permalink / raw)
To: Nelio Laranjeiro; +Cc: Yongseok Koh, dev
On Tue, Mar 13, 2018 at 01:50:37PM +0100, Nelio Laranjeiro wrote:
> VF devices are not able to receive traffic unless it fully requests it
> though Netlink. This will cause the request to be processed by the PF
> which will add/remove the MAC address to the VF table if the VF is trusted.
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Same basic comments as on the previous patch:
- Check random capitalization in documentation and comments.
- "vf" => "nl" for functions, objects and files that aren't really
VF-specific but involve Netlink.
More below.
> ---
> drivers/net/mlx5/Makefile | 3 +-
> drivers/net/mlx5/mlx5.c | 7 +
> drivers/net/mlx5/mlx5.h | 6 +
> drivers/net/mlx5/mlx5_mac.c | 25 +++-
> drivers/net/mlx5/mlx5_vf.c | 315 ++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 353 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> index afda4118f..bbda0d1ff 100644
> --- a/drivers/net/mlx5/Makefile
> +++ b/drivers/net/mlx5/Makefile
> @@ -59,6 +59,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
> SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
> SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
> SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
> +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vf.c
>
> ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
> INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
> @@ -84,7 +85,7 @@ LDLIBS += -libverbs -lmlx5
> endif
> LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
> LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
> -LDLIBS += -lrte_bus_pci
> +LDLIBS += -lrte_bus_pci -lrte_netlink
>
> # A few warnings cannot be avoided in external headers.
> CFLAGS += -Wno-error=cast-qual
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index d5cc85d19..e966884bd 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -205,6 +205,13 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> rte_free(priv->reta_idx);
> if (priv->primary_socket)
> mlx5_socket_uninit(dev);
> + if (priv->config.vf) {
> + ret = mlx5_vf_mac_addr_flush(dev);
> + if (ret)
> + DRV_LOG(WARNING,
> + "port %u some MAC address remains configured",
> + dev->data->port_id);
> + }
> ret = mlx5_hrxq_ibv_verify(dev);
> if (ret)
> DRV_LOG(WARNING, "port %u some hash Rx queue still remain",
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index 9191b2949..a4333e6a5 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -298,4 +298,10 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct rte_mempool *mp);
> int mlx5_mr_release(struct mlx5_mr *mr);
> int mlx5_mr_verify(struct rte_eth_dev *dev);
>
> +/* mlx5_vf.c */
> +
> +int mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac);
> +int mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac);
> +int mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev);
> +
> #endif /* RTE_PMD_MLX5_H_ */
> diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
> index 01c7ba17a..5137ad11d 100644
> --- a/drivers/net/mlx5/mlx5_mac.c
> +++ b/drivers/net/mlx5/mlx5_mac.c
> @@ -67,11 +67,24 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[ETHER_ADDR_LEN])
> void
> mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
> {
> + struct priv *priv = dev->data->dev_private;
> + const int vf = priv->config.vf;
> + int ret;
> +
> assert(index < MLX5_MAX_MAC_ADDRESSES);
> + if (index && vf) {
I don't think checking index is necessary here. mlx5_vf_mac_addr_remove()
will fail if the MAC in question happens to be the last one configured on
the netdevice, but depending on external configuration, it's not necessarily
the same as mac_addrs[0].
Checking "vf" only seems more reliable.
> + ret = mlx5_vf_mac_addr_remove(dev,
> + &dev->data->mac_addrs[index]);
> + if (ret) {
> + DRV_LOG(WARNING,
> + "port %u cannot remove mac address at index %d",
> + dev->data->port_id, index);
> + return;
> + }
> + }
> memset(&dev->data->mac_addrs[index], 0, sizeof(struct ether_addr));
> if (!dev->data->promiscuous) {
> - int ret = mlx5_traffic_restart(dev);
> -
> + ret = mlx5_traffic_restart(dev);
> if (ret)
> DRV_LOG(ERR, "port %u cannot remove mac address: %s",
> dev->data->port_id, strerror(rte_errno));
> @@ -97,6 +110,8 @@ int
> mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
> uint32_t index, uint32_t vmdq __rte_unused)
> {
> + struct priv *priv = dev->data->dev_private;
> + const int vf = priv->config.vf;
> unsigned int i;
>
> assert(index < MLX5_MAX_MAC_ADDRESSES);
> @@ -111,6 +126,12 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
> rte_errno = EADDRINUSE;
> return -rte_errno;
> }
> + if (index && vf) {
Same comment as mlx5_mac_addr_remove() regarding index.
> + int ret = mlx5_vf_mac_addr_add(dev, mac);
> +
> + if (ret)
> + return ret;
> + }
> dev->data->mac_addrs[index] = *mac;
> if (!dev->data->promiscuous)
> return mlx5_traffic_restart(dev);
> diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
> index 357407f56..3db8ee0eb 100644
> --- a/drivers/net/mlx5/mlx5_vf.c
> +++ b/drivers/net/mlx5/mlx5_vf.c
> @@ -11,12 +11,29 @@
> #include "mlx5.h"
> #include "mlx5_utils.h"
>
> +/*
> + * Define NDA_RTA as defined in iproute2 sources.
> + *
> + * see in iproute2 sources file include/libnetlink.h
> + */
> +#ifndef NDA_RTA
> +#define NDA_RTA(r) \
> + ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg))))
> +#endif
> +
I suggest an internal MLX5_* macro instead, just in case.
> /* Data exchanged between Netlink library and PMD. */
> struct mlx5_vf_iface {
> struct rte_eth_dev *dev; /**< Pointer to Ethernet device. */
> int iface_idx; /**< Network Interface index. */
> };
>
> +/* Add/Remove mac address through Metlink */
Remove => remove
Metlink => Netlink
> +struct mlx5_vf_mac_addr {
> + struct ether_addr (*mac)[];
> + /**< MAC Address handled by the device. */
Address => address
> + int mac_n; /**< Number of address in the array. */
address => addresses
(Please check the remaining typos and extra caps.)
> +};
> +
> /**
> * Parse Netlink message to retrieve the interface index.
> *
> @@ -132,3 +149,301 @@ mlx5_vf_iface_idx(struct rte_eth_dev *dev)
> dev->data->port_id, strerror(rte_errno));
> return -rte_errno;
> }
> +
> +/**
> + * Parse Netlink message to retrieve the bridge MAC address.
> + *
> + * @param nh
> + * Pointer to Netlink Message Header.
> + * @param arg
> + * PMD data register with this callback.
> + *
> + * @return
> + * 0 on success, -1 otherwise.
How about a negative errno value in case of error for consistency?
> + */
> +static int
> +mlx5_vf_mac_addr_cb(struct nlmsghdr *nh, void *arg)
> +{
> + struct mlx5_vf_mac_addr *data = arg;
> + struct ndmsg *r = NLMSG_DATA(nh);
> + struct rtattr *attribute;
> + int len;
> +
> + len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
> + for (attribute = NDA_RTA(r);
> + RTA_OK(attribute, len);
> + attribute = RTA_NEXT(attribute, len)) {
> + if (attribute->rta_type == NDA_LLADDR) {
> + if (data->mac_n == MLX5_MAX_MAC_ADDRESSES) {
> + DRV_LOG(WARNING,
> + "not enough room to finalise the"
> + " request");
> + return -1;
How about -ENOMEM?
> + }
> +#ifndef NDEBUG
> + struct ether_addr m;
> +
> + memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN);
> + DRV_LOG(DEBUG,
> + "brige MAC address"
brige => bridge
> + " %02X:%02X:%02X:%02X:%02X:%02X",
> + m.addr_bytes[0], m.addr_bytes[1],
> + m.addr_bytes[2], m.addr_bytes[3],
> + m.addr_bytes[4], m.addr_bytes[5]);
> +#endif
> + memcpy(&(*data->mac)[data->mac_n++],
> + RTA_DATA(attribute), ETHER_ADDR_LEN);
> + }
> + }
> + return 0;
> +}
> +
> +/**
> + * Get bridge MAC addresses.
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + * @param mac[out]
> + * Pointer to the array table of MAC addresses to fill.
> + * Its size should be of MLX5_MAX_MAC_ADDRESSES.
> + * @param mac_n[out]
> + * Number of entries filled in MAC array.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
Since the mac_n is only an output value and the size of mac is fixed to
MLX5_MAX_MAC_ADDRESSES, I suggest to return the value of mac_n (positive in
case of success) and remove it from the parameter list.
> +static int
> +mlx5_vf_mac_addr_list(struct rte_eth_dev *dev, struct ether_addr (*mac)[],
Another suggestion for clarity: (*mac)[MLX5_MAX_MAC_ADDRESSES]
> + int *mac_n)
> +{
> + int iface_idx = mlx5_vf_iface_idx(dev);
> + struct {
> + struct nlmsghdr hdr;
> + struct ifinfomsg ifm;
> + } req = {
> + .hdr = {
> + .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
> + .nlmsg_type = RTM_GETNEIGH,
> + .nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
> + },
> + .ifm = {
> + .ifi_family = PF_BRIDGE,
> + .ifi_index = iface_idx,
> + },
> + };
> + struct mlx5_vf_mac_addr data = {
> + .mac = mac,
> + .mac_n = 0,
> + };
> + int fd;
> + int ret;
> +
> + fd = rte_nl_init(RTMGRP_LINK);
> + if (fd < 0) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_request(fd, &req.hdr, &req.ifm, sizeof(struct ifinfomsg));
> + if (ret < 0) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_recv(fd, mlx5_vf_mac_addr_cb, &data);
> + if (ret < 0) {
> + rte_errno = errno;
> + goto error;
> + }
> + rte_nl_final(fd);
> + *mac_n = data.mac_n;
> + return 0;
> +error:
> + if (fd >= 0)
> + rte_nl_final(fd);
> + DRV_LOG(DEBUG, "port %u cannot retrieve MAC address list %s",
> + dev->data->port_id, strerror(rte_errno));
> + return -rte_errno;
> +}
> +
> +/**
> + * Modify the mac address neighbour table with Netlink.
mac => MAC
Metlink => Netlink
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + * @param mac
> + * MAC address to consider.
> + * @param add
> + * 1 to add the MAC address, 0 to remove the MAC address.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_vf_mac_addr_modify(struct rte_eth_dev *dev, struct ether_addr *mac,
> + int add)
> +{
> + int iface_idx = mlx5_vf_iface_idx(dev);
> + struct {
> + struct nlmsghdr hdr;
> + struct ndmsg ndm;
> + struct rtattr rta;
> + } req = {
> + .hdr = {
> + .nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)),
> + .nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE |
> + NLM_F_EXCL | NLM_F_ACK,
> + .nlmsg_type = add ? RTM_NEWNEIGH : RTM_DELNEIGH,
> + },
> + .ndm = {
> + .ndm_family = PF_BRIDGE,
> + .ndm_state = NUD_NOARP | NUD_PERMANENT,
> + .ndm_ifindex = iface_idx,
> + .ndm_flags = NTF_SELF,
> + },
> + .rta = {
> + .rta_type = NDA_LLADDR,
> + .rta_len = RTA_LENGTH(ETHER_ADDR_LEN),
> + },
> + };
> + int fd;
> + int ret;
> +
> + memcpy(RTA_DATA(&req.rta), mac, ETHER_ADDR_LEN);
> + req.hdr.nlmsg_len = NLMSG_ALIGN(req.hdr.nlmsg_len) +
> + RTA_ALIGN(req.rta.rta_len);
> + fd = rte_nl_init(RTMGRP_LINK);
> + if (fd < 0) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_send(fd, &req.hdr);
> + if (ret == -1) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_recv_ack(fd);
> + if (ret == -1) {
> + rte_errno = errno;
> + goto error;
> + }
> + rte_nl_final(fd);
> + return 0;
> +error:
> + if (fd >= 0)
> + rte_nl_final(fd);
> + DRV_LOG(DEBUG,
> + "port %u cannot %s MAC address %02X:%02X:%02X:%02X:%02X:%02X"
> + " %s",
> + dev->data->port_id,
> + add ? "add" : "remove",
> + mac->addr_bytes[0], mac->addr_bytes[1],
> + mac->addr_bytes[2], mac->addr_bytes[3],
> + mac->addr_bytes[4], mac->addr_bytes[5],
> + strerror(rte_errno));
> + return -rte_errno;
> +}
> +
> +/**
> + * add a MAC address.
add => Add (can't stop!)
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + * @param mac_addr
> + * MAC address to register.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac)
> +{
> + struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
> + int macs_n = 0;
> + int i;
> + int mac_index = MLX5_MAX_MAC_ADDRESSES;
> + int ret;
> +
> + ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
> + if (ret)
> + return ret;
> + for (i = 0; i != macs_n; ++i) {
> + if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
> + mac_index = i;
> + break;
> + }
> + }
> + if (mac_index != MLX5_MAX_MAC_ADDRESSES) {
> + DRV_LOG(INFO, "port %u MAC address already added",
added => present
> + dev->data->port_id);
> + rte_errno = EADDRINUSE;
> + return -rte_errno;
> + }
> + return mlx5_vf_mac_addr_modify(dev, mac, 1);
> +}
> +
> +/**
> + * Remove a MAC address.
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + * @param index
> + * MAC address to remove.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac)
> +{
> + struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
> + int macs_n = 0;
> + int i;
> + int mac_index = MLX5_MAX_MAC_ADDRESSES;
> + int ret;
> +
> + ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
> + if (ret)
> + return ret;
> + for (i = 0; i != macs_n; ++i) {
> + if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
> + mac_index = i;
> + break;
> + }
> + }
> + if (mac_index == MLX5_MAX_MAC_ADDRESSES) {
> + DRV_LOG(INFO, "port %u MAC address not found",
> + dev->data->port_id);
> + rte_errno = EINVAL;
How about EADDRNOTAVAIL (a better counterpart for EADDRINUSE)?
> + return -rte_errno;
> + }
> + return mlx5_vf_mac_addr_modify(dev, mac, 0);
> +}
> +
> +/**
> + * Flush all added MAC addresses.
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev)
> +{
> + int i;
> + const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
> +
> + /* Skip the default mac at index 0. */
I'm not sure, what if index 0 happens to be yet another secondary MAC
address of the netdevice at this point? I think the PMD should attempt to
flush them all starting from the end of the MAC array and not care about
errors; this function should return void.
> + for (i = 1; i != MLX5_MAX_MAC_ADDRESSES; ++i) {
> + struct ether_addr *m = &dev->data->mac_addrs[i];
> +
> + if (memcmp(&mac_null, m, ETHER_ADDR_LEN)) {
> + int ret;
> +
> + ret = mlx5_vf_mac_addr_remove(dev, m);
> + if (ret)
> + return ret;
> + }
> + }
> + return 0;
> +}
--
Adrien Mazarguil
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 4/5] net/mlx5: use Netlink to enable promisc/allmulti
2018-03-13 12:50 ` [dpdk-dev] [PATCH 4/5] net/mlx5: use Netlink to enable promisc/allmulti Nelio Laranjeiro
@ 2018-03-14 17:11 ` Adrien Mazarguil
0 siblings, 0 replies; 32+ messages in thread
From: Adrien Mazarguil @ 2018-03-14 17:11 UTC (permalink / raw)
To: Nelio Laranjeiro; +Cc: Yongseok Koh, dev
On Tue, Mar 13, 2018 at 01:50:38PM +0100, Nelio Laranjeiro wrote:
> VF devices are not able to receive promisc or allmulti traffic unless it
> fully requests it though Netlink. This will cause the request to be
> processed by the PF which will handle the request and enable it.
>
> This requires the VF to be trusted by the PF.
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Usual comments regarding "vf" => "nl", caps and typos. More nits below.
> ---
> drivers/net/mlx5/mlx5.h | 2 +
> drivers/net/mlx5/mlx5_trigger.c | 27 ++++++++++-
> drivers/net/mlx5/mlx5_vf.c | 102 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 130 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index a4333e6a5..245235641 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -303,5 +303,7 @@ int mlx5_mr_verify(struct rte_eth_dev *dev);
> int mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac);
> int mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac);
> int mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev);
> +int mlx5_vf_promisc(struct rte_eth_dev *dev, int enable);
> +int mlx5_vf_allmulti(struct rte_eth_dev *dev, int enable);
>
> #endif /* RTE_PMD_MLX5_H_ */
> diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
> index 6bb4ffb14..3f21797ff 100644
> --- a/drivers/net/mlx5/mlx5_trigger.c
> +++ b/drivers/net/mlx5/mlx5_trigger.c
> @@ -278,8 +278,23 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
> unsigned int j;
> int ret;
>
> - if (priv->isolated)
> + if (priv->isolated) {
> + if (priv->config.vf) {
> + if (dev->data->promiscuous) {
> + ret = mlx5_vf_promisc(dev, 1);
> + rte_errno = ret;
> + if (ret)
> + goto error;
> + }
> + if (dev->data->all_multicast) {
> + ret = mlx5_vf_allmulti(dev, 1);
> + rte_errno = ret;
> + if (ret)
> + goto error;
> + }
> + }
Looks like this block should be run no matter what. You should put it before
the check on priv->isolated and drop the two next chunks.
Note there seems to be missing rollback code in case of error.
> return 0;
> + }
> if (dev->data->promiscuous) {
> struct rte_flow_item_eth promisc = {
> .dst.addr_bytes = "\x00\x00\x00\x00\x00\x00",
> @@ -287,6 +302,11 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
> .type = 0,
> };
>
> + if (priv->config.vf) {
> + ret = mlx5_vf_promisc(dev, 1);
> + if (ret)
> + goto error;
> + }
> ret = mlx5_ctrl_flow(dev, &promisc, &promisc);
> if (ret)
> goto error;
> @@ -298,6 +318,11 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
> .type = 0,
> };
>
> + if (priv->config.vf) {
> + ret = mlx5_vf_allmulti(dev, 1);
> + if (ret)
> + goto error;
> + }
> ret = mlx5_ctrl_flow(dev, &multicast, &multicast);
> if (ret)
> goto error;
> diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
> index 3db8ee0eb..cf71e79d9 100644
> --- a/drivers/net/mlx5/mlx5_vf.c
> +++ b/drivers/net/mlx5/mlx5_vf.c
> @@ -447,3 +447,105 @@ mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev)
> }
> return 0;
> }
> +
> +/**
> + * Enable promisc/allmulti though Netlink
though => through, also missing period.
I suggest to write promisc and allmulti in full in documentation
(as "promiscuous" and "all multicast" modes).
> + *
> + * @param dev
> + * Pointer to Ethernet device structure.
> + * @param flags
> + * IFF_PROMISC for promiscuous, IFF_ALLMULTI for allmulti.
> + * @param en
en => enable
> + * 1 to enable, 0 to disable.
=> Nonzero to enable, disable otherwise.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_vf_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
> +{
> + int iface_idx = mlx5_vf_iface_idx(dev);
> + struct {
> + struct nlmsghdr hdr;
> + struct ifinfomsg ifi;
> + } req = {
> + .hdr = {
> + .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
> + .nlmsg_type = RTM_NEWLINK,
> + .nlmsg_flags = NLM_F_REQUEST,
> + },
> + .ifi = {
> + .ifi_flags = enable ? flags : 0,
> + .ifi_change = flags,
> + .ifi_index = iface_idx,
> + },
> + };
> + int fd;
> + int ret;
> +
> + assert(!(flags & ~(IFF_PROMISC | IFF_ALLMULTI)));
> + fd = rte_nl_init(NETLINK_ROUTE);
> + if (fd < 0) {
> + rte_errno = errno;
> + goto error;
> + }
> + ret = rte_nl_send(fd, &req.hdr);
> + if (ret == -1) {
> + rte_errno = errno;
> + goto error;
> + }
> + rte_nl_final(fd);
> + return 0;
> +error:
> + if (fd >= 0)
> + rte_nl_final(fd);
> + return -rte_errno;
> +}
> +
> +/**
> + * Enable promisc though Netlink
Missing period, another suggestion: promisc => promiscuous mode.
> + *
> + * @param dev
> + * Pointer to Ethernet device structure.
> + * @param en
> + * 1 to enable promisc, 0 to disable it.
=> Nonzero to enable, disable otherwise.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_vf_promisc(struct rte_eth_dev *dev, int enable)
> +{
> + int ret = mlx5_vf_device_flags(dev, IFF_PROMISC, enable);
> +
> + if (ret)
> + DRV_LOG(DEBUG,
> + "port %u cannot %s promisc mode: Netlink error %s",
> + dev->data->port_id, enable ? "enable" : "disable",
> + strerror(rte_errno));
> + return ret;
> +}
> +
> +/**
> + * Enable allmulti though Netlink
though => through
> + *
> + * @param dev
> + * Pointer to Ethernet device structure.
> + * @param en
> + * 1 to enable promisc, 0 to disable it.
> + *
> + * @return
> + * 0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_vf_allmulti(struct rte_eth_dev *dev, int enable)
> +{
> + int ret = mlx5_vf_device_flags(dev, IFF_ALLMULTI, enable);
> +
> + if (ret)
> + DRV_LOG(DEBUG,
> + "port %u cannot %s allmulti mode: Netlink error %s",
> + dev->data->port_id, enable ? "enable" : "disable",
> + strerror(rte_errno));
> + return ret;
> +}
> --
> 2.11.0
>
--
Adrien Mazarguil
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH 5/5] net/mlx5: add a parameter for Netlink support in VF
2018-03-13 12:50 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
@ 2018-03-14 17:11 ` Adrien Mazarguil
0 siblings, 0 replies; 32+ messages in thread
From: Adrien Mazarguil @ 2018-03-14 17:11 UTC (permalink / raw)
To: Nelio Laranjeiro; +Cc: Yongseok Koh, dev
On Tue, Mar 13, 2018 at 01:50:39PM +0100, Nelio Laranjeiro wrote:
> All Netlink request the PMD will do can also be done by a iproute2 command
> line interface, letting the operator configure the VF behavior without
> having to modify the application nor reaching PMD limits (e.g. MAC
> address number limit).
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Some nits, please see below.
> ---
> doc/guides/nics/mlx5.rst | 13 ++++++++++++
> drivers/net/mlx5/mlx5.c | 7 ++++++
> drivers/net/mlx5/mlx5.h | 1 +
> drivers/net/mlx5/mlx5_vf.c | 53 ++++++++++++++++++++++++++++++++++++++++++++--
> 4 files changed, 72 insertions(+), 2 deletions(-)
>
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index 46d26e4c8..54e6f327b 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -135,6 +135,11 @@ Limitations
> - Flows with a VXLAN Network Identifier equal (or ends to be equal)
> to 0 are not supported.
> - VXLAN TSO and checksum offloads are not supported on VM.
> +- VF: Flow rules matching a specific MAC address are only triggers if the MAC
> + has been previously added by the application.
Rewording suggestion:
Flow rules created on VF devices can only match traffic targeted at the
configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
> + Adding more than MLX5_MAX_MAC_ADDRESSES MAC addresses is not supported in VF
> + mode. **Note** needs a Linux kernel v4.16 or higher or Mellanox OFED
> + installed.
I think this last bit is kind of redundant since applications already can't
add more MAC addresses than what a PMD allows. It's not a new limitation.
>
> Statistics
> ----------
> @@ -335,6 +340,14 @@ Run-time configuration
>
> Enabled by default.
>
> +- ``vf_nl_en`` parameter [int]
> +
> + A nonzero value enables Netlink requests from the VF to add/remove MAC
> + addresses or/and enable/disable promiscuous/allmulticast on the Netdevice.
> + Otherwise the according configuration must be run with Linux iproute2 tools.
You might add that it's a necessary prerequisite to receive the traffic in
question.
> +
> + Enabled by default on VF.
=> VF devices.
> +
> Prerequisites
> -------------
>
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index e966884bd..d1ca0837c 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -68,6 +68,9 @@
> /* Device parameter to enable hardware Rx vector. */
> #define MLX5_RX_VEC_EN "rx_vec_en"
>
> +/* Activate Netlink support in VF mode. */
> +#define MLX5_VF_NL_EN "vf_nl_en"
> +
> #ifndef HAVE_IBV_MLX5_MOD_MPW
> #define MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED (1 << 2)
> #define MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW (1 << 3)
> @@ -414,6 +417,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
> config->tx_vec_en = !!tmp;
> } else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
> config->rx_vec_en = !!tmp;
> + } else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
> + config->vf_nl_en = !!tmp;
> } else {
> DRV_LOG(WARNING, "%s: unknown parameter", key);
> rte_errno = EINVAL;
> @@ -445,6 +450,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
> MLX5_TXQ_MAX_INLINE_LEN,
> MLX5_TX_VEC_EN,
> MLX5_RX_VEC_EN,
> + MLX5_VF_NL_EN,
> NULL,
> };
> struct rte_kvargs *kvlist;
> @@ -750,6 +756,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> .txq_inline = MLX5_ARG_UNSET,
> .txqs_inline = MLX5_ARG_UNSET,
> .inline_max_packet_sz = MLX5_ARG_UNSET,
> + .vf_nl_en = 1,
> };
>
> len = snprintf(name, sizeof(name), PCI_PRI_FMT,
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index 245235641..8c3e925af 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -88,6 +88,7 @@ struct mlx5_dev_config {
> unsigned int tx_vec_en:1; /* Tx vector is enabled. */
> unsigned int rx_vec_en:1; /* Rx vector is enabled. */
> unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
> + unsigned int vf_nl_en:1; /* Enable Netlink request in VF mode. */
request => requests
> unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
> unsigned int ind_table_max_size; /* Maximum indirection table size. */
> int txq_inline; /* Maximum packet size for inlining. */
> diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c
> index cf71e79d9..18b02365d 100644
> --- a/drivers/net/mlx5/mlx5_vf.c
> +++ b/drivers/net/mlx5/mlx5_vf.c
> @@ -361,7 +361,16 @@ mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac)
> int i;
> int mac_index = MLX5_MAX_MAC_ADDRESSES;
> int ret;
> + struct priv *priv = dev->data->dev_private;
>
> + if (!priv->config.vf_nl_en) {
> + DRV_LOG(WARNING,
> + "port %u Netlink requests are disabled, add the MAC"
> + " though bridge command: bridge fdb add <mac> dev"
> + " <device>",
> + dev->data->port_id);
Log level should be INFO (even "PROTIP" if such a thing existed :)
Also I'm not sure it should be done for each call, printing it once during
startup should be enough.
> + return 0;
> + }
> ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
> if (ret)
> return ret;
> @@ -399,7 +408,16 @@ mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac)
> int i;
> int mac_index = MLX5_MAX_MAC_ADDRESSES;
> int ret;
> + struct priv *priv = dev->data->dev_private;
>
> + if (!priv->config.vf_nl_en) {
> + DRV_LOG(WARNING,
> + "port %u Netlink requests are disabled, delete the MAC"
> + " though bridge command: bridge fdb del <mac> dev"
> + " <device>",
> + dev->data->port_id);
Same here.
> + return 0;
> + }
> ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n);
> if (ret)
> return ret;
> @@ -432,7 +450,16 @@ mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev)
> {
> int i;
> const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
> + struct priv *priv = dev->data->dev_private;
>
> + if (!priv->config.vf_nl_en) {
> + DRV_LOG(WARNING,
> + "port %u Netlink requests are disabled, flush added"
> + " MAC though bridge command: bridge fdb del <mac> dev"
> + " <device>",
> + dev->data->port_id);
Ditto.
> + return 0;
> + }
> /* Skip the default mac at index 0. */
> for (i = 1; i != MLX5_MAX_MAC_ADDRESSES; ++i) {
> struct ether_addr *m = &dev->data->mac_addrs[i];
> @@ -516,8 +543,19 @@ mlx5_vf_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
> int
> mlx5_vf_promisc(struct rte_eth_dev *dev, int enable)
> {
> - int ret = mlx5_vf_device_flags(dev, IFF_PROMISC, enable);
> + int ret;
> + struct priv *priv = dev->data->dev_private;
>
> + if (!priv->config.vf_nl_en) {
> + DRV_LOG(WARNING,
> + "port %u Netlink requests are disabled, %s promisc"
> + " though ip link command: ip link set <device> promisc"
> + " %s",
> + dev->data->port_id, enable ? "enable" : "disable",
> + enable ? "on" : "off");
Ditto.
> + return 0;
> + }
> + ret = mlx5_vf_device_flags(dev, IFF_PROMISC, enable);
> if (ret)
> DRV_LOG(DEBUG,
> "port %u cannot %s promisc mode: Netlink error %s",
> @@ -540,8 +578,19 @@ mlx5_vf_promisc(struct rte_eth_dev *dev, int enable)
> int
> mlx5_vf_allmulti(struct rte_eth_dev *dev, int enable)
> {
> - int ret = mlx5_vf_device_flags(dev, IFF_ALLMULTI, enable);
> + int ret;
> + struct priv *priv = dev->data->dev_private;
>
> + if (!priv->config.vf_nl_en) {
> + DRV_LOG(WARNING,
> + "port %u Netlink requests are disabled, %s allmulti"
> + " though ip link command: ip link set <device>"
> + " allmulticast %s",
> + dev->data->port_id, enable ? "enable" : "disable",
> + enable ? "on" : "off");
Ditto.
> + return 0;
> + }
> + ret = mlx5_vf_device_flags(dev, IFF_ALLMULTI, enable);
> if (ret)
> DRV_LOG(DEBUG,
> "port %u cannot %s allmulti mode: Netlink error %s",
> --
> 2.11.0
>
--
Adrien Mazarguil
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v2 0/3] net/mlx5: use Netlink in VF mode
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (4 preceding siblings ...)
2018-03-13 12:50 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
@ 2018-03-19 15:20 ` Nelio Laranjeiro
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
` (6 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-19 15:20 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
When MLX5 behaves in VF mode and the hypervisor have **trusted** this VF, to
be able to receive specific traffic some requests must be done to configure
the NIC. There is no API currently available to do it though Verbs, but there
is in Linux side using Netlink.
The specific cases are:
- Enable/disable promiscuous mode.
- Enable/disable allmulti mode.
- Add/remove mac addresses.
Changes in v2:
Embed Netlink socket communication inside the PMD.
Nelio Laranjeiro (3):
net/mlx5: use Netlink to add/remove MAC addresses
net/mlx5: use Netlink to enable promisc / all multicast mode
net/mlx5: add a parameter for Netlink support in VF
doc/guides/nics/mlx5.rst | 11 +
drivers/net/mlx5/Makefile | 1 +
drivers/net/mlx5/mlx5.c | 29 ++
drivers/net/mlx5/mlx5.h | 14 +
drivers/net/mlx5/mlx5_ethdev.c | 27 ++
drivers/net/mlx5/mlx5_mac.c | 25 +-
drivers/net/mlx5/mlx5_nl.c | 623 +++++++++++++++++++++++++++++++++++++++++
drivers/net/mlx5/mlx5_rxmode.c | 8 +
8 files changed, 736 insertions(+), 2 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_nl.c
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v2 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (5 preceding siblings ...)
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
@ 2018-03-19 15:20 ` Nelio Laranjeiro
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
` (5 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-19 15:20 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive traffic unless it fully requests it
though Netlink. This will cause the request to be processed by the PF
which will add/remove the MAC address to the VF table if the VF is trusted.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
drivers/net/mlx5/Makefile | 1 +
drivers/net/mlx5/mlx5.c | 22 ++
drivers/net/mlx5/mlx5.h | 13 +
drivers/net/mlx5/mlx5_ethdev.c | 27 +++
drivers/net/mlx5/mlx5_mac.c | 25 +-
drivers/net/mlx5/mlx5_nl.c | 529 +++++++++++++++++++++++++++++++++++++++++
6 files changed, 615 insertions(+), 2 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_nl.c
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index afda4118f..5cca2ba53 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -59,6 +59,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 95dd49420..5ec152ab1 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -13,6 +13,7 @@
#include <errno.h>
#include <net/if.h>
#include <sys/mman.h>
+#include <linux/rtnetlink.h>
/* Verbs header. */
/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
@@ -205,6 +206,10 @@ mlx5_dev_close(struct rte_eth_dev *dev)
rte_free(priv->reta_idx);
if (priv->primary_socket)
mlx5_socket_uninit(dev);
+ if (priv->config.vf)
+ mlx5_nl_mac_addr_flush(dev);
+ if (priv->nl_socket >= 0)
+ close(priv->nl_socket);
ret = mlx5_hrxq_ibv_verify(dev);
if (ret)
DRV_LOG(WARNING, "port %u some hash Rx queue still remain",
@@ -597,6 +602,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
int err = 0;
struct ibv_context *attr_ctx = NULL;
struct ibv_device_attr_ex device_attr;
+ unsigned int vf;
unsigned int mps;
unsigned int cqe_comp;
unsigned int tunnel_en = 0;
@@ -646,6 +652,14 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
continue;
DRV_LOG(INFO, "PCI information matches, using device \"%s\"",
list[i]->name);
+ vf = ((pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5EXVF));
attr_ctx = mlx5_glue->open_device(list[i]);
rte_errno = errno;
err = rte_errno;
@@ -871,6 +885,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
DRV_LOG(DEBUG,
"hardware Rx end alignment padding is %ssupported",
(config.hw_padding ? "" : "not "));
+ config.vf = vf;
config.tso = ((device_attr_ex.tso_caps.max_tso > 0) &&
(device_attr_ex.tso_caps.supported_qpts &
(1 << IBV_QPT_RAW_PACKET)));
@@ -947,6 +962,13 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
eth_dev->tx_pkt_burst = removed_tx_burst;
eth_dev->dev_ops = &mlx5_dev_ops;
/* Register MAC address. */
+ priv->nl_socket = -1;
+ priv->nl_sn = 0;
+ if (vf) {
+ priv->nl_socket = mlx5_nl_init(RTMGRP_LINK);
+ if (priv->nl_socket < 0)
+ priv->nl_socket = -1;
+ }
claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0));
TAILQ_INIT(&priv->flows);
TAILQ_INIT(&priv->ctrl_flows);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index faacfd9d6..6a7e9f310 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -78,6 +78,7 @@ struct mlx5_dev_config {
unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
unsigned int hw_padding:1; /* End alignment padding is supported. */
+ unsigned int vf:1; /* This is a VF. */
unsigned int mps:2; /* Multi-packet send supported mode. */
unsigned int tunnel_en:1;
/* Whether tunnel stateless offloads are supported. */
@@ -154,6 +155,8 @@ struct priv {
struct mlx5_dev_config config; /* Device configuration. */
struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
/* Context for Verbs allocator. */
+ int nl_socket; /* Netlink socket. */
+ uint32_t nl_sn; /* Netlink message sequence number. */
};
/* mlx5.c */
@@ -163,6 +166,7 @@ int mlx5_getenv_int(const char *);
/* mlx5_ethdev.c */
int mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[IF_NAMESIZE]);
+int mlx5_ifindex(const struct rte_eth_dev *dev);
int mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr);
int mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu);
int mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep,
@@ -297,4 +301,13 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct rte_mempool *mp);
int mlx5_mr_release(struct mlx5_mr *mr);
int mlx5_mr_verify(struct rte_eth_dev *dev);
+/* mlx5_nl.c */
+
+int mlx5_nl_init(uint32_t nlgroups);
+int mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac);
+int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac);
+void mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev);
+int mlx5_nl_promisc(struct rte_eth_dev *dev, int enable);
+int mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable);
+
#endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f219f2ccd..911f366c3 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -177,6 +177,33 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[IF_NAMESIZE])
}
/**
+ * Get the interface index from the sysfs ifindex file.
+ *
+ * @param[in] dev
+ * Pointer to Ethernet device.
+ *
+ * @return
+ * Interface index on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+int
+mlx5_ifindex(const struct rte_eth_dev *dev)
+{
+ char ifname[IF_NAMESIZE];
+ int ret;
+
+ ret = mlx5_get_ifname(dev, &ifname);
+ if (ret)
+ return ret;
+ ret = if_nametoindex(ifname);
+ if (ret == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return ret;
+}
+
+/**
* Perform ifreq ioctl() on associated Ethernet device.
*
* @param[in] dev
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 01c7ba17a..61b42ba95 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -67,11 +67,24 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[ETHER_ADDR_LEN])
void
mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
+ int ret;
+
assert(index < MLX5_MAX_MAC_ADDRESSES);
+ if (vf) {
+ ret = mlx5_nl_mac_addr_remove(dev,
+ &dev->data->mac_addrs[index]);
+ if (ret) {
+ DRV_LOG(WARNING,
+ "port %u cannot remove mac address at index %d",
+ dev->data->port_id, index);
+ return;
+ }
+ }
memset(&dev->data->mac_addrs[index], 0, sizeof(struct ether_addr));
if (!dev->data->promiscuous) {
- int ret = mlx5_traffic_restart(dev);
-
+ ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot remove mac address: %s",
dev->data->port_id, strerror(rte_errno));
@@ -97,6 +110,8 @@ int
mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
uint32_t index, uint32_t vmdq __rte_unused)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
unsigned int i;
assert(index < MLX5_MAX_MAC_ADDRESSES);
@@ -111,6 +126,12 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
rte_errno = EADDRINUSE;
return -rte_errno;
}
+ if (vf) {
+ int ret = mlx5_nl_mac_addr_add(dev, mac);
+
+ if (ret)
+ return ret;
+ }
dev->data->mac_addrs[index] = *mac;
if (!dev->data->promiscuous)
return mlx5_traffic_restart(dev);
diff --git a/drivers/net/mlx5/mlx5_nl.c b/drivers/net/mlx5/mlx5_nl.c
new file mode 100644
index 000000000..21edefc0f
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_nl.c
@@ -0,0 +1,529 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 6WIND S.A.
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+#include <unistd.h>
+#include <linux/rtnetlink.h>
+#include <linux/netlink.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/* Size of the buffer to receive kernel messages */
+#define MLX5_NL_BUF_SIZE (32 * 1024)
+/* Send buffer size for the netlink socket */
+#define MLX5_SEND_BUF_SIZE 32768
+/* Receive buffer size for the netlink socket */
+#define MLX5_RECV_BUF_SIZE 32768
+
+/*
+ * Define NDA_RTA as defined in iproute2 sources.
+ *
+ * see in iproute2 sources file include/libnetlink.h
+ */
+#ifndef MLX5_NDA_RTA
+#define MLX5_NDA_RTA(r) \
+ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg))))
+#endif
+
+/* Add/remove MAC address through Netlink */
+struct mlx5_nl_mac_addr {
+ struct ether_addr (*mac)[];
+ /**< MAC address handled by the device. */
+ int mac_n; /**< Number of addresses in the array. */
+};
+
+/**
+ * Opens a Netlink socket.
+ *
+ * @param nl_groups
+ * Netlink group value (e.g. RTMGRP_LINK).
+ *
+ * @return
+ * A file descriptor on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+int
+mlx5_nl_init(uint32_t nl_groups)
+{
+ int fd;
+ int sndbuf_size = MLX5_SEND_BUF_SIZE;
+ int rcvbuf_size = MLX5_RECV_BUF_SIZE;
+ struct sockaddr_nl local = {
+ .nl_family = AF_NETLINK,
+ .nl_groups = nl_groups,
+ };
+ int ret;
+
+ fd = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
+ if (fd == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ ret = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, sizeof(int));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, sizeof(int));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = bind(fd, (struct sockaddr *)&local, sizeof(local));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ return fd;
+error:
+ close(fd);
+ return -rte_errno;
+}
+
+/**
+ * Send a request message to the kernel on the Netlink socket.
+ *
+ * @param[in] nlsk_fd
+ * Netlink socket file descriptor.
+ * @param[in] nh
+ * The netlink message send to the kernel.
+ * @param[in] ssn
+ * Sequence number.
+ * @param[in] req
+ * Pointer to the request structure.
+ * @param[in] len
+ * Length of the request in bytes.
+ *
+ * @return
+ * The number of sent bytes on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+static int
+mlx5_nl_request(int nlsk_fd, struct nlmsghdr *nh, uint32_t sn, void *req,
+ int len)
+{
+ /* man 7 netlink EXAMPLE */
+ struct sockaddr_nl sa = {
+ .nl_family = AF_NETLINK,
+ };
+ struct iovec iov[2] = {
+ { .iov_base = nh, .iov_len = sizeof(*nh), },
+ { .iov_base = req, .iov_len = len, },
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = iov,
+ .msg_iovlen = 2,
+ };
+ int send_bytes;
+
+ nh->nlmsg_pid = 0; /* communication with the kernel uses pid 0 */
+ nh->nlmsg_seq = sn;
+ send_bytes = sendmsg(nlsk_fd, &msg, 0);
+ if (send_bytes < 0) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return send_bytes;
+}
+
+/**
+ * Send a message to the kernel on the Netlink socket.
+ *
+ * @param[in] nlsk_fd
+ * The netlink socket file descriptor used for communication.
+ * @param[in] nh
+ * The Netlink message send to the kernel.
+ * @param[in] sn
+ * Sequence number.
+ *
+ * @return
+ * the number of sent bytes on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+static int
+mlx5_nl_send(int nlsk_fd, struct nlmsghdr *nh, uint32_t sn)
+{
+ /* man 7 netlink EXAMPLE */
+ struct sockaddr_nl sa = {
+ .nl_family = AF_NETLINK,
+ };
+ struct iovec iov = {
+ .iov_base = nh,
+ .iov_len = nh->nlmsg_len,
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ int send_bytes;
+
+ nh->nlmsg_pid = 0; /* communication with the kernel uses pid 0 */
+ nh->nlmsg_seq = sn;
+ send_bytes = sendmsg(nlsk_fd, &msg, 0);
+ if (send_bytes < 0) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return send_bytes;
+}
+
+/**
+ * Receive a message from the kernel on the Netlink socket, following
+ * mlx5_nl_send().
+ *
+ * @param[in] nlsk_fd
+ * The Netlink socket file descriptor used for communication.
+ * @param[in] sn
+ * Sequence number.
+ * @param[in] cb
+ * The callback function to call for each Netlink message received.
+ * @param[in, out] arg
+ * Custom arguments for the callback.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_recv(int nlsk_fd, uint32_t sn, int (*cb)(struct nlmsghdr *, void *arg),
+ void *arg)
+{
+ /* man 7 netlink EXAMPLE */
+ struct sockaddr_nl sa;
+ char buf[MLX5_RECV_BUF_SIZE];
+ struct iovec iov = {
+ .iov_base = buf,
+ .iov_len = sizeof(buf),
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ /* One message at a time */
+ .msg_iovlen = 1,
+ };
+ int multipart = 0;
+ int ret = 0;
+
+ do {
+ struct nlmsghdr *nh;
+ int recv_bytes = 0;
+
+ do {
+ recv_bytes = recvmsg(nlsk_fd, &msg, 0);
+ if (recv_bytes == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ nh = (struct nlmsghdr *)buf;
+ } while (nh->nlmsg_seq != sn);
+ for (;
+ NLMSG_OK(nh, (unsigned int)recv_bytes);
+ nh = NLMSG_NEXT(nh, recv_bytes)) {
+ if (nh->nlmsg_type == NLMSG_ERROR) {
+ struct nlmsgerr *err_data = NLMSG_DATA(nh);
+
+ if (err_data->error < 0) {
+ rte_errno = -err_data->error;
+ return -rte_errno;
+ }
+ /* Ack message. */
+ return 0;
+ }
+ /* Multi-part msgs and their trailing DONE message. */
+ if (nh->nlmsg_flags & NLM_F_MULTI) {
+ if (nh->nlmsg_type == NLMSG_DONE)
+ return 0;
+ multipart = 1;
+ }
+ if (cb) {
+ ret = cb(nh, arg);
+ if (ret < 0)
+ return ret;
+ }
+ }
+ } while (multipart);
+ return ret;
+}
+
+/**
+ * Parse Netlink message to retrieve the bridge MAC address.
+ *
+ * @param nh
+ * Pointer to Netlink Message Header.
+ * @param arg
+ * PMD data register with this callback.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_cb(struct nlmsghdr *nh, void *arg)
+{
+ struct mlx5_nl_mac_addr *data = arg;
+ struct ndmsg *r = NLMSG_DATA(nh);
+ struct rtattr *attribute;
+ int len;
+
+ len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
+ for (attribute = MLX5_NDA_RTA(r);
+ RTA_OK(attribute, len);
+ attribute = RTA_NEXT(attribute, len)) {
+ if (attribute->rta_type == NDA_LLADDR) {
+ if (data->mac_n == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(WARNING,
+ "not enough room to finalise the"
+ " request");
+ rte_errno = ENOMEM;
+ return -rte_errno;
+ }
+#ifndef NDEBUG
+ struct ether_addr m;
+
+ memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN);
+ DRV_LOG(DEBUG,
+ "brige MAC address"
+ " %02X:%02X:%02X:%02X:%02X:%02X",
+ m.addr_bytes[0], m.addr_bytes[1],
+ m.addr_bytes[2], m.addr_bytes[3],
+ m.addr_bytes[4], m.addr_bytes[5]);
+#endif
+ memcpy(&(*data->mac)[data->mac_n++],
+ RTA_DATA(attribute), ETHER_ADDR_LEN);
+ }
+ }
+ return 0;
+}
+
+/**
+ * Get bridge MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac[out]
+ * Pointer to the array table of MAC addresses to fill.
+ * Its size should be of MLX5_MAX_MAC_ADDRESSES.
+ * @param mac_n[out]
+ * Number of entries filled in MAC array.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_list(struct rte_eth_dev *dev, struct ether_addr (*mac)[],
+ int *mac_n)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifm;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_GETNEIGH,
+ .nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
+ },
+ .ifm = {
+ .ifi_family = PF_BRIDGE,
+ .ifi_index = iface_idx,
+ },
+ };
+ struct mlx5_nl_mac_addr data = {
+ .mac = mac,
+ .mac_n = 0,
+ };
+ int fd;
+ int ret;
+ uint32_t sn = priv->nl_sn++;
+
+ if (priv->nl_socket == -1)
+ return 0;
+ fd = priv->nl_socket;
+ ret = mlx5_nl_request(fd, &req.hdr, sn, &req.ifm,
+ sizeof(struct ifinfomsg));
+ if (ret < 0)
+ goto error;
+ ret = mlx5_nl_recv(fd, sn, mlx5_nl_mac_addr_cb, &data);
+ if (ret < 0)
+ goto error;
+ *mac_n = data.mac_n;
+ return 0;
+error:
+ DRV_LOG(DEBUG, "port %u cannot retrieve MAC address list %s",
+ dev->data->port_id, strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Modify the MAC address neighbour table with Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac
+ * MAC address to consider.
+ * @param add
+ * 1 to add the MAC address, 0 to remove the MAC address.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_modify(struct rte_eth_dev *dev, struct ether_addr *mac,
+ int add)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ndmsg ndm;
+ struct rtattr rta;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)),
+ .nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE |
+ NLM_F_EXCL | NLM_F_ACK,
+ .nlmsg_type = add ? RTM_NEWNEIGH : RTM_DELNEIGH,
+ },
+ .ndm = {
+ .ndm_family = PF_BRIDGE,
+ .ndm_state = NUD_NOARP | NUD_PERMANENT,
+ .ndm_ifindex = iface_idx,
+ .ndm_flags = NTF_SELF,
+ },
+ .rta = {
+ .rta_type = NDA_LLADDR,
+ .rta_len = RTA_LENGTH(ETHER_ADDR_LEN),
+ },
+ };
+ int fd;
+ int ret;
+ uint32_t sn = priv->nl_sn++;
+
+ if (priv->nl_socket == -1)
+ return 0;
+ fd = priv->nl_socket;
+ memcpy(RTA_DATA(&req.rta), mac, ETHER_ADDR_LEN);
+ req.hdr.nlmsg_len = NLMSG_ALIGN(req.hdr.nlmsg_len) +
+ RTA_ALIGN(req.rta.rta_len);
+ ret = mlx5_nl_send(fd, &req.hdr, sn);
+ if (ret < 0)
+ goto error;
+ ret = mlx5_nl_recv(fd, sn, NULL, NULL);
+ if (ret < 0)
+ goto error;
+ return 0;
+error:
+ DRV_LOG(DEBUG,
+ "port %u cannot %s MAC address %02X:%02X:%02X:%02X:%02X:%02X"
+ " %s",
+ dev->data->port_id,
+ add ? "add" : "remove",
+ mac->addr_bytes[0], mac->addr_bytes[1],
+ mac->addr_bytes[2], mac->addr_bytes[3],
+ mac->addr_bytes[4], mac->addr_bytes[5],
+ strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Add a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac_addr
+ * MAC address to register.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int mac_index = MLX5_MAX_MAC_ADDRESSES;
+ int ret;
+
+ ret = mlx5_nl_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return ret;
+ assert(macs_n);
+ for (i = 0; i != macs_n; ++i) {
+ if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
+ mac_index = i;
+ break;
+ }
+ }
+ if (mac_index != MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(INFO, "port %u MAC address already added",
+ dev->data->port_id);
+ rte_errno = EADDRINUSE;
+ return -rte_errno;
+ }
+ return mlx5_nl_mac_addr_modify(dev, mac, 1);
+}
+
+/**
+ * Remove a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param index
+ * MAC address to remove.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int mac_index = MLX5_MAX_MAC_ADDRESSES;
+ int ret;
+
+ ret = mlx5_nl_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return ret;
+ for (i = 0; i != macs_n; ++i) {
+ if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
+ mac_index = i;
+ break;
+ }
+ }
+ if (mac_index == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(INFO, "port %u MAC address not found",
+ dev->data->port_id);
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ return mlx5_nl_mac_addr_modify(dev, mac, 0);
+}
+
+/**
+ * Flush all added MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ */
+void
+mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev)
+{
+ int i;
+ const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
+
+ for (i = MLX5_MAX_MAC_ADDRESSES - 1; i >= 0; --i) {
+ struct ether_addr *m = &dev->data->mac_addrs[i];
+
+ if (memcmp(&mac_null, m, ETHER_ADDR_LEN))
+ mlx5_nl_mac_addr_remove(dev, m);
+ }
+}
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v2 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (6 preceding siblings ...)
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
@ 2018-03-19 15:20 ` Nelio Laranjeiro
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
` (4 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-19 15:20 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive promisc or allmulti traffic unless it
fully requests it though Netlink. This will cause the request to be
processed by the PF which will handle the request and enable it.
This requires the VF to be trusted by the PF.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
drivers/net/mlx5/mlx5_nl.c | 94 ++++++++++++++++++++++++++++++++++++++++++
drivers/net/mlx5/mlx5_rxmode.c | 8 ++++
2 files changed, 102 insertions(+)
diff --git a/drivers/net/mlx5/mlx5_nl.c b/drivers/net/mlx5/mlx5_nl.c
index 21edefc0f..562d219fa 100644
--- a/drivers/net/mlx5/mlx5_nl.c
+++ b/drivers/net/mlx5/mlx5_nl.c
@@ -527,3 +527,97 @@ mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev)
mlx5_nl_mac_addr_remove(dev, m);
}
}
+
+/**
+ * Enable promiscuous / all multicast mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param flags
+ * IFF_PROMISC for promiscuous, IFF_ALLMULTI for allmulti.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifi;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_NEWLINK,
+ .nlmsg_flags = NLM_F_REQUEST,
+ },
+ .ifi = {
+ .ifi_flags = enable ? flags : 0,
+ .ifi_change = flags,
+ .ifi_index = iface_idx,
+ },
+ };
+ int fd;
+ int ret;
+
+ assert(!(flags & ~(IFF_PROMISC | IFF_ALLMULTI)));
+ if (priv->nl_socket < 0)
+ return 0;
+ fd = priv->nl_socket;
+ ret = mlx5_nl_send(fd, &req.hdr, priv->nl_sn++);
+ if (ret < 0)
+ return ret;
+ return 0;
+}
+
+/**
+ * Enable promiscuous mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_promisc(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_nl_device_flags(dev, IFF_PROMISC, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s promisc mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
+
+/**
+ * Enable all multicast mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_nl_device_flags(dev, IFF_ALLMULTI, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s allmulti mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index e43a4b030..c1c0f21c7 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -35,6 +35,8 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 1;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_promisc(dev, 1);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot enable promiscuous mode: %s",
@@ -53,6 +55,8 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 0;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_promisc(dev, 0);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot disable promiscuous mode: %s",
@@ -71,6 +75,8 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 1;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_allmulti(dev, 1);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot enable allmulicast mode: %s",
@@ -89,6 +95,8 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 0;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_allmulti(dev, 0);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot disable allmulicast mode: %s",
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v2 3/3] net/mlx5: add a parameter for Netlink support in VF
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (7 preceding siblings ...)
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
@ 2018-03-19 15:20 ` Nelio Laranjeiro
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (3 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-19 15:20 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
All Netlink request the PMD will do can also be done by a iproute2 command
line interface, letting the operator configure the VF behavior without
having to modify the application nor reaching PMD limits (e.g. MAC
address number limit).
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
doc/guides/nics/mlx5.rst | 11 +++++++++++
drivers/net/mlx5/mlx5.c | 9 ++++++++-
drivers/net/mlx5/mlx5.h | 1 +
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 46d26e4c8..4c7bf1c56 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -135,6 +135,8 @@ Limitations
- Flows with a VXLAN Network Identifier equal (or ends to be equal)
to 0 are not supported.
- VXLAN TSO and checksum offloads are not supported on VM.
+- VF: flow rules created on VF devices can only match traffic targeted at the
+ configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
Statistics
----------
@@ -335,6 +337,15 @@ Run-time configuration
Enabled by default.
+- ``vf_nl_en`` parameter [int]
+
+ A nonzero value enables Netlink requests from the VF to add/remove MAC
+ addresses or/and enable/disable promiscuous/all multicast on the Netdevice.
+ Otherwise the relevant configuration must be run with Linux iproute2 tools.
+ This is a prerequisite to receive this kind of traffic.
+
+ Enabled by default, valid only on VF devices ignored otherwise.
+
Prerequisites
-------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5ec152ab1..89bada409 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
/* Device parameter to enable hardware Rx vector. */
#define MLX5_RX_VEC_EN "rx_vec_en"
+/* Activate Netlink support in VF mode. */
+#define MLX5_VF_NL_EN "vf_nl_en"
+
#ifndef HAVE_IBV_MLX5_MOD_MPW
#define MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED (1 << 2)
#define MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW (1 << 3)
@@ -412,6 +415,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
config->tx_vec_en = !!tmp;
} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
config->rx_vec_en = !!tmp;
+ } else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
+ config->vf_nl_en = !!tmp;
} else {
DRV_LOG(WARNING, "%s: unknown parameter", key);
rte_errno = EINVAL;
@@ -443,6 +448,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
MLX5_TXQ_MAX_INLINE_LEN,
MLX5_TX_VEC_EN,
MLX5_RX_VEC_EN,
+ MLX5_VF_NL_EN,
NULL,
};
struct rte_kvargs *kvlist;
@@ -748,6 +754,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
.txq_inline = MLX5_ARG_UNSET,
.txqs_inline = MLX5_ARG_UNSET,
.inline_max_packet_sz = MLX5_ARG_UNSET,
+ .vf_nl_en = 1,
};
len = snprintf(name, sizeof(name), PCI_PRI_FMT,
@@ -964,7 +971,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
/* Register MAC address. */
priv->nl_socket = -1;
priv->nl_sn = 0;
- if (vf) {
+ if (vf && config.vf_nl_en) {
priv->nl_socket = mlx5_nl_init(RTMGRP_LINK);
if (priv->nl_socket < 0)
priv->nl_socket = -1;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6a7e9f310..7cabf43cc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -88,6 +88,7 @@ struct mlx5_dev_config {
unsigned int tx_vec_en:1; /* Tx vector is enabled. */
unsigned int rx_vec_en:1; /* Rx vector is enabled. */
unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
+ unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
unsigned int ind_table_max_size; /* Maximum indirection table size. */
int txq_inline; /* Maximum packet size for inlining. */
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (8 preceding siblings ...)
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
@ 2018-03-21 13:40 ` Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 " Nelio Laranjeiro
` (3 more replies)
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
` (2 subsequent siblings)
12 siblings, 4 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-21 13:40 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
When MLX5 behaves in VF mode and the hypervisor have **trusted** this VF, to
be able to receive specific traffic some requests must be done to configure
the NIC. There is no API currently available to do it though Verbs, but there
is in Linux side using Netlink.
The specific cases are:
- Enable/disable promiscuous mode.
- Enable/disable allmulti mode.
- Add/remove mac addresses.
Changes in v3:
Add missing room to add the MAC address in the Netlink message.
Changes in v2:
Embed Netlink socket communication inside the PMD.
Nelio Laranjeiro (3):
net/mlx5: use Netlink to add/remove MAC addresses
net/mlx5: use Netlink to enable promisc / all multicast mode
net/mlx5: add a parameter for Netlink support in VF
doc/guides/nics/mlx5.rst | 11 +
drivers/net/mlx5/Makefile | 1 +
drivers/net/mlx5/mlx5.c | 29 ++
drivers/net/mlx5/mlx5.h | 14 +
drivers/net/mlx5/mlx5_ethdev.c | 27 ++
drivers/net/mlx5/mlx5_mac.c | 25 +-
drivers/net/mlx5/mlx5_nl.c | 624 +++++++++++++++++++++++++++++++++++++++++
drivers/net/mlx5/mlx5_rxmode.c | 8 +
8 files changed, 737 insertions(+), 2 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_nl.c
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (9 preceding siblings ...)
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
@ 2018-03-21 13:40 ` Nelio Laranjeiro
2018-03-22 7:34 ` Shahaf Shuler
2018-03-22 7:44 ` Shahaf Shuler
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
12 siblings, 2 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-21 13:40 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive traffic unless it fully requests it
though Netlink. This will cause the request to be processed by the PF
which will add/remove the MAC address to the VF table if the VF is trusted.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
drivers/net/mlx5/Makefile | 1 +
drivers/net/mlx5/mlx5.c | 22 ++
drivers/net/mlx5/mlx5.h | 13 +
drivers/net/mlx5/mlx5_ethdev.c | 27 +++
drivers/net/mlx5/mlx5_mac.c | 25 +-
drivers/net/mlx5/mlx5_nl.c | 530 +++++++++++++++++++++++++++++++++++++++++
6 files changed, 616 insertions(+), 2 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_nl.c
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index afda4118f..5cca2ba53 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -59,6 +59,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 95dd49420..5ec152ab1 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -13,6 +13,7 @@
#include <errno.h>
#include <net/if.h>
#include <sys/mman.h>
+#include <linux/rtnetlink.h>
/* Verbs header. */
/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
@@ -205,6 +206,10 @@ mlx5_dev_close(struct rte_eth_dev *dev)
rte_free(priv->reta_idx);
if (priv->primary_socket)
mlx5_socket_uninit(dev);
+ if (priv->config.vf)
+ mlx5_nl_mac_addr_flush(dev);
+ if (priv->nl_socket >= 0)
+ close(priv->nl_socket);
ret = mlx5_hrxq_ibv_verify(dev);
if (ret)
DRV_LOG(WARNING, "port %u some hash Rx queue still remain",
@@ -597,6 +602,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
int err = 0;
struct ibv_context *attr_ctx = NULL;
struct ibv_device_attr_ex device_attr;
+ unsigned int vf;
unsigned int mps;
unsigned int cqe_comp;
unsigned int tunnel_en = 0;
@@ -646,6 +652,14 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
continue;
DRV_LOG(INFO, "PCI information matches, using device \"%s\"",
list[i]->name);
+ vf = ((pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5EXVF));
attr_ctx = mlx5_glue->open_device(list[i]);
rte_errno = errno;
err = rte_errno;
@@ -871,6 +885,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
DRV_LOG(DEBUG,
"hardware Rx end alignment padding is %ssupported",
(config.hw_padding ? "" : "not "));
+ config.vf = vf;
config.tso = ((device_attr_ex.tso_caps.max_tso > 0) &&
(device_attr_ex.tso_caps.supported_qpts &
(1 << IBV_QPT_RAW_PACKET)));
@@ -947,6 +962,13 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
eth_dev->tx_pkt_burst = removed_tx_burst;
eth_dev->dev_ops = &mlx5_dev_ops;
/* Register MAC address. */
+ priv->nl_socket = -1;
+ priv->nl_sn = 0;
+ if (vf) {
+ priv->nl_socket = mlx5_nl_init(RTMGRP_LINK);
+ if (priv->nl_socket < 0)
+ priv->nl_socket = -1;
+ }
claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0));
TAILQ_INIT(&priv->flows);
TAILQ_INIT(&priv->ctrl_flows);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index faacfd9d6..6a7e9f310 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -78,6 +78,7 @@ struct mlx5_dev_config {
unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
unsigned int hw_padding:1; /* End alignment padding is supported. */
+ unsigned int vf:1; /* This is a VF. */
unsigned int mps:2; /* Multi-packet send supported mode. */
unsigned int tunnel_en:1;
/* Whether tunnel stateless offloads are supported. */
@@ -154,6 +155,8 @@ struct priv {
struct mlx5_dev_config config; /* Device configuration. */
struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
/* Context for Verbs allocator. */
+ int nl_socket; /* Netlink socket. */
+ uint32_t nl_sn; /* Netlink message sequence number. */
};
/* mlx5.c */
@@ -163,6 +166,7 @@ int mlx5_getenv_int(const char *);
/* mlx5_ethdev.c */
int mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[IF_NAMESIZE]);
+int mlx5_ifindex(const struct rte_eth_dev *dev);
int mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr);
int mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu);
int mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep,
@@ -297,4 +301,13 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct rte_mempool *mp);
int mlx5_mr_release(struct mlx5_mr *mr);
int mlx5_mr_verify(struct rte_eth_dev *dev);
+/* mlx5_nl.c */
+
+int mlx5_nl_init(uint32_t nlgroups);
+int mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac);
+int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac);
+void mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev);
+int mlx5_nl_promisc(struct rte_eth_dev *dev, int enable);
+int mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable);
+
#endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f5511ce70..672589f71 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -177,6 +177,33 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[IF_NAMESIZE])
}
/**
+ * Get the interface index from the sysfs ifindex file.
+ *
+ * @param[in] dev
+ * Pointer to Ethernet device.
+ *
+ * @return
+ * Interface index on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+int
+mlx5_ifindex(const struct rte_eth_dev *dev)
+{
+ char ifname[IF_NAMESIZE];
+ int ret;
+
+ ret = mlx5_get_ifname(dev, &ifname);
+ if (ret)
+ return ret;
+ ret = if_nametoindex(ifname);
+ if (ret == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return ret;
+}
+
+/**
* Perform ifreq ioctl() on associated Ethernet device.
*
* @param[in] dev
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 01c7ba17a..61b42ba95 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -67,11 +67,24 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[ETHER_ADDR_LEN])
void
mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
+ int ret;
+
assert(index < MLX5_MAX_MAC_ADDRESSES);
+ if (vf) {
+ ret = mlx5_nl_mac_addr_remove(dev,
+ &dev->data->mac_addrs[index]);
+ if (ret) {
+ DRV_LOG(WARNING,
+ "port %u cannot remove mac address at index %d",
+ dev->data->port_id, index);
+ return;
+ }
+ }
memset(&dev->data->mac_addrs[index], 0, sizeof(struct ether_addr));
if (!dev->data->promiscuous) {
- int ret = mlx5_traffic_restart(dev);
-
+ ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot remove mac address: %s",
dev->data->port_id, strerror(rte_errno));
@@ -97,6 +110,8 @@ int
mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
uint32_t index, uint32_t vmdq __rte_unused)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
unsigned int i;
assert(index < MLX5_MAX_MAC_ADDRESSES);
@@ -111,6 +126,12 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
rte_errno = EADDRINUSE;
return -rte_errno;
}
+ if (vf) {
+ int ret = mlx5_nl_mac_addr_add(dev, mac);
+
+ if (ret)
+ return ret;
+ }
dev->data->mac_addrs[index] = *mac;
if (!dev->data->promiscuous)
return mlx5_traffic_restart(dev);
diff --git a/drivers/net/mlx5/mlx5_nl.c b/drivers/net/mlx5/mlx5_nl.c
new file mode 100644
index 000000000..cf0df0c2f
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_nl.c
@@ -0,0 +1,530 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 6WIND S.A.
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+#include <unistd.h>
+#include <linux/rtnetlink.h>
+#include <linux/netlink.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/* Size of the buffer to receive kernel messages */
+#define MLX5_NL_BUF_SIZE (32 * 1024)
+/* Send buffer size for the netlink socket */
+#define MLX5_SEND_BUF_SIZE 32768
+/* Receive buffer size for the netlink socket */
+#define MLX5_RECV_BUF_SIZE 32768
+
+/*
+ * Define NDA_RTA as defined in iproute2 sources.
+ *
+ * see in iproute2 sources file include/libnetlink.h
+ */
+#ifndef MLX5_NDA_RTA
+#define MLX5_NDA_RTA(r) \
+ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg))))
+#endif
+
+/* Add/remove MAC address through Netlink */
+struct mlx5_nl_mac_addr {
+ struct ether_addr (*mac)[];
+ /**< MAC address handled by the device. */
+ int mac_n; /**< Number of addresses in the array. */
+};
+
+/**
+ * Opens a Netlink socket.
+ *
+ * @param nl_groups
+ * Netlink group value (e.g. RTMGRP_LINK).
+ *
+ * @return
+ * A file descriptor on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+int
+mlx5_nl_init(uint32_t nl_groups)
+{
+ int fd;
+ int sndbuf_size = MLX5_SEND_BUF_SIZE;
+ int rcvbuf_size = MLX5_RECV_BUF_SIZE;
+ struct sockaddr_nl local = {
+ .nl_family = AF_NETLINK,
+ .nl_groups = nl_groups,
+ };
+ int ret;
+
+ fd = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
+ if (fd == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ ret = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, sizeof(int));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, sizeof(int));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = bind(fd, (struct sockaddr *)&local, sizeof(local));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ return fd;
+error:
+ close(fd);
+ return -rte_errno;
+}
+
+/**
+ * Send a request message to the kernel on the Netlink socket.
+ *
+ * @param[in] nlsk_fd
+ * Netlink socket file descriptor.
+ * @param[in] nh
+ * The netlink message send to the kernel.
+ * @param[in] ssn
+ * Sequence number.
+ * @param[in] req
+ * Pointer to the request structure.
+ * @param[in] len
+ * Length of the request in bytes.
+ *
+ * @return
+ * The number of sent bytes on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+static int
+mlx5_nl_request(int nlsk_fd, struct nlmsghdr *nh, uint32_t sn, void *req,
+ int len)
+{
+ /* man 7 netlink EXAMPLE */
+ struct sockaddr_nl sa = {
+ .nl_family = AF_NETLINK,
+ };
+ struct iovec iov[2] = {
+ { .iov_base = nh, .iov_len = sizeof(*nh), },
+ { .iov_base = req, .iov_len = len, },
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = iov,
+ .msg_iovlen = 2,
+ };
+ int send_bytes;
+
+ nh->nlmsg_pid = 0; /* communication with the kernel uses pid 0 */
+ nh->nlmsg_seq = sn;
+ send_bytes = sendmsg(nlsk_fd, &msg, 0);
+ if (send_bytes < 0) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return send_bytes;
+}
+
+/**
+ * Send a message to the kernel on the Netlink socket.
+ *
+ * @param[in] nlsk_fd
+ * The netlink socket file descriptor used for communication.
+ * @param[in] nh
+ * The Netlink message send to the kernel.
+ * @param[in] sn
+ * Sequence number.
+ *
+ * @return
+ * the number of sent bytes on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+static int
+mlx5_nl_send(int nlsk_fd, struct nlmsghdr *nh, uint32_t sn)
+{
+ /* man 7 netlink EXAMPLE */
+ struct sockaddr_nl sa = {
+ .nl_family = AF_NETLINK,
+ };
+ struct iovec iov = {
+ .iov_base = nh,
+ .iov_len = nh->nlmsg_len,
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ int send_bytes;
+
+ nh->nlmsg_pid = 0; /* communication with the kernel uses pid 0 */
+ nh->nlmsg_seq = sn;
+ send_bytes = sendmsg(nlsk_fd, &msg, 0);
+ if (send_bytes < 0) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return send_bytes;
+}
+
+/**
+ * Receive a message from the kernel on the Netlink socket, following
+ * mlx5_nl_send().
+ *
+ * @param[in] nlsk_fd
+ * The Netlink socket file descriptor used for communication.
+ * @param[in] sn
+ * Sequence number.
+ * @param[in] cb
+ * The callback function to call for each Netlink message received.
+ * @param[in, out] arg
+ * Custom arguments for the callback.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_recv(int nlsk_fd, uint32_t sn, int (*cb)(struct nlmsghdr *, void *arg),
+ void *arg)
+{
+ /* man 7 netlink EXAMPLE */
+ struct sockaddr_nl sa;
+ char buf[MLX5_RECV_BUF_SIZE];
+ struct iovec iov = {
+ .iov_base = buf,
+ .iov_len = sizeof(buf),
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ /* One message at a time */
+ .msg_iovlen = 1,
+ };
+ int multipart = 0;
+ int ret = 0;
+
+ do {
+ struct nlmsghdr *nh;
+ int recv_bytes = 0;
+
+ do {
+ recv_bytes = recvmsg(nlsk_fd, &msg, 0);
+ if (recv_bytes == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ nh = (struct nlmsghdr *)buf;
+ } while (nh->nlmsg_seq != sn);
+ for (;
+ NLMSG_OK(nh, (unsigned int)recv_bytes);
+ nh = NLMSG_NEXT(nh, recv_bytes)) {
+ if (nh->nlmsg_type == NLMSG_ERROR) {
+ struct nlmsgerr *err_data = NLMSG_DATA(nh);
+
+ if (err_data->error < 0) {
+ rte_errno = -err_data->error;
+ return -rte_errno;
+ }
+ /* Ack message. */
+ return 0;
+ }
+ /* Multi-part msgs and their trailing DONE message. */
+ if (nh->nlmsg_flags & NLM_F_MULTI) {
+ if (nh->nlmsg_type == NLMSG_DONE)
+ return 0;
+ multipart = 1;
+ }
+ if (cb) {
+ ret = cb(nh, arg);
+ if (ret < 0)
+ return ret;
+ }
+ }
+ } while (multipart);
+ return ret;
+}
+
+/**
+ * Parse Netlink message to retrieve the bridge MAC address.
+ *
+ * @param nh
+ * Pointer to Netlink Message Header.
+ * @param arg
+ * PMD data register with this callback.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_cb(struct nlmsghdr *nh, void *arg)
+{
+ struct mlx5_nl_mac_addr *data = arg;
+ struct ndmsg *r = NLMSG_DATA(nh);
+ struct rtattr *attribute;
+ int len;
+
+ len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
+ for (attribute = MLX5_NDA_RTA(r);
+ RTA_OK(attribute, len);
+ attribute = RTA_NEXT(attribute, len)) {
+ if (attribute->rta_type == NDA_LLADDR) {
+ if (data->mac_n == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(WARNING,
+ "not enough room to finalise the"
+ " request");
+ rte_errno = ENOMEM;
+ return -rte_errno;
+ }
+#ifndef NDEBUG
+ struct ether_addr m;
+
+ memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN);
+ DRV_LOG(DEBUG,
+ "brige MAC address"
+ " %02X:%02X:%02X:%02X:%02X:%02X",
+ m.addr_bytes[0], m.addr_bytes[1],
+ m.addr_bytes[2], m.addr_bytes[3],
+ m.addr_bytes[4], m.addr_bytes[5]);
+#endif
+ memcpy(&(*data->mac)[data->mac_n++],
+ RTA_DATA(attribute), ETHER_ADDR_LEN);
+ }
+ }
+ return 0;
+}
+
+/**
+ * Get bridge MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac[out]
+ * Pointer to the array table of MAC addresses to fill.
+ * Its size should be of MLX5_MAX_MAC_ADDRESSES.
+ * @param mac_n[out]
+ * Number of entries filled in MAC array.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_list(struct rte_eth_dev *dev, struct ether_addr (*mac)[],
+ int *mac_n)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifm;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_GETNEIGH,
+ .nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
+ },
+ .ifm = {
+ .ifi_family = PF_BRIDGE,
+ .ifi_index = iface_idx,
+ },
+ };
+ struct mlx5_nl_mac_addr data = {
+ .mac = mac,
+ .mac_n = 0,
+ };
+ int fd;
+ int ret;
+ uint32_t sn = priv->nl_sn++;
+
+ if (priv->nl_socket == -1)
+ return 0;
+ fd = priv->nl_socket;
+ ret = mlx5_nl_request(fd, &req.hdr, sn, &req.ifm,
+ sizeof(struct ifinfomsg));
+ if (ret < 0)
+ goto error;
+ ret = mlx5_nl_recv(fd, sn, mlx5_nl_mac_addr_cb, &data);
+ if (ret < 0)
+ goto error;
+ *mac_n = data.mac_n;
+ return 0;
+error:
+ DRV_LOG(DEBUG, "port %u cannot retrieve MAC address list %s",
+ dev->data->port_id, strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Modify the MAC address neighbour table with Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac
+ * MAC address to consider.
+ * @param add
+ * 1 to add the MAC address, 0 to remove the MAC address.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_modify(struct rte_eth_dev *dev, struct ether_addr *mac,
+ int add)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ndmsg ndm;
+ struct rtattr rta;
+ uint8_t buffer[ETHER_ADDR_LEN];
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)),
+ .nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE |
+ NLM_F_EXCL | NLM_F_ACK,
+ .nlmsg_type = add ? RTM_NEWNEIGH : RTM_DELNEIGH,
+ },
+ .ndm = {
+ .ndm_family = PF_BRIDGE,
+ .ndm_state = NUD_NOARP | NUD_PERMANENT,
+ .ndm_ifindex = iface_idx,
+ .ndm_flags = NTF_SELF,
+ },
+ .rta = {
+ .rta_type = NDA_LLADDR,
+ .rta_len = RTA_LENGTH(ETHER_ADDR_LEN),
+ },
+ };
+ int fd;
+ int ret;
+ uint32_t sn = priv->nl_sn++;
+
+ if (priv->nl_socket == -1)
+ return 0;
+ fd = priv->nl_socket;
+ memcpy(RTA_DATA(&req.rta), mac, ETHER_ADDR_LEN);
+ req.hdr.nlmsg_len = NLMSG_ALIGN(req.hdr.nlmsg_len) +
+ RTA_ALIGN(req.rta.rta_len);
+ ret = mlx5_nl_send(fd, &req.hdr, sn);
+ if (ret < 0)
+ goto error;
+ ret = mlx5_nl_recv(fd, sn, NULL, NULL);
+ if (ret < 0)
+ goto error;
+ return 0;
+error:
+ DRV_LOG(DEBUG,
+ "port %u cannot %s MAC address %02X:%02X:%02X:%02X:%02X:%02X"
+ " %s",
+ dev->data->port_id,
+ add ? "add" : "remove",
+ mac->addr_bytes[0], mac->addr_bytes[1],
+ mac->addr_bytes[2], mac->addr_bytes[3],
+ mac->addr_bytes[4], mac->addr_bytes[5],
+ strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Add a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac_addr
+ * MAC address to register.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int mac_index = MLX5_MAX_MAC_ADDRESSES;
+ int ret;
+
+ ret = mlx5_nl_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return ret;
+ assert(macs_n);
+ for (i = 0; i != macs_n; ++i) {
+ if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
+ mac_index = i;
+ break;
+ }
+ }
+ if (mac_index != MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(INFO, "port %u MAC address already added",
+ dev->data->port_id);
+ rte_errno = EADDRINUSE;
+ return -rte_errno;
+ }
+ return mlx5_nl_mac_addr_modify(dev, mac, 1);
+}
+
+/**
+ * Remove a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param index
+ * MAC address to remove.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int mac_index = MLX5_MAX_MAC_ADDRESSES;
+ int ret;
+
+ ret = mlx5_nl_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return ret;
+ for (i = 0; i != macs_n; ++i) {
+ if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) {
+ mac_index = i;
+ break;
+ }
+ }
+ if (mac_index == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(INFO, "port %u MAC address not found",
+ dev->data->port_id);
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+ return mlx5_nl_mac_addr_modify(dev, mac, 0);
+}
+
+/**
+ * Flush all added MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ */
+void
+mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev)
+{
+ int i;
+ const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
+
+ for (i = MLX5_MAX_MAC_ADDRESSES - 1; i >= 0; --i) {
+ struct ether_addr *m = &dev->data->mac_addrs[i];
+
+ if (memcmp(&mac_null, m, ETHER_ADDR_LEN))
+ mlx5_nl_mac_addr_remove(dev, m);
+ }
+}
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v3 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (10 preceding siblings ...)
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
@ 2018-03-21 13:40 ` Nelio Laranjeiro
2018-03-22 7:36 ` Shahaf Shuler
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-21 13:40 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive promisc or allmulti traffic unless it
fully requests it though Netlink. This will cause the request to be
processed by the PF which will handle the request and enable it.
This requires the VF to be trusted by the PF.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
drivers/net/mlx5/mlx5_nl.c | 94 ++++++++++++++++++++++++++++++++++++++++++
drivers/net/mlx5/mlx5_rxmode.c | 8 ++++
2 files changed, 102 insertions(+)
diff --git a/drivers/net/mlx5/mlx5_nl.c b/drivers/net/mlx5/mlx5_nl.c
index cf0df0c2f..7d97e1139 100644
--- a/drivers/net/mlx5/mlx5_nl.c
+++ b/drivers/net/mlx5/mlx5_nl.c
@@ -528,3 +528,97 @@ mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev)
mlx5_nl_mac_addr_remove(dev, m);
}
}
+
+/**
+ * Enable promiscuous / all multicast mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param flags
+ * IFF_PROMISC for promiscuous, IFF_ALLMULTI for allmulti.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifi;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_NEWLINK,
+ .nlmsg_flags = NLM_F_REQUEST,
+ },
+ .ifi = {
+ .ifi_flags = enable ? flags : 0,
+ .ifi_change = flags,
+ .ifi_index = iface_idx,
+ },
+ };
+ int fd;
+ int ret;
+
+ assert(!(flags & ~(IFF_PROMISC | IFF_ALLMULTI)));
+ if (priv->nl_socket < 0)
+ return 0;
+ fd = priv->nl_socket;
+ ret = mlx5_nl_send(fd, &req.hdr, priv->nl_sn++);
+ if (ret < 0)
+ return ret;
+ return 0;
+}
+
+/**
+ * Enable promiscuous mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_promisc(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_nl_device_flags(dev, IFF_PROMISC, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s promisc mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
+
+/**
+ * Enable all multicast mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_nl_device_flags(dev, IFF_ALLMULTI, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s allmulti mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index e43a4b030..c1c0f21c7 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -35,6 +35,8 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 1;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_promisc(dev, 1);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot enable promiscuous mode: %s",
@@ -53,6 +55,8 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 0;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_promisc(dev, 0);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot disable promiscuous mode: %s",
@@ -71,6 +75,8 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 1;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_allmulti(dev, 1);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot enable allmulicast mode: %s",
@@ -89,6 +95,8 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 0;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_allmulti(dev, 0);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot disable allmulicast mode: %s",
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v3 3/3] net/mlx5: add a parameter for Netlink support in VF
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (11 preceding siblings ...)
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
@ 2018-03-21 13:40 ` Nelio Laranjeiro
2018-03-22 7:38 ` Shahaf Shuler
12 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-03-21 13:40 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
All Netlink request the PMD will do can also be done by a iproute2 command
line interface, letting the operator configure the VF behavior without
having to modify the application nor reaching PMD limits (e.g. MAC
address number limit).
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
doc/guides/nics/mlx5.rst | 11 +++++++++++
drivers/net/mlx5/mlx5.c | 9 ++++++++-
drivers/net/mlx5/mlx5.h | 1 +
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 46d26e4c8..4c7bf1c56 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -135,6 +135,8 @@ Limitations
- Flows with a VXLAN Network Identifier equal (or ends to be equal)
to 0 are not supported.
- VXLAN TSO and checksum offloads are not supported on VM.
+- VF: flow rules created on VF devices can only match traffic targeted at the
+ configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
Statistics
----------
@@ -335,6 +337,15 @@ Run-time configuration
Enabled by default.
+- ``vf_nl_en`` parameter [int]
+
+ A nonzero value enables Netlink requests from the VF to add/remove MAC
+ addresses or/and enable/disable promiscuous/all multicast on the Netdevice.
+ Otherwise the relevant configuration must be run with Linux iproute2 tools.
+ This is a prerequisite to receive this kind of traffic.
+
+ Enabled by default, valid only on VF devices ignored otherwise.
+
Prerequisites
-------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5ec152ab1..89bada409 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
/* Device parameter to enable hardware Rx vector. */
#define MLX5_RX_VEC_EN "rx_vec_en"
+/* Activate Netlink support in VF mode. */
+#define MLX5_VF_NL_EN "vf_nl_en"
+
#ifndef HAVE_IBV_MLX5_MOD_MPW
#define MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED (1 << 2)
#define MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW (1 << 3)
@@ -412,6 +415,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
config->tx_vec_en = !!tmp;
} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
config->rx_vec_en = !!tmp;
+ } else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
+ config->vf_nl_en = !!tmp;
} else {
DRV_LOG(WARNING, "%s: unknown parameter", key);
rte_errno = EINVAL;
@@ -443,6 +448,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
MLX5_TXQ_MAX_INLINE_LEN,
MLX5_TX_VEC_EN,
MLX5_RX_VEC_EN,
+ MLX5_VF_NL_EN,
NULL,
};
struct rte_kvargs *kvlist;
@@ -748,6 +754,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
.txq_inline = MLX5_ARG_UNSET,
.txqs_inline = MLX5_ARG_UNSET,
.inline_max_packet_sz = MLX5_ARG_UNSET,
+ .vf_nl_en = 1,
};
len = snprintf(name, sizeof(name), PCI_PRI_FMT,
@@ -964,7 +971,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
/* Register MAC address. */
priv->nl_socket = -1;
priv->nl_sn = 0;
- if (vf) {
+ if (vf && config.vf_nl_en) {
priv->nl_socket = mlx5_nl_init(RTMGRP_LINK);
if (priv->nl_socket < 0)
priv->nl_socket = -1;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6a7e9f310..7cabf43cc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -88,6 +88,7 @@ struct mlx5_dev_config {
unsigned int tx_vec_en:1; /* Tx vector is enabled. */
unsigned int rx_vec_en:1; /* Rx vector is enabled. */
unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
+ unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
unsigned int ind_table_max_size; /* Maximum indirection table size. */
int txq_inline; /* Maximum packet size for inlining. */
--
2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
@ 2018-03-22 7:34 ` Shahaf Shuler
2018-03-22 9:04 ` Nélio Laranjeiro
2018-03-22 7:44 ` Shahaf Shuler
1 sibling, 1 reply; 32+ messages in thread
From: Shahaf Shuler @ 2018-03-22 7:34 UTC (permalink / raw)
To: Nélio Laranjeiro, Adrien Mazarguil, Yongseok Koh; +Cc: dev
Hi Nelio,
Wednesday, March 21, 2018 3:40 PM, Nelio Laranjeiro:
> VF devices are not able to receive traffic unless it fully requests it though
> Netlink. This will cause the request to be processed by the PF which will
> add/remove the MAC address to the VF table if the VF is trusted.
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
> drivers/net/mlx5/Makefile | 1 +
> drivers/net/mlx5/mlx5.c | 22 ++
> drivers/net/mlx5/mlx5.h | 13 +
> drivers/net/mlx5/mlx5_ethdev.c | 27 +++
> drivers/net/mlx5/mlx5_mac.c | 25 +-
> drivers/net/mlx5/mlx5_nl.c | 530
> +++++++++++++++++++++++++++++++++++++++++
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> faacfd9d6..6a7e9f310 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -78,6 +78,7 @@ struct mlx5_dev_config {
> unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
> unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
> unsigned int hw_padding:1; /* End alignment padding is supported.
> */
> + unsigned int vf:1; /* This is a VF. */
> unsigned int mps:2; /* Multi-packet send supported mode. */
> unsigned int tunnel_en:1;
> /* Whether tunnel stateless offloads are supported. */ @@ -154,6
> +155,8 @@ struct priv {
> struct mlx5_dev_config config; /* Device configuration. */
> struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
> /* Context for Verbs allocator. */
> + int nl_socket; /* Netlink socket. */
> + uint32_t nl_sn; /* Netlink message sequence number. */
> };
>
> /* mlx5.c */
> @@ -163,6 +166,7 @@ int mlx5_getenv_int(const char *);
> /* mlx5_ethdev.c */
>
> int mlx5_get_ifname(const struct rte_eth_dev *dev, char
> (*ifname)[IF_NAMESIZE]);
> +int mlx5_ifindex(const struct rte_eth_dev *dev);
> int mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr); int
> mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu); int
> mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep, @@ -297,4
> +301,13 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct
> rte_mempool *mp); int mlx5_mr_release(struct mlx5_mr *mr); int
> mlx5_mr_verify(struct rte_eth_dev *dev);
>
> +/* mlx5_nl.c */
> +
> +int mlx5_nl_init(uint32_t nlgroups);
> +int mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr
> +*mac); int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct
> +ether_addr *mac); void mlx5_nl_mac_addr_flush(struct rte_eth_dev
> *dev);
I think the two below should be introduced only on the next patch of the series.
> +int mlx5_nl_promisc(struct rte_eth_dev *dev, int enable); int
> +mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable);
> +
> #endif /* RTE_PMD_MLX5_H_ */
[...]
> +/**
> + * Flush all added MAC addresses.
> + *
> + * @param dev
> + * Pointer to Ethernet device.
> + */
> +void
> +mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev) {
> + int i;
> + const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
> +
> + for (i = MLX5_MAX_MAC_ADDRESSES - 1; i >= 0; --i) {
> + struct ether_addr *m = &dev->data->mac_addrs[i];
> +
> + if (memcmp(&mac_null, m, ETHER_ADDR_LEN))
> + mlx5_nl_mac_addr_remove(dev, m);
> + }
> +}
What if the DPDK process is terminated ungracefully? I think the MAC table will remain with all the MACs which were added.
The next run of the process may have un-expected results.
Should we flush the neighbor mac table also on probe to make sure only the VF mac exists?
> --
> 2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
@ 2018-03-22 7:36 ` Shahaf Shuler
0 siblings, 0 replies; 32+ messages in thread
From: Shahaf Shuler @ 2018-03-22 7:36 UTC (permalink / raw)
To: Nélio Laranjeiro, Adrien Mazarguil, Yongseok Koh; +Cc: dev
Wednesday, March 21, 2018 3:40 PM, Nelio Laranjeiro:
> VF devices are not able to receive promisc or allmulti traffic unless it fully
> requests it though Netlink. This will cause the request to be processed by
> the PF which will handle the request and enable it.
>
> This requires the VF to be trusted by the PF.
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
> drivers/net/mlx5/mlx5_nl.c | 94
> ++++++++++++++++++++++++++++++++++++++++++
> drivers/net/mlx5/mlx5_rxmode.c | 8 ++++
> 2 files changed, 102 insertions(+)
[...]
> diff --git a/drivers/net/mlx5/mlx5_rxmode.c
> b/drivers/net/mlx5/mlx5_rxmode.c index e43a4b030..c1c0f21c7 100644
> --- a/drivers/net/mlx5/mlx5_rxmode.c
> +++ b/drivers/net/mlx5/mlx5_rxmode.c
> @@ -35,6 +35,8 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
> int ret;
>
> dev->data->promiscuous = 1;
> + if (((struct priv *)dev->data->dev_private)->config.vf)
> + mlx5_nl_promisc(dev, 1);
> ret = mlx5_traffic_restart(dev);
> if (ret)
> DRV_LOG(ERR, "port %u cannot enable promiscuous mode:
> %s", @@ -53,6 +55,8 @@ mlx5_promiscuous_disable(struct rte_eth_dev
> *dev)
> int ret;
>
> dev->data->promiscuous = 0;
> + if (((struct priv *)dev->data->dev_private)->config.vf)
> + mlx5_nl_promisc(dev, 0);
Same comment: What if the DPDK process is terminated ungracefully?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 3/3] net/mlx5: add a parameter for Netlink support in VF
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
@ 2018-03-22 7:38 ` Shahaf Shuler
0 siblings, 0 replies; 32+ messages in thread
From: Shahaf Shuler @ 2018-03-22 7:38 UTC (permalink / raw)
To: Nélio Laranjeiro, Adrien Mazarguil, Yongseok Koh; +Cc: dev
Wednesday, March 21, 2018 3:40 PM, Nelio Laranjeiro:
> All Netlink request the PMD will do can also be done by a iproute2 command
> line interface, letting the operator configure the VF behavior without having
> to modify the application nor reaching PMD limits (e.g. MAC address number
> limit).
>
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
> doc/guides/nics/mlx5.rst | 11 +++++++++++ drivers/net/mlx5/mlx5.c | 9
> ++++++++- drivers/net/mlx5/mlx5.h | 1 +
> 3 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index
> 46d26e4c8..4c7bf1c56 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -135,6 +135,8 @@ Limitations
> - Flows with a VXLAN Network Identifier equal (or ends to be equal)
> to 0 are not supported.
> - VXLAN TSO and checksum offloads are not supported on VM.
> +- VF: flow rules created on VF devices can only match traffic targeted
> +at the
> + configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
Should this comment be in the first patch of the series?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
2018-03-22 7:34 ` Shahaf Shuler
@ 2018-03-22 7:44 ` Shahaf Shuler
1 sibling, 0 replies; 32+ messages in thread
From: Shahaf Shuler @ 2018-03-22 7:44 UTC (permalink / raw)
To: Nélio Laranjeiro, Adrien Mazarguil, Yongseok Koh; +Cc: dev
Thursday, March 22, 2018 9:35 AM, Shahaf Shuler:
> Hi Nelio,
>
> Wednesday, March 21, 2018 3:40 PM, Nelio Laranjeiro:
> > VF devices are not able to receive traffic unless it fully requests it
> > though Netlink. This will cause the request to be processed by the PF
> > which will add/remove the MAC address to the VF table if the VF is trusted.
> >
> > Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > ---
> > drivers/net/mlx5/Makefile | 1 +
> > drivers/net/mlx5/mlx5.c | 22 ++
> > drivers/net/mlx5/mlx5.h | 13 +
> > drivers/net/mlx5/mlx5_ethdev.c | 27 +++
> > drivers/net/mlx5/mlx5_mac.c | 25 +-
> > drivers/net/mlx5/mlx5_nl.c | 530
> > +++++++++++++++++++++++++++++++++++++++++
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > faacfd9d6..6a7e9f310 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
Sorry sent to early.
Another general question - is there required netlink version exists on all the relevant distros? What is the minimal kernel version required?
> > @@ -78,6 +78,7 @@ struct mlx5_dev_config {
> > unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
> > unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
> > unsigned int hw_padding:1; /* End alignment padding is supported.
> > */
> > + unsigned int vf:1; /* This is a VF. */
> > unsigned int mps:2; /* Multi-packet send supported mode. */
> > unsigned int tunnel_en:1;
> > /* Whether tunnel stateless offloads are supported. */ @@ -154,6
> > +155,8 @@ struct priv {
> > struct mlx5_dev_config config; /* Device configuration. */
> > struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
> > /* Context for Verbs allocator. */
> > + int nl_socket; /* Netlink socket. */
> > + uint32_t nl_sn; /* Netlink message sequence number. */
> > };
> >
> > /* mlx5.c */
> > @@ -163,6 +166,7 @@ int mlx5_getenv_int(const char *);
> > /* mlx5_ethdev.c */
> >
> > int mlx5_get_ifname(const struct rte_eth_dev *dev, char
> > (*ifname)[IF_NAMESIZE]);
> > +int mlx5_ifindex(const struct rte_eth_dev *dev);
> > int mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq
> > *ifr); int mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu); int
> > mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep, @@ -297,4
> > +301,13 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev,
> struct
> > rte_mempool *mp); int mlx5_mr_release(struct mlx5_mr *mr); int
> > mlx5_mr_verify(struct rte_eth_dev *dev);
> >
> > +/* mlx5_nl.c */
> > +
> > +int mlx5_nl_init(uint32_t nlgroups);
> > +int mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr
> > +*mac); int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct
> > +ether_addr *mac); void mlx5_nl_mac_addr_flush(struct rte_eth_dev
> > *dev);
>
> I think the two below should be introduced only on the next patch of the
> series.
>
> > +int mlx5_nl_promisc(struct rte_eth_dev *dev, int enable); int
> > +mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable);
> > +
> > #endif /* RTE_PMD_MLX5_H_ */
>
> [...]
>
> > +/**
> > + * Flush all added MAC addresses.
> > + *
> > + * @param dev
> > + * Pointer to Ethernet device.
> > + */
> > +void
> > +mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev) {
> > + int i;
> > + const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
> > +
> > + for (i = MLX5_MAX_MAC_ADDRESSES - 1; i >= 0; --i) {
> > + struct ether_addr *m = &dev->data->mac_addrs[i];
> > +
> > + if (memcmp(&mac_null, m, ETHER_ADDR_LEN))
> > + mlx5_nl_mac_addr_remove(dev, m);
> > + }
> > +}
>
> What if the DPDK process is terminated ungracefully? I think the MAC table
> will remain with all the MACs which were added.
> The next run of the process may have un-expected results.
>
> Should we flush the neighbor mac table also on probe to make sure only the
> VF mac exists?
>
> > --
> > 2.11.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-22 7:34 ` Shahaf Shuler
@ 2018-03-22 9:04 ` Nélio Laranjeiro
2018-03-22 9:45 ` Shahaf Shuler
0 siblings, 1 reply; 32+ messages in thread
From: Nélio Laranjeiro @ 2018-03-22 9:04 UTC (permalink / raw)
To: Shahaf Shuler; +Cc: Adrien Mazarguil, Yongseok Koh, dev
On Thu, Mar 22, 2018 at 07:34:50AM +0000, Shahaf Shuler wrote:
> Hi Nelio,
>
> Wednesday, March 21, 2018 3:40 PM, Nelio Laranjeiro:
> > VF devices are not able to receive traffic unless it fully requests it though
> > Netlink. This will cause the request to be processed by the PF which will
> > add/remove the MAC address to the VF table if the VF is trusted.
> >
> > Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > ---
> > drivers/net/mlx5/Makefile | 1 +
> > drivers/net/mlx5/mlx5.c | 22 ++
> > drivers/net/mlx5/mlx5.h | 13 +
> > drivers/net/mlx5/mlx5_ethdev.c | 27 +++
> > drivers/net/mlx5/mlx5_mac.c | 25 +-
> > drivers/net/mlx5/mlx5_nl.c | 530
> > +++++++++++++++++++++++++++++++++++++++++
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > faacfd9d6..6a7e9f310 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
> > @@ -78,6 +78,7 @@ struct mlx5_dev_config {
> > unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
> > unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
> > unsigned int hw_padding:1; /* End alignment padding is supported.
> > */
> > + unsigned int vf:1; /* This is a VF. */
> > unsigned int mps:2; /* Multi-packet send supported mode. */
> > unsigned int tunnel_en:1;
> > /* Whether tunnel stateless offloads are supported. */ @@ -154,6
> > +155,8 @@ struct priv {
> > struct mlx5_dev_config config; /* Device configuration. */
> > struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
> > /* Context for Verbs allocator. */
> > + int nl_socket; /* Netlink socket. */
> > + uint32_t nl_sn; /* Netlink message sequence number. */
> > };
> >
> > /* mlx5.c */
> > @@ -163,6 +166,7 @@ int mlx5_getenv_int(const char *);
> > /* mlx5_ethdev.c */
> >
> > int mlx5_get_ifname(const struct rte_eth_dev *dev, char
> > (*ifname)[IF_NAMESIZE]);
> > +int mlx5_ifindex(const struct rte_eth_dev *dev);
> > int mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr); int
> > mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu); int
> > mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep, @@ -297,4
> > +301,13 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct
> > rte_mempool *mp); int mlx5_mr_release(struct mlx5_mr *mr); int
> > mlx5_mr_verify(struct rte_eth_dev *dev);
> >
> > +/* mlx5_nl.c */
> > +
> > +int mlx5_nl_init(uint32_t nlgroups);
> > +int mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr
> > +*mac); int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct
> > +ether_addr *mac); void mlx5_nl_mac_addr_flush(struct rte_eth_dev
> > *dev);
>
> I think the two below should be introduced only on the next patch of the series.
Right,
>
> > +int mlx5_nl_promisc(struct rte_eth_dev *dev, int enable); int
> > +mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable);
> > +
> > #endif /* RTE_PMD_MLX5_H_ */
>
> [...]
>
> > +/**
> > + * Flush all added MAC addresses.
> > + *
> > + * @param dev
> > + * Pointer to Ethernet device.
> > + */
> > +void
> > +mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev) {
> > + int i;
> > + const struct ether_addr mac_null = { .addr_bytes = { 0 }, };
> > +
> > + for (i = MLX5_MAX_MAC_ADDRESSES - 1; i >= 0; --i) {
> > + struct ether_addr *m = &dev->data->mac_addrs[i];
> > +
> > + if (memcmp(&mac_null, m, ETHER_ADDR_LEN))
> > + mlx5_nl_mac_addr_remove(dev, m);
> > + }
> > +}
>
> What if the DPDK process is terminated ungracefully? I think the MAC
> table will remain with all the MACs which were added.
> The next run of the process may have un-expected results.
>
> Should we flush the neighbor mac table also on probe to make sure only
> the VF mac exists?
In such situation the sysadmin should make the clean up, the DPDK
application cannot consider it is the only one using the device as it is
not the case, Linux still owns the device.
We have no guarantee the admin did not use another MAC address for a
service outside of the DPDK application (even if in such case he should
disable this feature to fully control what happens on the neighbor mac
table).
Thanks,
--
Nélio Laranjeiro
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-22 9:04 ` Nélio Laranjeiro
@ 2018-03-22 9:45 ` Shahaf Shuler
2018-03-22 10:28 ` Nélio Laranjeiro
0 siblings, 1 reply; 32+ messages in thread
From: Shahaf Shuler @ 2018-03-22 9:45 UTC (permalink / raw)
To: Nélio Laranjeiro; +Cc: Adrien Mazarguil, Yongseok Koh, dev
Thursday, March 22, 2018 11:05 AM, Nélio Laranjeiro:
> > What if the DPDK process is terminated ungracefully? I think the MAC
> > table will remain with all the MACs which were added.
> > The next run of the process may have un-expected results.
> >
> > Should we flush the neighbor mac table also on probe to make sure only
> > the VF mac exists?
>
> In such situation the sysadmin should make the clean up, the DPDK
> application cannot consider it is the only one using the device as it is not the
> case, Linux still owns the device.
> We have no guarantee the admin did not use another MAC address for a
> service outside of the DPDK application (even if in such case he should
> disable this feature to fully control what happens on the neighbor mac table).
There should be only one owner for the VF mac tables - either the DPDK or the Linux.
If the DPDK is the owner, it should do the cleanup. If the sysadmin wants to be the owner then it should set the proper flag when running dpdk or alternatively not set the VF in trusted mode.
>
> Thanks,
>
> --
> Nélio Laranjeiro
> 6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-22 9:45 ` Shahaf Shuler
@ 2018-03-22 10:28 ` Nélio Laranjeiro
2018-03-28 5:56 ` Shahaf Shuler
0 siblings, 1 reply; 32+ messages in thread
From: Nélio Laranjeiro @ 2018-03-22 10:28 UTC (permalink / raw)
To: Shahaf Shuler; +Cc: Adrien Mazarguil, Yongseok Koh, dev
On Thu, Mar 22, 2018 at 09:45:16AM +0000, Shahaf Shuler wrote:
> Thursday, March 22, 2018 11:05 AM, Nélio Laranjeiro:
> > > What if the DPDK process is terminated ungracefully? I think the MAC
> > > table will remain with all the MACs which were added.
> > > The next run of the process may have un-expected results.
> > >
> > > Should we flush the neighbor mac table also on probe to make sure only
> > > the VF mac exists?
> >
> > In such situation the sysadmin should make the clean up, the DPDK
> > application cannot consider it is the only one using the device as it is not the
> > case, Linux still owns the device.
> > We have no guarantee the admin did not use another MAC address for a
> > service outside of the DPDK application (even if in such case he should
> > disable this feature to fully control what happens on the neighbor mac table).
>
> There should be only one owner for the VF mac tables - either the DPDK or the Linux.
>
> If the DPDK is the owner, it should do the cleanup. If the sysadmin
> wants to be the owner then it should set the proper flag when running
> dpdk or alternatively not set the VF in trusted mode.
It is a little more complex than what you think, according to the Linux
configuration the tables will have entries already present or not e.g.
broadcast v6 is present only if the Linux supports the v6 (by
compilation or configuration).
For such reason the PMD cannot consider it owns the bridge tables when
acting as a VF, otherwise it needs to trigger any Linux configuration
modification and enable the according functionalities.
To answer this:
> There should be only one owner for the VF mac tables - either the DPDK or the Linux.
That's right, DPDK application or Linux should own the device, the PMD
is none of the two. It is just a driver.
Thanks,
--
Nélio Laranjeiro
6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-22 10:28 ` Nélio Laranjeiro
@ 2018-03-28 5:56 ` Shahaf Shuler
0 siblings, 0 replies; 32+ messages in thread
From: Shahaf Shuler @ 2018-03-28 5:56 UTC (permalink / raw)
To: Nélio Laranjeiro; +Cc: Adrien Mazarguil, Yongseok Koh, dev
Thursday, March 22, 2018 12:29 PM, Nélio Laranjeiro:
> On Thu, Mar 22, 2018 at 09:45:16AM +0000, Shahaf Shuler wrote:
> > Thursday, March 22, 2018 11:05 AM, Nélio Laranjeiro:
> > > > What if the DPDK process is terminated ungracefully? I think the
> > > > MAC table will remain with all the MACs which were added.
> > > > The next run of the process may have un-expected results.
> > > >
> > > > Should we flush the neighbor mac table also on probe to make sure
> > > > only the VF mac exists?
> > >
> > > In such situation the sysadmin should make the clean up, the DPDK
> > > application cannot consider it is the only one using the device as
> > > it is not the case, Linux still owns the device.
> > > We have no guarantee the admin did not use another MAC address for a
> > > service outside of the DPDK application (even if in such case he
> > > should disable this feature to fully control what happens on the neighbor
> mac table).
> >
> > There should be only one owner for the VF mac tables - either the DPDK or
> the Linux.
> >
> > If the DPDK is the owner, it should do the cleanup. If the sysadmin
> > wants to be the owner then it should set the proper flag when running
> > dpdk or alternatively not set the VF in trusted mode.
>
> It is a little more complex than what you think, according to the Linux
> configuration the tables will have entries already present or not e.g.
> broadcast v6 is present only if the Linux supports the v6 (by compilation or
> configuration).
>
> For such reason the PMD cannot consider it owns the bridge tables when
> acting as a VF, otherwise it needs to trigger any Linux configuration
> modification and enable the according functionalities.
OK.
Then I think a proper documentation is needed as part of this patch in mlx guide.
To explain the application that a cleanup of the neighbor table is needed in case the PMD is terminated ungracefully.
>
> To answer this:
> > There should be only one owner for the VF mac tables - either the DPDK or
> the Linux.
>
> That's right, DPDK application or Linux should own the device, the PMD is
> none of the two. It is just a driver.
>
> Thanks,
>
> --
> Nélio Laranjeiro
> 6WIND
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v4 0/3] net/mlx5: use Netlink in VF mode
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
@ 2018-04-05 15:07 ` Nelio Laranjeiro
2018-04-08 8:16 ` Shahaf Shuler
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
` (2 subsequent siblings)
3 siblings, 1 reply; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-04-05 15:07 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
When MLX5 behaves in VF mode and the hypervisor have **trusted** this VF, to
be able to receive specific traffic some requests must be done to configure
the NIC. There is no API currently available to do it though Verbs, but there
is in Linux side using Netlink.
The specific cases are:
- Enable/disable promiscuous mode.
- Enable/disable allmulti mode.
- Add/remove mac addresses.
Changes in v4:
Track MAC addresses added by the PMD and not present in the bridge table to
remove them when leaving.
Changes in v3:
Add missing room to add the MAC address in the Netlink message.
Changes in v2:
Embed Netlink socket communication inside the PMD.
Nelio Laranjeiro (3):
net/mlx5: use Netlink to add/remove MAC addresses
net/mlx5: use Netlink to enable promisc / allmulti mode
net/mlx5: add a parameter for Netlink support in VF
doc/guides/nics/mlx5.rst | 18 +
drivers/net/mlx5/Makefile | 1 +
drivers/net/mlx5/mlx5.c | 30 ++
drivers/net/mlx5/mlx5.h | 19 +
drivers/net/mlx5/mlx5_ethdev.c | 27 ++
drivers/net/mlx5/mlx5_mac.c | 20 +-
drivers/net/mlx5/mlx5_nl.c | 627 +++++++++++++++++++++++++++++++++
drivers/net/mlx5/mlx5_rxmode.c | 8 +
8 files changed, 747 insertions(+), 3 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_nl.c
--
2.17.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v4 1/3] net/mlx5: use Netlink to add/remove MAC addresses
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 " Nelio Laranjeiro
@ 2018-04-05 15:07 ` Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 2/3] net/mlx5: use Netlink to enable promisc / allmulti mode Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
3 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-04-05 15:07 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive traffic unless it fully requests it
though Netlink. This will cause the request to be processed by the PF
which will add/remove the MAC address to the VF table if the VF is trusted.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
doc/guides/nics/mlx5.rst | 9 +
drivers/net/mlx5/Makefile | 1 +
drivers/net/mlx5/mlx5.c | 23 ++
drivers/net/mlx5/mlx5.h | 16 +
drivers/net/mlx5/mlx5_ethdev.c | 27 ++
drivers/net/mlx5/mlx5_mac.c | 20 +-
drivers/net/mlx5/mlx5_nl.c | 533 +++++++++++++++++++++++++++++++++
7 files changed, 626 insertions(+), 3 deletions(-)
create mode 100644 drivers/net/mlx5/mlx5_nl.c
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 46d26e4c8..c812f0b4f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -135,6 +135,15 @@ Limitations
- Flows with a VXLAN Network Identifier equal (or ends to be equal)
to 0 are not supported.
- VXLAN TSO and checksum offloads are not supported on VM.
+- VF: flow rules created on VF devices can only match traffic targeted at the
+ configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
+
+.. note::
+
+ MAC addresses not already present in the bridge table of the associated
+ kernel network device will be added and cleaned up by the PMD when closing
+ the device. In case of ungraceful program termination, some entries may
+ remain present and should be removed manually by other means.
Statistics
----------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 201f6f06a..ae118ad33 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -59,6 +59,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 7d58d66bb..e52c60fb3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -13,6 +13,7 @@
#include <errno.h>
#include <net/if.h>
#include <sys/mman.h>
+#include <linux/rtnetlink.h>
/* Verbs header. */
/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
@@ -205,6 +206,10 @@ mlx5_dev_close(struct rte_eth_dev *dev)
rte_free(priv->reta_idx);
if (priv->primary_socket)
mlx5_socket_uninit(dev);
+ if (priv->config.vf)
+ mlx5_nl_mac_addr_flush(dev);
+ if (priv->nl_socket >= 0)
+ close(priv->nl_socket);
ret = mlx5_hrxq_ibv_verify(dev);
if (ret)
DRV_LOG(WARNING, "port %u some hash Rx queue still remain",
@@ -597,6 +602,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
int err = 0;
struct ibv_context *attr_ctx = NULL;
struct ibv_device_attr_ex device_attr;
+ unsigned int vf;
unsigned int mps;
unsigned int cqe_comp;
unsigned int tunnel_en = 0;
@@ -646,6 +652,14 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
continue;
DRV_LOG(INFO, "PCI information matches, using device \"%s\"",
list[i]->name);
+ vf = ((pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5VF) ||
+ (pci_dev->id.device_id ==
+ PCI_DEVICE_ID_MELLANOX_CONNECTX5EXVF));
attr_ctx = mlx5_glue->open_device(list[i]);
rte_errno = errno;
err = rte_errno;
@@ -869,6 +883,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
DRV_LOG(DEBUG,
"hardware Rx end alignment padding is %ssupported",
(config.hw_padding ? "" : "not "));
+ config.vf = vf;
config.tso = ((device_attr_ex.tso_caps.max_tso > 0) &&
(device_attr_ex.tso_caps.supported_qpts &
(1 << IBV_QPT_RAW_PACKET)));
@@ -946,6 +961,14 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
eth_dev->dev_ops = &mlx5_dev_ops;
/* Register MAC address. */
claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0));
+ priv->nl_socket = -1;
+ priv->nl_sn = 0;
+ if (vf) {
+ priv->nl_socket = mlx5_nl_init(RTMGRP_LINK);
+ if (priv->nl_socket < 0)
+ priv->nl_socket = -1;
+ mlx5_nl_mac_addr_sync(eth_dev);
+ }
TAILQ_INIT(&priv->flows);
TAILQ_INIT(&priv->ctrl_flows);
/* Hint libmlx5 to use PMD allocator for data plane resources */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index faacfd9d6..683026b0f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -78,6 +78,7 @@ struct mlx5_dev_config {
unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
unsigned int hw_padding:1; /* End alignment padding is supported. */
+ unsigned int vf:1; /* This is a VF. */
unsigned int mps:2; /* Multi-packet send supported mode. */
unsigned int tunnel_en:1;
/* Whether tunnel stateless offloads are supported. */
@@ -119,6 +120,8 @@ struct priv {
struct ibv_pd *pd; /* Protection Domain. */
char ibdev_path[IBV_SYSFS_PATH_MAX]; /* IB device path for secondary */
struct ether_addr mac[MLX5_MAX_MAC_ADDRESSES]; /* MAC addresses. */
+ BITFIELD_DECLARE(mac_own, uint64_t, MLX5_MAX_MAC_ADDRESSES);
+ /* Bit-field of MAC addresses owned by the PMD. */
uint16_t vlan_filter[MLX5_MAX_VLAN_IDS]; /* VLAN filters table. */
unsigned int vlan_filter_n; /* Number of configured VLAN filters. */
/* Device properties. */
@@ -154,6 +157,8 @@ struct priv {
struct mlx5_dev_config config; /* Device configuration. */
struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
/* Context for Verbs allocator. */
+ int nl_socket; /* Netlink socket. */
+ uint32_t nl_sn; /* Netlink message sequence number. */
};
/* mlx5.c */
@@ -163,6 +168,7 @@ int mlx5_getenv_int(const char *);
/* mlx5_ethdev.c */
int mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[IF_NAMESIZE]);
+int mlx5_ifindex(const struct rte_eth_dev *dev);
int mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr);
int mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu);
int mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep,
@@ -297,4 +303,14 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct rte_mempool *mp);
int mlx5_mr_release(struct mlx5_mr *mr);
int mlx5_mr_verify(struct rte_eth_dev *dev);
+/* mlx5_nl.c */
+
+int mlx5_nl_init(uint32_t nlgroups);
+int mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
+ uint32_t index);
+int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac,
+ uint32_t index);
+void mlx5_nl_mac_addr_sync(struct rte_eth_dev *dev);
+void mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev);
+
#endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index b6f5101cf..bdd03c3d7 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -176,6 +176,33 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[IF_NAMESIZE])
return 0;
}
+/**
+ * Get the interface index from device name.
+ *
+ * @param[in] dev
+ * Pointer to Ethernet device.
+ *
+ * @return
+ * Interface index on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+int
+mlx5_ifindex(const struct rte_eth_dev *dev)
+{
+ char ifname[IF_NAMESIZE];
+ int ret;
+
+ ret = mlx5_get_ifname(dev, &ifname);
+ if (ret)
+ return ret;
+ ret = if_nametoindex(ifname);
+ if (ret == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return ret;
+}
+
/**
* Perform ifreq ioctl() on associated Ethernet device.
*
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 01c7ba17a..e859fca6a 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -67,13 +67,19 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[ETHER_ADDR_LEN])
void
mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
+ int ret;
+
assert(index < MLX5_MAX_MAC_ADDRESSES);
+ if (vf)
+ mlx5_nl_mac_addr_remove(dev, &dev->data->mac_addrs[index],
+ index);
memset(&dev->data->mac_addrs[index], 0, sizeof(struct ether_addr));
if (!dev->data->promiscuous) {
- int ret = mlx5_traffic_restart(dev);
-
+ ret = mlx5_traffic_restart(dev);
if (ret)
- DRV_LOG(ERR, "port %u cannot remove mac address: %s",
+ DRV_LOG(ERR, "port %u cannot restart traffic: %s",
dev->data->port_id, strerror(rte_errno));
}
}
@@ -97,6 +103,8 @@ int
mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
uint32_t index, uint32_t vmdq __rte_unused)
{
+ struct priv *priv = dev->data->dev_private;
+ const int vf = priv->config.vf;
unsigned int i;
assert(index < MLX5_MAX_MAC_ADDRESSES);
@@ -111,6 +119,12 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
rte_errno = EADDRINUSE;
return -rte_errno;
}
+ if (vf) {
+ int ret = mlx5_nl_mac_addr_add(dev, mac, index);
+
+ if (ret)
+ return ret;
+ }
dev->data->mac_addrs[index] = *mac;
if (!dev->data->promiscuous)
return mlx5_traffic_restart(dev);
diff --git a/drivers/net/mlx5/mlx5_nl.c b/drivers/net/mlx5/mlx5_nl.c
new file mode 100644
index 000000000..2f238beb3
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_nl.c
@@ -0,0 +1,533 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 6WIND S.A.
+ * Copyright 2018 Mellanox Technologies, Ltd
+ */
+
+#include <linux/netlink.h>
+#include <linux/rtnetlink.h>
+#include <unistd.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/* Size of the buffer to receive kernel messages */
+#define MLX5_NL_BUF_SIZE (32 * 1024)
+/* Send buffer size for the Netlink socket */
+#define MLX5_SEND_BUF_SIZE 32768
+/* Receive buffer size for the Netlink socket */
+#define MLX5_RECV_BUF_SIZE 32768
+
+/*
+ * Define NDA_RTA as defined in iproute2 sources.
+ *
+ * see in iproute2 sources file include/libnetlink.h
+ */
+#ifndef MLX5_NDA_RTA
+#define MLX5_NDA_RTA(r) \
+ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg))))
+#endif
+
+/* Add/remove MAC address through Netlink */
+struct mlx5_nl_mac_addr {
+ struct ether_addr (*mac)[];
+ /**< MAC address handled by the device. */
+ int mac_n; /**< Number of addresses in the array. */
+};
+
+/**
+ * Opens a Netlink socket.
+ *
+ * @param nl_groups
+ * Netlink group value (e.g. RTMGRP_LINK).
+ *
+ * @return
+ * A file descriptor on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+int
+mlx5_nl_init(uint32_t nl_groups)
+{
+ int fd;
+ int sndbuf_size = MLX5_SEND_BUF_SIZE;
+ int rcvbuf_size = MLX5_RECV_BUF_SIZE;
+ struct sockaddr_nl local = {
+ .nl_family = AF_NETLINK,
+ .nl_groups = nl_groups,
+ };
+ int ret;
+
+ fd = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
+ if (fd == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ ret = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, sizeof(int));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, sizeof(int));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ ret = bind(fd, (struct sockaddr *)&local, sizeof(local));
+ if (ret == -1) {
+ rte_errno = errno;
+ goto error;
+ }
+ return fd;
+error:
+ close(fd);
+ return -rte_errno;
+}
+
+/**
+ * Send a request message to the kernel on the Netlink socket.
+ *
+ * @param[in] nlsk_fd
+ * Netlink socket file descriptor.
+ * @param[in] nh
+ * The Netlink message send to the kernel.
+ * @param[in] ssn
+ * Sequence number.
+ * @param[in] req
+ * Pointer to the request structure.
+ * @param[in] len
+ * Length of the request in bytes.
+ *
+ * @return
+ * The number of sent bytes on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+static int
+mlx5_nl_request(int nlsk_fd, struct nlmsghdr *nh, uint32_t sn, void *req,
+ int len)
+{
+ struct sockaddr_nl sa = {
+ .nl_family = AF_NETLINK,
+ };
+ struct iovec iov[2] = {
+ { .iov_base = nh, .iov_len = sizeof(*nh), },
+ { .iov_base = req, .iov_len = len, },
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = iov,
+ .msg_iovlen = 2,
+ };
+ int send_bytes;
+
+ nh->nlmsg_pid = 0; /* communication with the kernel uses pid 0 */
+ nh->nlmsg_seq = sn;
+ send_bytes = sendmsg(nlsk_fd, &msg, 0);
+ if (send_bytes < 0) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return send_bytes;
+}
+
+/**
+ * Send a message to the kernel on the Netlink socket.
+ *
+ * @param[in] nlsk_fd
+ * The Netlink socket file descriptor used for communication.
+ * @param[in] nh
+ * The Netlink message send to the kernel.
+ * @param[in] sn
+ * Sequence number.
+ *
+ * @return
+ * The number of sent bytes on success, a negative errno value otherwise and
+ * rte_errno is set.
+ */
+static int
+mlx5_nl_send(int nlsk_fd, struct nlmsghdr *nh, uint32_t sn)
+{
+ struct sockaddr_nl sa = {
+ .nl_family = AF_NETLINK,
+ };
+ struct iovec iov = {
+ .iov_base = nh,
+ .iov_len = nh->nlmsg_len,
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ int send_bytes;
+
+ nh->nlmsg_pid = 0; /* communication with the kernel uses pid 0 */
+ nh->nlmsg_seq = sn;
+ send_bytes = sendmsg(nlsk_fd, &msg, 0);
+ if (send_bytes < 0) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ return send_bytes;
+}
+
+/**
+ * Receive a message from the kernel on the Netlink socket, following
+ * mlx5_nl_send().
+ *
+ * @param[in] nlsk_fd
+ * The Netlink socket file descriptor used for communication.
+ * @param[in] sn
+ * Sequence number.
+ * @param[in] cb
+ * The callback function to call for each Netlink message received.
+ * @param[in, out] arg
+ * Custom arguments for the callback.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_recv(int nlsk_fd, uint32_t sn, int (*cb)(struct nlmsghdr *, void *arg),
+ void *arg)
+{
+ struct sockaddr_nl sa;
+ char buf[MLX5_RECV_BUF_SIZE];
+ struct iovec iov = {
+ .iov_base = buf,
+ .iov_len = sizeof(buf),
+ };
+ struct msghdr msg = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ /* One message at a time */
+ .msg_iovlen = 1,
+ };
+ int multipart = 0;
+ int ret = 0;
+
+ do {
+ struct nlmsghdr *nh;
+ int recv_bytes = 0;
+
+ do {
+ recv_bytes = recvmsg(nlsk_fd, &msg, 0);
+ if (recv_bytes == -1) {
+ rte_errno = errno;
+ return -rte_errno;
+ }
+ nh = (struct nlmsghdr *)buf;
+ } while (nh->nlmsg_seq != sn);
+ for (;
+ NLMSG_OK(nh, (unsigned int)recv_bytes);
+ nh = NLMSG_NEXT(nh, recv_bytes)) {
+ if (nh->nlmsg_type == NLMSG_ERROR) {
+ struct nlmsgerr *err_data = NLMSG_DATA(nh);
+
+ if (err_data->error < 0) {
+ rte_errno = -err_data->error;
+ return -rte_errno;
+ }
+ /* Ack message. */
+ return 0;
+ }
+ /* Multi-part msgs and their trailing DONE message. */
+ if (nh->nlmsg_flags & NLM_F_MULTI) {
+ if (nh->nlmsg_type == NLMSG_DONE)
+ return 0;
+ multipart = 1;
+ }
+ if (cb) {
+ ret = cb(nh, arg);
+ if (ret < 0)
+ return ret;
+ }
+ }
+ } while (multipart);
+ return ret;
+}
+
+/**
+ * Parse Netlink message to retrieve the bridge MAC address.
+ *
+ * @param nh
+ * Pointer to Netlink Message Header.
+ * @param arg
+ * PMD data register with this callback.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_cb(struct nlmsghdr *nh, void *arg)
+{
+ struct mlx5_nl_mac_addr *data = arg;
+ struct ndmsg *r = NLMSG_DATA(nh);
+ struct rtattr *attribute;
+ int len;
+
+ len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
+ for (attribute = MLX5_NDA_RTA(r);
+ RTA_OK(attribute, len);
+ attribute = RTA_NEXT(attribute, len)) {
+ if (attribute->rta_type == NDA_LLADDR) {
+ if (data->mac_n == MLX5_MAX_MAC_ADDRESSES) {
+ DRV_LOG(WARNING,
+ "not enough room to finalise the"
+ " request");
+ rte_errno = ENOMEM;
+ return -rte_errno;
+ }
+#ifndef NDEBUG
+ char m[18];
+
+ ether_format_addr(m, 18, RTA_DATA(attribute));
+ DRV_LOG(DEBUG, "brige MAC address %s", m);
+#endif
+ memcpy(&(*data->mac)[data->mac_n++],
+ RTA_DATA(attribute), ETHER_ADDR_LEN);
+ }
+ }
+ return 0;
+}
+
+/**
+ * Get bridge MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac[out]
+ * Pointer to the array table of MAC addresses to fill.
+ * Its size should be of MLX5_MAX_MAC_ADDRESSES.
+ * @param mac_n[out]
+ * Number of entries filled in MAC array.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_list(struct rte_eth_dev *dev, struct ether_addr (*mac)[],
+ int *mac_n)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifm;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_GETNEIGH,
+ .nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
+ },
+ .ifm = {
+ .ifi_family = PF_BRIDGE,
+ .ifi_index = iface_idx,
+ },
+ };
+ struct mlx5_nl_mac_addr data = {
+ .mac = mac,
+ .mac_n = 0,
+ };
+ int fd;
+ int ret;
+ uint32_t sn = priv->nl_sn++;
+
+ if (priv->nl_socket == -1)
+ return 0;
+ fd = priv->nl_socket;
+ ret = mlx5_nl_request(fd, &req.hdr, sn, &req.ifm,
+ sizeof(struct ifinfomsg));
+ if (ret < 0)
+ goto error;
+ ret = mlx5_nl_recv(fd, sn, mlx5_nl_mac_addr_cb, &data);
+ if (ret < 0)
+ goto error;
+ *mac_n = data.mac_n;
+ return 0;
+error:
+ DRV_LOG(DEBUG, "port %u cannot retrieve MAC address list %s",
+ dev->data->port_id, strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Modify the MAC address neighbour table with Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac
+ * MAC address to consider.
+ * @param add
+ * 1 to add the MAC address, 0 to remove the MAC address.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_mac_addr_modify(struct rte_eth_dev *dev, struct ether_addr *mac,
+ int add)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ndmsg ndm;
+ struct rtattr rta;
+ uint8_t buffer[ETHER_ADDR_LEN];
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)),
+ .nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE |
+ NLM_F_EXCL | NLM_F_ACK,
+ .nlmsg_type = add ? RTM_NEWNEIGH : RTM_DELNEIGH,
+ },
+ .ndm = {
+ .ndm_family = PF_BRIDGE,
+ .ndm_state = NUD_NOARP | NUD_PERMANENT,
+ .ndm_ifindex = iface_idx,
+ .ndm_flags = NTF_SELF,
+ },
+ .rta = {
+ .rta_type = NDA_LLADDR,
+ .rta_len = RTA_LENGTH(ETHER_ADDR_LEN),
+ },
+ };
+ int fd;
+ int ret;
+ uint32_t sn = priv->nl_sn++;
+
+ if (priv->nl_socket == -1)
+ return 0;
+ fd = priv->nl_socket;
+ memcpy(RTA_DATA(&req.rta), mac, ETHER_ADDR_LEN);
+ req.hdr.nlmsg_len = NLMSG_ALIGN(req.hdr.nlmsg_len) +
+ RTA_ALIGN(req.rta.rta_len);
+ ret = mlx5_nl_send(fd, &req.hdr, sn);
+ if (ret < 0)
+ goto error;
+ ret = mlx5_nl_recv(fd, sn, NULL, NULL);
+ if (ret < 0)
+ goto error;
+ return 0;
+error:
+ DRV_LOG(DEBUG,
+ "port %u cannot %s MAC address %02X:%02X:%02X:%02X:%02X:%02X"
+ " %s",
+ dev->data->port_id,
+ add ? "add" : "remove",
+ mac->addr_bytes[0], mac->addr_bytes[1],
+ mac->addr_bytes[2], mac->addr_bytes[3],
+ mac->addr_bytes[4], mac->addr_bytes[5],
+ strerror(rte_errno));
+ return -rte_errno;
+}
+
+/**
+ * Add a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac
+ * MAC address to register.
+ * @param index
+ * MAC address index.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac,
+ uint32_t index)
+{
+ struct priv *priv = dev->data->dev_private;
+ int ret;
+
+ ret = mlx5_nl_mac_addr_modify(dev, mac, 1);
+ if (!ret)
+ BITFIELD_SET(priv->mac_own, index);
+ if (ret == -EEXIST)
+ return 0;
+ return ret;
+}
+
+/**
+ * Remove a MAC address.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ * @param mac
+ * MAC address to remove.
+ * @param index
+ * MAC address index.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac,
+ uint32_t index)
+{
+ struct priv *priv = dev->data->dev_private;
+
+ BITFIELD_RESET(priv->mac_own, index);
+ return mlx5_nl_mac_addr_modify(dev, mac, 0);
+}
+
+/**
+ * Synchronise Netlink bridge table to the internal table.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ */
+void
+mlx5_nl_mac_addr_sync(struct rte_eth_dev *dev)
+{
+ struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES];
+ int macs_n = 0;
+ int i;
+ int ret;
+
+ ret = mlx5_nl_mac_addr_list(dev, &macs, &macs_n);
+ if (ret)
+ return;
+ for (i = 0; i != macs_n; ++i) {
+ int j;
+
+ /* Verify the address is not in the array yet. */
+ for (j = 0; j != MLX5_MAX_MAC_ADDRESSES; ++j)
+ if (is_same_ether_addr(&macs[i],
+ &dev->data->mac_addrs[j]))
+ break;
+ if (j != MLX5_MAX_MAC_ADDRESSES)
+ continue;
+ /* Find the first entry available. */
+ for (j = 0; j != MLX5_MAX_MAC_ADDRESSES; ++j) {
+ if (is_zero_ether_addr(&dev->data->mac_addrs[j])) {
+ dev->data->mac_addrs[j] = macs[i];
+ break;
+ }
+ }
+ }
+}
+
+/**
+ * Flush all added MAC addresses.
+ *
+ * @param dev
+ * Pointer to Ethernet device.
+ */
+void
+mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev)
+{
+ struct priv *priv = dev->data->dev_private;
+ int i;
+
+ for (i = MLX5_MAX_MAC_ADDRESSES - 1; i >= 0; --i) {
+ struct ether_addr *m = &dev->data->mac_addrs[i];
+
+ if (BITFIELD_ISSET(priv->mac_own, i))
+ mlx5_nl_mac_addr_remove(dev, m, i);
+ }
+}
--
2.17.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v4 2/3] net/mlx5: use Netlink to enable promisc / allmulti mode
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 " Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
@ 2018-04-05 15:07 ` Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
3 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-04-05 15:07 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
VF devices are not able to receive promisc or allmulti traffic unless it
fully requests it though Netlink. This will cause the request to be
processed by the PF which will handle the request and enable it.
This requires the VF to be trusted by the PF.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
drivers/net/mlx5/mlx5.h | 2 +
drivers/net/mlx5/mlx5_nl.c | 94 ++++++++++++++++++++++++++++++++++
drivers/net/mlx5/mlx5_rxmode.c | 8 +++
3 files changed, 104 insertions(+)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 683026b0f..50cc09946 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -312,5 +312,7 @@ int mlx5_nl_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac,
uint32_t index);
void mlx5_nl_mac_addr_sync(struct rte_eth_dev *dev);
void mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev);
+int mlx5_nl_promisc(struct rte_eth_dev *dev, int enable);
+int mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable);
#endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_nl.c b/drivers/net/mlx5/mlx5_nl.c
index 2f238beb3..fe0e81710 100644
--- a/drivers/net/mlx5/mlx5_nl.c
+++ b/drivers/net/mlx5/mlx5_nl.c
@@ -531,3 +531,97 @@ mlx5_nl_mac_addr_flush(struct rte_eth_dev *dev)
mlx5_nl_mac_addr_remove(dev, m, i);
}
}
+
+/**
+ * Enable promiscuous / all multicast mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param flags
+ * IFF_PROMISC for promiscuous, IFF_ALLMULTI for allmulti.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_nl_device_flags(struct rte_eth_dev *dev, uint32_t flags, int enable)
+{
+ struct priv *priv = dev->data->dev_private;
+ int iface_idx = mlx5_ifindex(dev);
+ struct {
+ struct nlmsghdr hdr;
+ struct ifinfomsg ifi;
+ } req = {
+ .hdr = {
+ .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+ .nlmsg_type = RTM_NEWLINK,
+ .nlmsg_flags = NLM_F_REQUEST,
+ },
+ .ifi = {
+ .ifi_flags = enable ? flags : 0,
+ .ifi_change = flags,
+ .ifi_index = iface_idx,
+ },
+ };
+ int fd;
+ int ret;
+
+ assert(!(flags & ~(IFF_PROMISC | IFF_ALLMULTI)));
+ if (priv->nl_socket < 0)
+ return 0;
+ fd = priv->nl_socket;
+ ret = mlx5_nl_send(fd, &req.hdr, priv->nl_sn++);
+ if (ret < 0)
+ return ret;
+ return 0;
+}
+
+/**
+ * Enable promiscuous mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_promisc(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_nl_device_flags(dev, IFF_PROMISC, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s promisc mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
+
+/**
+ * Enable all multicast mode through Netlink.
+ *
+ * @param dev
+ * Pointer to Ethernet device structure.
+ * @param enable
+ * Nonzero to enable, disable otherwise.
+ *
+ * @return
+ * 0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable)
+{
+ int ret = mlx5_nl_device_flags(dev, IFF_ALLMULTI, enable);
+
+ if (ret)
+ DRV_LOG(DEBUG,
+ "port %u cannot %s allmulti mode: Netlink error %s",
+ dev->data->port_id, enable ? "enable" : "disable",
+ strerror(rte_errno));
+ return ret;
+}
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index e43a4b030..c1c0f21c7 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -35,6 +35,8 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 1;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_promisc(dev, 1);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot enable promiscuous mode: %s",
@@ -53,6 +55,8 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 0;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_promisc(dev, 0);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot disable promiscuous mode: %s",
@@ -71,6 +75,8 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 1;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_allmulti(dev, 1);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot enable allmulicast mode: %s",
@@ -89,6 +95,8 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 0;
+ if (((struct priv *)dev->data->dev_private)->config.vf)
+ mlx5_nl_allmulti(dev, 0);
ret = mlx5_traffic_restart(dev);
if (ret)
DRV_LOG(ERR, "port %u cannot disable allmulicast mode: %s",
--
2.17.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* [dpdk-dev] [PATCH v4 3/3] net/mlx5: add a parameter for Netlink support in VF
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
` (2 preceding siblings ...)
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 2/3] net/mlx5: use Netlink to enable promisc / allmulti mode Nelio Laranjeiro
@ 2018-04-05 15:07 ` Nelio Laranjeiro
3 siblings, 0 replies; 32+ messages in thread
From: Nelio Laranjeiro @ 2018-04-05 15:07 UTC (permalink / raw)
To: Adrien Mazarguil, Yongseok Koh; +Cc: dev
All Netlink request the PMD will do can also be done by a iproute2 command
line interface, enabling VF behavior configuration without having to modify
the application nor reaching PMD limits (e.g. MAC address number limit).
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
doc/guides/nics/mlx5.rst | 9 +++++++++
drivers/net/mlx5/mlx5.c | 9 ++++++++-
drivers/net/mlx5/mlx5.h | 1 +
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index c812f0b4f..b1bab2ce2 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -344,6 +344,15 @@ Run-time configuration
Enabled by default.
+- ``vf_nl_en`` parameter [int]
+
+ A nonzero value enables Netlink requests from the VF to add/remove MAC
+ addresses or/and enable/disable promiscuous/all multicast on the Netdevice.
+ Otherwise the relevant configuration must be run with Linux iproute2 tools.
+ This is a prerequisite to receive this kind of traffic.
+
+ Enabled by default, valid only on VF devices ignored otherwise.
+
Prerequisites
-------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index e52c60fb3..cfab55897 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
/* Device parameter to enable hardware Rx vector. */
#define MLX5_RX_VEC_EN "rx_vec_en"
+/* Activate Netlink support in VF mode. */
+#define MLX5_VF_NL_EN "vf_nl_en"
+
#ifndef HAVE_IBV_MLX5_MOD_MPW
#define MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED (1 << 2)
#define MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW (1 << 3)
@@ -412,6 +415,8 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
config->tx_vec_en = !!tmp;
} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
config->rx_vec_en = !!tmp;
+ } else if (strcmp(MLX5_VF_NL_EN, key) == 0) {
+ config->vf_nl_en = !!tmp;
} else {
DRV_LOG(WARNING, "%s: unknown parameter", key);
rte_errno = EINVAL;
@@ -443,6 +448,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
MLX5_TXQ_MAX_INLINE_LEN,
MLX5_TX_VEC_EN,
MLX5_RX_VEC_EN,
+ MLX5_VF_NL_EN,
NULL,
};
struct rte_kvargs *kvlist;
@@ -747,6 +753,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
.txq_inline = MLX5_ARG_UNSET,
.txqs_inline = MLX5_ARG_UNSET,
.inline_max_packet_sz = MLX5_ARG_UNSET,
+ .vf_nl_en = 1,
};
len = snprintf(name, sizeof(name), PCI_PRI_FMT,
@@ -963,7 +970,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0));
priv->nl_socket = -1;
priv->nl_sn = 0;
- if (vf) {
+ if (vf && config.vf_nl_en) {
priv->nl_socket = mlx5_nl_init(RTMGRP_LINK);
if (priv->nl_socket < 0)
priv->nl_socket = -1;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 50cc09946..63b24e6bb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -88,6 +88,7 @@ struct mlx5_dev_config {
unsigned int tx_vec_en:1; /* Tx vector is enabled. */
unsigned int rx_vec_en:1; /* Rx vector is enabled. */
unsigned int mpw_hdr_dseg:1; /* Enable DSEGs in the title WQEBB. */
+ unsigned int vf_nl_en:1; /* Enable Netlink requests in VF mode. */
unsigned int tso_max_payload_sz; /* Maximum TCP payload for TSO. */
unsigned int ind_table_max_size; /* Maximum indirection table size. */
int txq_inline; /* Maximum packet size for inlining. */
--
2.17.0
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/3] net/mlx5: use Netlink in VF mode
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 " Nelio Laranjeiro
@ 2018-04-08 8:16 ` Shahaf Shuler
0 siblings, 0 replies; 32+ messages in thread
From: Shahaf Shuler @ 2018-04-08 8:16 UTC (permalink / raw)
To: Nélio Laranjeiro, Adrien Mazarguil, Yongseok Koh; +Cc: dev
Thursday, April 5, 2018 6:07 PM, Nelio Laranjeiro:
> Subject: [dpdk-dev] [PATCH v4 0/3] net/mlx5: use Netlink in VF mode
>
> When MLX5 behaves in VF mode and the hypervisor have **trusted** this
> VF, to be able to receive specific traffic some requests must be done to
> configure the NIC. There is no API currently available to do it though Verbs,
> but there is in Linux side using Netlink.
>
> The specific cases are:
> - Enable/disable promiscuous mode.
> - Enable/disable allmulti mode.
> - Add/remove mac addresses.
Series applied to next-net-mlx, thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2018-04-08 8:16 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-13 12:50 [dpdk-dev] [PATCH 0/5] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-03-13 12:50 ` [dpdk-dev] [PATCH 1/5] net/mlx5: add VF information in configuration Nelio Laranjeiro
2018-03-14 17:10 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 2/5] net/mlx5: retrieve device index from Netlink Nelio Laranjeiro
2018-03-14 17:10 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
2018-03-14 17:10 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 4/5] net/mlx5: use Netlink to enable promisc/allmulti Nelio Laranjeiro
2018-03-14 17:11 ` Adrien Mazarguil
2018-03-13 12:50 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
2018-03-14 17:11 ` Adrien Mazarguil
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
2018-03-19 15:20 ` [dpdk-dev] [PATCH v2 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 0/3] net/mlx5: use Netlink in VF mode Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 " Nelio Laranjeiro
2018-04-08 8:16 ` Shahaf Shuler
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 2/3] net/mlx5: use Netlink to enable promisc / allmulti mode Nelio Laranjeiro
2018-04-05 15:07 ` [dpdk-dev] [PATCH v4 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: use Netlink to add/remove MAC addresses Nelio Laranjeiro
2018-03-22 7:34 ` Shahaf Shuler
2018-03-22 9:04 ` Nélio Laranjeiro
2018-03-22 9:45 ` Shahaf Shuler
2018-03-22 10:28 ` Nélio Laranjeiro
2018-03-28 5:56 ` Shahaf Shuler
2018-03-22 7:44 ` Shahaf Shuler
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: use Netlink to enable promisc / all multicast mode Nelio Laranjeiro
2018-03-22 7:36 ` Shahaf Shuler
2018-03-21 13:40 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add a parameter for Netlink support in VF Nelio Laranjeiro
2018-03-22 7:38 ` Shahaf Shuler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).