From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com [74.125.82.42]) by dpdk.org (Postfix) with ESMTP id 2E282A495 for ; Wed, 14 Mar 2018 18:11:04 +0100 (CET) Received: by mail-wm0-f42.google.com with SMTP id t3so5481234wmc.2 for ; Wed, 14 Mar 2018 10:11:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=XRX1UCfmItnvPlW9VZ4j07KXKxxDHt8EkJHrHxDytZg=; b=gd7+iWHwVJa625IDlIWecUBykiDn8Wvrj3M7D/ejaa9I/z6CSuHSG+lWUST1Fb7DCv O6cNEuKjOk4M8Ni1RV/i3C8qVoR4AlDq2S8xeylp5/tcXf+gkJLl7dEPEZBt7sdowL6X YChOD2SMRzHwB7Cq0n9SyuUzE0Yrj53WOV3AEvcvkL3R12oBfuzsdfp6xXpRI3DmGLAz i4rV1m2G8AncS57OAJv1wzZxOkxettHqVPnROhFEvIH+0DY81Y2KwZnU8+mOyifgTmPO qrJVfeA8LZxac6EmhzcpnnDPfvCjBR53iOubLwyIU0EF+ThRKgq571jgtbyHTXNejgow a7KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=XRX1UCfmItnvPlW9VZ4j07KXKxxDHt8EkJHrHxDytZg=; b=flRZNTXprGfYhLwbClyrAIPX941HVhtgAwbtaB/vwGL135noISzKLvhBOK6VmqoEhj XrXS4Jx1tJQgb7tw4VtOUTdhZoQshgv6Qxo59Izq3wYH2oxqQNNYxAJVSwzy4g5Q+jp5 /NtB/k0f+PBinfiTbuMIo6F9L/2UB4P0yXnTlXW9ZQYdCcUQLWbIj5dA7ZUaePy6PVVK gUXMMCG/KNSQqSUYqKUV9j3eUKt/Fnt/7eqK4M1jDBzMJYLhVmhdb/RjFqwj52ziqUDO J/nJSWoOUZLIW6DXeHYJK4OAGQPtXZXj3LIT0kycKZ/JVUpIPijNEBJ8714bHgiGbFSd 1Rgg== X-Gm-Message-State: AElRT7HzZdXv3HaPe4vAgQuFUqljH+yz35ihUW1/lmgD/51zz9pBfkmN +Nvt0k8VbKNh3JdAoesz8rkRDOYB X-Google-Smtp-Source: AG47ELvE7JaU73g8o3d9JCH4njDtduOkAvAxblN+Pbutzi1dIlKf4AMzHvPvuEuh0Lb1HtTaFUWTGQ== X-Received: by 10.28.23.74 with SMTP id 71mr2259655wmx.23.1521047463033; Wed, 14 Mar 2018 10:11:03 -0700 (PDT) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id 69sm1946903wmp.36.2018.03.14.10.11.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 Mar 2018 10:11:02 -0700 (PDT) Date: Wed, 14 Mar 2018 18:10:48 +0100 From: Adrien Mazarguil To: Nelio Laranjeiro Cc: Yongseok Koh , dev@dpdk.org Message-ID: <20180314171048.GN3994@6wind.com> References: <1680083ce6c1d1536b38cce0ecbeeb8ca4882cdb.1520944256.git.nelio.laranjeiro@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1680083ce6c1d1536b38cce0ecbeeb8ca4882cdb.1520944256.git.nelio.laranjeiro@6wind.com> Subject: Re: [dpdk-dev] [PATCH 3/5] net/mlx5: use Netlink to add/remove MAC addresses X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Mar 2018 17:11:04 -0000 On Tue, Mar 13, 2018 at 01:50:37PM +0100, Nelio Laranjeiro wrote: > VF devices are not able to receive traffic unless it fully requests it > though Netlink. This will cause the request to be processed by the PF > which will add/remove the MAC address to the VF table if the VF is trusted. > > Signed-off-by: Nelio Laranjeiro Same basic comments as on the previous patch: - Check random capitalization in documentation and comments. - "vf" => "nl" for functions, objects and files that aren't really VF-specific but involve Netlink. More below. > --- > drivers/net/mlx5/Makefile | 3 +- > drivers/net/mlx5/mlx5.c | 7 + > drivers/net/mlx5/mlx5.h | 6 + > drivers/net/mlx5/mlx5_mac.c | 25 +++- > drivers/net/mlx5/mlx5_vf.c | 315 ++++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 353 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile > index afda4118f..bbda0d1ff 100644 > --- a/drivers/net/mlx5/Makefile > +++ b/drivers/net/mlx5/Makefile > @@ -59,6 +59,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c > SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c > SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c > SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c > +SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vf.c > > ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y) > INSTALL-$(CONFIG_RTE_LIBRTE_MLX5_PMD)-lib += $(LIB_GLUE) > @@ -84,7 +85,7 @@ LDLIBS += -libverbs -lmlx5 > endif > LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring > LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs > -LDLIBS += -lrte_bus_pci > +LDLIBS += -lrte_bus_pci -lrte_netlink > > # A few warnings cannot be avoided in external headers. > CFLAGS += -Wno-error=cast-qual > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c > index d5cc85d19..e966884bd 100644 > --- a/drivers/net/mlx5/mlx5.c > +++ b/drivers/net/mlx5/mlx5.c > @@ -205,6 +205,13 @@ mlx5_dev_close(struct rte_eth_dev *dev) > rte_free(priv->reta_idx); > if (priv->primary_socket) > mlx5_socket_uninit(dev); > + if (priv->config.vf) { > + ret = mlx5_vf_mac_addr_flush(dev); > + if (ret) > + DRV_LOG(WARNING, > + "port %u some MAC address remains configured", > + dev->data->port_id); > + } > ret = mlx5_hrxq_ibv_verify(dev); > if (ret) > DRV_LOG(WARNING, "port %u some hash Rx queue still remain", > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h > index 9191b2949..a4333e6a5 100644 > --- a/drivers/net/mlx5/mlx5.h > +++ b/drivers/net/mlx5/mlx5.h > @@ -298,4 +298,10 @@ struct mlx5_mr *mlx5_mr_get(struct rte_eth_dev *dev, struct rte_mempool *mp); > int mlx5_mr_release(struct mlx5_mr *mr); > int mlx5_mr_verify(struct rte_eth_dev *dev); > > +/* mlx5_vf.c */ > + > +int mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac); > +int mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac); > +int mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev); > + > #endif /* RTE_PMD_MLX5_H_ */ > diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c > index 01c7ba17a..5137ad11d 100644 > --- a/drivers/net/mlx5/mlx5_mac.c > +++ b/drivers/net/mlx5/mlx5_mac.c > @@ -67,11 +67,24 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[ETHER_ADDR_LEN]) > void > mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index) > { > + struct priv *priv = dev->data->dev_private; > + const int vf = priv->config.vf; > + int ret; > + > assert(index < MLX5_MAX_MAC_ADDRESSES); > + if (index && vf) { I don't think checking index is necessary here. mlx5_vf_mac_addr_remove() will fail if the MAC in question happens to be the last one configured on the netdevice, but depending on external configuration, it's not necessarily the same as mac_addrs[0]. Checking "vf" only seems more reliable. > + ret = mlx5_vf_mac_addr_remove(dev, > + &dev->data->mac_addrs[index]); > + if (ret) { > + DRV_LOG(WARNING, > + "port %u cannot remove mac address at index %d", > + dev->data->port_id, index); > + return; > + } > + } > memset(&dev->data->mac_addrs[index], 0, sizeof(struct ether_addr)); > if (!dev->data->promiscuous) { > - int ret = mlx5_traffic_restart(dev); > - > + ret = mlx5_traffic_restart(dev); > if (ret) > DRV_LOG(ERR, "port %u cannot remove mac address: %s", > dev->data->port_id, strerror(rte_errno)); > @@ -97,6 +110,8 @@ int > mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac, > uint32_t index, uint32_t vmdq __rte_unused) > { > + struct priv *priv = dev->data->dev_private; > + const int vf = priv->config.vf; > unsigned int i; > > assert(index < MLX5_MAX_MAC_ADDRESSES); > @@ -111,6 +126,12 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac, > rte_errno = EADDRINUSE; > return -rte_errno; > } > + if (index && vf) { Same comment as mlx5_mac_addr_remove() regarding index. > + int ret = mlx5_vf_mac_addr_add(dev, mac); > + > + if (ret) > + return ret; > + } > dev->data->mac_addrs[index] = *mac; > if (!dev->data->promiscuous) > return mlx5_traffic_restart(dev); > diff --git a/drivers/net/mlx5/mlx5_vf.c b/drivers/net/mlx5/mlx5_vf.c > index 357407f56..3db8ee0eb 100644 > --- a/drivers/net/mlx5/mlx5_vf.c > +++ b/drivers/net/mlx5/mlx5_vf.c > @@ -11,12 +11,29 @@ > #include "mlx5.h" > #include "mlx5_utils.h" > > +/* > + * Define NDA_RTA as defined in iproute2 sources. > + * > + * see in iproute2 sources file include/libnetlink.h > + */ > +#ifndef NDA_RTA > +#define NDA_RTA(r) \ > + ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg)))) > +#endif > + I suggest an internal MLX5_* macro instead, just in case. > /* Data exchanged between Netlink library and PMD. */ > struct mlx5_vf_iface { > struct rte_eth_dev *dev; /**< Pointer to Ethernet device. */ > int iface_idx; /**< Network Interface index. */ > }; > > +/* Add/Remove mac address through Metlink */ Remove => remove Metlink => Netlink > +struct mlx5_vf_mac_addr { > + struct ether_addr (*mac)[]; > + /**< MAC Address handled by the device. */ Address => address > + int mac_n; /**< Number of address in the array. */ address => addresses (Please check the remaining typos and extra caps.) > +}; > + > /** > * Parse Netlink message to retrieve the interface index. > * > @@ -132,3 +149,301 @@ mlx5_vf_iface_idx(struct rte_eth_dev *dev) > dev->data->port_id, strerror(rte_errno)); > return -rte_errno; > } > + > +/** > + * Parse Netlink message to retrieve the bridge MAC address. > + * > + * @param nh > + * Pointer to Netlink Message Header. > + * @param arg > + * PMD data register with this callback. > + * > + * @return > + * 0 on success, -1 otherwise. How about a negative errno value in case of error for consistency? > + */ > +static int > +mlx5_vf_mac_addr_cb(struct nlmsghdr *nh, void *arg) > +{ > + struct mlx5_vf_mac_addr *data = arg; > + struct ndmsg *r = NLMSG_DATA(nh); > + struct rtattr *attribute; > + int len; > + > + len = nh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)); > + for (attribute = NDA_RTA(r); > + RTA_OK(attribute, len); > + attribute = RTA_NEXT(attribute, len)) { > + if (attribute->rta_type == NDA_LLADDR) { > + if (data->mac_n == MLX5_MAX_MAC_ADDRESSES) { > + DRV_LOG(WARNING, > + "not enough room to finalise the" > + " request"); > + return -1; How about -ENOMEM? > + } > +#ifndef NDEBUG > + struct ether_addr m; > + > + memcpy(&m, RTA_DATA(attribute), ETHER_ADDR_LEN); > + DRV_LOG(DEBUG, > + "brige MAC address" brige => bridge > + " %02X:%02X:%02X:%02X:%02X:%02X", > + m.addr_bytes[0], m.addr_bytes[1], > + m.addr_bytes[2], m.addr_bytes[3], > + m.addr_bytes[4], m.addr_bytes[5]); > +#endif > + memcpy(&(*data->mac)[data->mac_n++], > + RTA_DATA(attribute), ETHER_ADDR_LEN); > + } > + } > + return 0; > +} > + > +/** > + * Get bridge MAC addresses. > + * > + * @param dev > + * Pointer to Ethernet device. > + * @param mac[out] > + * Pointer to the array table of MAC addresses to fill. > + * Its size should be of MLX5_MAX_MAC_ADDRESSES. > + * @param mac_n[out] > + * Number of entries filled in MAC array. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ Since the mac_n is only an output value and the size of mac is fixed to MLX5_MAX_MAC_ADDRESSES, I suggest to return the value of mac_n (positive in case of success) and remove it from the parameter list. > +static int > +mlx5_vf_mac_addr_list(struct rte_eth_dev *dev, struct ether_addr (*mac)[], Another suggestion for clarity: (*mac)[MLX5_MAX_MAC_ADDRESSES] > + int *mac_n) > +{ > + int iface_idx = mlx5_vf_iface_idx(dev); > + struct { > + struct nlmsghdr hdr; > + struct ifinfomsg ifm; > + } req = { > + .hdr = { > + .nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)), > + .nlmsg_type = RTM_GETNEIGH, > + .nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST, > + }, > + .ifm = { > + .ifi_family = PF_BRIDGE, > + .ifi_index = iface_idx, > + }, > + }; > + struct mlx5_vf_mac_addr data = { > + .mac = mac, > + .mac_n = 0, > + }; > + int fd; > + int ret; > + > + fd = rte_nl_init(RTMGRP_LINK); > + if (fd < 0) { > + rte_errno = errno; > + goto error; > + } > + ret = rte_nl_request(fd, &req.hdr, &req.ifm, sizeof(struct ifinfomsg)); > + if (ret < 0) { > + rte_errno = errno; > + goto error; > + } > + ret = rte_nl_recv(fd, mlx5_vf_mac_addr_cb, &data); > + if (ret < 0) { > + rte_errno = errno; > + goto error; > + } > + rte_nl_final(fd); > + *mac_n = data.mac_n; > + return 0; > +error: > + if (fd >= 0) > + rte_nl_final(fd); > + DRV_LOG(DEBUG, "port %u cannot retrieve MAC address list %s", > + dev->data->port_id, strerror(rte_errno)); > + return -rte_errno; > +} > + > +/** > + * Modify the mac address neighbour table with Netlink. mac => MAC Metlink => Netlink > + * > + * @param dev > + * Pointer to Ethernet device. > + * @param mac > + * MAC address to consider. > + * @param add > + * 1 to add the MAC address, 0 to remove the MAC address. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ > +static int > +mlx5_vf_mac_addr_modify(struct rte_eth_dev *dev, struct ether_addr *mac, > + int add) > +{ > + int iface_idx = mlx5_vf_iface_idx(dev); > + struct { > + struct nlmsghdr hdr; > + struct ndmsg ndm; > + struct rtattr rta; > + } req = { > + .hdr = { > + .nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)), > + .nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | > + NLM_F_EXCL | NLM_F_ACK, > + .nlmsg_type = add ? RTM_NEWNEIGH : RTM_DELNEIGH, > + }, > + .ndm = { > + .ndm_family = PF_BRIDGE, > + .ndm_state = NUD_NOARP | NUD_PERMANENT, > + .ndm_ifindex = iface_idx, > + .ndm_flags = NTF_SELF, > + }, > + .rta = { > + .rta_type = NDA_LLADDR, > + .rta_len = RTA_LENGTH(ETHER_ADDR_LEN), > + }, > + }; > + int fd; > + int ret; > + > + memcpy(RTA_DATA(&req.rta), mac, ETHER_ADDR_LEN); > + req.hdr.nlmsg_len = NLMSG_ALIGN(req.hdr.nlmsg_len) + > + RTA_ALIGN(req.rta.rta_len); > + fd = rte_nl_init(RTMGRP_LINK); > + if (fd < 0) { > + rte_errno = errno; > + goto error; > + } > + ret = rte_nl_send(fd, &req.hdr); > + if (ret == -1) { > + rte_errno = errno; > + goto error; > + } > + ret = rte_nl_recv_ack(fd); > + if (ret == -1) { > + rte_errno = errno; > + goto error; > + } > + rte_nl_final(fd); > + return 0; > +error: > + if (fd >= 0) > + rte_nl_final(fd); > + DRV_LOG(DEBUG, > + "port %u cannot %s MAC address %02X:%02X:%02X:%02X:%02X:%02X" > + " %s", > + dev->data->port_id, > + add ? "add" : "remove", > + mac->addr_bytes[0], mac->addr_bytes[1], > + mac->addr_bytes[2], mac->addr_bytes[3], > + mac->addr_bytes[4], mac->addr_bytes[5], > + strerror(rte_errno)); > + return -rte_errno; > +} > + > +/** > + * add a MAC address. add => Add (can't stop!) > + * > + * @param dev > + * Pointer to Ethernet device. > + * @param mac_addr > + * MAC address to register. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ > +int > +mlx5_vf_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac) > +{ > + struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES]; > + int macs_n = 0; > + int i; > + int mac_index = MLX5_MAX_MAC_ADDRESSES; > + int ret; > + > + ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n); > + if (ret) > + return ret; > + for (i = 0; i != macs_n; ++i) { > + if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) { > + mac_index = i; > + break; > + } > + } > + if (mac_index != MLX5_MAX_MAC_ADDRESSES) { > + DRV_LOG(INFO, "port %u MAC address already added", added => present > + dev->data->port_id); > + rte_errno = EADDRINUSE; > + return -rte_errno; > + } > + return mlx5_vf_mac_addr_modify(dev, mac, 1); > +} > + > +/** > + * Remove a MAC address. > + * > + * @param dev > + * Pointer to Ethernet device. > + * @param index > + * MAC address to remove. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ > +int > +mlx5_vf_mac_addr_remove(struct rte_eth_dev *dev, struct ether_addr *mac) > +{ > + struct ether_addr macs[MLX5_MAX_MAC_ADDRESSES]; > + int macs_n = 0; > + int i; > + int mac_index = MLX5_MAX_MAC_ADDRESSES; > + int ret; > + > + ret = mlx5_vf_mac_addr_list(dev, &macs, &macs_n); > + if (ret) > + return ret; > + for (i = 0; i != macs_n; ++i) { > + if (!memcmp(&macs[i], mac, ETHER_ADDR_LEN)) { > + mac_index = i; > + break; > + } > + } > + if (mac_index == MLX5_MAX_MAC_ADDRESSES) { > + DRV_LOG(INFO, "port %u MAC address not found", > + dev->data->port_id); > + rte_errno = EINVAL; How about EADDRNOTAVAIL (a better counterpart for EADDRINUSE)? > + return -rte_errno; > + } > + return mlx5_vf_mac_addr_modify(dev, mac, 0); > +} > + > +/** > + * Flush all added MAC addresses. > + * > + * @param dev > + * Pointer to Ethernet device. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ > +int > +mlx5_vf_mac_addr_flush(struct rte_eth_dev *dev) > +{ > + int i; > + const struct ether_addr mac_null = { .addr_bytes = { 0 }, }; > + > + /* Skip the default mac at index 0. */ I'm not sure, what if index 0 happens to be yet another secondary MAC address of the netdevice at this point? I think the PMD should attempt to flush them all starting from the end of the MAC array and not care about errors; this function should return void. > + for (i = 1; i != MLX5_MAX_MAC_ADDRESSES; ++i) { > + struct ether_addr *m = &dev->data->mac_addrs[i]; > + > + if (memcmp(&mac_null, m, ETHER_ADDR_LEN)) { > + int ret; > + > + ret = mlx5_vf_mac_addr_remove(dev, m); > + if (ret) > + return ret; > + } > + } > + return 0; > +} -- Adrien Mazarguil 6WIND