From: Ferruh Yigit <ferruh.yigit@xilinx.com>
To: <longli@microsoft.com>
Cc: <dev@dpdk.org>, Ajay Sharma <sharmaajay@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>
Subject: Re: [Patch v4 01/17] net/mana: add basic driver, build environment and doc
Date: Mon, 22 Aug 2022 16:03:47 +0100 [thread overview]
Message-ID: <859e95d9-2483-b017-6daa-0852317b4a72@xilinx.com> (raw)
In-Reply-To: <1657324171-31369-2-git-send-email-longli@linuxonhyperv.com>
On 7/9/2022 12:49 AM, longli@linuxonhyperv.com wrote:
> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> From: Long Li <longli@microsoft.com>
>
> MANA is a PCI device. It uses IB verbs to access hardware through the
> kernel RDMA layer. This patch introduces build environment and basic
> device probe functions.
>
> Signed-off-by: Long Li <longli@microsoft.com>
> ---
> Change log:
> v2:
> Fix typos.
> Make the driver build only on x86-64 and Linux.
> Remove unused header files.
> Change port definition to uint16_t or uint8_t (for IB).
> Use getline() in place of fgets() to read and truncate a line.
> v3:
> Add meson build check for required functions from RDMA direct verb header file
> v4:
> Remove extra "\n" in logging code.
> Use "r" in place of "rb" in fopen() to read text files.
>
<...>
> --- /dev/null
> +++ b/doc/guides/nics/mana.rst
> @@ -0,0 +1,66 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> + Copyright 2022 Microsoft Corporation
> +
> +MANA poll mode driver library
> +=============================
> +
> +The MANA poll mode driver library (**librte_net_mana**) implements support
> +for Microsoft Azure Network Adapter VF in SR-IOV context.
> +
Can you please provide any link to an official product description? As a
reference point for anybody interested more with the product details.
<..>
> +
> +Netvsc PMD arguments > +--------------------
'Netvsc'? Do you mean 'MANA'?
j
> +
> +The user can specify below argument in devargs.
> +
> +#. ``mac``:
> +
> + Specify the MAC address for this device. If it is set, the driver
> + probes and loads the NIC with a matching mac address. If it is not
> + set, the driver probes on all the NICs on the PCI device. The default
> + value is not set, meaning all the NICs will be probed and loaded.
Code accepts up to 8 mac value, should this be documented?
Also why this devarg is needed?
> diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
> new file mode 100644
> index 0000000000..cb59eb6882
> --- /dev/null
> +++ b/drivers/net/mana/mana.c
> @@ -0,0 +1,704 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2022 Microsoft Corporation
> + */
> +
> +#include <unistd.h>
> +#include <dirent.h>
> +#include <fcntl.h>
> +#include <sys/mman.h>
> +
> +#include <ethdev_driver.h>
> +#include <ethdev_pci.h>
> +#include <rte_kvargs.h>
> +#include <rte_eal_paging.h>
> +
> +#include <infiniband/verbs.h>
> +#include <infiniband/manadv.h>
> +
> +#include <assert.h>
> +
> +#include "mana.h"
> +
> +/* Shared memory between primary/secondary processes, per driver */
> +struct mana_shared_data *mana_shared_data;
> +const struct rte_memzone *mana_shared_mz;
If these global variables are not used by other compilation units,
please try to make them static as much as possible.
> +static const char *MZ_MANA_SHARED_DATA = "mana_shared_data";
> +
> +struct mana_shared_data mana_local_data;
> +
Can you put some comment to this global variables?
> +/* Spinlock for mana_shared_data */
> +static rte_spinlock_t mana_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> +
> +/* Allocate a buffer on the stack and fill it with a printf format string. */
> +#define MKSTR(name, ...) \
> + int mkstr_size_##name = snprintf(NULL, 0, "" __VA_ARGS__); \
> + char name[mkstr_size_##name + 1]; \
> + \
> + memset(name, 0, mkstr_size_##name + 1); \
> + snprintf(name, sizeof(name), "" __VA_ARGS__)
> +
> +int mana_logtype_driver;
> +int mana_logtype_init;
> +
> +const struct eth_dev_ops mana_dev_ops = {
> +};
> +
> +const struct eth_dev_ops mana_dev_sec_ops = {
> +};
It may be better to expand 'sec' to secondary to not confuse with
security etc...
> +
> +uint16_t
> +mana_rx_burst_removed(void *dpdk_rxq __rte_unused,
> + struct rte_mbuf **pkts __rte_unused,
> + uint16_t pkts_n __rte_unused)
> +{
> + rte_mb();
> + return 0;
> +}
> +
> +uint16_t
> +mana_tx_burst_removed(void *dpdk_rxq __rte_unused,
> + struct rte_mbuf **pkts __rte_unused,
> + uint16_t pkts_n __rte_unused)
> +{
> + rte_mb();
> + return 0;
> +}
> +
> +static const char *mana_init_args[] = {
> + "mac",
> + NULL,
> +};
> +
> +/* Support of parsing up to 8 mac address from EAL command line */
> +#define MAX_NUM_ADDRESS 8
> +struct mana_conf {
> + struct rte_ether_addr mac_array[MAX_NUM_ADDRESS];
> + unsigned int index;
> +};
> +
> +static int mana_arg_parse_callback(const char *key, const char *val,
> + void *private)
Since this is new driver, better to follow the coding convention:
https://doc.dpdk.org/guides/contributing/coding_style.html
Please put return type to another line:
static int
mana_arg_parse_callback(const char *key, const char *val, void *private)
> +{
> + struct mana_conf *conf = (struct mana_conf *)private;
> + int ret;
> +
> + DRV_LOG(INFO, "key=%s value=%s index=%d", key, val, conf->index);
> +
> + if (conf->index >= MAX_NUM_ADDRESS) {
> + DRV_LOG(ERR, "Exceeding max MAC address");
> + return 1;
> + }
> +
> + ret = rte_ether_unformat_addr(val, &conf->mac_array[conf->index]);
> + if (ret) {
> + DRV_LOG(ERR, "Invalid MAC address %s", val);
> + return ret;
> + }
> +
> + conf->index++;
> +
> + return 0;
> +}
> +
<...>
> +static int get_port_mac(struct ibv_device *device, unsigned int port,
> + struct rte_ether_addr *addr)
> +{
> + FILE *file;
> + int ret = 0;
> + DIR *dir;
> + struct dirent *dent;
> + unsigned int dev_port;
> + char mac[20];
> +
> + MKSTR(path, "%s/device/net", device->ibdev_path);
> +
> + dir = opendir(path);
> + if (!dir)
> + return -ENOENT;
> +
> + while ((dent = readdir(dir))) {
> + char *name = dent->d_name;
> +
> + MKSTR(filepath, "%s/%s/dev_port", path, name);
> +
> + /* Ignore . and .. */
> + if ((name[0] == '.') &&
> + ((name[1] == '\0') ||
> + ((name[1] == '.') && (name[2] == '\0'))))
> + continue;
> +
> + file = fopen(filepath, "r");
> + if (!file)
> + continue;
> +
> + ret = fscanf(file, "%u", &dev_port);
> + fclose(file);
> +
> + if (ret != 1)
> + continue;
> +
> + /* Ethernet ports start at 0, IB port start at 1 */
> + if (dev_port == port - 1) {
> + MKSTR(filepath, "%s/%s/address", path, name);
'MKSTR' macro adds two variables related with first argument, 'filepath'
already used above. Yes there is a new scope but better to not define
new variables, can you select a new name here?
<...>
> +
> +static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv __rte_unused,
This is a static function, if you don't use 'pci_drv', why not drop it
from the argument list.
> + struct rte_pci_device *pci_dev,
> + struct rte_ether_addr *mac_addr)
> +{
> + struct ibv_device **ibv_list;
> + int ibv_idx;
> + struct ibv_context *ctx;
> + struct ibv_device_attr_ex dev_attr;
> + int num_devices;
> + int ret = 0;
> + uint8_t port;
> + struct mana_priv *priv = NULL;
> + struct rte_eth_dev *eth_dev = NULL;
> + bool found_port;
> +
> + ibv_list = ibv_get_device_list(&num_devices);
> + for (ibv_idx = 0; ibv_idx < num_devices; ibv_idx++) {
> + struct ibv_device *ibdev = ibv_list[ibv_idx];
> + struct rte_pci_addr pci_addr;
> +
> + DRV_LOG(INFO, "Probe device name %s dev_name %s ibdev_path %s",
> + ibdev->name, ibdev->dev_name, ibdev->ibdev_path);
> +
> + if (mana_ibv_device_to_pci_addr(ibdev, &pci_addr))
> + continue;
> +
> + /* Ignore if this IB device is not this PCI device */
> + if (pci_dev->addr.domain != pci_addr.domain ||
> + pci_dev->addr.bus != pci_addr.bus ||
> + pci_dev->addr.devid != pci_addr.devid ||
> + pci_dev->addr.function != pci_addr.function)
> + continue;
> +
As far as I understand, intention of this loop is to find 'ibdev'
matching this device, code gooes through all "ibv device list" for this,
I wonder if there is a easy way for doing this, like a sysfs entry to
help getting this information?
And how mlx4/5 does this?
> + ctx = ibv_open_device(ibdev);
> + if (!ctx) {
> + DRV_LOG(ERR, "Failed to open IB device %s",
> + ibdev->name);
> + continue;
> + }
> +
> + ret = ibv_query_device_ex(ctx, NULL, &dev_attr);
> + DRV_LOG(INFO, "dev_attr.orig_attr.phys_port_cnt %u",
> + dev_attr.orig_attr.phys_port_cnt);
> + found_port = false;
> +
> + for (port = 1; port <= dev_attr.orig_attr.phys_port_cnt;
> + port++) {
> + struct ibv_parent_domain_init_attr attr = {};
"= { 0 };" for portability.
<...>
> +static int mana_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> + struct rte_pci_device *pci_dev)
> +{
> + struct rte_devargs *args = pci_dev->device.devargs;
> + struct mana_conf conf = {};
afaik, this is not part of c spec yet, why not initialize as " = {0}".
> + unsigned int i;
> + int ret;
> +
> + if (args && args->args) {
You can prefer 'args->drv_str', which is newer name of the args.
<...>
> +static const struct rte_pci_id mana_pci_id_map[] = {
> + {
> + RTE_PCI_DEVICE(PCI_VENDOR_ID_MICROSOFT,
> + PCI_DEVICE_ID_MICROSOFT_MANA)
> + },
PCI ID list should be terminated with ".vendor_id = 0", otherwise PCI
bus scan loop may behave unexpectedly.
> +};
> +
> +static struct rte_pci_driver mana_pci_driver = {
> + .driver = {
> + .name = "mana_pci",
driver names are mostly like 'net_<driver_name>', is there a reason to
diverge from it?
Also if you use 'RTE_PMD_REGISTER_PCI' macro, it will be standardised
anyway.
> + },
> + .id_table = mana_pci_id_map,
> + .probe = mana_pci_probe,
> + .remove = mana_pci_remove,
> + .drv_flags = RTE_PCI_DRV_INTR_RMV,
> +};
> +
> +RTE_INIT(rte_mana_pmd_init)
> +{
> + rte_pci_register(&mana_pci_driver);
> +}
> +
Why not using 'RTE_PMD_REGISTER_PCI()' macro instead?
> +RTE_PMD_EXPORT_NAME(net_mana, __COUNTER__);
> +RTE_PMD_REGISTER_PCI_TABLE(net_mana, mana_pci_id_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_mana, "* ib_uverbs & mana_ib");
> +RTE_LOG_REGISTER_SUFFIX(mana_logtype_init, init, NOTICE);
> +RTE_LOG_REGISTER_SUFFIX(mana_logtype_driver, driver, NOTICE);
> diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
> new file mode 100644
> index 0000000000..e30c030b4e
> --- /dev/null
> +++ b/drivers/net/mana/mana.h
> @@ -0,0 +1,210 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2022 Microsoft Corporation
> + */
> +
> +#ifndef __MANA_H__
> +#define __MANA_H__
> +
> +enum {
> + PCI_VENDOR_ID_MICROSOFT = 0x1414,
> +};
> +
> +enum {
> + PCI_DEVICE_ID_MICROSOFT_MANA = 0x00ba,
> +};
> +
> +/* Shared data between primary/secondary processes */
> +struct mana_shared_data {
> + rte_spinlock_t lock;
> + int init_done;
> + unsigned int primary_cnt;
> + unsigned int secondary_cnt;
> +};
> +
> +#define MIN_RX_BUF_SIZE 1024
> +#define MAX_FRAME_SIZE RTE_ETHER_MAX_LEN
> +#define BNIC_MAX_MAC_ADDR 1
> +
What 'BNIC_' prefix stands for? If it is related to the PMD, what do you
think to use 'MANA_' as prefix?
Same for multiple macros below.
<...>
> +
> +#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
> +
> +const uint32_t *mana_supported_ptypes(struct rte_eth_dev *dev);
> +
This function is not defined in this patch, so can drop declarataion.
<...>
> diff --git a/drivers/net/mana/version.map b/drivers/net/mana/version.map
> new file mode 100644
> index 0000000000..c2e0723b4c
> --- /dev/null
> +++ b/drivers/net/mana/version.map
> @@ -0,0 +1,3 @@
> +DPDK_22 {
It is 'DPDK_23' now.
next prev parent reply other threads:[~2022-08-22 15:03 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-08 23:49 [Patch v4 00/17] Introduce Microsoft Azure Network Adatper (MANA) PMD longli
2022-07-08 23:49 ` [Patch v4 01/17] net/mana: add basic driver, build environment and doc longli
2022-08-22 15:03 ` Ferruh Yigit [this message]
2022-08-22 15:07 ` Ferruh Yigit
2022-08-22 18:27 ` Long Li
2022-08-29 7:58 ` Thomas Monjalon
2022-08-29 8:51 ` Ferruh Yigit
2022-08-29 9:20 ` Thomas Monjalon
2022-09-07 23:38 ` Long Li
2022-07-08 23:49 ` [Patch v4 02/17] net/mana: add device configuration and stop longli
2022-07-08 23:49 ` [Patch v4 03/17] net/mana: add function to report support ptypes longli
2022-07-08 23:49 ` [Patch v4 04/17] net/mana: add link update longli
2022-07-08 23:49 ` [Patch v4 05/17] net/mana: add function for device removal interrupts longli
2022-07-08 23:49 ` [Patch v4 06/17] net/mana: add device info longli
2022-07-08 23:49 ` [Patch v4 07/17] net/mana: add function to configure RSS longli
2022-07-08 23:49 ` [Patch v4 08/17] net/mana: add function to configure RX queues longli
2022-07-08 23:49 ` [Patch v4 09/17] net/mana: add function to configure TX queues longli
2022-07-08 23:49 ` [Patch v4 10/17] net/mana: implement memory registration longli
2022-07-08 23:49 ` [Patch v4 11/17] net/mana: implement the hardware layer operations longli
2022-08-22 15:08 ` Ferruh Yigit
2022-08-22 18:28 ` Long Li
2022-07-08 23:49 ` [Patch v4 12/17] net/mana: add function to start/stop TX queues longli
2022-07-08 23:49 ` [Patch v4 13/17] net/mana: add function to start/stop RX queues longli
2022-07-08 23:49 ` [Patch v4 14/17] net/mana: add function to receive packets longli
2022-07-08 23:49 ` [Patch v4 15/17] net/mana: add function to send packets longli
2022-08-22 15:09 ` Ferruh Yigit
2022-08-24 13:38 ` Thomas Monjalon
2022-07-08 23:49 ` [Patch v4 16/17] net/mana: add function to start/stop device longli
2022-07-08 23:49 ` [Patch v4 17/17] net/mana: add function to report queue stats longli
2022-08-22 15:08 ` Ferruh Yigit
2022-08-22 18:35 ` Long Li
2022-08-22 14:59 ` [Patch v4 00/17] Introduce Microsoft Azure Network Adatper (MANA) PMD Ferruh Yigit
2022-08-22 17:07 ` Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=859e95d9-2483-b017-6daa-0852317b4a72@xilinx.com \
--to=ferruh.yigit@xilinx.com \
--cc=dev@dpdk.org \
--cc=longli@microsoft.com \
--cc=sharmaajay@microsoft.com \
--cc=sthemmin@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).