From: Xueming Li <xuemingl@nvidia.com>
To: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Cc: <dev@dpdk.org>, <xuemingl@nvidia.com>,
Matan Azrad <matan@nvidia.com>,
Shahaf Shuler <shahafs@nvidia.com>,
Anatoly Burakov <anatoly.burakov@intel.com>
Subject: [dpdk-dev] [RFC 14/14] net/mlx5: support SubFunction
Date: Thu, 27 May 2021 17:02:02 +0300 [thread overview]
Message-ID: <20210527140202.19377-5-xuemingl@nvidia.com> (raw)
In-Reply-To: <20210527140202.19377-1-xuemingl@nvidia.com>
This patch introduces SF support. Similar to VF, SF on auxiliary bus is
a portion of hardware PF, no representor or bonding parameters for SF.
Devargs to support SF:
-a auxiliary:mlx5_core.sf.8,dv_flow_en=1
New global syntax to support SF:
-a bus=auxiliary,name=mlx5_core.sf.8/class=eth/driver=mlx5,dv_flow_en=1
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
doc/guides/nics/mlx5.rst | 339 +++++++++++++++++++++++-
drivers/net/mlx5/linux/mlx5_ethdev_os.c | 12 +-
drivers/net/mlx5/linux/mlx5_os.c | 142 +++++++---
drivers/net/mlx5/linux/mlx5_os.h | 2 +
drivers/net/mlx5/mlx5.c | 10 +-
drivers/net/mlx5/mlx5.h | 1 +
drivers/net/mlx5/mlx5_rxmode.c | 8 +-
drivers/net/mlx5/mlx5_trigger.c | 2 +-
8 files changed, 452 insertions(+), 64 deletions(-)
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 83299646dd..3f5692038c 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -403,6 +403,300 @@ Limitations
- Hairpin between two ports could only manual binding and explicit Tx flow mode. For single port hairpin, all the combinations of auto/manual binding and explicit/implicit Tx flow mode could be supported.
- Hairpin in switchdev SR-IOV mode is not supported till now.
+- Meter:
+
+Limitations
+-----------
+
+- Windows support:
+
+ On Windows, the features are limited:
+
+ - Promiscuous mode is not supported
+ - The following rules are supported:
+
+ - IPv4/UDP with CVLAN filtering
+ - Unicast MAC filtering
+
+- For secondary process:
+
+ - Forked secondary process not supported.
+ - External memory unregistered in EAL memseg list cannot be used for DMA
+ unless such memory has been registered by ``mlx5_mr_update_ext_mp()`` in
+ primary process and remapped to the same virtual address in secondary
+ process. If the external memory is registered by primary process but has
+ different virtual address in secondary process, unexpected error may happen.
+
+- When using Verbs flow engine (``dv_flow_en`` = 0), flow pattern without any
+ specific VLAN will match for VLAN packets as well:
+
+ When VLAN spec is not specified in the pattern, the matching rule will be created with VLAN as a wild card.
+ Meaning, the flow rule::
+
+ flow create 0 ingress pattern eth / vlan vid is 3 / ipv4 / end ...
+
+ Will only match vlan packets with vid=3. and the flow rule::
+
+ flow create 0 ingress pattern eth / ipv4 / end ...
+
+ Will match any ipv4 packet (VLAN included).
+
+- When using Verbs flow engine (``dv_flow_en`` = 0), multi-tagged(QinQ) match is not supported.
+
+- When using DV flow engine (``dv_flow_en`` = 1), flow pattern with any VLAN specification will match only single-tagged packets unless the ETH item ``type`` field is 0x88A8 or the VLAN item ``has_more_vlan`` field is 1.
+ The flow rule::
+
+ flow create 0 ingress pattern eth / ipv4 / end ...
+
+ Will match any ipv4 packet.
+ The flow rules::
+
+ flow create 0 ingress pattern eth / vlan / end ...
+ flow create 0 ingress pattern eth has_vlan is 1 / end ...
+ flow create 0 ingress pattern eth type is 0x8100 / end ...
+
+ Will match single-tagged packets only, with any VLAN ID value.
+ The flow rules::
+
+ flow create 0 ingress pattern eth type is 0x88A8 / end ...
+ flow create 0 ingress pattern eth / vlan has_more_vlan is 1 / end ...
+
+ Will match multi-tagged packets only, with any VLAN ID value.
+
+- A flow pattern with 2 sequential VLAN items is not supported.
+
+- VLAN pop offload command:
+
+ - Flow rules having a VLAN pop offload command as one of their actions and
+ are lacking a match on VLAN as one of their items are not supported.
+ - The command is not supported on egress traffic in NIC mode.
+
+- VLAN push offload is not supported on ingress traffic in NIC mode.
+
+- VLAN set PCP offload is not supported on existing headers.
+
+- A multi segment packet must have not more segments than reported by dev_infos_get()
+ in tx_desc_lim.nb_seg_max field. This value depends on maximal supported Tx descriptor
+ size and ``txq_inline_min`` settings and may be from 2 (worst case forced by maximal
+ inline settings) to 58.
+
+- Flows with a VXLAN Network Identifier equal (or ends to be equal)
+ to 0 are not supported.
+
+- L3 VXLAN and VXLAN-GPE tunnels cannot be supported together with MPLSoGRE and MPLSoUDP.
+
+- Match on Geneve header supports the following fields only:
+
+ - VNI
+ - OAM
+ - protocol type
+ - options length
+
+- Match on Geneve TLV option is supported on the following fields:
+
+ - Class
+ - Type
+ - Length
+ - Data
+
+ Only one Class/Type/Length Geneve TLV option is supported per shared device.
+ Class/Type/Length fields must be specified as well as masks.
+ Class/Type/Length specified masks must be full.
+ Matching Geneve TLV option without specifying data is not supported.
+ Matching Geneve TLV option with ``data & mask == 0`` is not supported.
+
+- VF: flow rules created on VF devices can only match traffic targeted at the
+ configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``).
+
+- Match on GTP tunnel header item supports the following fields only:
+
+ - v_pt_rsv_flags: E flag, S flag, PN flag
+ - msg_type
+ - teid
+
+- Match on GTP extension header only for GTP PDU session container (next
+ extension header type = 0x85).
+- Match on GTP extension header is not supported in group 0.
+
+- No Tx metadata go to the E-Switch steering domain for the Flow group 0.
+ The flows within group 0 and set metadata action are rejected by hardware.
+
+.. note::
+
+ MAC addresses not already present in the bridge table of the associated
+ kernel network device will be added and cleaned up by the PMD when closing
+ the device. In case of ungraceful program termination, some entries may
+ remain present and should be removed manually by other means.
+
+- Buffer split offload is supported with regular Rx burst routine only,
+ no MPRQ feature or vectorized code can be engaged.
+
+- When Multi-Packet Rx queue is configured (``mprq_en``), a Rx packet can be
+ externally attached to a user-provided mbuf with having EXT_ATTACHED_MBUF in
+ ol_flags. As the mempool for the external buffer is managed by PMD, all the
+ Rx mbufs must be freed before the device is closed. Otherwise, the mempool of
+ the external buffers will be freed by PMD and the application which still
+ holds the external buffers may be corrupted.
+
+- If Multi-Packet Rx queue is configured (``mprq_en``) and Rx CQE compression is
+ enabled (``rxq_cqe_comp_en``) at the same time, RSS hash result is not fully
+ supported. Some Rx packets may not have PKT_RX_RSS_HASH.
+
+- IPv6 Multicast messages are not supported on VM, while promiscuous mode
+ and allmulticast mode are both set to off.
+ To receive IPv6 Multicast messages on VM, explicitly set the relevant
+ MAC address using rte_eth_dev_mac_addr_add() API.
+
+- To support a mixed traffic pattern (some buffers from local host memory, some
+ buffers from other devices) with high bandwidth, a mbuf flag is used.
+
+ An application hints the PMD whether or not it should try to inline the
+ given mbuf data buffer. PMD should do the best effort to act upon this request.
+
+ The hint flag ``RTE_PMD_MLX5_FINE_GRANULARITY_INLINE`` is dynamic,
+ registered by application with rte_mbuf_dynflag_register(). This flag is
+ purely driver-specific and declared in PMD specific header ``rte_pmd_mlx5.h``,
+ which is intended to be used by the application.
+
+ To query the supported specific flags in runtime,
+ the function ``rte_pmd_mlx5_get_dyn_flag_names`` returns the array of
+ currently (over present hardware and configuration) supported specific flags.
+ The "not inline hint" feature operating flow is the following one:
+
+ - application starts
+ - probe the devices, ports are created
+ - query the port capabilities
+ - if port supporting the feature is found
+ - register dynamic flag ``RTE_PMD_MLX5_FINE_GRANULARITY_INLINE``
+ - application starts the ports
+ - on ``dev_start()`` PMD checks whether the feature flag is registered and
+ enables the feature support in datapath
+ - application might set the registered flag bit in ``ol_flags`` field
+ of mbuf being sent and PMD will handle ones appropriately.
+
+- The amount of descriptors in Tx queue may be limited by data inline settings.
+ Inline data require the more descriptor building blocks and overall block
+ amount may exceed the hardware supported limits. The application should
+ reduce the requested Tx size or adjust data inline settings with
+ ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
+
+- To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
+ parameter should be specified.
+ When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME set on the packet
+ being sent it tries to synchronize the time of packet appearing on
+ the wire with the specified packet timestamp. It the specified one
+ is in the past it should be ignored, if one is in the distant future
+ it should be capped with some reasonable value (in range of seconds).
+ These specific cases ("too late" and "distant future") can be optionally
+ reported via device xstats to assist applications to detect the
+ time-related problems.
+
+ The timestamp upper "too-distant-future" limit
+ at the moment of invoking the Tx burst routine
+ can be estimated as ``tx_pp`` option (in nanoseconds) multiplied by 2^23.
+ Please note, for the testpmd txonly mode,
+ the limit is deduced from the expression::
+
+ (n_tx_descriptors / burst_size + 1) * inter_burst_gap
+
+ There is no any packet reordering according timestamps is supposed,
+ neither within packet burst, nor between packets, it is an entirely
+ application responsibility to generate packets and its timestamps
+ in desired order. The timestamps can be put only in the first packet
+ in the burst providing the entire burst scheduling.
+
+- E-Switch decapsulation Flow:
+
+ - can be applied to PF port only.
+ - must specify VF port action (packet redirection from PF to VF).
+ - optionally may specify tunnel inner source and destination MAC addresses.
+
+- E-Switch encapsulation Flow:
+
+ - can be applied to VF ports only.
+ - must specify PF port action (packet redirection from VF to PF).
+
+- Raw encapsulation:
+
+ - The input buffer, used as outer header, is not validated.
+
+- Raw decapsulation:
+
+ - The decapsulation is always done up to the outermost tunnel detected by the HW.
+ - The input buffer, providing the removal size, is not validated.
+ - The buffer size must match the length of the headers to be removed.
+
+- ICMP(code/type/identifier/sequence number) / ICMP6(code/type) matching, IP-in-IP and MPLS flow matching are all
+ mutually exclusive features which cannot be supported together
+ (see :ref:`mlx5_firmware_config`).
+
+- LRO:
+
+ - Requires DevX and DV flow to be enabled.
+ - KEEP_CRC offload cannot be supported with LRO.
+ - The first mbuf length, without head-room, must be big enough to include the
+ TCP header (122B).
+ - Rx queue with LRO offload enabled, receiving a non-LRO packet, can forward
+ it with size limited to max LRO size, not to max RX packet length.
+ - LRO can be used with outer header of TCP packets of the standard format:
+ eth (with or without vlan) / ipv4 or ipv6 / tcp / payload
+
+ Other TCP packets (e.g. with MPLS label) received on Rx queue with LRO enabled, will be received with bad checksum.
+ - LRO packet aggregation is performed by HW only for packet size larger than
+ ``lro_min_mss_size``. This value is reported on device start, when debug
+ mode is enabled.
+
+- CRC:
+
+ - ``DEV_RX_OFFLOAD_KEEP_CRC`` cannot be supported with decapsulation
+ for some NICs (such as ConnectX-6 Dx, ConnectX-6 Lx, and BlueField-2).
+ The capability bit ``scatter_fcs_w_decap_disable`` shows NIC support.
+
+- TX mbuf fast free:
+
+ - fast free offload assumes the all mbufs being sent are originated from the
+ same memory pool and there is no any extra references to the mbufs (the
+ reference counter for each mbuf is equal 1 on tx_burst call). The latter
+ means there should be no any externally attached buffers in mbufs. It is
+ an application responsibility to provide the correct mbufs if the fast
+ free offload is engaged. The mlx5 PMD implicitly produces the mbufs with
+ externally attached buffers if MPRQ option is enabled, hence, the fast
+ free offload is neither supported nor advertised if there is MPRQ enabled.
+
+- Sample flow:
+
+ - Supports ``RTE_FLOW_ACTION_TYPE_SAMPLE`` action only within NIC Rx and
+ E-Switch steering domain.
+ - For E-Switch Sampling flow with sample ratio > 1, additional actions are not
+ supported in the sample actions list.
+ - For ConnectX-5, the ``RTE_FLOW_ACTION_TYPE_SAMPLE`` is typically used as
+ first action in the E-Switch egress flow if with header modify or
+ encapsulation actions.
+ - For NIC Rx flow, supports ``MARK``, ``COUNT``, ``QUEUE``, ``RSS`` in the
+ sample actions list.
+ - For E-Switch mirroring flow, supports ``RAW ENCAP``, ``Port ID``,
+ ``VXLAN ENCAP``, ``NVGRE ENCAP`` in the sample actions list.
+
+- Modify Field flow:
+
+ - Supports the 'set' operation only for ``RTE_FLOW_ACTION_TYPE_MODIFY_FIELD`` action.
+ - Modification of an arbitrary place in a packet via the special ``RTE_FLOW_FIELD_START`` Field ID is not supported.
+ - Modification of the 802.1Q Tag, VXLAN Network or GENEVE Network ID's is not supported.
+ - Encapsulation levels are not supported, can modify outermost header fields only.
+ - Offsets must be 32-bits aligned, cannot skip past the boundary of a field.
+
+- IPv6 header item 'proto' field, indicating the next header protocol, should
+ not be set as extension header.
+ In case the next header is an extension header, it should not be specified in
+ IPv6 header item 'proto' field.
+ The last extension header item 'next header' field can specify the following
+ header protocol type.
+
+- Hairpin:
+
+ - Hairpin between two ports could only manual binding and explicit Tx flow mode. For single port hairpin, all the combinations of auto/manual binding and explicit/implicit Tx flow mode could be supported.
+ - Hairpin in switchdev SR-IOV mode is not supported till now.
+
- Meter:
- All the meter colors with drop action will be counted only by the global drop statistics.
@@ -1438,13 +1732,17 @@ the DPDK application.
echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
-Sub-Function representor
-------------------------
+SubFunction support
+-------------------
+SubFunction is a portion of the PCI device, a SF netdev has its own
+dedicated queues(txq, rxq). A SF shares PCI level resources with other SFs
+and/or with its parent PCI function.
-Sub-Function is a portion of the PCI device, a SF netdev has its own
-dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
-offload similar to existing PF and VF representors. A SF shares PCI
-level resources with other SFs and/or with its parent PCI function.
+0. Requirement::
+
+ kernel version >= 5.12 or OFED version >= 5.6
+
+ iproute2 >= 5.11
1. Configure SF feature::
@@ -1457,21 +1755,34 @@ level resources with other SFs and/or with its parent PCI function.
2: 32 SFs
3: 64 SFs
-2. Reset the FW::
+2. Enable switchdev mode::
- mlxfwreset -d <mst device> reset
+ devlink dev eswitch set pci/<DBDF> mode switchdev
-3. Enable switchdev mode::
+3. Add SF port::
- echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+ devlink port add pci/<DBDF> flavour pcisf pfnum 0 sfnum <sfnum>
+
+ Get SFID from output: pci/<DBDF>/<SFID>
+
+4. Modify MAC address::
+
+ devlink port function set pci/<DBDF>/<SFID> hw_addr <MAC>
+
+5. Activate SF port::
+
+ devlink port function set pci/<DBDF>/<ID> state active
-4. Create SF::
+6. Devargs to probe SF device::
- mlnx-sf -d <PCI_BDF> -a create
+ auxiliary:mlx5_core.sf.9,dv_flow_en=1
-5. Probe SF representor::
+SubFunction representor support
+-------------------------------
+A SF netdev supports E-Switch representation offload similar to existing PF
+and VF representors. Use <sfnum> to probe SF representor.
- testpmd> port attach <PCI_BDF>,representor=sf0,dv_flow_en=1
+ testpmd> port attach <PCI_BDF>,representor=sf<sfnum>,dv_flow_en=1
Performance tuning
------------------
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 6fdb310129..8678502595 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -128,6 +128,17 @@ struct ethtool_link_settings {
#define ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT 2 /* 66 - 64 */
#endif
+/* Get interface index from SubFunction device name. */
+int
+mlx5_auxiliary_get_ifindex(const char *sf_name)
+{
+ char if_name[IF_NAMESIZE];
+
+ if (mlx5_auxiliary_get_child_name(sf_name, "/net",
+ if_name, sizeof(if_name)) != 0)
+ return -rte_errno;
+ return if_nametoindex(if_name);
+}
/**
* Get interface name from private structure.
@@ -1619,4 +1630,3 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
return 0;
}
-
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 4f16230fa5..d74273a7ca 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -20,6 +20,7 @@
#include <ethdev_pci.h>
#include <rte_pci.h>
#include <rte_bus_pci.h>
+#include <rte_bus_auxiliary.h>
#include <rte_common.h>
#include <rte_kvargs.h>
#include <rte_rwlock.h>
@@ -1923,6 +1924,27 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
return pf;
}
+static void
+mlx5_os_config_default(struct mlx5_dev_config *config)
+{
+ memset(config, 0, sizeof(*config));
+ config->mps = MLX5_ARG_UNSET;
+ config->dbnc = MLX5_ARG_UNSET;
+ config->rx_vec_en = 1;
+ config->txq_inline_max = MLX5_ARG_UNSET;
+ config->txq_inline_min = MLX5_ARG_UNSET;
+ config->txq_inline_mpw = MLX5_ARG_UNSET;
+ config->txqs_inline = MLX5_ARG_UNSET;
+ config->vf_nl_en = 1;
+ config->mr_ext_memseg_en = 1;
+ config->mprq.max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN;
+ config->mprq.min_rxqs_num = MLX5_MPRQ_MIN_RXQS;
+ config->dv_esw_en = 1;
+ config->dv_flow_en = 1;
+ config->decap_en = 1;
+ config->log_hp_size = MLX5_ARG_UNSET;
+}
+
/**
* Register a PCI device within bonding.
*
@@ -2334,23 +2356,8 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
uint32_t restore;
/* Default configuration. */
- memset(&dev_config, 0, sizeof(struct mlx5_dev_config));
+ mlx5_os_config_default(&dev_config);
dev_config.vf = dev_config_vf;
- dev_config.mps = MLX5_ARG_UNSET;
- dev_config.dbnc = MLX5_ARG_UNSET;
- dev_config.rx_vec_en = 1;
- dev_config.txq_inline_max = MLX5_ARG_UNSET;
- dev_config.txq_inline_min = MLX5_ARG_UNSET;
- dev_config.txq_inline_mpw = MLX5_ARG_UNSET;
- dev_config.txqs_inline = MLX5_ARG_UNSET;
- dev_config.vf_nl_en = 1;
- dev_config.mr_ext_memseg_en = 1;
- dev_config.mprq.max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN;
- dev_config.mprq.min_rxqs_num = MLX5_MPRQ_MIN_RXQS;
- dev_config.dv_esw_en = 1;
- dev_config.dv_flow_en = 1;
- dev_config.decap_en = 1;
- dev_config.log_hp_size = MLX5_ARG_UNSET;
list[i].eth_dev = mlx5_dev_spawn(&pci_dev->device,
&list[i],
&dev_config,
@@ -2407,6 +2414,35 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
return ret;
}
+static int
+mlx5_os_parse_eth_devargs(struct rte_device *dev,
+ struct rte_eth_devargs *eth_da)
+{
+ int ret = 0;
+
+ if (dev->devargs == NULL)
+ return 0;
+ memset(eth_da, 0, sizeof(*eth_da));
+ /* Parse representor information first from class argument. */
+ if (dev->devargs->cls_str)
+ ret = rte_eth_devargs_parse(dev->devargs->cls_str, eth_da);
+ if (ret != 0) {
+ DRV_LOG(ERR, "failed to parse device arguments: %s",
+ dev->devargs->cls_str);
+ return -rte_errno;
+ }
+ if (eth_da->type == RTE_ETH_REPRESENTOR_NONE) {
+ /* Parse legacy device argument */
+ ret = rte_eth_devargs_parse(dev->devargs->args, eth_da);
+ if (ret) {
+ DRV_LOG(ERR, "failed to parse device arguments: %s",
+ dev->devargs->args);
+ return -rte_errno;
+ }
+ }
+ return 0;
+}
+
/**
* Callback to register a PCI device.
*
@@ -2421,31 +2457,13 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
static int
mlx5_os_pci_probe(struct rte_pci_device *pci_dev)
{
- struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+ struct rte_eth_devargs eth_da = { .nb_ports = 0 };
int ret = 0;
uint16_t p;
- if (pci_dev->device.devargs) {
- /* Parse representor information from device argument. */
- if (pci_dev->device.devargs->cls_str)
- ret = rte_eth_devargs_parse
- (pci_dev->device.devargs->cls_str, ð_da);
- if (ret) {
- DRV_LOG(ERR, "failed to parse device arguments: %s",
- pci_dev->device.devargs->cls_str);
- return -rte_errno;
- }
- if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
- /* Support legacy device argument */
- ret = rte_eth_devargs_parse
- (pci_dev->device.devargs->args, ð_da);
- if (ret) {
- DRV_LOG(ERR, "failed to parse device arguments: %s",
- pci_dev->device.devargs->args);
- return -rte_errno;
- }
- }
- }
+ ret = mlx5_os_parse_eth_devargs(&pci_dev->device, ð_da);
+ if (ret != 0)
+ return ret;
if (eth_da.nb_ports > 0) {
/* Iterate all port if devargs pf is range: "pf[0-1]vf[...]". */
@@ -2458,10 +2476,53 @@ mlx5_os_pci_probe(struct rte_pci_device *pci_dev)
return ret;
}
+/* Probe a single SF device on auxiliary bus, no representor support. */
+static int
+mlx5_os_auxiliary_probe(struct rte_device *dev)
+{
+ struct rte_eth_devargs eth_da = { .nb_ports = 0 };
+ struct mlx5_dev_config config;
+ struct mlx5_dev_spawn_data spawn = { .pf_bond = -1 };
+ struct rte_auxiliary_device *adev = RTE_DEV_TO_AUXILIARY(dev);
+ struct rte_eth_dev *eth_dev;
+ int ret = 0;
+
+ /* Parse ethdev devargs. */
+ ret = mlx5_os_parse_eth_devargs(dev, ð_da);
+ if (ret != 0)
+ return ret;
+ /* Set default config data. */
+ mlx5_os_config_default(&config);
+ config.sf = 1;
+ /* Init spawn data. */
+ spawn.max_port = 1;
+ spawn.phys_port = 1;
+ spawn.phys_dev = mlx5_get_ibv_device(dev);
+ ret = mlx5_auxiliary_get_ifindex(dev->name);
+ if (ret < 0) {
+ DRV_LOG(ERR, "failed to get ethdev ifindex: %s", dev->name);
+ return ret;
+ }
+ spawn.ifindex = ret;
+ /* Spawn device. */
+ eth_dev = mlx5_dev_spawn(dev, &spawn, &config, ð_da);
+ if (eth_dev == NULL)
+ return -rte_errno;
+ /* Post create. */
+ eth_dev->intr_handle = &adev->intr_handle;
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+ eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
+ eth_dev->data->numa_node = dev->numa_node;
+ }
+ rte_eth_dev_probing_finish(eth_dev);
+ return 0;
+}
+
/**
* Common bus driver callback to probe a device.
*
- * This function probe PCI bus device(s).
+ * This function probe PCI bus device(s) or a single SF on auxiliary bus.
*
* @param[in] dev
* Pointer to the generic device.
@@ -2484,7 +2545,8 @@ mlx5_os_net_probe(struct rte_device *dev)
}
if (mlx5_dev_is_pci(dev))
return mlx5_os_pci_probe(RTE_DEV_TO_PCI(dev));
- return 0;
+ else
+ return mlx5_os_auxiliary_probe(dev);
}
static int
diff --git a/drivers/net/mlx5/linux/mlx5_os.h b/drivers/net/mlx5/linux/mlx5_os.h
index af7cbeb418..2991d37df2 100644
--- a/drivers/net/mlx5/linux/mlx5_os.h
+++ b/drivers/net/mlx5/linux/mlx5_os.h
@@ -19,4 +19,6 @@ enum {
#define MLX5_NAMESIZE IF_NAMESIZE
+int mlx5_auxiliary_get_ifindex(const char *sf_name);
+
#endif /* RTE_PMD_MLX5_OS_H_ */
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 3defdb2db3..69edd55b86 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -2319,10 +2319,12 @@ mlx5_eth_find_next(uint16_t port_id, struct rte_eth_dev *odev)
if (opriv->sh == priv->sh ||
odev->device == dev->device)
break;
- } else if (dev->device != NULL && dev->device->driver &&
- dev->device->driver->name &&
- !strcmp(dev->device->driver->name,
- MLX5_PCI_DRIVER_NAME)) {
+ } else if (dev->device != NULL && dev->device->driver != NULL &&
+ dev->device->driver->name != NULL &&
+ (strcmp(dev->device->driver->name,
+ MLX5_PCI_DRIVER_NAME) == 0 ||
+ strcmp(dev->device->driver->name,
+ MLX5_AUXILIARY_DRIVER_NAME) == 0)) {
/* odev not specified, found all mlx5 devices. */
break;
}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 27bb34e827..b06f45fc54 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -220,6 +220,7 @@ struct mlx5_dev_config {
unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
unsigned int hw_padding:1; /* End alignment padding is supported. */
unsigned int vf:1; /* This is a VF. */
+ unsigned int sf:1; /* This is a SF. */
unsigned int tunnel_en:1;
/* Whether tunnel stateless offloads are supported. */
unsigned int mpls_en:1; /* MPLS over GRE/UDP is enabled. */
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 25fb47c9ed..7f19b235c2 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -36,7 +36,7 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
dev->data->port_id);
return 0;
}
- if (priv->config.vf) {
+ if (priv->config.vf || priv->config.sf) {
ret = mlx5_os_set_promisc(dev, 1);
if (ret)
return ret;
@@ -69,7 +69,7 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
int ret;
dev->data->promiscuous = 0;
- if (priv->config.vf) {
+ if (priv->config.vf || priv->config.sf) {
ret = mlx5_os_set_promisc(dev, 0);
if (ret)
return ret;
@@ -109,7 +109,7 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
dev->data->port_id);
return 0;
}
- if (priv->config.vf) {
+ if (priv->config.vf || priv->config.sf) {
ret = mlx5_os_set_allmulti(dev, 1);
if (ret)
goto error;
@@ -142,7 +142,7 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
int ret;
dev->data->all_multicast = 0;
- if (priv->config.vf) {
+ if (priv->config.vf || priv->config.sf) {
ret = mlx5_os_set_allmulti(dev, 0);
if (ret)
goto error;
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 6c8a64ce03..e4e057a6f8 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1259,7 +1259,7 @@ mlx5_traffic_enable(struct rte_eth_dev *dev)
}
mlx5_txq_release(dev, i);
}
- if (priv->config.dv_esw_en && !priv->config.vf) {
+ if (priv->config.dv_esw_en && !priv->config.vf && !priv->config.sf) {
if (mlx5_flow_create_esw_table_zero_flow(dev))
priv->fdb_def_rule = 1;
else
--
2.25.1
next prev parent reply other threads:[~2021-05-27 14:03 UTC|newest]
Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-27 13:37 [dpdk-dev] [RFC 00/14] mlx5: " Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 01/14] common/mlx5: add common device driver Xueming Li
2021-06-10 9:51 ` Thomas Monjalon
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 00/14] net/mlx5: support Sub-Function Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 00/16] " Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 01/16] common/mlx5: rename eth device class name Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 02/16] common/mlx5: add common device driver Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 03/16] common/mlx5: move description of PCI sysfs functions Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 04/16] common/mlx5: support auxiliary bus Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 05/16] common/mlx5: get PCI device address from any bus Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 06/16] net/mlx5: remove PCI dependency Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 07/16] net/mlx5: migrate to bus-agnostic common driver Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 08/16] net/mlx5: support SubFunction Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 09/16] net/mlx5: check max Verbs port number Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 10/16] regex/mlx5: migrate to common driver Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 11/16] vdpa/mlx5: define driver name as macro Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 12/16] vdpa/mlx5: remove PCI specifics Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 13/16] vdpa/mlx5: support SubFunction Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 14/16] compress/mlx5: migrate to common driver Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 15/16] crypto/mlx5: " Xueming Li
2021-07-21 14:37 ` [dpdk-dev] [PATCH v4 16/16] common/mlx5: clean up legacy PCI bus driver Xueming Li
2021-07-21 22:24 ` [dpdk-dev] [PATCH v4 00/16] net/mlx5: support Sub-Function Thomas Monjalon
2021-07-22 3:03 ` Xueming(Steven) Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 01/14] common/mlx5: add common device driver Xueming Li
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 00/14] net/mlx5: support Sub-Function Xueming Li
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 01/14] common/mlx5: add common device driver Xueming Li
2021-07-14 5:58 ` Slava Ovsiienko
2021-07-18 18:28 ` Thomas Monjalon
2021-07-19 4:05 ` Xueming(Steven) Li
2021-07-19 2:53 ` [dpdk-dev] [PATCH v3 00/15] net/mlx5: support Sub-Function Xueming Li
2021-07-19 2:53 ` [dpdk-dev] [PATCH v3 01/15] common/mlx5: rename eth device class name Xueming Li
2021-07-19 2:53 ` [dpdk-dev] [PATCH v3 02/15] common/mlx5: add common device driver Xueming Li
2021-07-19 2:53 ` [dpdk-dev] [PATCH v3 03/15] common/mlx5: move description of PCI sysfs functions Xueming Li
2021-07-19 2:53 ` [dpdk-dev] [PATCH v3 04/15] common/mlx5: support auxiliary bus Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 05/15] common/mlx5: get PCI device address from any bus Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 06/15] net/mlx5: remove PCI dependency Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 07/15] net/mlx5: migrate to bus-agnostic common driver Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 08/15] net/mlx5: support SubFunction Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 09/15] net/mlx5: check max Verbs port number Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 10/15] regex/mlx5: migrate to common driver Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 11/15] vdpa/mlx5: define driver name as macro Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 12/15] vdpa/mlx5: remove PCI specifics Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 13/15] vdpa/mlx5: support SubFunction Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 14/15] compress/mlx5: migrate to common driver Xueming Li
2021-07-19 2:54 ` [dpdk-dev] [PATCH v3 15/15] common/mlx5: clean up legacy PCI bus driver Xueming Li
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 02/14] common/mlx5: move description of PCI sysfs functions Xueming Li
2021-07-14 5:58 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 03/14] common/mlx5: support auxiliary bus Xueming Li
2021-07-14 5:58 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 04/14] common/mlx5: get PCI device address from any bus Xueming Li
2021-07-14 5:59 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 05/14] net/mlx5: remove PCI dependency Xueming Li
2021-07-14 5:59 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 06/14] net/mlx5: migrate to bus-agnostic common driver Xueming Li
2021-07-14 5:59 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 07/14] net/mlx5: support SubFunction Xueming Li
2021-07-14 5:59 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 08/14] net/mlx5: check max Verbs port number Xueming Li
2021-07-14 6:00 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 09/14] regex/mlx5: migrate to common driver Xueming Li
2021-07-14 6:00 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 10/14] vdpa/mlx5: define driver name as macro Xueming Li
2021-07-14 6:00 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 11/14] vdpa/mlx5: remove PCI specifics Xueming Li
2021-07-14 6:08 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 12/14] vdpa/mlx5: support SubFunction Xueming Li
2021-07-14 6:01 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 13/14] compress/mlx5: migrate to common driver Xueming Li
2021-07-14 6:01 ` Slava Ovsiienko
2021-07-13 13:14 ` [dpdk-dev] [PATCH v2 14/14] common/mlx5: clean up legacy PCI bus driver Xueming Li
2021-07-14 6:01 ` Slava Ovsiienko
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 02/14] common/mlx5: move description of PCI sysfs functions Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 03/14] common/mlx5: support auxiliary bus Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 04/14] common/mlx5: get PCI device address from any bus Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 05/14] net/mlx5: remove PCI dependency Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 06/14] net/mlx5: migrate to bus-agnostic common driver Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 07/14] net/mlx5: support SubFunction Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 08/14] net/mlx5: check max Verbs port number Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 09/14] regex/mlx5: migrate to common driver Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 10/14] vdpa/mlx5: define driver name as macro Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 11/14] vdpa/mlx5: remove PCI specifics Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 12/14] vdpa/mlx5: support SubFunction Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 13/14] compress/mlx5: migrate to common driver Xueming Li
2021-06-16 4:09 ` [dpdk-dev] [PATCH v1 14/14] common/mlx5: clean up legacy PCI bus driver Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 02/14] common/mlx5: move description of PCI sysfs functions Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 03/14] net/mlx5: remove PCI dependency Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 04/14] net/mlx5: migrate to bus-agnostic common driver Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 05/14] regex/mlx5: migrate to " Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 06/14] compress/mlx5: " Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 07/14] vdpa/mlx5: fix driver name Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 08/14] vdpa/mlx5: remove PCI specifics Xueming Li
2021-05-27 13:37 ` [dpdk-dev] [RFC 09/14] common/mlx5: clean up legacy PCI bus driver Xueming Li
2021-05-27 14:01 ` [dpdk-dev] [RFC 10/14] bus/auxiliary: introduce auxiliary bus Xueming Li
2021-05-27 14:01 ` [dpdk-dev] [RFC 11/14] common/mlx5: support " Xueming Li
2021-05-27 14:02 ` [dpdk-dev] [RFC 12/14] common/mlx5: get PCI device address from any bus Xueming Li
2021-05-27 14:02 ` [dpdk-dev] [RFC 13/14] vdpa/mlx5: support SubFunction Xueming Li
2021-05-27 14:02 ` Xueming Li [this message]
2021-06-10 10:33 ` [dpdk-dev] [RFC 00/14] mlx5: " Ferruh Yigit
2021-06-10 13:23 ` Thomas Monjalon
2021-06-11 5:14 ` Xia, Chenbo
2021-06-11 7:54 ` Thomas Monjalon
2021-06-15 2:10 ` Xia, Chenbo
2021-06-15 4:04 ` Parav Pandit
2021-06-15 5:33 ` Xia, Chenbo
2021-06-15 5:43 ` Parav Pandit
2021-06-15 11:19 ` Xia, Chenbo
2021-06-15 12:47 ` Parav Pandit
2021-06-15 15:19 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210527140202.19377-5-xuemingl@nvidia.com \
--to=xuemingl@nvidia.com \
--cc=anatoly.burakov@intel.com \
--cc=dev@dpdk.org \
--cc=matan@nvidia.com \
--cc=shahafs@nvidia.com \
--cc=viacheslavo@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).