DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@amd.com>
To: Jiawei Wang <jiaweiw@nvidia.com>,
	viacheslavo@nvidia.com, orika@nvidia.com, thomas@monjalon.net,
	andrew.rybchenko@oktetlabs.ru,
	Aman Singh <aman.deep.singh@intel.com>,
	Yuying Zhang <yuying.zhang@intel.com>
Cc: dev@dpdk.org, rasland@nvidia.com
Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
Date: Thu, 9 Feb 2023 19:44:54 +0000	[thread overview]
Message-ID: <562e77e6-ea8a-7a56-6fbd-8aba70b295d4@amd.com> (raw)
In-Reply-To: <20230203133339.30271-2-jiaweiw@nvidia.com>

On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> When multiple physical ports are connected to a single DPDK port,
> (example: kernel bonding, DPDK bonding, failsafe, etc.),
> we want to know which physical port is used for Rx and Tx.
> 

I assume "kernel bonding" is out of context, but this patch concerns
DPDK bonding, failsafe or softnic. (I will refer them as virtual bonding
device.)

To use specific queues of the virtual bonding device may interfere with
the logic of these devices, like bonding modes or RSS of the underlying
devices. I can see feature focuses on a very specific use case, but not
sure if all possible side effects taken into consideration.


And although the feature is only relavent to virtual bondiong device,
core ethdev structures are updated for this. Most use cases won't need
these, so is there a way to reduce the scope of the changes to virtual
bonding devices?


There are a few very core ethdev APIs, like:
rte_eth_dev_configure()
rte_eth_tx_queue_setup()
rte_eth_rx_queue_setup()
rte_eth_dev_start()
rte_eth_dev_info_get()

Almost every user of ehtdev uses these APIs, since these are so
fundemental I am for being a little more conservative on these APIs.

Every eccentric features are targetting these APIs first because they
are common and extending them gives an easy solution, but in long run
making these APIs more complex, harder to maintain and harder for PMDs
to support them correctly. So I am for not updating them unless it is a
generic use case.


Also as we talked about PMDs supporting them, I assume your coming PMD
patch will be implementing 'tx_phy_affinity' config option only for mlx
drivers. What will happen for other NICs? Will they silently ignore the
config option from user? So this is a problem for the DPDK application
portabiltiy.



As far as I understand target is application controlling which
sub-device is used under the virtual bonding device, can you pleaes give
more information why this is required, perhaps it can help to provide a
better/different solution.
Like adding the ability to use both bonding device and sub-device for
data path, this way application can use whichever it wants. (this is
just first solution I come with, I am not suggesting as replacement
solution, but if you can describe the problem more I am sure other
people can come with better solutions.)

And isn't this against the applicatio transparent to underneath device
being bonding device or actual device?


> This patch maps a DPDK Tx queue with a physical port,
> by adding tx_phy_affinity setting in Tx queue.
> The affinity number is the physical port ID where packets will be
> sent.
> Value 0 means no affinity and traffic could be routed to any
> connected physical ports, this is the default current behavior.
> 
> The number of physical ports is reported with rte_eth_dev_info_get().
> 
> The new tx_phy_affinity field is added into the padding hole of
> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> An ABI check rule needs to be added to avoid false warning.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and phy_affinity 1 stood for
> the first physical port and phy_affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 phy_affinity 1
>         port config 0 txq 1 phy_affinity 1
>         port config 0 txq 2 phy_affinity 2
>         port config 0 txq 3 phy_affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
>  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
>  app/test-pmd/config.c                       |   1 +
>  devtools/libabigail.abignore                |   5 +
>  doc/guides/rel_notes/release_23_03.rst      |   4 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
>  lib/ethdev/rte_ethdev.h                     |  10 ++
>  6 files changed, 133 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index cb8c174020..f771fcf8ac 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void *parsed_result,
>  
>  			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
>  			"    Cleanup txq mbufs for a specific Tx queue\n\n"
> +
> +			"port config (port_id) txq (queue_id) phy_affinity (value)\n"
> +			"    Set the physical affinity value "
> +			"on a specific Tx queue\n\n"
>  		);
>  	}
>  
> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
>  	}
>  };
>  
> +/* *** configure port txq phy_affinity value *** */
> +struct cmd_config_tx_phy_affinity {
> +	cmdline_fixed_string_t port;
> +	cmdline_fixed_string_t config;
> +	portid_t portid;
> +	cmdline_fixed_string_t txq;
> +	uint16_t qid;
> +	cmdline_fixed_string_t phy_affinity;
> +	uint8_t value;
> +};
> +
> +static void
> +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
> +				  __rte_unused struct cmdline *cl,
> +				  __rte_unused void *data)
> +{
> +	struct cmd_config_tx_phy_affinity *res = parsed_result;
> +	struct rte_eth_dev_info dev_info;
> +	struct rte_port *port;
> +	int ret;
> +
> +	if (port_id_is_invalid(res->portid, ENABLED_WARN))
> +		return;
> +
> +	if (res->portid == (portid_t)RTE_PORT_ALL) {
> +		printf("Invalid port id\n");
> +		return;
> +	}
> +
> +	port = &ports[res->portid];
> +
> +	if (strcmp(res->txq, "txq")) {
> +		printf("Unknown parameter\n");
> +		return;
> +	}
> +	if (tx_queue_id_is_invalid(res->qid))
> +		return;
> +
> +	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
> +	if (ret != 0)
> +		return;
> +
> +	if (dev_info.nb_phy_ports == 0) {
> +		printf("Number of physical ports is 0 which is invalid for PHY Affinity\n");
> +		return;
> +	}
> +	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
> +	if (dev_info.nb_phy_ports < res->value) {
> +		printf("The PHY affinity value %u is Invalid, exceeds the "
> +		       "number of physical ports\n", res->value);
> +		return;
> +	}
> +	port->txq[res->qid].conf.tx_phy_affinity = res->value;
> +
> +	cmd_reconfig_device_queue(res->portid, 0, 1);
> +}
> +
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 port, "port");
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 config, "config");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 portid, RTE_UINT16);
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 txq, "txq");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +			      qid, RTE_UINT16);
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 phy_affinity, "phy_affinity");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +			      value, RTE_UINT8);
> +
> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
> +	.f = cmd_config_tx_phy_affinity_parsed,
> +	.data = (void *)0,
> +	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
> +	.tokens = {
> +		(void *)&cmd_config_tx_phy_affinity_port,
> +		(void *)&cmd_config_tx_phy_affinity_config,
> +		(void *)&cmd_config_tx_phy_affinity_portid,
> +		(void *)&cmd_config_tx_phy_affinity_txq,
> +		(void *)&cmd_config_tx_phy_affinity_qid,
> +		(void *)&cmd_config_tx_phy_affinity_hwport,
> +		(void *)&cmd_config_tx_phy_affinity_value,
> +		NULL,
> +	},
> +};
> +
>  /* ******************************************************************************** */
>  
>  /* list of instructions */
> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
>  	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
> +	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
>  	NULL,
>  };
>  
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index acccb6b035..b83fb17cfa 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
>  		printf("unknown\n");
>  		break;
>  	}
> +	printf("Current number of physical ports: %u\n", dev_info.nb_phy_ports);
>  }
>  
>  void
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 7a93de3ba1..ac7d3fb2da 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -34,3 +34,8 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +
> +; Ignore fields inserted in padding hole of rte_eth_txconf
> +[suppress_type]
> +        name = rte_eth_txconf
> +        has_data_member_inserted_between = {offset_of(tx_deferred_start), offset_of(offloads)}
> diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
> index 73f5d94e14..e99bd2dcb6 100644
> --- a/doc/guides/rel_notes/release_23_03.rst
> +++ b/doc/guides/rel_notes/release_23_03.rst
> @@ -55,6 +55,10 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
>  
> +* **Added affinity for multiple physical ports connected to a single DPDK port.**
> +
> +  * Added Tx affinity in queue setup to map a physical port.
> +
>  * **Updated AMD axgbe driver.**
>  
>    * Added multi-process support.
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 79a1fa9cb7..5c716f7679 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
>  
>  This command should be run when the port is stopped, or else it will fail.
>  
> +config per queue Tx physical affinity
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Configure a per queue physical affinity value only on a specific Tx queue::
> +
> +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
> +
> +* ``phy_affinity``: physical port to use for sending,
> +                    when multiple physical ports are connected to
> +                    a single DPDK port.
> +
> +This command should be run when the port is stopped, otherwise it fails.
> +
>  Config VXLAN Encap outer layers
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c129ca1eaf..2fd971b7b5 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
>  				      less free descriptors than this value. */
>  
>  	uint8_t tx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
> +	/**
> +	 * Affinity with one of the multiple physical ports connected to the DPDK port.
> +	 * Value 0 means no affinity and traffic could be routed to any connected
> +	 * physical port.
> +	 * The first physical port is number 1 and so on.
> +	 * Number of physical ports is reported by nb_phy_ports in rte_eth_dev_info.
> +	 */
> +	uint8_t tx_phy_affinity;
>  	/**
>  	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_* flags.
>  	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa
> @@ -1744,6 +1752,8 @@ struct rte_eth_dev_info {
>  	/** Device redirection table size, the total number of entries. */
>  	uint16_t reta_size;
>  	uint8_t hash_key_size; /**< Hash key size in bytes */
> +	/** Number of physical ports connected with DPDK port. */
> +	uint8_t nb_phy_ports;
>  	/** Bit mask of RSS offloads, the bit offset also means flow type */
>  	uint64_t flow_type_rss_offloads;
>  	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration */


  parent reply	other threads:[~2023-02-09 19:45 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221221102934.13822-1-jiaweiw@nvidia.com/>
2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and " Jiawei Wang
2023-02-03  5:07   ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-03  5:07   ` [PATCH v3 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-06 15:29       ` Jiawei(Jonny) Wang
2023-02-07  9:40       ` Ori Kam
2023-02-09 19:44       ` Ferruh Yigit [this message]
2023-02-10 14:06         ` Jiawei(Jonny) Wang
2023-02-14  9:38         ` Jiawei(Jonny) Wang
2023-02-14 10:01           ` Ferruh Yigit
2023-02-03 13:33     ` [PATCH v4 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
2023-02-15 11:41       ` Jiawei(Jonny) Wang
2023-02-16 17:42       ` Thomas Monjalon
2023-02-17  6:45         ` Jiawei(Jonny) Wang
2023-02-16 17:58       ` Ferruh Yigit
2023-02-17  6:44         ` Jiawei(Jonny) Wang
2023-02-17  8:24         ` Andrew Rybchenko
2023-02-17  9:50           ` Jiawei(Jonny) Wang
2023-02-14 15:48     ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
2023-02-16 17:46       ` Thomas Monjalon
2023-02-17  6:45         ` Jiawei(Jonny) Wang
2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
2023-02-17 12:53       ` Ferruh Yigit
2023-02-17 12:56       ` Andrew Rybchenko
2023-02-17 12:59         ` Ferruh Yigit
2023-02-17 13:05           ` Jiawei(Jonny) Wang
2023-02-17 13:41         ` Jiawei(Jonny) Wang
2023-02-17 15:03           ` Andrew Rybchenko
2023-02-17 15:32             ` Ferruh Yigit
2023-02-17 10:50     ` [PATCH v6 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 13:01     ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
2023-02-17 13:07       ` Jiawei(Jonny) Wang
2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
2023-02-17 15:47     ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
2023-02-17 15:47     ` [PATCH v7 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 16:45     ` [PATCH v7 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=562e77e6-ea8a-7a56-6fbd-8aba70b295d4@amd.com \
    --to=ferruh.yigit@amd.com \
    --cc=aman.deep.singh@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=jiaweiw@nvidia.com \
    --cc=orika@nvidia.com \
    --cc=rasland@nvidia.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    --cc=yuying.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).