DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@amd.com>
To: "Jiawei(Jonny) Wang" <jiaweiw@nvidia.com>,
	Slava Ovsiienko <viacheslavo@nvidia.com>,
	Ori Kam <orika@nvidia.com>,
	"NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>,
	"andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
	Aman Singh <aman.deep.singh@intel.com>,
	Yuying Zhang <yuying.zhang@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Raslan Darawsheh <rasland@nvidia.com>
Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
Date: Tue, 14 Feb 2023 10:01:31 +0000	[thread overview]
Message-ID: <8bf62f42-c40b-536b-1946-f1158dbb31b0@amd.com> (raw)
In-Reply-To: <PH0PR12MB5451BED719E2C4C890EF96BEC6A29@PH0PR12MB5451.namprd12.prod.outlook.com>

On 2/14/2023 9:38 AM, Jiawei(Jonny) Wang wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, February 10, 2023 3:45 AM
>> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
>> Yuying Zhang <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
>> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
>> API
>>
>> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
>>> When multiple physical ports are connected to a single DPDK port,
>>> (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
>>> know which physical port is used for Rx and Tx.
>>>
>>
>> I assume "kernel bonding" is out of context, but this patch concerns DPDK
>> bonding, failsafe or softnic. (I will refer them as virtual bonding
>> device.)
>>
>> To use specific queues of the virtual bonding device may interfere with the
>> logic of these devices, like bonding modes or RSS of the underlying devices. I
>> can see feature focuses on a very specific use case, but not sure if all possible
>> side effects taken into consideration.
>>
>>
>> And although the feature is only relavent to virtual bondiong device, core
>> ethdev structures are updated for this. Most use cases won't need these, so is
>> there a way to reduce the scope of the changes to virtual bonding devices?
>>
>>
>> There are a few very core ethdev APIs, like:
>> rte_eth_dev_configure()
>> rte_eth_tx_queue_setup()
>> rte_eth_rx_queue_setup()
>> rte_eth_dev_start()
>> rte_eth_dev_info_get()
>>
>> Almost every user of ehtdev uses these APIs, since these are so fundemental I
>> am for being a little more conservative on these APIs.
>>
>> Every eccentric features are targetting these APIs first because they are
>> common and extending them gives an easy solution, but in long run making
>> these APIs more complex, harder to maintain and harder for PMDs to support
>> them correctly. So I am for not updating them unless it is a generic use case.
>>
>>
>> Also as we talked about PMDs supporting them, I assume your coming PMD
>> patch will be implementing 'tx_phy_affinity' config option only for mlx drivers.
>> What will happen for other NICs? Will they silently ignore the config option
>> from user? So this is a problem for the DPDK application portabiltiy.
>>
>>
>>
>> As far as I understand target is application controlling which sub-device is used
>> under the virtual bonding device, can you pleaes give more information why
>> this is required, perhaps it can help to provide a better/different solution.
>> Like adding the ability to use both bonding device and sub-device for data path,
>> this way application can use whichever it wants. (this is just first solution I
>> come with, I am not suggesting as replacement solution, but if you can describe
>> the problem more I am sure other people can come with better solutions.)
>>
>> And isn't this against the applicatio transparent to underneath device being
>> bonding device or actual device?
>>
>>
> 
> OK, I will send the new version with separate functions in ethdev layer, 
> to support the Map a Tx queue to port and get the number of ports.
> And these functions work with device ops callback, other NICs will reported
> The unsupported the ops callback is NULL.
> 

OK, thanks Jonny, at least this separates the fetaure to its own APIs
which reduces the impact for applications and drivers that are not using
this feature.


>>> This patch maps a DPDK Tx queue with a physical port, by adding
>>> tx_phy_affinity setting in Tx queue.
>>> The affinity number is the physical port ID where packets will be
>>> sent.
>>> Value 0 means no affinity and traffic could be routed to any connected
>>> physical ports, this is the default current behavior.
>>>
>>> The number of physical ports is reported with rte_eth_dev_info_get().
>>>
>>> The new tx_phy_affinity field is added into the padding hole of
>>> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
>>> An ABI check rule needs to be added to avoid false warning.
>>>
>>> Add the testpmd command line:
>>> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
>>>
>>> For example, there're two physical ports connected to a single DPDK
>>> port (port id 0), and phy_affinity 1 stood for the first physical port
>>> and phy_affinity 2 stood for the second physical port.
>>> Use the below commands to config tx phy affinity for per Tx Queue:
>>>         port config 0 txq 0 phy_affinity 1
>>>         port config 0 txq 1 phy_affinity 1
>>>         port config 0 txq 2 phy_affinity 2
>>>         port config 0 txq 3 phy_affinity 2
>>>
>>> These commands config the Tx Queue index 0 and Tx Queue index 1 with
>>> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
>>> packets will be sent from the first physical port, and similar with
>>> the second physical port if sending packets with Tx Queue 2 or Tx
>>> Queue 3.
>>>
>>> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
>>> ---
>>>  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
>>>  app/test-pmd/config.c                       |   1 +
>>>  devtools/libabigail.abignore                |   5 +
>>>  doc/guides/rel_notes/release_23_03.rst      |   4 +
>>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
>>>  lib/ethdev/rte_ethdev.h                     |  10 ++
>>>  6 files changed, 133 insertions(+)
>>>
>>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
>>> cb8c174020..f771fcf8ac 100644
>>> --- a/app/test-pmd/cmdline.c
>>> +++ b/app/test-pmd/cmdline.c
>>> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void
>>> *parsed_result,
>>>
>>>  			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
>>>  			"    Cleanup txq mbufs for a specific Tx queue\n\n"
>>> +
>>> +			"port config (port_id) txq (queue_id) phy_affinity
>> (value)\n"
>>> +			"    Set the physical affinity value "
>>> +			"on a specific Tx queue\n\n"
>>>  		);
>>>  	}
>>>
>>> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t
>> cmd_show_port_flow_transfer_proxy = {
>>>  	}
>>>  };
>>>
>>> +/* *** configure port txq phy_affinity value *** */ struct
>>> +cmd_config_tx_phy_affinity {
>>> +	cmdline_fixed_string_t port;
>>> +	cmdline_fixed_string_t config;
>>> +	portid_t portid;
>>> +	cmdline_fixed_string_t txq;
>>> +	uint16_t qid;
>>> +	cmdline_fixed_string_t phy_affinity;
>>> +	uint8_t value;
>>> +};
>>> +
>>> +static void
>>> +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
>>> +				  __rte_unused struct cmdline *cl,
>>> +				  __rte_unused void *data)
>>> +{
>>> +	struct cmd_config_tx_phy_affinity *res = parsed_result;
>>> +	struct rte_eth_dev_info dev_info;
>>> +	struct rte_port *port;
>>> +	int ret;
>>> +
>>> +	if (port_id_is_invalid(res->portid, ENABLED_WARN))
>>> +		return;
>>> +
>>> +	if (res->portid == (portid_t)RTE_PORT_ALL) {
>>> +		printf("Invalid port id\n");
>>> +		return;
>>> +	}
>>> +
>>> +	port = &ports[res->portid];
>>> +
>>> +	if (strcmp(res->txq, "txq")) {
>>> +		printf("Unknown parameter\n");
>>> +		return;
>>> +	}
>>> +	if (tx_queue_id_is_invalid(res->qid))
>>> +		return;
>>> +
>>> +	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
>>> +	if (ret != 0)
>>> +		return;
>>> +
>>> +	if (dev_info.nb_phy_ports == 0) {
>>> +		printf("Number of physical ports is 0 which is invalid for PHY
>> Affinity\n");
>>> +		return;
>>> +	}
>>> +	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
>>> +	if (dev_info.nb_phy_ports < res->value) {
>>> +		printf("The PHY affinity value %u is Invalid, exceeds the "
>>> +		       "number of physical ports\n", res->value);
>>> +		return;
>>> +	}
>>> +	port->txq[res->qid].conf.tx_phy_affinity = res->value;
>>> +
>>> +	cmd_reconfig_device_queue(res->portid, 0, 1); }
>>> +
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 port, "port");
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 config, "config");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
>>> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 portid, RTE_UINT16);
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 txq, "txq");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
>>> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +			      qid, RTE_UINT16);
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 phy_affinity, "phy_affinity");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
>>> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +			      value, RTE_UINT8);
>>> +
>>> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
>>> +	.f = cmd_config_tx_phy_affinity_parsed,
>>> +	.data = (void *)0,
>>> +	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
>>> +	.tokens = {
>>> +		(void *)&cmd_config_tx_phy_affinity_port,
>>> +		(void *)&cmd_config_tx_phy_affinity_config,
>>> +		(void *)&cmd_config_tx_phy_affinity_portid,
>>> +		(void *)&cmd_config_tx_phy_affinity_txq,
>>> +		(void *)&cmd_config_tx_phy_affinity_qid,
>>> +		(void *)&cmd_config_tx_phy_affinity_hwport,
>>> +		(void *)&cmd_config_tx_phy_affinity_value,
>>> +		NULL,
>>> +	},
>>> +};
>>> +
>>>  /*
>>>
>> ****************************************************************
>> ******
>>> ********** */
>>>
>>>  /* list of instructions */
>>> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
>>>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
>>>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
>>>  	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
>>> +	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
>>>  	NULL,
>>>  };
>>>
>>> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
>>> acccb6b035..b83fb17cfa 100644
>>> --- a/app/test-pmd/config.c
>>> +++ b/app/test-pmd/config.c
>>> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
>>>  		printf("unknown\n");
>>>  		break;
>>>  	}
>>> +	printf("Current number of physical ports: %u\n",
>>> +dev_info.nb_phy_ports);
>>>  }
>>>
>>>  void
>>> diff --git a/devtools/libabigail.abignore
>>> b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644
>>> --- a/devtools/libabigail.abignore
>>> +++ b/devtools/libabigail.abignore
>>> @@ -34,3 +34,8 @@
>>>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>>  ; Temporary exceptions till next major ABI version ;
>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>> +
>>> +; Ignore fields inserted in padding hole of rte_eth_txconf
>>> +[suppress_type]
>>> +        name = rte_eth_txconf
>>> +        has_data_member_inserted_between =
>>> +{offset_of(tx_deferred_start), offset_of(offloads)}
>>> diff --git a/doc/guides/rel_notes/release_23_03.rst
>>> b/doc/guides/rel_notes/release_23_03.rst
>>> index 73f5d94e14..e99bd2dcb6 100644
>>> --- a/doc/guides/rel_notes/release_23_03.rst
>>> +++ b/doc/guides/rel_notes/release_23_03.rst
>>> @@ -55,6 +55,10 @@ New Features
>>>       Also, make sure to start the actual text at the margin.
>>>       =======================================================
>>>
>>> +* **Added affinity for multiple physical ports connected to a single
>>> +DPDK port.**
>>> +
>>> +  * Added Tx affinity in queue setup to map a physical port.
>>> +
>>>  * **Updated AMD axgbe driver.**
>>>
>>>    * Added multi-process support.
>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> index 79a1fa9cb7..5c716f7679 100644
>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only
>> on a specific Tx queue::
>>>
>>>  This command should be run when the port is stopped, or else it will fail.
>>>
>>> +config per queue Tx physical affinity
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Configure a per queue physical affinity value only on a specific Tx queue::
>>> +
>>> +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
>>> +
>>> +* ``phy_affinity``: physical port to use for sending,
>>> +                    when multiple physical ports are connected to
>>> +                    a single DPDK port.
>>> +
>>> +This command should be run when the port is stopped, otherwise it fails.
>>> +
>>>  Config VXLAN Encap outer layers
>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>> c129ca1eaf..2fd971b7b5 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
>>>  				      less free descriptors than this value. */
>>>
>>>  	uint8_t tx_deferred_start; /**< Do not start queue with
>>> rte_eth_dev_start(). */
>>> +	/**
>>> +	 * Affinity with one of the multiple physical ports connected to the
>> DPDK port.
>>> +	 * Value 0 means no affinity and traffic could be routed to any
>> connected
>>> +	 * physical port.
>>> +	 * The first physical port is number 1 and so on.
>>> +	 * Number of physical ports is reported by nb_phy_ports in
>> rte_eth_dev_info.
>>> +	 */
>>> +	uint8_t tx_phy_affinity;
>>>  	/**
>>>  	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_*
>> flags.
>>>  	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@
>>> -1744,6 +1752,8 @@ struct rte_eth_dev_info {
>>>  	/** Device redirection table size, the total number of entries. */
>>>  	uint16_t reta_size;
>>>  	uint8_t hash_key_size; /**< Hash key size in bytes */
>>> +	/** Number of physical ports connected with DPDK port. */
>>> +	uint8_t nb_phy_ports;
>>>  	/** Bit mask of RSS offloads, the bit offset also means flow type */
>>>  	uint64_t flow_type_rss_offloads;
>>>  	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration
>>> */
> 


  reply	other threads:[~2023-02-14 10:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221221102934.13822-1-jiaweiw@nvidia.com/>
2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and " Jiawei Wang
2023-02-03  5:07   ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-03  5:07   ` [PATCH v3 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-06 15:29       ` Jiawei(Jonny) Wang
2023-02-07  9:40       ` Ori Kam
2023-02-09 19:44       ` Ferruh Yigit
2023-02-10 14:06         ` Jiawei(Jonny) Wang
2023-02-14  9:38         ` Jiawei(Jonny) Wang
2023-02-14 10:01           ` Ferruh Yigit [this message]
2023-02-03 13:33     ` [PATCH v4 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
2023-02-15 11:41       ` Jiawei(Jonny) Wang
2023-02-16 17:42       ` Thomas Monjalon
2023-02-17  6:45         ` Jiawei(Jonny) Wang
2023-02-16 17:58       ` Ferruh Yigit
2023-02-17  6:44         ` Jiawei(Jonny) Wang
2023-02-17  8:24         ` Andrew Rybchenko
2023-02-17  9:50           ` Jiawei(Jonny) Wang
2023-02-14 15:48     ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
2023-02-16 17:46       ` Thomas Monjalon
2023-02-17  6:45         ` Jiawei(Jonny) Wang
2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
2023-02-17 12:53       ` Ferruh Yigit
2023-02-17 12:56       ` Andrew Rybchenko
2023-02-17 12:59         ` Ferruh Yigit
2023-02-17 13:05           ` Jiawei(Jonny) Wang
2023-02-17 13:41         ` Jiawei(Jonny) Wang
2023-02-17 15:03           ` Andrew Rybchenko
2023-02-17 15:32             ` Ferruh Yigit
2023-02-17 10:50     ` [PATCH v6 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 13:01     ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
2023-02-17 13:07       ` Jiawei(Jonny) Wang
2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
2023-02-17 15:47     ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
2023-02-17 15:47     ` [PATCH v7 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 16:45     ` [PATCH v7 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8bf62f42-c40b-536b-1946-f1158dbb31b0@amd.com \
    --to=ferruh.yigit@amd.com \
    --cc=aman.deep.singh@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=jiaweiw@nvidia.com \
    --cc=orika@nvidia.com \
    --cc=rasland@nvidia.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    --cc=yuying.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).