From: Ferruh Yigit <ferruh.yigit@amd.com>
To: "Jiawei(Jonny) Wang" <jiaweiw@nvidia.com>,
Slava Ovsiienko <viacheslavo@nvidia.com>,
Ori Kam <orika@nvidia.com>,
"NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>,
"andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
Aman Singh <aman.deep.singh@intel.com>,
Yuying Zhang <yuying.zhang@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Raslan Darawsheh <rasland@nvidia.com>
Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
Date: Tue, 14 Feb 2023 10:01:31 +0000 [thread overview]
Message-ID: <8bf62f42-c40b-536b-1946-f1158dbb31b0@amd.com> (raw)
In-Reply-To: <PH0PR12MB5451BED719E2C4C890EF96BEC6A29@PH0PR12MB5451.namprd12.prod.outlook.com>
On 2/14/2023 9:38 AM, Jiawei(Jonny) Wang wrote:
> Hi,
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, February 10, 2023 3:45 AM
>> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
>> Yuying Zhang <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
>> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
>> API
>>
>> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
>>> When multiple physical ports are connected to a single DPDK port,
>>> (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
>>> know which physical port is used for Rx and Tx.
>>>
>>
>> I assume "kernel bonding" is out of context, but this patch concerns DPDK
>> bonding, failsafe or softnic. (I will refer them as virtual bonding
>> device.)
>>
>> To use specific queues of the virtual bonding device may interfere with the
>> logic of these devices, like bonding modes or RSS of the underlying devices. I
>> can see feature focuses on a very specific use case, but not sure if all possible
>> side effects taken into consideration.
>>
>>
>> And although the feature is only relavent to virtual bondiong device, core
>> ethdev structures are updated for this. Most use cases won't need these, so is
>> there a way to reduce the scope of the changes to virtual bonding devices?
>>
>>
>> There are a few very core ethdev APIs, like:
>> rte_eth_dev_configure()
>> rte_eth_tx_queue_setup()
>> rte_eth_rx_queue_setup()
>> rte_eth_dev_start()
>> rte_eth_dev_info_get()
>>
>> Almost every user of ehtdev uses these APIs, since these are so fundemental I
>> am for being a little more conservative on these APIs.
>>
>> Every eccentric features are targetting these APIs first because they are
>> common and extending them gives an easy solution, but in long run making
>> these APIs more complex, harder to maintain and harder for PMDs to support
>> them correctly. So I am for not updating them unless it is a generic use case.
>>
>>
>> Also as we talked about PMDs supporting them, I assume your coming PMD
>> patch will be implementing 'tx_phy_affinity' config option only for mlx drivers.
>> What will happen for other NICs? Will they silently ignore the config option
>> from user? So this is a problem for the DPDK application portabiltiy.
>>
>>
>>
>> As far as I understand target is application controlling which sub-device is used
>> under the virtual bonding device, can you pleaes give more information why
>> this is required, perhaps it can help to provide a better/different solution.
>> Like adding the ability to use both bonding device and sub-device for data path,
>> this way application can use whichever it wants. (this is just first solution I
>> come with, I am not suggesting as replacement solution, but if you can describe
>> the problem more I am sure other people can come with better solutions.)
>>
>> And isn't this against the applicatio transparent to underneath device being
>> bonding device or actual device?
>>
>>
>
> OK, I will send the new version with separate functions in ethdev layer,
> to support the Map a Tx queue to port and get the number of ports.
> And these functions work with device ops callback, other NICs will reported
> The unsupported the ops callback is NULL.
>
OK, thanks Jonny, at least this separates the fetaure to its own APIs
which reduces the impact for applications and drivers that are not using
this feature.
>>> This patch maps a DPDK Tx queue with a physical port, by adding
>>> tx_phy_affinity setting in Tx queue.
>>> The affinity number is the physical port ID where packets will be
>>> sent.
>>> Value 0 means no affinity and traffic could be routed to any connected
>>> physical ports, this is the default current behavior.
>>>
>>> The number of physical ports is reported with rte_eth_dev_info_get().
>>>
>>> The new tx_phy_affinity field is added into the padding hole of
>>> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
>>> An ABI check rule needs to be added to avoid false warning.
>>>
>>> Add the testpmd command line:
>>> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
>>>
>>> For example, there're two physical ports connected to a single DPDK
>>> port (port id 0), and phy_affinity 1 stood for the first physical port
>>> and phy_affinity 2 stood for the second physical port.
>>> Use the below commands to config tx phy affinity for per Tx Queue:
>>> port config 0 txq 0 phy_affinity 1
>>> port config 0 txq 1 phy_affinity 1
>>> port config 0 txq 2 phy_affinity 2
>>> port config 0 txq 3 phy_affinity 2
>>>
>>> These commands config the Tx Queue index 0 and Tx Queue index 1 with
>>> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
>>> packets will be sent from the first physical port, and similar with
>>> the second physical port if sending packets with Tx Queue 2 or Tx
>>> Queue 3.
>>>
>>> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
>>> ---
>>> app/test-pmd/cmdline.c | 100 ++++++++++++++++++++
>>> app/test-pmd/config.c | 1 +
>>> devtools/libabigail.abignore | 5 +
>>> doc/guides/rel_notes/release_23_03.rst | 4 +
>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 13 +++
>>> lib/ethdev/rte_ethdev.h | 10 ++
>>> 6 files changed, 133 insertions(+)
>>>
>>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
>>> cb8c174020..f771fcf8ac 100644
>>> --- a/app/test-pmd/cmdline.c
>>> +++ b/app/test-pmd/cmdline.c
>>> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void
>>> *parsed_result,
>>>
>>> "port cleanup (port_id) txq (queue_id) (free_cnt)\n"
>>> " Cleanup txq mbufs for a specific Tx queue\n\n"
>>> +
>>> + "port config (port_id) txq (queue_id) phy_affinity
>> (value)\n"
>>> + " Set the physical affinity value "
>>> + "on a specific Tx queue\n\n"
>>> );
>>> }
>>>
>>> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t
>> cmd_show_port_flow_transfer_proxy = {
>>> }
>>> };
>>>
>>> +/* *** configure port txq phy_affinity value *** */ struct
>>> +cmd_config_tx_phy_affinity {
>>> + cmdline_fixed_string_t port;
>>> + cmdline_fixed_string_t config;
>>> + portid_t portid;
>>> + cmdline_fixed_string_t txq;
>>> + uint16_t qid;
>>> + cmdline_fixed_string_t phy_affinity;
>>> + uint8_t value;
>>> +};
>>> +
>>> +static void
>>> +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
>>> + __rte_unused struct cmdline *cl,
>>> + __rte_unused void *data)
>>> +{
>>> + struct cmd_config_tx_phy_affinity *res = parsed_result;
>>> + struct rte_eth_dev_info dev_info;
>>> + struct rte_port *port;
>>> + int ret;
>>> +
>>> + if (port_id_is_invalid(res->portid, ENABLED_WARN))
>>> + return;
>>> +
>>> + if (res->portid == (portid_t)RTE_PORT_ALL) {
>>> + printf("Invalid port id\n");
>>> + return;
>>> + }
>>> +
>>> + port = &ports[res->portid];
>>> +
>>> + if (strcmp(res->txq, "txq")) {
>>> + printf("Unknown parameter\n");
>>> + return;
>>> + }
>>> + if (tx_queue_id_is_invalid(res->qid))
>>> + return;
>>> +
>>> + ret = eth_dev_info_get_print_err(res->portid, &dev_info);
>>> + if (ret != 0)
>>> + return;
>>> +
>>> + if (dev_info.nb_phy_ports == 0) {
>>> + printf("Number of physical ports is 0 which is invalid for PHY
>> Affinity\n");
>>> + return;
>>> + }
>>> + printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
>>> + if (dev_info.nb_phy_ports < res->value) {
>>> + printf("The PHY affinity value %u is Invalid, exceeds the "
>>> + "number of physical ports\n", res->value);
>>> + return;
>>> + }
>>> + port->txq[res->qid].conf.tx_phy_affinity = res->value;
>>> +
>>> + cmd_reconfig_device_queue(res->portid, 0, 1); }
>>> +
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
>>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + port, "port");
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
>>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + config, "config");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
>>> + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + portid, RTE_UINT16);
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
>>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + txq, "txq");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
>>> + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + qid, RTE_UINT16);
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
>>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + phy_affinity, "phy_affinity");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
>>> + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> + value, RTE_UINT8);
>>> +
>>> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
>>> + .f = cmd_config_tx_phy_affinity_parsed,
>>> + .data = (void *)0,
>>> + .help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
>>> + .tokens = {
>>> + (void *)&cmd_config_tx_phy_affinity_port,
>>> + (void *)&cmd_config_tx_phy_affinity_config,
>>> + (void *)&cmd_config_tx_phy_affinity_portid,
>>> + (void *)&cmd_config_tx_phy_affinity_txq,
>>> + (void *)&cmd_config_tx_phy_affinity_qid,
>>> + (void *)&cmd_config_tx_phy_affinity_hwport,
>>> + (void *)&cmd_config_tx_phy_affinity_value,
>>> + NULL,
>>> + },
>>> +};
>>> +
>>> /*
>>>
>> ****************************************************************
>> ******
>>> ********** */
>>>
>>> /* list of instructions */
>>> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
>>> (cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
>>> (cmdline_parse_inst_t *)&cmd_show_port_cman_config,
>>> (cmdline_parse_inst_t *)&cmd_set_port_cman_config,
>>> + (cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
>>> NULL,
>>> };
>>>
>>> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
>>> acccb6b035..b83fb17cfa 100644
>>> --- a/app/test-pmd/config.c
>>> +++ b/app/test-pmd/config.c
>>> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
>>> printf("unknown\n");
>>> break;
>>> }
>>> + printf("Current number of physical ports: %u\n",
>>> +dev_info.nb_phy_ports);
>>> }
>>>
>>> void
>>> diff --git a/devtools/libabigail.abignore
>>> b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644
>>> --- a/devtools/libabigail.abignore
>>> +++ b/devtools/libabigail.abignore
>>> @@ -34,3 +34,8 @@
>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>> ; Temporary exceptions till next major ABI version ;
>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>> +
>>> +; Ignore fields inserted in padding hole of rte_eth_txconf
>>> +[suppress_type]
>>> + name = rte_eth_txconf
>>> + has_data_member_inserted_between =
>>> +{offset_of(tx_deferred_start), offset_of(offloads)}
>>> diff --git a/doc/guides/rel_notes/release_23_03.rst
>>> b/doc/guides/rel_notes/release_23_03.rst
>>> index 73f5d94e14..e99bd2dcb6 100644
>>> --- a/doc/guides/rel_notes/release_23_03.rst
>>> +++ b/doc/guides/rel_notes/release_23_03.rst
>>> @@ -55,6 +55,10 @@ New Features
>>> Also, make sure to start the actual text at the margin.
>>> =======================================================
>>>
>>> +* **Added affinity for multiple physical ports connected to a single
>>> +DPDK port.**
>>> +
>>> + * Added Tx affinity in queue setup to map a physical port.
>>> +
>>> * **Updated AMD axgbe driver.**
>>>
>>> * Added multi-process support.
>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> index 79a1fa9cb7..5c716f7679 100644
>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only
>> on a specific Tx queue::
>>>
>>> This command should be run when the port is stopped, or else it will fail.
>>>
>>> +config per queue Tx physical affinity
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Configure a per queue physical affinity value only on a specific Tx queue::
>>> +
>>> + testpmd> port (port_id) txq (queue_id) phy_affinity (value)
>>> +
>>> +* ``phy_affinity``: physical port to use for sending,
>>> + when multiple physical ports are connected to
>>> + a single DPDK port.
>>> +
>>> +This command should be run when the port is stopped, otherwise it fails.
>>> +
>>> Config VXLAN Encap outer layers
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>> c129ca1eaf..2fd971b7b5 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
>>> less free descriptors than this value. */
>>>
>>> uint8_t tx_deferred_start; /**< Do not start queue with
>>> rte_eth_dev_start(). */
>>> + /**
>>> + * Affinity with one of the multiple physical ports connected to the
>> DPDK port.
>>> + * Value 0 means no affinity and traffic could be routed to any
>> connected
>>> + * physical port.
>>> + * The first physical port is number 1 and so on.
>>> + * Number of physical ports is reported by nb_phy_ports in
>> rte_eth_dev_info.
>>> + */
>>> + uint8_t tx_phy_affinity;
>>> /**
>>> * Per-queue Tx offloads to be set using RTE_ETH_TX_OFFLOAD_*
>> flags.
>>> * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@
>>> -1744,6 +1752,8 @@ struct rte_eth_dev_info {
>>> /** Device redirection table size, the total number of entries. */
>>> uint16_t reta_size;
>>> uint8_t hash_key_size; /**< Hash key size in bytes */
>>> + /** Number of physical ports connected with DPDK port. */
>>> + uint8_t nb_phy_ports;
>>> /** Bit mask of RSS offloads, the bit offset also means flow type */
>>> uint64_t flow_type_rss_offloads;
>>> struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration
>>> */
>
next prev parent reply other threads:[~2023-02-14 10:01 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20221221102934.13822-1-jiaweiw@nvidia.com/>
2023-02-03 5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and " Jiawei Wang
2023-02-03 5:07 ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-03 5:07 ` [PATCH v3 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-03 13:33 ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
2023-02-03 13:33 ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-06 15:29 ` Jiawei(Jonny) Wang
2023-02-07 9:40 ` Ori Kam
2023-02-09 19:44 ` Ferruh Yigit
2023-02-10 14:06 ` Jiawei(Jonny) Wang
2023-02-14 9:38 ` Jiawei(Jonny) Wang
2023-02-14 10:01 ` Ferruh Yigit [this message]
2023-02-03 13:33 ` [PATCH v4 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-14 15:48 ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
2023-02-14 15:48 ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
2023-02-15 11:41 ` Jiawei(Jonny) Wang
2023-02-16 17:42 ` Thomas Monjalon
2023-02-17 6:45 ` Jiawei(Jonny) Wang
2023-02-16 17:58 ` Ferruh Yigit
2023-02-17 6:44 ` Jiawei(Jonny) Wang
2023-02-17 8:24 ` Andrew Rybchenko
2023-02-17 9:50 ` Jiawei(Jonny) Wang
2023-02-14 15:48 ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
2023-02-16 17:46 ` Thomas Monjalon
2023-02-17 6:45 ` Jiawei(Jonny) Wang
2023-02-17 10:50 ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
2023-02-17 10:50 ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
2023-02-17 12:53 ` Ferruh Yigit
2023-02-17 12:56 ` Andrew Rybchenko
2023-02-17 12:59 ` Ferruh Yigit
2023-02-17 13:05 ` Jiawei(Jonny) Wang
2023-02-17 13:41 ` Jiawei(Jonny) Wang
2023-02-17 15:03 ` Andrew Rybchenko
2023-02-17 15:32 ` Ferruh Yigit
2023-02-17 10:50 ` [PATCH v6 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 13:01 ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
2023-02-17 13:07 ` Jiawei(Jonny) Wang
2023-02-17 15:47 ` [PATCH v7 " Jiawei Wang
2023-02-17 15:47 ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
2023-02-17 15:47 ` [PATCH v7 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 16:45 ` [PATCH v7 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8bf62f42-c40b-536b-1946-f1158dbb31b0@amd.com \
--to=ferruh.yigit@amd.com \
--cc=aman.deep.singh@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=jiaweiw@nvidia.com \
--cc=orika@nvidia.com \
--cc=rasland@nvidia.com \
--cc=thomas@monjalon.net \
--cc=viacheslavo@nvidia.com \
--cc=yuying.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).