From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 30D05A034F;
	Mon,  7 Feb 2022 08:24:18 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 0499A40685;
	Mon,  7 Feb 2022 08:24:18 +0100 (CET)
Received: from mail-io1-f54.google.com (mail-io1-f54.google.com
 [209.85.166.54]) by mails.dpdk.org (Postfix) with ESMTP id 862564067C
 for <dev@dpdk.org>; Mon,  7 Feb 2022 08:24:16 +0100 (CET)
Received: by mail-io1-f54.google.com with SMTP id d188so15582159iof.7
 for <dev@dpdk.org>; Sun, 06 Feb 2022 23:24:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=e5olDoLMjagucjl01w8HSYxGabJTlHUPcaySzi6WK88=;
 b=bqoMJf3xdqKW+I8VY7PojaSgcumThNS6RC4rfyGUAMuRD7aydISAo7a2vr78blN8o9
 w9xjntP0uCCiyTzHWB7e+kZQUQT1/R4dgZf2tRXg8NJ4p9OMemUmwxIdykSW9HxIWxx5
 SA330EvOFLhnmpVUg7c7uwBQDxy21zgHCh7D1XMr6I+y+XMYF42dg2r10KX3ZLghU+wB
 ANI3LwasZU+ZNxWKUni25VOd9UA9eAEvGsp0xgiYPllfvI4n7NrYZW+PHnKwDNnG7rNp
 9spBtrdJib2gmYachDFUzEq9tO8h6QIqmZaTqr2MP4dx8XiL2J0cIbveclmKo3xoqza4
 x32Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=e5olDoLMjagucjl01w8HSYxGabJTlHUPcaySzi6WK88=;
 b=Y9Y6cFGsSAdVWhLhYOo9rSeCLz9ueUGoI4xJZmvyoM1nVbrcii4Zd8tCmKh9Z4nQ5k
 +BLr3rpDWLjjEGvGUG8HymQMBV9uyFmqT1eQuaVneEyRJNfwzVhOii+o+JhWlrW1x2tQ
 5j/gIeOtHeWDQcj2VjDIuNaAOq51ve8HE7tT+lun44N4EHPaDylS5F5fG3Qb0Mm/d2We
 aP6+4VJGnDFAUvr8rhs307hc499LFAtWZ26V6BoD2O99+fqHgQJagej88UvTpDOwf4oB
 ypxSPA/6iMOj0xXtgt9te2hBlR1ZUlBEWoy4K0hHX2Yn37YIkKGj+3kKpnczVSZ/S5K3
 r5KQ==
X-Gm-Message-State: AOAM530o0Bpn6y9keuDyNLu51u1TuN8G99ODpVv6a8l75RpFyh4USOBp
 Nvn3OzZBUnfUHMZU/JBAsGWKYdqKdYvu0ltcuOU=
X-Google-Smtp-Source: ABdhPJyTVGmxLUMosedIoHeDO2ZT7kqT8sa5e5z3/TKzz+8ezvF8YrCOB0SWMBGjCg+eIHotkEAIQGIKg+1dFCb5QoQ=
X-Received: by 2002:a05:6602:1512:: with SMTP id
 g18mr5260426iow.121.1644218655625; 
 Sun, 06 Feb 2022 23:24:15 -0800 (PST)
MIME-Version: 1.0
References: <20220113102718.3167282-1-jerinj@marvell.com>
 <20220131180859.2662034-1-jerinj@marvell.com>
 <f113fd33-8049-8aec-d345-ac834ea1a5ac@intel.com>
In-Reply-To: <f113fd33-8049-8aec-d345-ac834ea1a5ac@intel.com>
From: Jerin Jacob <jerinjacobk@gmail.com>
Date: Mon, 7 Feb 2022 12:53:48 +0530
Message-ID: <CALBAE1M9bkjX6nC9y5t=rUnWbUts_Sj3VAJ5q2r57XSOx2ZQnQ@mail.gmail.com>
Subject: Re: [dpdk-dev] [PATCH v3 1/2] ethdev: support queue-based priority
 flow control
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Jerin Jacob <jerinj@marvell.com>, dpdk-dev <dev@dpdk.org>, 
 Thomas Monjalon <thomas@monjalon.net>,
 Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>, 
 Ray Kinsella <mdr@ashroe.eu>, Ajit Khaparde <ajit.khaparde@broadcom.com>, 
 Andrew Boyer <aboyer@pensando.io>, Beilei Xing <beilei.xing@intel.com>, 
 "Richardson, Bruce" <bruce.richardson@intel.com>, Chas Williams <chas3@att.com>,
 "Xia, Chenbo" <chenbo.xia@intel.com>, Ciara Loftus <ciara.loftus@intel.com>, 
 Devendra Singh Rawat <dsinghrawat@marvell.com>,
 Ed Czeck <ed.czeck@atomicrules.com>, 
 Evgeny Schemeilin <evgenys@amazon.com>, Gaetan Rivet <grive@u256.net>,
 Gagandeep Singh <g.singh@nxp.com>, 
 Guoyang Zhou <zhouguoyang@huawei.com>, Haiyue Wang <haiyue.wang@intel.com>, 
 Harman Kalra <hkalra@marvell.com>, heinrich.kuhn@corigine.com, 
 Hemant Agrawal <hemant.agrawal@nxp.com>, Hyong Youb Kim <hyonkim@cisco.com>, 
 Igor Chauskin <igorch@amazon.com>, Igor Russkikh <irusskikh@marvell.com>, 
 Jakub Grajciar <jgrajcia@cisco.com>,
 Jasvinder Singh <jasvinder.singh@intel.com>, 
 Jian Wang <jianwang@trustnetic.com>, Jiawen Wu <jiawenwu@trustnetic.com>, 
 Jingjing Wu <jingjing.wu@intel.com>, John Daley <johndale@cisco.com>, 
 John Miller <john.miller@atomicrules.com>,
 "John W. Linville" <linville@tuxdriver.com>, 
 "Wiles, Keith" <keith.wiles@intel.com>, Kiran Kumar K <kirankumark@marvell.com>,
 Lijun Ou <oulijun@huawei.com>, Liron Himi <lironh@marvell.com>,
 Long Li <longli@microsoft.com>, 
 Marcin Wojtas <mw@semihalf.com>, Martin Spinler <spinler@cesnet.cz>,
 Matan Azrad <matan@nvidia.com>, Matt Peters <matt.peters@windriver.com>,
 Maxime Coquelin <maxime.coquelin@redhat.com>, 
 Michal Krawczyk <mk@semihalf.com>, "Min Hu (Connor" <humin29@huawei.com>, 
 Pradeep Kumar Nalla <pnalla@marvell.com>,
 Nithin Dabilpuram <ndabilpuram@marvell.com>, 
 Qiming Yang <qiming.yang@intel.com>, Qi Zhang <qi.z.zhang@intel.com>, 
 Radha Mohan Chintakuntla <radhac@marvell.com>,
 Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>, 
 Rasesh Mody <rmody@marvell.com>, Rosen Xu <rosen.xu@intel.com>, 
 Sachin Saxena <sachin.saxena@oss.nxp.com>, 
 Satha Koteswara Rao Kottidi <skoteshwar@marvell.com>,
 Shahed Shaikh <shshaikh@marvell.com>, Shai Brandes <shaibran@amazon.com>,
 Shepard Siegel <shepard.siegel@atomicrules.com>, 
 Somalapuram Amaranath <asomalap@amd.com>,
 Somnath Kotur <somnath.kotur@broadcom.com>, 
 Stephen Hemminger <sthemmin@microsoft.com>,
 Steven Webster <steven.webster@windriver.com>, 
 Sunil Kumar Kori <skori@marvell.com>, Tetsuya Mukawa <mtetsuyah@gmail.com>, 
 Veerasenareddy Burru <vburru@marvell.com>,
 Viacheslav Ovsiienko <viacheslavo@nvidia.com>, 
 Xiao Wang <xiao.w.wang@intel.com>, Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>,
 Yisen Zhuang <yisen.zhuang@huawei.com>, Yong Wang <yongwang@vmware.com>, 
 Ziyang Xuan <xuanziyang2@huawei.com>
Content-Type: text/plain; charset="UTF-8"
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

)


On Thu, Feb 3, 2022 at 9:31 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 1/31/2022 6:08 PM, jerinj@marvell.com wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Based on device support and use-case need, there are two different ways
> > to enable PFC. The first case is the port level PFC configuration, in
> > this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to
> > configure the PFC, and PFC frames will be generated using based on VLAN
> > TC value.
> >
> > The second case is the queue level PFC configuration, in this
> > case, Any packet field content can be used to steer the packet to the
> > specific queue using rte_flow or RSS and then use
> > rte_eth_dev_priority_flow_ctrl_queue_configure() to configure the
> > TC mapping on each queue.
> > Based on congestion selected on the specific queue, configured TC
> > shall be used to generate PFC frames.
> >
>
> Hi Jerin, Sunil,
>
> Please find below minor comments, mostly syntax issues.
>
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
> > ---
> >
> > v2..v1:
> > - Introduce rte_eth_dev_priority_flow_ctrl_queue_info_get() to
> > avoid updates to rte_eth_dev_info
> > - Removed devtools/libabigail.abignore changes
> > - Address the comment from Ferruh in
> > http://patches.dpdk.org/project/dpdk/patch/20220113102718.3167282-1-jerinj@marvell.com/
> >
> >   doc/guides/nics/features.rst           |   7 +-
> >   doc/guides/rel_notes/release_22_03.rst |   6 ++
> >   lib/ethdev/ethdev_driver.h             |  12 ++-
> >   lib/ethdev/rte_ethdev.c                | 132 +++++++++++++++++++++++++
> >   lib/ethdev/rte_ethdev.h                |  89 +++++++++++++++++
> >   lib/ethdev/version.map                 |   4 +
> >   6 files changed, 247 insertions(+), 3 deletions(-)
> >
> > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > index 27be2d2576..1cacdc883a 100644
> > --- a/doc/guides/nics/features.rst
> > +++ b/doc/guides/nics/features.rst
> > @@ -379,9 +379,12 @@ Flow control
> >   Supports configuring link flow control.
> >
> >   * **[implements] eth_dev_ops**: ``flow_ctrl_get``, ``flow_ctrl_set``,
> > -  ``priority_flow_ctrl_set``.
> > +  ``priority_flow_ctrl_set``, ``priority_flow_ctrl_queue_info_get``,
> > +  ``priority_flow_ctrl_queue_configure``
> >   * **[related]    API**: ``rte_eth_dev_flow_ctrl_get()``, ``rte_eth_dev_flow_ctrl_set()``,
> > -  ``rte_eth_dev_priority_flow_ctrl_set()``.
> > +  ``rte_eth_dev_priority_flow_ctrl_set()``,
> > +  ``rte_eth_dev_priority_flow_ctrl_queue_info_get()``,
> > +  ``rte_eth_dev_priority_flow_ctrl_queue_configure()``.
> >
> >
> >   .. _nic_features_rate_limitation:
> > diff --git a/doc/guides/rel_notes/release_22_03.rst b/doc/guides/rel_notes/release_22_03.rst
> > index 3bc0630c7c..e988c104e8 100644
> > --- a/doc/guides/rel_notes/release_22_03.rst
> > +++ b/doc/guides/rel_notes/release_22_03.rst
> > @@ -69,6 +69,12 @@ New Features
> >
> >     The new API ``rte_event_eth_rx_adapter_event_port_get()`` was added.
> >
> > +* **Added an API to enable queue based priority flow ctrl(PFC).**
> > +
> > +  New APIs, ``rte_eth_dev_priority_flow_ctrl_queue_info_get()`` and
> > +  ``rte_eth_dev_priority_flow_ctrl_queue_configure()``, was added.
> > +
> > +
> >
>
>
> Can you please move this update before ethdev driver updates.
> And no need double empty lines.


Ack


>
> >   Removed Items
> >   -------------
> > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > index d95605a355..320a364766 100644
> > --- a/lib/ethdev/ethdev_driver.h
> > +++ b/lib/ethdev/ethdev_driver.h
> > @@ -533,6 +533,13 @@ typedef int (*flow_ctrl_set_t)(struct rte_eth_dev *dev,
> >   typedef int (*priority_flow_ctrl_set_t)(struct rte_eth_dev *dev,
> >                               struct rte_eth_pfc_conf *pfc_conf);
> >
> > +/** @internal Get info for queue based PFC on an Ethernet device. */
> > +typedef int (*priority_flow_ctrl_queue_info_get_t)(
> > +     struct rte_eth_dev *dev, struct rte_eth_pfc_queue_info *pfc_queue_info);
> > +/** @internal Configure queue based PFC parameter on an Ethernet device. */
> > +typedef int (*priority_flow_ctrl_queue_config_t)(
> > +     struct rte_eth_dev *dev, struct rte_eth_pfc_queue_conf *pfc_queue_conf);
> > +
>
> Instead of ending line with opening parantesis '(', can you break the line after
> first argument, like:
>
> typedef int (*priority_flow_ctrl_queue_config_t)(struct rte_eth_dev *dev,
>                                 struct rte_eth_pfc_queue_conf *pfc_queue_conf);


Ack

>
> Same for all instances.
>
> >   /** @internal Update RSS redirection table on an Ethernet device. */
> >   typedef int (*reta_update_t)(struct rte_eth_dev *dev,
> >                            struct rte_eth_rss_reta_entry64 *reta_conf,
> > @@ -1080,7 +1087,10 @@ struct eth_dev_ops {
> >       flow_ctrl_set_t            flow_ctrl_set; /**< Setup flow control */
> >       /** Setup priority flow control */
> >       priority_flow_ctrl_set_t   priority_flow_ctrl_set;
> > -
> > +     /** Priority flow control queue info get */
> > +     priority_flow_ctrl_queue_info_get_t priority_flow_ctrl_queue_info_get;
> > +     /** Priority flow control queue configure */
> > +     priority_flow_ctrl_queue_config_t priority_flow_ctrl_queue_config;
>
> Can you please keep empty line before next (hash) group?


Ack

>
> >       /** Set Unicast Table Array */
> >       eth_uc_hash_table_set_t    uc_hash_table_set;
> >       /** Set Unicast hash bitmap */
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> > index a1d475a292..2ce38cd2c5 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -4022,6 +4022,138 @@ rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
> >       return -ENOTSUP;
> >   }
> >
> > +static inline int
>
> Not sure if there is a value to ask function to be 'inline', this is in control
> path, only static can be enough.

Ack

>
> > +validate_rx_pause_config(struct rte_eth_dev_info *dev_info, uint8_t tc_max,
> > +                      struct rte_eth_pfc_queue_conf *pfc_queue_conf)
> > +{
> > +     if ((pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) ||
> > +         (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) {
>
> We don't allign to paranthesis in dpdk coding convenion [1], it should be as:
>
> if ((pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) ||
>                 (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) {
>         if (pfc_queue_conf->rx_pause.tx_qid >= dev_info->nb_tx_queues) {
>                 ...
>          }
> }

Ack

>
>
> [1]
> Altough I am aware many instances sneaked in, still I think better to follow
> the convention.
>
> > +             if (pfc_queue_conf->rx_pause.tx_qid >= dev_info->nb_tx_queues) {
> > +                     RTE_ETHDEV_LOG(ERR, "Tx queue not in range for Rx pause"
> > +                                    " (requested: %d configured: %d)\n",
> > +                                    pfc_queue_conf->rx_pause.tx_qid,
> > +                                    dev_info->nb_tx_queues);
>
>
> Should log mention that this is related to the "priority flow Rx queue control"?

Prepended "PFC" in the log message


>
> > +                     return -EINVAL;
> > +             }
> > +
> > +             if (pfc_queue_conf->rx_pause.tc >= tc_max) {
>
> Should we document somewhere that 'tc_max' value itself is an invalid value?

Updated the documentation to following

        struct {
                uint16_t tx_qid; /**< Tx queue ID */
-               uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */
+               uint8_t tc;
+               /**< Traffic class as per PFC (802.1Qbb) spec. The value must be
+                * in the range [0, rte_eth_pfc_queue_info::tx_max - 1]
+                */


>
> > +                     RTE_ETHDEV_LOG(ERR, "TC not in range for Rx pause"
> > +                                    " (requested: %d max: %d)\n",
> > +                                    pfc_queue_conf->rx_pause.tc, tc_max);
> > +                     return -EINVAL;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static inline int
> > +validate_tx_pause_config(struct rte_eth_dev_info *dev_info, uint8_t tc_max,
> > +                      struct rte_eth_pfc_queue_conf *pfc_queue_conf)
> > +{
> > +     if ((pfc_queue_conf->mode == RTE_ETH_FC_TX_PAUSE) ||
> > +          (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) {
> > +             if (pfc_queue_conf->tx_pause.rx_qid >= dev_info->nb_rx_queues) {
> > +                     RTE_ETHDEV_LOG(ERR, "Rx queue not in range for Tx pause"
> > +                                    "(requested: %d configured: %d)\n",
> > +                                    pfc_queue_conf->tx_pause.rx_qid,
> > +                                    dev_info->nb_rx_queues);
> > +                     return -EINVAL;
> > +             }
> > +
> > +             if (pfc_queue_conf->tx_pause.tc >= tc_max) {
> > +                     RTE_ETHDEV_LOG(ERR, "TC not in range for Tx pause"
> > +                                    "(requested: %d max: %d)\n",
> > +                                    pfc_queue_conf->tx_pause.tc, tc_max);
> > +                     return -EINVAL;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +int
> > +rte_eth_dev_priority_flow_ctrl_queue_info_get(
> > +     uint16_t port_id, struct rte_eth_pfc_queue_info *pfc_queue_info)
> > +{
> > +     struct rte_eth_dev *dev;
> > +
> > +     RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +     dev = &rte_eth_devices[port_id];
> > +
> > +     if (pfc_queue_info == NULL) {
> > +             RTE_ETHDEV_LOG(ERR, "PFC info param is NULL for port (%u)\n",
> > +                            port_id);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (*dev->dev_ops->priority_flow_ctrl_queue_info_get)
> > +             return eth_err(port_id,
> > +                            (*dev->dev_ops->priority_flow_ctrl_queue_info_get)(
> > +                                    dev, pfc_queue_info));
> > +     return -ENOTSUP;
> > +}
> > +
> > +int
> > +rte_eth_dev_priority_flow_ctrl_queue_configure(
> > +     uint16_t port_id, struct rte_eth_pfc_queue_conf *pfc_queue_conf)
> > +{
> > +     struct rte_eth_pfc_queue_info pfc_info;
> > +     struct rte_eth_dev_info dev_info;
> > +     struct rte_eth_dev *dev;
> > +     int ret;
> > +
> > +     RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +     dev = &rte_eth_devices[port_id];
> > +
> > +     if (pfc_queue_conf == NULL) {
> > +             RTE_ETHDEV_LOG(ERR, "PFC parameters are NULL for port (%u)\n",
> > +                            port_id);
> > +             return -EINVAL;
> > +     }
> > +
> > +     ret = rte_eth_dev_info_get(port_id, &dev_info);
> > +     if (ret != 0)
> > +             return ret;
> > +
> > +     ret = rte_eth_dev_priority_flow_ctrl_queue_info_get(port_id, &pfc_info);
> > +     if (ret != 0)
> > +             return ret;
> > +
> > +     if (pfc_info.capa == 0) {
> > +             RTE_ETHDEV_LOG(ERR, "Ethdev port %u does not support PFC\n",
> > +                     port_id);
> > +             return -ENOTSUP;
> > +     }
> > +
> > +     if (pfc_info.tc_max == 0) {
> > +             RTE_ETHDEV_LOG(ERR,
> > +                     "Ethdev port %u does not support PFC TC values\n",
> > +                     port_id);
> > +             return -ENOTSUP;
> > +     }
> > +
> > +     if (pfc_info.capa & RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE) {
> > +             ret = validate_rx_pause_config(&dev_info, pfc_info.tc_max,
> > +                                            pfc_queue_conf);
>
> There is capablilty flags for RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE and RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE
> also there is config flags RTE_ETH_FC_RX_PAUSE, RTE_ETH_FC_TX_PAUSE and RTE_ETH_FC_FULL

Removed that and reused rte_eth_fc_mode

-/** Device supports Rx pause for queue based PFC. */
-#define RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE RTE_BIT64(0)
-/** Device supports Tx pause for queue based PFC. */
-#define RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE RTE_BIT64(1)
-
 /**
  * @warning
  * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
@@ -1424,8 +1419,8 @@ struct rte_eth_pfc_queue_info {
         * Maximum supported traffic class as per PFC (802.1Qbb) specification.
         */
        uint8_t tc_max;
-       /** PFC queue capabilities (RTE_ETH_PFC_QUEUE_CAPA_). */
-       uint64_t capa;
+       /** PFC queue mode capabilities. */
+       enum rte_eth_fc_mode mode_capa;
 };



>
>
> What should happen if driver only support RX_PAUSE but app config request only
> TX_PAUSE?
> As far as can see with current code it pass the validation, but should it?

Yes. Good catch. Added

        /* Check requested mode supported or not */
        if (pfc_info.mode_capa == RTE_ETH_FC_RX_PAUSE &&
                        pfc_queue_conf->mode == RTE_ETH_FC_TX_PAUSE) {
                RTE_ETHDEV_LOG(ERR, "PFC Tx pause unsupported for port (%d)\n",
                               port_id);
                return -EINVAL;
        }

        if (pfc_info.mode_capa == RTE_ETH_FC_TX_PAUSE &&
                        pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) {
                RTE_ETHDEV_LOG(ERR, "PFC Rx pause unsupported for port (%d)\n",
                               port_id);
                return -EINVAL;
        }


>
>
> > +             if (ret != 0)
> > +                     return ret;
> > +     }
> > +
> > +     if (pfc_info.capa & RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE) {
> > +             ret = validate_tx_pause_config(&dev_info, pfc_info.tc_max,
> > +                                            pfc_queue_conf);
>
> syntax, please don't align to paranthesis
>
> > +             if (ret != 0)
> > +                     return ret;
> > +     }
> > +
> > +     if (*dev->dev_ops->priority_flow_ctrl_queue_config)
> > +             return eth_err(port_id,
> > +                            (*dev->dev_ops->priority_flow_ctrl_queue_config)(
> > +                                    dev, pfc_queue_conf));
> > +     return -ENOTSUP;
> > +}
> > +
> >   static int
> >   eth_check_reta_mask(struct rte_eth_rss_reta_entry64 *reta_conf,
> >                       uint16_t reta_size)
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index fa299c8ad7..383ad1cdd7 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1395,6 +1395,48 @@ struct rte_eth_pfc_conf {
> >       uint8_t priority;          /**< VLAN User Priority. */
> >   };
> >
> > +/** Device supports Rx pause for queue based PFC. */
> > +#define RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE RTE_BIT64(0)
> > +/** Device supports Tx pause for queue based PFC. */
> > +#define RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE RTE_BIT64(1)
> > +
>
> There is already control flow mode enum 'enum rte_eth_fc_mode', those
> enum items use FC as abrivation (RTE_ETH_FC_RX_PAUSE), above ones use PFC,
> should they use same prefix 'RTE_ETH_FC_' for consistency?
>
> And overall, what for struct and functins too, is the correct abreviation
> 'pfc' or 'fc', since old code has 'fc' as far as I see?

Removed RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE and reused enum rte_eth_fc_mode
>
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > + *
> > + * A structure used to retrieve information of queue based PFC.
> > + */
> > +struct rte_eth_pfc_queue_info {
> > +     /**
> > +      * Maximum supported traffic class as per PFC (802.1Qbb) specification.
>
> Will it be redundant to say valid value should be bigger than 0?

Updated the document as:
@@ -1440,13 +1435,19 @@ struct rte_eth_pfc_queue_conf {

        struct {
                uint16_t tx_qid; /**< Tx queue ID */
-               uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */
+               uint8_t tc;
+               /**< Traffic class as per PFC (802.1Qbb) spec. The value must be
+                * in the range [0, rte_eth_pfc_queue_info::tx_max - 1]
+                */



>
> > +      */
> > +     uint8_t tc_max;
> > +     /** PFC queue capabilities (RTE_ETH_PFC_QUEUE_CAPA_). */
>
> Can move doxygen comments to same line if they fit to 80 char limit:
>         uint64_t capa; /**< PFC queue capabilities (RTE_ETH_PFC_QUEUE_CAPA_). */
>
> > +     uint64_t capa;
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> > + *
> > + * A structure used to configure Ethernet priority flow control parameter for
> > + * ethdev queues.
> > + */
> > +struct rte_eth_pfc_queue_conf {
> > +     enum rte_eth_fc_mode mode; /**< Link flow control mode */
> > +
> > +     struct {
> > +             uint16_t tx_qid; /**< Tx queue ID */
>
> 'tx_qid' within 'rx_pause' struct, this seems done intentionally but just to double
> check, can you please describe here the intendent usage?


Clarified the doc as

@@ -1429,6 +1429,16 @@ struct rte_eth_pfc_queue_info {
  *
  * A structure used to configure Ethernet priority flow control parameter for
  * ethdev queues.
+ *
+ * rte_eth_pfc_queue_conf::rx_pause structure shall used to configure given
+ * tx_qid with corresponding tc. When ethdev device receives PFC frame with
+ * rte_eth_pfc_queue_conf::rx_pause::tc, traffic will be paused on
+ * rte_eth_pfc_queue_conf::rx_pause::tx_qid for that tc.
+ *
+ * rte_eth_pfc_queue_conf::tx_pause structure shall used to configure given
+ * rx_qid. When rx_qid is congested, PFC frames are generated with
+ * rte_eth_pfc_queue_conf::rx_pause::tc and
+ * rte_eth_pfc_queue_conf::rx_pause::pause_time to the peer.

>
> > +             uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */
> > +     } rx_pause; /* Valid when (mode == FC_RX_PAUSE || mode == FC_FULL) */
> > +
> > +     struct {
> > +             uint16_t pause_time; /**< Pause quota in the Pause frame */
> > +             uint16_t rx_qid;     /**< Rx queue ID */
> > +             uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */
> > +     } tx_pause; /* Valid when (mode == FC_TX_PAUSE || mode == FC_FULL) */
> > +};
> > +
> >   /**
> >    * Tunnel type for device-specific classifier configuration.
> >    * @see rte_eth_udp_tunnel
> > @@ -4144,6 +4186,53 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id,
> >   int rte_eth_dev_mac_addr_add(uint16_t port_id, struct rte_ether_addr *mac_addr,
> >                               uint32_t pool);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Retrieve the information for queue based PFC.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param pfc_queue_info
> > + *   A pointer to a structure of type *rte_eth_pfc_queue_info* to be filled with
> > + *   the information about queue based PFC.
> > + * @return
> > + *   - (0) if successful.
> > + *   - (-ENOTSUP) if support for priority_flow_ctrl_queue_info_get does not exist.
> > + *   - (-ENODEV) if *port_id* invalid.
> > + *   - (-EINVAL) if bad parameter.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_priority_flow_ctrl_queue_info_get(uint16_t port_id,
> > +                                struct rte_eth_pfc_queue_info *pfc_queue_info);
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Configure the queue based priority flow control for a given queue
> > + * for Ethernet device.
> > + *
> > + * @note When an ethdev port switches to queue based PFC mode, the
> > + * unconfigured queues shall be configured by the driver with
> > + * default values such as lower priority value for TC etc.
> > + *
>
> May be good to document it has dependency to 'rte_eth_dev_info_get()' API?

There is no dependency to rte_eth_dev_info_get() in the spec.

>
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param pfc_queue_conf
> > + *   The pointer to the structure of the priority flow control parameters
> > + *   for the queue.
> > + * @return
> > + *   - (0) if successful.
> > + *   - (-ENOTSUP) if hardware doesn't support queue based PFC mode.
> > + *   - (-ENODEV)  if *port_id* invalid.
> > + *   - (-EINVAL)  if bad parameter
> > + *   - (-EIO)     if flow control setup queue failure
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_priority_flow_ctrl_queue_configure(uint16_t port_id,
> > +                           struct rte_eth_pfc_queue_conf *pfc_queue_conf);
> > +
> >   /**
> >    * Remove a MAC address from the internal array of addresses.
> >    *
> > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> > index c2fb0669a4..49523ebc45 100644
> > --- a/lib/ethdev/version.map
> > +++ b/lib/ethdev/version.map
> > @@ -256,6 +256,10 @@ EXPERIMENTAL {
> >       rte_flow_flex_item_create;
> >       rte_flow_flex_item_release;
> >       rte_flow_pick_transfer_proxy;
> > +
> > +     # added in 22.03
> > +     rte_eth_dev_priority_flow_ctrl_queue_configure;
> > +     rte_eth_dev_priority_flow_ctrl_queue_info_get;
> >   };
> >
> >   INTERNAL {
>