From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 79FFBA0613 for ; Thu, 26 Sep 2019 14:18:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 68C582E8F; Thu, 26 Sep 2019 14:18:50 +0200 (CEST) Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com [67.231.154.164]) by dpdk.org (Postfix) with ESMTP id 150682E83 for ; Thu, 26 Sep 2019 14:18:49 +0200 (CEST) X-Virus-Scanned: Proofpoint Essentials engine Received: from webmail.solarflare.com (uk.solarflare.com [193.34.186.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mx1-us2.ppe-hosted.com (PPE Hosted ESMTP Server) with ESMTPS id 3BEC928006A; Thu, 26 Sep 2019 12:18:47 +0000 (UTC) Received: from [192.168.38.17] (91.220.146.112) by ukex01.SolarFlarecom.com (10.17.10.4) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Thu, 26 Sep 2019 13:18:40 +0100 To: Ori Kam , Thomas Monjalon , Ferruh Yigit CC: , , References: <1569479349-36962-1-git-send-email-orika@mellanox.com> <1569479349-36962-2-git-send-email-orika@mellanox.com> From: Andrew Rybchenko Message-ID: Date: Thu, 26 Sep 2019 15:18:36 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <1569479349-36962-2-git-send-email-orika@mellanox.com> Content-Language: en-GB X-Originating-IP: [91.220.146.112] X-ClientProxiedBy: ocex03.SolarFlarecom.com (10.20.40.36) To ukex01.SolarFlarecom.com (10.17.10.4) X-TM-AS-Product-Ver: SMEX-12.5.0.1300-8.5.1010-24934.003 X-TM-AS-Result: No-15.315100-8.000000-10 X-TMASE-MatchedRID: IeZYkn8zfFrmLzc6AOD8DUf49ONH0RaShnpDm0ThsHRScXQMxepe+wQ9 n8U23GDfrep3cF5oylCS2HehsT2UYCnYx8BJnk18kGjXSeFzpu/aZZnEwa4yxAzzaIyrVuO86+3 41imwtEXhJLMBAqWObeb1L/R4hBRv6jHbEaTIvGE/ApMPW/xhXkyQ5fRSh265LmFWQMoSp4F81n UlG7o38u9qIiYgdCa1kNz9ZHfnIK7K7XOHo8djb/go8BKl9ae5e8SChIGavwF0FoS1aixTNQYxJ ykfuo4ABy4u5SKbRtWrFSTEPU+kls8P0vtbeXQ/ydRP56yRRA+iIpNv3rjMdbw2tvOM+/MnNgCd q9L148kyibgELZyn2MytzYMiNdCuKN6GdXLqWfbBjbyj5wYDmg+jS+LRpl81V7MQTbTl029IeBR Uygi/YTOwECumAHbyvTii6vYyxdFDyFDttIdo/LBZAi3nrnzbfYrr1p9yfCp8Xu7iArRioB/8Aj nXlbL5R/bO+8EaduOReJK/TTiI10xtwZS3cxlMUrViw6AaT1HImecPUV56jiMvRs8eXyQqgYqf5 LK03jtqt1XdCPbvGwpTufbcmv1VLVLChO4s5bvwqDryy7bDIZtglMvynACeSgypjNmDEOsC//mi S4mwEFbPOVISu9a4c1DnyTvk0m4oGsCID/XH5fFanwH6mNosyiKgKtIyB4rXOb8uw11sY9SLORr 4Zhm8+2m2EtamUaaf6rydzMhmHTyZx/Tr+EqqN19PjPJahlIKF0jiwuWuOPuiO5t6hLaGyqat6b 4pc91YI8rkL7MfFCTyifo4J5jG77jN54p1A9yeAiCmPx4NwLTrdaH1ZWqC1B0Hk1Q1KyLgfCfWl nNb/yGR7S7gQECwsjvNV98mpPPc+7DJ+EJGdyxztvFgq1nJqbaZqfL+KyOnKD07BZNOazzDgU8s 5QqCyIDMYr0Ko3Socdz0TJExBaKyKMWn2G1Uzfgntuh89QI= X-TM-AS-User-Approved-Sender: Yes X-TM-AS-User-Blocked-Sender: No X-TMASE-Result: 10--15.315100-8.000000 X-TMASE-Version: SMEX-12.5.0.1300-8.5.1010-24934.003 X-MDID: 1569500328-tzqxbIetQEY3 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH 01/13] ethdev: support setup function for hairpin queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 9/26/19 9:28 AM, Ori Kam wrote: > This commit introduce the RX/TX hairpin setup function. RX/TX should be Rx/Tx here and everywhere below. > Hairpin is RX/TX queue that is used by the nic in order to offload > wire to wire traffic. > > Each hairpin queue is binded to one or more queues from other type. > For example TX hairpin queue should be binded to at least 1 RX hairpin > queue and vice versa. How should application find out that hairpin queues are supported? How many? How should application find out which ports/queues could be used for pining? Is hair-pinning domain on device level sufficient to expose limitations? > Signed-off-by: Ori Kam > --- > lib/librte_ethdev/rte_ethdev.c | 213 +++++++++++++++++++++++++++++++ > lib/librte_ethdev/rte_ethdev.h | 145 +++++++++++++++++++++ > lib/librte_ethdev/rte_ethdev_core.h | 18 +++ > lib/librte_ethdev/rte_ethdev_version.map | 4 + > 4 files changed, 380 insertions(+) > > diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c > index 30b0c78..4021f38 100644 > --- a/lib/librte_ethdev/rte_ethdev.c > +++ b/lib/librte_ethdev/rte_ethdev.c > @@ -1701,6 +1701,115 @@ struct rte_eth_dev * > } > > int > +rte_eth_rx_hairpin_queue_setup(uint16_t port_id, uint16_t rx_queue_id, > + uint16_t nb_rx_desc, unsigned int socket_id, > + const struct rte_eth_rxconf *rx_conf, > + const struct rte_eth_hairpin_conf *hairpin_conf) Below code duplicates rte_eth_rx_queue_setup() a lot and it is very bad from maintenance point of view. Similar problem with Tx hairpin queue setup. > +{ > + int ret; > + struct rte_eth_dev *dev; > + struct rte_eth_dev_info dev_info; > + struct rte_eth_rxconf local_conf; > + void **rxq; > + > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > + > + dev = &rte_eth_devices[port_id]; > + if (rx_queue_id >= dev->data->nb_rx_queues) { > + RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", rx_queue_id); > + return -EINVAL; > + } > + > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get, -ENOTSUP); > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_hairpin_queue_setup, > + -ENOTSUP); > + > + rte_eth_dev_info_get(port_id, &dev_info); > + > + /* Use default specified by driver, if nb_rx_desc is zero */ > + if (nb_rx_desc == 0) { > + nb_rx_desc = dev_info.default_rxportconf.ring_size; > + /* If driver default is also zero, fall back on EAL default */ > + if (nb_rx_desc == 0) > + nb_rx_desc = RTE_ETH_DEV_FALLBACK_RX_RINGSIZE; > + } > + > + if (nb_rx_desc > dev_info.rx_desc_lim.nb_max || > + nb_rx_desc < dev_info.rx_desc_lim.nb_min || > + nb_rx_desc % dev_info.rx_desc_lim.nb_align != 0) { > + > + RTE_ETHDEV_LOG(ERR, > + "Invalid value for nb_rx_desc(=%hu), should be: " > + "<= %hu, >= %hu, and a product of %hu\n", > + nb_rx_desc, dev_info.rx_desc_lim.nb_max, > + dev_info.rx_desc_lim.nb_min, > + dev_info.rx_desc_lim.nb_align); > + return -EINVAL; > + } > + > + if (dev->data->dev_started && > + !(dev_info.dev_capa & > + RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP)) > + return -EBUSY; > + > + if (dev->data->dev_started && > + (dev->data->rx_queue_state[rx_queue_id] != > + RTE_ETH_QUEUE_STATE_STOPPED)) > + return -EBUSY; > + > + rxq = dev->data->rx_queues; > + if (rxq[rx_queue_id]) { > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release, > + -ENOTSUP); > + (*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]); > + rxq[rx_queue_id] = NULL; > + } > + > + if (rx_conf == NULL) > + rx_conf = &dev_info.default_rxconf; > + > + local_conf = *rx_conf; > + > + /* > + * If an offloading has already been enabled in > + * rte_eth_dev_configure(), it has been enabled on all queues, > + * so there is no need to enable it in this queue again. > + * The local_conf.offloads input to underlying PMD only carries > + * those offloadings which are only enabled on this queue and > + * not enabled on all queues. > + */ > + local_conf.offloads &= ~dev->data->dev_conf.rxmode.offloads; > + > + /* > + * New added offloadings for this queue are those not enabled in > + * rte_eth_dev_configure() and they must be per-queue type. > + * A pure per-port offloading can't be enabled on a queue while > + * disabled on another queue. A pure per-port offloading can't > + * be enabled for any queue as new added one if it hasn't been > + * enabled in rte_eth_dev_configure(). > + */ > + if ((local_conf.offloads & dev_info.rx_queue_offload_capa) != > + local_conf.offloads) { > + RTE_ETHDEV_LOG(ERR, > + "Ethdev port_id=%d rx_queue_id=%d, " > + "new added offloads 0x%"PRIx64" must be " > + "within per-queue offload capabilities " > + "0x%"PRIx64" in %s()\n", > + port_id, rx_queue_id, local_conf.offloads, > + dev_info.rx_queue_offload_capa, > + __func__); > + return -EINVAL; > + } > + > + ret = (*dev->dev_ops->rx_hairpin_queue_setup)(dev, rx_queue_id, > + nb_rx_desc, socket_id, > + &local_conf, > + hairpin_conf); > + > + return eth_err(port_id, ret); > +} > + > +int > rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id, > uint16_t nb_tx_desc, unsigned int socket_id, > const struct rte_eth_txconf *tx_conf) > @@ -1799,6 +1908,110 @@ struct rte_eth_dev * > tx_queue_id, nb_tx_desc, socket_id, &local_conf)); > } > > +int > +rte_eth_tx_hairpin_queue_setup(uint16_t port_id, uint16_t tx_queue_id, > + uint16_t nb_tx_desc, unsigned int socket_id, > + const struct rte_eth_txconf *tx_conf, > + const struct rte_eth_hairpin_conf *hairpin_conf) > +{ > + struct rte_eth_dev *dev; > + struct rte_eth_dev_info dev_info; > + struct rte_eth_txconf local_conf; > + void **txq; > + > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > + > + dev = &rte_eth_devices[port_id]; > + if (tx_queue_id >= dev->data->nb_tx_queues) { > + RTE_ETHDEV_LOG(ERR, "Invalid TX queue_id=%u\n", tx_queue_id); > + return -EINVAL; > + } > + > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get, -ENOTSUP); > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_hairpin_queue_setup, > + -ENOTSUP); > + > + rte_eth_dev_info_get(port_id, &dev_info); > + > + /* Use default specified by driver, if nb_tx_desc is zero */ > + if (nb_tx_desc == 0) { > + nb_tx_desc = dev_info.default_txportconf.ring_size; > + /* If driver default is zero, fall back on EAL default */ > + if (nb_tx_desc == 0) > + nb_tx_desc = RTE_ETH_DEV_FALLBACK_TX_RINGSIZE; > + } > + if (nb_tx_desc > dev_info.tx_desc_lim.nb_max || > + nb_tx_desc < dev_info.tx_desc_lim.nb_min || > + nb_tx_desc % dev_info.tx_desc_lim.nb_align != 0) { > + RTE_ETHDEV_LOG(ERR, > + "Invalid value for nb_tx_desc(=%hu), " > + "should be: <= %hu, >= %hu, and a product of " > + " %hu\n", > + nb_tx_desc, dev_info.tx_desc_lim.nb_max, > + dev_info.tx_desc_lim.nb_min, > + dev_info.tx_desc_lim.nb_align); > + return -EINVAL; > + } > + > + if (dev->data->dev_started && > + !(dev_info.dev_capa & > + RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP)) > + return -EBUSY; > + > + if (dev->data->dev_started && > + (dev->data->tx_queue_state[tx_queue_id] != > + RTE_ETH_QUEUE_STATE_STOPPED)) > + return -EBUSY; > + > + txq = dev->data->tx_queues; > + if (txq[tx_queue_id]) { > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, > + -ENOTSUP); > + (*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]); > + txq[tx_queue_id] = NULL; > + } > + > + if (tx_conf == NULL) > + tx_conf = &dev_info.default_txconf; > + > + local_conf = *tx_conf; > + > + /* > + * If an offloading has already been enabled in > + * rte_eth_dev_configure(), it has been enabled on all queues, > + * so there is no need to enable it in this queue again. > + * The local_conf.offloads input to underlying PMD only carries > + * those offloadings which are only enabled on this queue and > + * not enabled on all queues. > + */ > + local_conf.offloads &= ~dev->data->dev_conf.txmode.offloads; > + > + /* > + * New added offloadings for this queue are those not enabled in > + * rte_eth_dev_configure() and they must be per-queue type. > + * A pure per-port offloading can't be enabled on a queue while > + * disabled on another queue. A pure per-port offloading can't > + * be enabled for any queue as new added one if it hasn't been > + * enabled in rte_eth_dev_configure(). > + */ > + if ((local_conf.offloads & dev_info.tx_queue_offload_capa) != > + local_conf.offloads) { > + RTE_ETHDEV_LOG(ERR, > + "Ethdev port_id=%d tx_queue_id=%d, new added " > + "offloads 0x%"PRIx64" must be within " > + "per-queue offload capabilities 0x%"PRIx64" " > + "in %s()\n", > + port_id, tx_queue_id, local_conf.offloads, > + dev_info.tx_queue_offload_capa, > + __func__); > + return -EINVAL; > + } > + > + return eth_err(port_id, (*dev->dev_ops->tx_hairpin_queue_setup) > + (dev, tx_queue_id, nb_tx_desc, socket_id, &local_conf, > + hairpin_conf)); > +} > + > void > rte_eth_tx_buffer_drop_callback(struct rte_mbuf **pkts, uint16_t unsent, > void *userdata __rte_unused) > diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h > index 475dbda..b3b1597 100644 > --- a/lib/librte_ethdev/rte_ethdev.h > +++ b/lib/librte_ethdev/rte_ethdev.h > @@ -803,6 +803,30 @@ struct rte_eth_txconf { > uint64_t offloads; > }; > > +#define RTE_ETH_MAX_HAIRPIN_PEERS 32 > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * A structure used to hold hairpin peer data. > + */ > +struct rte_eth_hairpin_peer { > + uint16_t port; /**< Peer port. */ > + uint16_t queue; /**< Peer queue. */ > +}; > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * A structure used to configure hairpin binding. > + */ > +struct rte_eth_hairpin_conf { > + uint16_t peer_n; /**< The number of peers. */ > + struct rte_eth_hairpin_peer peers[RTE_ETH_MAX_HAIRPIN_PEERS]; > +}; > + > /** > * A structure contains information about HW descriptor ring limitations. > */ > @@ -1769,6 +1793,60 @@ int rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, > struct rte_mempool *mb_pool); > > /** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * Allocate and set up a hairpin receive queue for an Ethernet device. > + * > + * The function set up the selected queue to be used in hairpin. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param rx_queue_id > + * The index of the receive queue to set up. > + * The value must be in the range [0, nb_rx_queue - 1] previously supplied > + * to rte_eth_dev_configure(). Is any Rx queue may be setup as hairpin queue? Can it be still used for regular traffic? > + * @param nb_rx_desc > + * The number of receive descriptors to allocate for the receive ring. Does it still make sense for hairpin queue? > + * @param socket_id > + * The *socket_id* argument is the socket identifier in case of NUMA. > + * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for > + * the DMA memory allocated for the receive descriptors of the ring. Is it still required to be provided for hairpin Rx queue? > + * @param rx_conf > + * The pointer to the configuration data to be used for the receive queue. > + * NULL value is allowed, in which case default RX configuration > + * will be used. > + * The *rx_conf* structure contains an *rx_thresh* structure with the values > + * of the Prefetch, Host, and Write-Back threshold registers of the receive > + * ring. > + * In addition it contains the hardware offloads features to activate using > + * the DEV_RX_OFFLOAD_* flags. > + * If an offloading set in rx_conf->offloads > + * hasn't been set in the input argument eth_conf->rxmode.offloads > + * to rte_eth_dev_configure(), it is a new added offloading, it must be > + * per-queue type and it is enabled for the queue. > + * No need to repeat any bit in rx_conf->offloads which has already been > + * enabled in rte_eth_dev_configure() at port level. An offloading enabled > + * at port level can't be disabled at queue level. Which offloads still make sense in the case of hairpin Rx queue? What about threshhods, drop enable? > + * @param hairpin_conf > + * The pointer to the hairpin binding configuration. > + * @return > + * - 0: Success, receive queue correctly set up. > + * - -EINVAL: The size of network buffers which can be allocated from the > + * memory pool does not fit the various buffer sizes allowed by the > + * device controller. > + * - -ENOMEM: Unable to allocate the receive ring descriptors or to > + * allocate network memory buffers from the memory pool when > + * initializing receive descriptors. > + */ > +__rte_experimental > +int rte_eth_rx_hairpin_queue_setup > + (uint16_t port_id, uint16_t rx_queue_id, > + uint16_t nb_rx_desc, unsigned int socket_id, > + const struct rte_eth_rxconf *rx_conf, > + const struct rte_eth_hairpin_conf *hairpin_conf); > + > +/** > * Allocate and set up a transmit queue for an Ethernet device. > * > * @param port_id > @@ -1821,6 +1899,73 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id, > const struct rte_eth_txconf *tx_conf); > > /** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * Allocate and set up a transmit hairpin queue for an Ethernet device. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param tx_queue_id > + * The index of the transmit queue to set up. > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied > + * to rte_eth_dev_configure(). Is any Tx queue may be setup as hairpin queue? > + * @param nb_tx_desc > + * The number of transmit descriptors to allocate for the transmit ring. Is it really required for hairpin queue? Are min/max/align limits still the same? > + * @param socket_id > + * The *socket_id* argument is the socket identifier in case of NUMA. > + * Its value can be *SOCKET_ID_ANY* if there is no NUMA constraint for > + * the DMA memory allocated for the transmit descriptors of the ring. Does it still make sense for Tx hairpin queue? > + * @param tx_conf > + * The pointer to the configuration data to be used for the transmit queue. > + * NULL value is allowed, in which case default RX configuration > + * will be used. > + * The *tx_conf* structure contains the following data: > + * - The *tx_thresh* structure with the values of the Prefetch, Host, and > + * Write-Back threshold registers of the transmit ring. > + * When setting Write-Back threshold to the value greater then zero, > + * *tx_rs_thresh* value should be explicitly set to one. > + * - The *tx_free_thresh* value indicates the [minimum] number of network > + * buffers that must be pending in the transmit ring to trigger their > + * [implicit] freeing by the driver transmit function. > + * - The *tx_rs_thresh* value indicates the [minimum] number of transmit > + * descriptors that must be pending in the transmit ring before setting the > + * RS bit on a descriptor by the driver transmit function. > + * The *tx_rs_thresh* value should be less or equal then > + * *tx_free_thresh* value, and both of them should be less then > + * *nb_tx_desc* - 3. I'm not sure that everything above makes sense for hairpin Tx queue. > + * - The *txq_flags* member contains flags to pass to the TX queue setup > + * function to configure the behavior of the TX queue. This should be set > + * to 0 if no special configuration is required. > + * This API is obsolete and will be deprecated. Applications > + * should set it to ETH_TXQ_FLAGS_IGNORE and use > + * the offloads field below. There is no txq_flags for a long time already. So, I'm wondering when it was copies from rte_eth_tx_queue_setup(). > + * - The *offloads* member contains Tx offloads to be enabled. > + * If an offloading set in tx_conf->offloads > + * hasn't been set in the input argument eth_conf->txmode.offloads > + * to rte_eth_dev_configure(), it is a new added offloading, it must be > + * per-queue type and it is enabled for the queue. > + * No need to repeat any bit in tx_conf->offloads which has already been > + * enabled in rte_eth_dev_configure() at port level. An offloading enabled > + * at port level can't be disabled at queue level. Which offloads do really make sense and valid to use for hairpin Tx queues? Do we need separate caps for hairpin offloads? > + * > + * Note that setting *tx_free_thresh* or *tx_rs_thresh* value to 0 forces > + * the transmit function to use default values. > + * @param hairpin_conf > + * The hairpin binding configuration. > + * > + * @return > + * - 0: Success, the transmit queue is correctly set up. > + * - -ENOMEM: Unable to allocate the transmit ring descriptors. > + */ > +__rte_experimental > +int rte_eth_tx_hairpin_queue_setup > + (uint16_t port_id, uint16_t tx_queue_id, > + uint16_t nb_tx_desc, unsigned int socket_id, > + const struct rte_eth_txconf *tx_conf, > + const struct rte_eth_hairpin_conf *hairpin_conf); > + > +/** > * Return the NUMA socket to which an Ethernet device is connected > * > * @param port_id > [snip]