From: Shahaf Shuler <shahafs@mellanox.com>
To: Tom Barbette <barbette@kth.se>, "dev@dpdk.org" <dev@dpdk.org>,
Ferruh Yigit <ferruh.yigit@intel.com>,
Thomas Monjalon <thomas@monjalon.net>,
Andrew Rybchenko <arybchenko@solarflare.com>,
"olivier.matz@6wind.com" <olivier.matz@6wind.com>
Cc: Yongseok Koh <yskoh@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
Date: Thu, 1 Nov 2018 07:21:31 +0000 [thread overview]
Message-ID: <DB7PR05MB4426685FAA18491A43449207C3CE0@DB7PR05MB4426.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <1540976475938.69727@kth.se>
Wednesday, October 31, 2018 11:01 AM, Tom Barbette:
> Subject: RE: [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
>
> Hi Shahaf,
>
> I don't see how rte_eth_rx_descriptor_status can actually give me the
> number of packets in the RX queue? It will tell me the status of a packet at a
> given offset, right?
It will tell you if in a given offset on the rxq you have a packet ready. I think it will fit your needs, see below.
>
> About the goal: we have a full view of a network (switches and servers), and
> we want to know where the queueing is happening. So queues from
> switches and servers are reported to the controller to deduce latency,
> congestion, ....
> On top of that, as CPU occupancy is somehow erratic with PMDs (ie we use
> approximations), the queuing helps to make better scheduling decisions
> about which servers could accept more load.
So how about the below heuristic using rte_eth_rx_descriptor_status.
Let's say you configure the rxq w/ N descriptors. Pick some threshold which will represent "this queue on this server has enough work, no need to send more", e.g. 3*N/4.
Monitor the rxq descriptor using rte_eth_rx_descriptor_status(port, rxq, 3*N/4). If you get RTE_ETH_RX_DESC_AVAIL, then the TH is not yet reached and the server can process more, otherwise if you get RTE_ETH_RX_DESC_DONE it means you should stop scheduling packets for this one.
You can also pick set of different TH to say queues is {not busy, partially busy, really busy} to deduce about the latency. But for the latency it is better to work w/ NIC host coherent clock + timestamps (like you implemented on a different patch).
>
> Considering the advance of Smart NICs, being able to monitor NICs as any
> networking equipment in a network is of even more importance (if I still
> need to make that point?).
>
>
> Tom
>
>
>
>
> ________________________________________
> De : Shahaf Shuler <shahafs@mellanox.com> Envoyé : dimanche 28 octobre
> 2018 10:37 À : Tom Barbette; dev@dpdk.org; Ferruh Yigit; Thomas Monjalon;
> Andrew Rybchenko; olivier.matz@6wind.com Cc : Yongseok Koh Objet : RE:
> [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
>
> Hi Tom,
>
> Adding ethdev maintainers and Oliver as the author of the new API.
>
>
> Saturday, October 27, 2018 6:11 PM, Tom Barbette:
> > Subject: [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
> >
>
> I have a more basic question.
> The rte_eth_rx_queue_count is a very old API, more or less from the
> beginning of DPDK.
> We progressed from then and a newer API to check the descriptor status
> was introduced @DPDK17.05 : rte_eth_rx_descriptor_status, see commit[1].
> There is also a plan to deprecate it once all drivers will support the new API.
>
> With the new API you can check the number of available descriptors on a
> similar way.
> So my question is, is the new API enough for your functionality? If not, what
> it is missing? I would prefer to improve the new one instead of starting to
> support the old one.
>
>
> [1]
> commit b1b700ce7d6fe0db9152f73e8e00394fc2756e60
> Author: Olivier Matz <olivier.matz@6wind.com>
> Date: Wed Mar 29 10:36:28 2017 +0200
>
> ethdev: add descriptor status API
>
> Introduce a new API to get the status of a descriptor.
>
> For Rx, it is almost similar to rx_descriptor_done API, except it
> differentiates "used" descriptors (which are hold by the driver and not
> returned to the hardware).
>
> For Tx, it is a new API.
>
> The descriptor_done() API, and probably the rx_queue_count() API could
> be replaced by this new API as soon as it is implemented on all PMDs.
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
>
>
> > This patch adds support for the rx_queue_count API in mlx5 driver
> >
> > Changes in v2:
> > * Fixed styling issues
> > * Fix missing return
> >
> > Changes in v3:
> > * Fix styling comments and checks as per Yongseok Koh
> > <yskoh@mellanox.com> comments. Thanks !
> >
> > Changes in v4:
> > * Fix compiling issue because of a line that disappeared in v3
> >
> > Signed-off-by: Tom Barbette <barbette@kth.se>
> > ---
> > drivers/net/mlx5/mlx5.c | 1 +
> > drivers/net/mlx5/mlx5_rxtx.c | 78
> > ++++++++++++++++++++++++++++++++++++++------
> > drivers/net/mlx5/mlx5_rxtx.h | 1 +
> > 3 files changed, 70 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > ec63bc6..6fccadd 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -375,6 +375,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
> > .filter_ctrl = mlx5_dev_filter_ctrl,
> > .rx_descriptor_status = mlx5_rx_descriptor_status,
> > .tx_descriptor_status = mlx5_tx_descriptor_status,
> > + .rx_queue_count = mlx5_rx_queue_count,
> > .rx_queue_intr_enable = mlx5_rx_intr_enable,
> > .rx_queue_intr_disable = mlx5_rx_intr_disable,
> > .is_removed = mlx5_is_removed,
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.c
> > b/drivers/net/mlx5/mlx5_rxtx.c index 2d14f8a..2126205 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx.c
> > @@ -417,20 +417,17 @@ mlx5_tx_descriptor_status(void *tx_queue,
> > uint16_t offset) }
> >
> > /**
> > - * DPDK callback to check the status of a rx descriptor.
> > + * Internal function to compute the number of used descriptors in an
> > + RX queue
> > *
> > - * @param rx_queue
> > - * The rx queue.
> > - * @param[in] offset
> > - * The index of the descriptor in the ring.
> > + * @param rxq
> > + * The Rx queue.
> > *
> > * @return
> > - * The status of the tx descriptor.
> > + * The number of used rx descriptor.
> > */
> > -int
> > -mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
> > +static uint32_t
> > +rx_queue_count(struct mlx5_rxq_data *rxq)
> > {
> > - struct mlx5_rxq_data *rxq = rx_queue;
> > struct rxq_zip *zip = &rxq->zip;
> > volatile struct mlx5_cqe *cqe;
> > const unsigned int cqe_n = (1 << rxq->cqe_n); @@ -461,12 +458,73
> > @@ mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
> > cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
> > }
> > used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
> > - if (offset < used)
> > + return used;
> > +}
> > +
> > +/**
> > + * DPDK callback to check the status of a rx descriptor.
> > + *
> > + * @param rx_queue
> > + * The Rx queue.
> > + * @param[in] offset
> > + * The index of the descriptor in the ring.
> > + *
> > + * @return
> > + * The status of the tx descriptor.
> > + */
> > +int
> > +mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset) {
> > + struct mlx5_rxq_data *rxq = rx_queue;
> > + struct mlx5_rxq_ctrl *rxq_ctrl =
> > + container_of(rxq, struct mlx5_rxq_ctrl, rxq);
> > + struct rte_eth_dev *dev = ETH_DEV(rxq_ctrl->priv);
> > +
> > + if (dev->rx_pkt_burst != mlx5_rx_burst) {
> > + rte_errno = ENOTSUP;
> > + return -rte_errno;
> > + }
> > + if (offset >= (1 << rxq->elts_n)) {
> > + rte_errno = EINVAL;
> > + return -rte_errno;
> > + }
> > + if (offset < rx_queue_count(rxq))
> > return RTE_ETH_RX_DESC_DONE;
> > return RTE_ETH_RX_DESC_AVAIL;
> > }
> >
> > /**
> > + * DPDK callback to get the number of used descriptors in a RX queue
> > + *
> > + * @param dev
> > + * Pointer to the device structure.
> > + *
> > + * @param rx_queue_id
> > + * The Rx queue.
> > + *
> > + * @return
> > + * The number of used rx descriptor.
> > + * -EINVAL if the queue is invalid
> > + */
> > +uint32_t
> > +mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) {
> > + struct priv *priv = dev->data->dev_private;
> > + struct mlx5_rxq_data *rxq;
> > +
> > + if (dev->rx_pkt_burst != mlx5_rx_burst) {
> > + rte_errno = ENOTSUP;
> > + return -rte_errno;
> > + }
> > + rxq = (*priv->rxqs)[rx_queue_id];
> > + if (!rxq) {
> > + rte_errno = EINVAL;
> > + return -rte_errno;
> > + }
> > + return rx_queue_count(rxq);
> > +}
> > +
> > +/**
> > * DPDK callback for TX.
> > *
> > * @param dpdk_txq
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h
> > b/drivers/net/mlx5/mlx5_rxtx.h index 48ed2b2..c82059b 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > @@ -345,6 +345,7 @@ uint16_t removed_rx_burst(void *dpdk_rxq, struct
> > rte_mbuf **pkts,
> > uint16_t pkts_n); int
> > mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset); int
> > mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset);
> > +uint32_t mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t
> > +rx_queue_id);
> >
> > /* Vectorized version of mlx5_rxtx.c */ int
> > mlx5_check_raw_vec_tx_support(struct rte_eth_dev *dev);
> > --
> > 2.7.4
next prev parent reply other threads:[~2018-11-01 7:21 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-27 15:10 Tom Barbette
2018-10-28 8:58 ` Tom Barbette
2018-10-28 9:37 ` Shahaf Shuler
2018-10-31 9:01 ` Tom Barbette
2018-11-01 7:21 ` Shahaf Shuler [this message]
2018-11-05 9:01 ` Tom Barbette
2018-11-05 9:55 ` Olivier Matz
2018-11-05 13:18 ` Shahaf Shuler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DB7PR05MB4426685FAA18491A43449207C3CE0@DB7PR05MB4426.eurprd05.prod.outlook.com \
--to=shahafs@mellanox.com \
--cc=arybchenko@solarflare.com \
--cc=barbette@kth.se \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=olivier.matz@6wind.com \
--cc=thomas@monjalon.net \
--cc=yskoh@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).