DPDK patches and discussions
 help / color / mirror / Atom feed
From: Shahaf Shuler <shahafs@mellanox.com>
To: Tom Barbette <barbette@kth.se>, "dev@dpdk.org" <dev@dpdk.org>,
	Ferruh Yigit <ferruh.yigit@intel.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	Andrew Rybchenko <arybchenko@solarflare.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>
Cc: Yongseok Koh <yskoh@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
Date: Thu, 1 Nov 2018 07:21:31 +0000	[thread overview]
Message-ID: <DB7PR05MB4426685FAA18491A43449207C3CE0@DB7PR05MB4426.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <1540976475938.69727@kth.se>

Wednesday, October 31, 2018 11:01 AM, Tom Barbette:
> Subject: RE: [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
> 
> Hi Shahaf,
> 
> I don't see how rte_eth_rx_descriptor_status can actually give me the
> number of packets in the RX queue? It will tell me the status of a packet at a
> given offset, right?

It will tell you if in a given offset on the rxq you have a packet ready. I think it will fit your needs, see below. 

> 
> About the goal: we have a full view of a network (switches and servers), and
> we want to know where the queueing is happening.  So queues from
> switches and servers are reported to the controller to deduce latency,
> congestion, ....
> On top of that, as CPU occupancy is somehow erratic with PMDs (ie we use
> approximations), the queuing helps to make better scheduling decisions
> about which servers could accept more load.

So how about the below heuristic using rte_eth_rx_descriptor_status.
Let's say you configure the rxq w/ N descriptors. Pick some threshold which will represent "this queue on this server has enough work, no need to send more", e.g. 3*N/4. 
Monitor the rxq descriptor using rte_eth_rx_descriptor_status(port, rxq, 3*N/4). If you get RTE_ETH_RX_DESC_AVAIL, then the TH is not yet reached and the server can process more, otherwise if you get RTE_ETH_RX_DESC_DONE it means you should stop scheduling packets for this one.

You can also pick set of different TH to say queues is {not busy, partially busy, really busy}  to deduce about the latency. But for the latency it is better to work w/ NIC host coherent clock + timestamps (like you implemented on a different patch).  

> 
> Considering the advance of Smart NICs, being able to monitor NICs as any
> networking equipment in a network is of even more importance (if I still
> need to make that point?).
> 
> 
> Tom
> 
> 
> 
> 
> ________________________________________
> De : Shahaf Shuler <shahafs@mellanox.com> Envoyé : dimanche 28 octobre
> 2018 10:37 À : Tom Barbette; dev@dpdk.org; Ferruh Yigit; Thomas Monjalon;
> Andrew Rybchenko; olivier.matz@6wind.com Cc : Yongseok Koh Objet : RE:
> [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
> 
> Hi Tom,
> 
> Adding ethdev maintainers and Oliver as the author of the new API.
> 
> 
> Saturday, October 27, 2018 6:11 PM, Tom Barbette:
> > Subject: [PATCH v4] mlx5: Support for rte_eth_rx_queue_count
> >
> 
> I have a more basic question.
> The rte_eth_rx_queue_count is a very old API, more or less from the
> beginning of DPDK.
> We progressed from then and a newer API to check the descriptor status
> was introduced @DPDK17.05 : rte_eth_rx_descriptor_status, see commit[1].
> There is also a plan to deprecate it once all drivers will support the new API.
> 
> With the new API you can check the number of available descriptors on a
> similar way.
> So my question is, is the new API enough for your functionality? If not, what
> it is missing? I would prefer to improve the new one instead of starting to
> support the old one.
> 
> 
> [1]
> commit b1b700ce7d6fe0db9152f73e8e00394fc2756e60
> Author: Olivier Matz <olivier.matz@6wind.com>
> Date:   Wed Mar 29 10:36:28 2017 +0200
> 
>     ethdev: add descriptor status API
> 
>     Introduce a new API to get the status of a descriptor.
> 
>     For Rx, it is almost similar to rx_descriptor_done API, except it
>     differentiates "used" descriptors (which are hold by the driver and not
>     returned to the hardware).
> 
>     For Tx, it is a new API.
> 
>     The descriptor_done() API, and probably the rx_queue_count() API could
>     be replaced by this new API as soon as it is implemented on all PMDs.
> 
>     Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>     Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
> 
> 
> > This patch adds support for the rx_queue_count API in mlx5 driver
> >
> > Changes in v2:
> >   * Fixed styling issues
> >   * Fix missing return
> >
> > Changes in v3:
> >   * Fix styling comments and checks as per Yongseok Koh
> >     <yskoh@mellanox.com> comments. Thanks !
> >
> > Changes in v4:
> >   * Fix compiling issue because of a line that disappeared in v3
> >
> > Signed-off-by: Tom Barbette <barbette@kth.se>
> > ---
> >  drivers/net/mlx5/mlx5.c      |  1 +
> >  drivers/net/mlx5/mlx5_rxtx.c | 78
> > ++++++++++++++++++++++++++++++++++++++------
> >  drivers/net/mlx5/mlx5_rxtx.h |  1 +
> >  3 files changed, 70 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > ec63bc6..6fccadd 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -375,6 +375,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
> >       .filter_ctrl = mlx5_dev_filter_ctrl,
> >       .rx_descriptor_status = mlx5_rx_descriptor_status,
> >       .tx_descriptor_status = mlx5_tx_descriptor_status,
> > +     .rx_queue_count = mlx5_rx_queue_count,
> >       .rx_queue_intr_enable = mlx5_rx_intr_enable,
> >       .rx_queue_intr_disable = mlx5_rx_intr_disable,
> >       .is_removed = mlx5_is_removed,
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.c
> > b/drivers/net/mlx5/mlx5_rxtx.c index 2d14f8a..2126205 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.c
> > +++ b/drivers/net/mlx5/mlx5_rxtx.c
> > @@ -417,20 +417,17 @@ mlx5_tx_descriptor_status(void *tx_queue,
> > uint16_t offset)  }
> >
> >  /**
> > - * DPDK callback to check the status of a rx descriptor.
> > + * Internal function to compute the number of used descriptors in an
> > + RX queue
> >   *
> > - * @param rx_queue
> > - *   The rx queue.
> > - * @param[in] offset
> > - *   The index of the descriptor in the ring.
> > + * @param rxq
> > + *   The Rx queue.
> >   *
> >   * @return
> > - *   The status of the tx descriptor.
> > + *   The number of used rx descriptor.
> >   */
> > -int
> > -mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
> > +static uint32_t
> > +rx_queue_count(struct mlx5_rxq_data *rxq)
> >  {
> > -     struct mlx5_rxq_data *rxq = rx_queue;
> >       struct rxq_zip *zip = &rxq->zip;
> >       volatile struct mlx5_cqe *cqe;
> >       const unsigned int cqe_n = (1 << rxq->cqe_n); @@ -461,12 +458,73
> > @@ mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
> >               cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
> >       }
> >       used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
> > -     if (offset < used)
> > +     return used;
> > +}
> > +
> > +/**
> > + * DPDK callback to check the status of a rx descriptor.
> > + *
> > + * @param rx_queue
> > + *   The Rx queue.
> > + * @param[in] offset
> > + *   The index of the descriptor in the ring.
> > + *
> > + * @return
> > + *   The status of the tx descriptor.
> > + */
> > +int
> > +mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset) {
> > +     struct mlx5_rxq_data *rxq = rx_queue;
> > +     struct mlx5_rxq_ctrl *rxq_ctrl =
> > +                     container_of(rxq, struct mlx5_rxq_ctrl, rxq);
> > +     struct rte_eth_dev *dev = ETH_DEV(rxq_ctrl->priv);
> > +
> > +     if (dev->rx_pkt_burst != mlx5_rx_burst) {
> > +             rte_errno = ENOTSUP;
> > +             return -rte_errno;
> > +     }
> > +     if (offset >= (1 << rxq->elts_n)) {
> > +             rte_errno = EINVAL;
> > +             return -rte_errno;
> > +     }
> > +     if (offset < rx_queue_count(rxq))
> >               return RTE_ETH_RX_DESC_DONE;
> >       return RTE_ETH_RX_DESC_AVAIL;
> >  }
> >
> >  /**
> > + * DPDK callback to get the number of used descriptors in a RX queue
> > + *
> > + * @param dev
> > + *   Pointer to the device structure.
> > + *
> > + * @param rx_queue_id
> > + *   The Rx queue.
> > + *
> > + * @return
> > + *   The number of used rx descriptor.
> > + *   -EINVAL if the queue is invalid
> > + */
> > +uint32_t
> > +mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) {
> > +     struct priv *priv = dev->data->dev_private;
> > +     struct mlx5_rxq_data *rxq;
> > +
> > +     if (dev->rx_pkt_burst != mlx5_rx_burst) {
> > +             rte_errno = ENOTSUP;
> > +             return -rte_errno;
> > +     }
> > +     rxq = (*priv->rxqs)[rx_queue_id];
> > +     if (!rxq) {
> > +             rte_errno = EINVAL;
> > +             return -rte_errno;
> > +     }
> > +     return rx_queue_count(rxq);
> > +}
> > +
> > +/**
> >   * DPDK callback for TX.
> >   *
> >   * @param dpdk_txq
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h
> > b/drivers/net/mlx5/mlx5_rxtx.h index 48ed2b2..c82059b 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > @@ -345,6 +345,7 @@ uint16_t removed_rx_burst(void *dpdk_rxq, struct
> > rte_mbuf **pkts,
> >                         uint16_t pkts_n);  int
> > mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset);  int
> > mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset);
> > +uint32_t mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t
> > +rx_queue_id);
> >
> >  /* Vectorized version of mlx5_rxtx.c */  int
> > mlx5_check_raw_vec_tx_support(struct rte_eth_dev *dev);
> > --
> > 2.7.4

  reply	other threads:[~2018-11-01  7:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-27 15:10 Tom Barbette
2018-10-28  8:58 ` Tom Barbette
2018-10-28  9:37 ` Shahaf Shuler
2018-10-31  9:01   ` Tom Barbette
2018-11-01  7:21     ` Shahaf Shuler [this message]
2018-11-05  9:01       ` Tom Barbette
2018-11-05  9:55         ` Olivier Matz
2018-11-05 13:18           ` Shahaf Shuler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB7PR05MB4426685FAA18491A43449207C3CE0@DB7PR05MB4426.eurprd05.prod.outlook.com \
    --to=shahafs@mellanox.com \
    --cc=arybchenko@solarflare.com \
    --cc=barbette@kth.se \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).