* [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
@ 2016-11-24  9:54 Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count Olivier Matz
                   ` (10 more replies)
  0 siblings, 11 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
This RFC patchset introduces a new ethdev API function
rte_eth_tx_queue_count() which is the tx counterpart of
rte_eth_rx_queue_count(). It implements this API on some
Intel drivers for reference, and it also optimizes the
implementation of rte_eth_rx_queue_count().
The usage of these functions can be:
- on Rx, anticipate that the cpu is not fast enough to process
  all incoming packets, and take dispositions to solve the
  problem (add more cpus, drop specific packets, ...)
- on Tx, detect that the link is overloaded, and take dispositions
  to solve the problem (notify flow control, drop specific
  packets)
The tests I've done (instrumenting testpmd) show that browsing
the descriptors linearly is slow when the ring size increases.
Accessing the head/tail registers through pci is also slow
whatever the size of the ring. A binary search is a good compromise
that gives quite good performance whatever the size of the ring.
Remaining question are about:
- should we keep this name? I'd say "queue_count" is quite confusing,
  and I would expect it returns the number of queues, not the
  number of used descriptors
- how shall we count the used descriptors, knowing that the driver
  can hold some to free them by bulk, which reduces the effective
  size of the ring
I would be happy to have some feedback about this RFC before
I send it as a patch.
Here are some helpers to understand the code more easily (I sometimes
make some shortcuts between like 1 pkt == 1 desc).
RX side
=======
- sw advances the tail pointer
- hw advances the head pointer
- the software populates the ring with descs to buffers that are filled
  when the hw receives packets
- head == tail means there is no available buffer for hw to receive a packet
- head points to the next descriptor to be filled
- hw owns all descriptors between [head...tail]
- when a packet is written in a descriptor, the DD (descriptor done)
  bit is set, and the head is advanced
- the driver never reads the head (needs a pci transaction), instead it
  monitors the DD bit of next descriptor
- when a filled packet is retrieved by the software, the descriptor has
  to be populated with a new empty buffer. This is not done for each
  packet: the driver holds them and waits until it has many descriptors
  to populate, and do it by bulk.
  (by the way, it means that the effective size a queue of size=N is
  lower than N since these descriptors cannot be used by the hw)
rxq->rx_tail: current value of the sw tail (the idx of the next packet to
  be received). The real tail (hw) can be different since the driver can
  hold descriptors.
rxq->nb_rx_hold: number of held descriptors
rxq->rxrearm_nb: same, but for vector driver
rxq->rx_free_thresh: when the number of held descriptors reaches this threshold,
  descriptors are populated with buffers to be filled, and sw advances the tail
Example with a ring size of 64:
|----------------------------------------------------------------|
|                    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx          |
|                    x buffers filled with data by hw x          |
|                    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx          |
|----------------------------------------------------------------|
                     ^hw_tail=20
                                    ^sw_tail=35
                                                       ^hw_head=54
                     <--- nb_hold -->
                                    <-pkts in hw queue->
The descriptors marked with 'x' has their DD bit set, the other
  (' ') reference empty buffers.
The next packet to be received by software is at index 35.
The software holds 15 descriptors that will be rearmed later.
There are 19 packets waiting in the hw queue.
We want the function rx_queue_count() to return the number of
"used" descriptors. The first question is: what does that mean
exactly? Should it be pkts_in_hw_queue or pkts_in_hw_queue + nb_hold?
The current implementation returns pkts_in_hw_queue, but including
nb_hold could be useful to know how many descriptors are really
free (= size - used).
The current implementation checks the DD bit starting from sw_tail,
every 4 packets. It can be quite slow for large rings. An alternative
is to read the head register, but it's also slow.
This patchset optimizes rx_queue_count() by doing a binary
search (checking for DD) between sw_tail and hw_tail, instead of a
linear search.
TX side
=======
- sw advances the tail pointer
- hw advances the head pointer
- the software populates the ring with full buffers to be sent by
  the hw
- head points to the in-progress descriptor.
- sw writes new descriptors at tail
- head == tail means that the transmit queue is empty
- when the hw has processed a descriptor, it sets the DD bit if
  the descriptor has the RS (report status) bit.
- the driver never reads the head (needs a pci transaction), instead it
  monitors the DD bit of a descriptor that has the RS bit
txq->tx_tail: sw value for tail register
txq->tx_free_thresh: free buffers if count(free descriptors) < this value
txq->tx_rs_thresh: RS bit is set every X descriptor
txq->tx_next_dd: next desc to scan for DD bit
txq->tx_next_rs: next desc to set RS bit
txq->last_desc_cleaned: last descriptor that have been cleaned
txq->nb_tx_free: number of free descriptors
Example:
|----------------------------------------------------------------|
|               D       R       R       R                        |
|        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
|        <descs sent><- descs not sent yet  ->                   |
|        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
|----------------------------------------------------------------|
        ^last_desc_cleaned=8                    ^next_rs=47
                ^next_dd=15                   ^tail=45
                     ^hw_head=20
                     <----  nb_used  --------->
The hardware is currently processing the descriptor 20
'R' means the descriptor has the RS bit
'D' means the descriptor has the DD + RS bits
'x' are packets in txq (not sent)
'.' are packet already sent but not freed by sw
In this example, we have rs_thres=8. On next call to ixgbe_tx_free_bufs(),
some buffers will be freed.
The new implementation does a binary search (checking for DD) between next_dd
and tail.
Olivier Matz (9):
  ethdev: clarify api comments of rx queue count
  ethdev: move queue id check in generic layer
  ethdev: add handler for Tx queue descriptor count
  net/ixgbe: optimize Rx queue descriptor count
  net/ixgbe: add handler for Tx queue descriptor count
  net/igb: optimize rx queue descriptor count
  net/igb: add handler for tx queue descriptor count
  net/e1000: optimize rx queue descriptor count
  net/e1000: add handler for tx queue descriptor count
 drivers/net/e1000/e1000_ethdev.h |  10 +++-
 drivers/net/e1000/em_ethdev.c    |   1 +
 drivers/net/e1000/em_rxtx.c      | 109 ++++++++++++++++++++++++++++------
 drivers/net/e1000/igb_ethdev.c   |   1 +
 drivers/net/e1000/igb_rxtx.c     | 109 ++++++++++++++++++++++++++++------
 drivers/net/i40e/i40e_rxtx.c     |   5 --
 drivers/net/ixgbe/ixgbe_ethdev.c |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.h |   4 +-
 drivers/net/ixgbe/ixgbe_rxtx.c   | 123 +++++++++++++++++++++++++++++++++------
 drivers/net/ixgbe/ixgbe_rxtx.h   |   2 +
 drivers/net/nfp/nfp_net.c        |   6 --
 lib/librte_ether/rte_ethdev.h    |  48 +++++++++++++--
 12 files changed, 344 insertions(+), 75 deletions(-)
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24 10:52   ` Ferruh Yigit
  2016-11-24  9:54 ` [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer Olivier Matz
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
The API comments are not consistent between each other.
The function rte_eth_rx_queue_count() returns the number of used
descriptors on a receive queue.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 lib/librte_ether/rte_ethdev.h | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 9678179..c3edc23 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1145,7 +1145,7 @@ typedef void (*eth_queue_release_t)(void *queue);
 
 typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 					 uint16_t rx_queue_id);
-/**< @internal Get number of available descriptors on a receive queue of an Ethernet device. */
+/**< @internal Get number of used descriptors on a receive queue. */
 
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
@@ -1459,7 +1459,8 @@ struct eth_dev_ops {
 	eth_queue_stop_t           tx_queue_stop;/**< Stop TX for a queue.*/
 	eth_rx_queue_setup_t       rx_queue_setup;/**< Set up device RX queue.*/
 	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
-	eth_rx_queue_count_t       rx_queue_count; /**< Get Rx queue count. */
+	eth_rx_queue_count_t       rx_queue_count;
+	/**< Get the number of used RX descriptors. */
 	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
 	/**< Enable Rx queue interrupt. */
 	eth_rx_enable_intr_t       rx_queue_intr_enable;
@@ -2684,7 +2685,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
 }
 
 /**
- * Get the number of used descriptors in a specific queue
+ * Get the number of used descriptors of a rx queue
  *
  * @param port_id
  *  The port identifier of the Ethernet device.
@@ -2699,9 +2700,11 @@ static inline int
 rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
-        return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
+
+	return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
 }
 
 /**
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24 10:59   ` Ferruh Yigit
  2016-11-24  9:54 ` [dpdk-dev] [RFC 3/9] ethdev: add handler for Tx queue descriptor count Olivier Matz
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
The check of queue_id is done in all drivers implementing
rte_eth_rx_queue_count(). Factorize this check in the generic function.
Note that the nfp driver was doing the check differently, which could
induce crashes if the queue index was too big.
By the way, also move the is_supported test before the port valid and
queue valid test.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/e1000/em_rxtx.c    | 5 -----
 drivers/net/e1000/igb_rxtx.c   | 5 -----
 drivers/net/i40e/i40e_rxtx.c   | 5 -----
 drivers/net/ixgbe/ixgbe_rxtx.c | 5 -----
 drivers/net/nfp/nfp_net.c      | 6 ------
 lib/librte_ether/rte_ethdev.h  | 6 ++++--
 6 files changed, 4 insertions(+), 28 deletions(-)
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index 41f51c0..c1c724b 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1390,11 +1390,6 @@ eth_em_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	struct em_rx_queue *rxq;
 	uint32_t desc = 0;
 
-	if (rx_queue_id >= dev->data->nb_rx_queues) {
-		PMD_RX_LOG(DEBUG, "Invalid RX queue_id=%d", rx_queue_id);
-		return 0;
-	}
-
 	rxq = dev->data->rx_queues[rx_queue_id];
 	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
 
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index dbd37ac..e9aa356 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1512,11 +1512,6 @@ eth_igb_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	struct igb_rx_queue *rxq;
 	uint32_t desc = 0;
 
-	if (rx_queue_id >= dev->data->nb_rx_queues) {
-		PMD_RX_LOG(ERR, "Invalid RX queue id=%d", rx_queue_id);
-		return 0;
-	}
-
 	rxq = dev->data->rx_queues[rx_queue_id];
 	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
 
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..79a72f0 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1793,11 +1793,6 @@ i40e_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	struct i40e_rx_queue *rxq;
 	uint16_t desc = 0;
 
-	if (unlikely(rx_queue_id >= dev->data->nb_rx_queues)) {
-		PMD_DRV_LOG(ERR, "Invalid RX queue id %u", rx_queue_id);
-		return 0;
-	}
-
 	rxq = dev->data->rx_queues[rx_queue_id];
 	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
 	while ((desc < rxq->nb_rx_desc) &&
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index b2d9f45..1a8ea5f 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2857,11 +2857,6 @@ ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	struct ixgbe_rx_queue *rxq;
 	uint32_t desc = 0;
 
-	if (rx_queue_id >= dev->data->nb_rx_queues) {
-		PMD_RX_LOG(ERR, "Invalid RX queue id=%d", rx_queue_id);
-		return 0;
-	}
-
 	rxq = dev->data->rx_queues[rx_queue_id];
 	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
 
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index e315dd8..f1d00fb 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1084,12 +1084,6 @@ nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
 	uint32_t count;
 
 	rxq = (struct nfp_net_rxq *)dev->data->rx_queues[queue_idx];
-
-	if (rxq == NULL) {
-		PMD_INIT_LOG(ERR, "Bad queue: %u\n", queue_idx);
-		return 0;
-	}
-
 	idx = rxq->rd_p % rxq->rx_count;
 	rxds = &rxq->rxds[idx];
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c3edc23..9551cfd 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2693,7 +2693,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
  *  The queue id on the specific port.
  * @return
  *  The number of used descriptors in the specific queue, or:
- *     (-EINVAL) if *port_id* is invalid
+ *     (-EINVAL) if *port_id* or *queue_id* is invalid
  *     (-ENOTSUP) if the device does not support this function
  */
 static inline int
@@ -2701,8 +2701,10 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+	if (queue_id >= dev->data->nb_rx_queues)
+		return -EINVAL;
 
 	return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
 }
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 3/9] ethdev: add handler for Tx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 4/9] net/ixgbe: optimize Rx " Olivier Matz
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Implement the Tx counterpart of rte_eth_rx_queue_count() in ethdev API,
which returns the number of used descriptors in a Tx queue.
It can help an application to detect that a link is too slow and cannot
send at the desired rate. In this case, the application can decide to
decrease the rate, or drop the packets with the lowest priority.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 lib/librte_ether/rte_ethdev.h | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 9551cfd..8244807 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1147,6 +1147,10 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 					 uint16_t rx_queue_id);
 /**< @internal Get number of used descriptors on a receive queue. */
 
+typedef uint32_t (*eth_tx_queue_count_t)(struct rte_eth_dev *dev,
+					 uint16_t tx_queue_id);
+/**< @internal Get number of used descriptors on a transmit queue */
+
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
@@ -1461,6 +1465,8 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
 	eth_rx_queue_count_t       rx_queue_count;
 	/**< Get the number of used RX descriptors. */
+	eth_tx_queue_count_t       tx_queue_count;
+	/**< Get the number of used TX descriptors. */
 	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
 	/**< Enable Rx queue interrupt. */
 	eth_rx_enable_intr_t       rx_queue_intr_enable;
@@ -2710,6 +2716,31 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
 }
 
 /**
+ * Get the number of used descriptors of a tx queue
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param queue_id
+ *  The queue id on the specific port.
+ * @return
+ *  - number of free descriptors if positive or zero
+ *  - (-EINVAL) if *port_id* or *queue_id* is invalid.
+ *  - (-ENOTSUP) if the device does not support this function
+ */
+static inline int
+rte_eth_tx_queue_count(uint8_t port_id, uint16_t queue_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_count, -ENOTSUP);
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+	if (queue_id >= dev->data->nb_tx_queues)
+		return -EINVAL;
+
+	return (*dev->dev_ops->tx_queue_count)(dev, queue_id);
+}
+
+/**
  * Check if the DD bit of the specific RX descriptor in the queue has been set
  *
  * @param port_id
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 4/9] net/ixgbe: optimize Rx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (2 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 3/9] ethdev: add handler for Tx queue descriptor count Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 5/9] net/ixgbe: add handler for Tx " Olivier Matz
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Use a binary search algorithm to find the first empty DD bit. The
ring-empty and ring-full cases are managed separately as they are more
likely to happen.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 63 ++++++++++++++++++++++++++++++++----------
 1 file changed, 48 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 1a8ea5f..07509b4 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2852,25 +2852,58 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
 uint32_t
 ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 {
-#define IXGBE_RXQ_SCAN_INTERVAL 4
-	volatile union ixgbe_adv_rx_desc *rxdp;
+	volatile uint32_t *status;
 	struct ixgbe_rx_queue *rxq;
-	uint32_t desc = 0;
+	uint32_t offset, interval, nb_hold, resolution;
+	int32_t idx;
 
 	rxq = dev->data->rx_queues[rx_queue_id];
-	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
-
-	while ((desc < rxq->nb_rx_desc) &&
-		(rxdp->wb.upper.status_error &
-			rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) {
-		desc += IXGBE_RXQ_SCAN_INTERVAL;
-		rxdp += IXGBE_RXQ_SCAN_INTERVAL;
-		if (rxq->rx_tail + desc >= rxq->nb_rx_desc)
-			rxdp = &(rxq->rx_ring[rxq->rx_tail +
-				desc - rxq->nb_rx_desc]);
-	}
 
-	return desc;
+	/* check if ring empty */
+	idx = rxq->rx_tail;
+	status = &rxq->rx_ring[idx].wb.upper.status_error;
+	if (!(*status & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
+		return 0;
+
+	/* decrease the precision if ring is large */
+	if (rxq->nb_rx_desc <= 256)
+		resolution = 4;
+	else
+		resolution = 16;
+
+	/* check if ring full */
+#ifdef RTE_IXGBE_INC_VECTOR
+	if (rxq->rx_using_sse)
+		nb_hold = rxq->rxrearm_nb;
+	else
+#endif
+		nb_hold = rxq->nb_rx_hold;
+
+	idx = rxq->rx_tail - nb_hold - resolution;
+	if (idx < 0)
+		idx += rxq->nb_rx_desc;
+	status = &rxq->rx_ring[idx].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))
+		return rxq->nb_rx_desc;
+
+	/* use a binary search */
+	interval = (rxq->nb_rx_desc - nb_hold) >> 1;
+	offset = interval;
+
+	do {
+		idx = rxq->rx_tail + offset;
+		if (idx >= rxq->nb_rx_desc)
+			idx -= rxq->nb_rx_desc;
+
+		interval >>= 1;
+		status = &rxq->rx_ring[idx].wb.upper.status_error;
+		if (*status & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))
+			offset += interval;
+		else
+			offset -= interval;
+	} while (interval >= resolution);
+
+	return offset;
 }
 
 int
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 5/9] net/ixgbe: add handler for Tx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (3 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 4/9] net/ixgbe: optimize Rx " Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 6/9] net/igb: optimize rx " Olivier Matz
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Like for TX, use a binary search algorithm to get the number of used Tx
descriptors.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c |  1 +
 drivers/net/ixgbe/ixgbe_ethdev.h |  4 ++-
 drivers/net/ixgbe/ixgbe_rxtx.c   | 57 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.h   |  2 ++
 4 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index baffc71..0ba098a 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -553,6 +553,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.rx_queue_intr_disable = ixgbe_dev_rx_queue_intr_disable,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
+	.tx_queue_count       = ixgbe_dev_tx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 4ff6338..e060c3d 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -348,7 +348,9 @@ int  ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		const struct rte_eth_txconf *tx_conf);
 
 uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
-		uint16_t rx_queue_id);
+				  uint16_t rx_queue_id);
+uint32_t ixgbe_dev_tx_queue_count(struct rte_eth_dev *dev,
+				  uint16_t tx_queue_id);
 
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 07509b4..5bf6b1a 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2437,6 +2437,7 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev,
 
 	txq->nb_tx_desc = nb_desc;
 	txq->tx_rs_thresh = tx_rs_thresh;
+	txq->tx_rs_thresh_div = nb_desc / tx_rs_thresh;
 	txq->tx_free_thresh = tx_free_thresh;
 	txq->pthresh = tx_conf->tx_thresh.pthresh;
 	txq->hthresh = tx_conf->tx_thresh.hthresh;
@@ -2906,6 +2907,62 @@ ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return offset;
 }
 
+uint32_t
+ixgbe_dev_tx_queue_count(struct rte_eth_dev *dev, uint16_t tx_queue_id)
+{
+	struct ixgbe_tx_queue *txq;
+	uint32_t status;
+	int32_t offset, interval, idx = 0;
+	int32_t max_offset, used_desc;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+
+	/* if DD on next threshold desc is not set, assume used packets
+	 * are pending.
+	 */
+	status = txq->tx_ring[txq->tx_next_dd].wb.status;
+	if (!(status & rte_cpu_to_le_32(IXGBE_ADVTXD_STAT_DD)))
+		return txq->nb_tx_desc - txq->nb_tx_free - 1;
+
+	/* browse DD bits between tail starting from tx_next_dd: we have
+	 * to be careful since DD bits are only set every tx_rs_thresh
+	 * descriptor.
+	 */
+	interval = txq->tx_rs_thresh_div >> 1;
+	offset = interval * txq->tx_rs_thresh;
+
+	/* don't go beyond tail */
+	max_offset = txq->tx_tail - txq->tx_next_dd;
+	if (max_offset < 0)
+		max_offset += txq->nb_tx_desc;
+
+	do {
+		interval >>= 1;
+
+		if (offset >= max_offset) {
+			offset -= (interval * txq->tx_rs_thresh);
+			continue;
+		}
+
+		idx = txq->tx_next_dd + offset;
+		if (idx >= txq->nb_tx_desc)
+			idx -= txq->nb_tx_desc;
+
+		status = txq->tx_ring[idx].wb.status;
+		if (status & rte_cpu_to_le_32(IXGBE_ADVTXD_STAT_DD))
+			offset += (interval * txq->tx_rs_thresh);
+		else
+			offset -= (interval * txq->tx_rs_thresh);
+	} while (interval > 0);
+
+	/* idx is now the index of the head */
+	used_desc = txq->tx_tail - idx;
+	if (used_desc < 0)
+		used_desc += txq->nb_tx_desc;
+
+	return used_desc;
+}
+
 int
 ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 {
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
index 2608b36..f69b5de 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.h
+++ b/drivers/net/ixgbe/ixgbe_rxtx.h
@@ -221,6 +221,8 @@ struct ixgbe_tx_queue {
 	uint16_t            tx_free_thresh;
 	/** Number of TX descriptors to use before RS bit is set. */
 	uint16_t            tx_rs_thresh;
+	/** Number of TX descriptors divided by tx_rs_thresh. */
+	uint16_t            tx_rs_thresh_div;
 	/** Number of TX descriptors used since RS bit was set. */
 	uint16_t            nb_tx_used;
 	/** Index to last TX descriptor to have been cleaned. */
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 6/9] net/igb: optimize rx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (4 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 5/9] net/ixgbe: add handler for Tx " Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 7/9] net/igb: add handler for tx " Olivier Matz
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Use a binary search algorithm to find the first empty DD bit. The
ring-empty and ring-full cases are managed separately as they are more
likely to happen.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/e1000/igb_rxtx.c | 55 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 41 insertions(+), 14 deletions(-)
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index e9aa356..6b0111f 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1507,24 +1507,51 @@ eth_igb_rx_queue_setup(struct rte_eth_dev *dev,
 uint32_t
 eth_igb_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 {
-#define IGB_RXQ_SCAN_INTERVAL 4
-	volatile union e1000_adv_rx_desc *rxdp;
+	volatile uint32_t *status;
 	struct igb_rx_queue *rxq;
-	uint32_t desc = 0;
+	uint32_t offset, interval, resolution;
+	int32_t idx;
 
 	rxq = dev->data->rx_queues[rx_queue_id];
-	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
-
-	while ((desc < rxq->nb_rx_desc) &&
-		(rxdp->wb.upper.status_error & E1000_RXD_STAT_DD)) {
-		desc += IGB_RXQ_SCAN_INTERVAL;
-		rxdp += IGB_RXQ_SCAN_INTERVAL;
-		if (rxq->rx_tail + desc >= rxq->nb_rx_desc)
-			rxdp = &(rxq->rx_ring[rxq->rx_tail +
-				desc - rxq->nb_rx_desc]);
-	}
 
-	return desc;
+	/* check if ring empty */
+	idx = rxq->rx_tail;
+	status = &rxq->rx_ring[idx].wb.upper.status_error;
+	if (!(*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD)))
+		return 0;
+
+	/* decrease the precision if ring is large */
+	if (rxq->nb_rx_desc <= 256)
+		resolution = 4;
+	else
+		resolution = 16;
+
+	/* check if ring full */
+	idx = rxq->rx_tail - rxq->nb_rx_hold - resolution;
+	if (idx < 0)
+		idx += rxq->nb_rx_desc;
+	status = &rxq->rx_ring[idx].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD))
+		return rxq->nb_rx_desc;
+
+	/* use a binary search */
+	interval = (rxq->nb_rx_desc - rxq->nb_rx_hold) >> 1;
+	offset = interval;
+
+	do {
+		idx = rxq->rx_tail + offset;
+		if (idx >= rxq->nb_rx_desc)
+			idx -= rxq->nb_rx_desc;
+
+		interval >>= 1;
+		status = &rxq->rx_ring[idx].wb.upper.status_error;
+		if (*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD))
+			offset += interval;
+		else
+			offset -= interval;
+	} while (interval >= resolution);
+
+	return offset;
 }
 
 int
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 7/9] net/igb: add handler for tx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (5 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 6/9] net/igb: optimize rx " Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 8/9] net/e1000: optimize rx " Olivier Matz
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Like for TX, use a binary search algorithm to get the number of used Tx
descriptors.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/e1000/e1000_ethdev.h |  5 +++-
 drivers/net/e1000/igb_ethdev.c   |  1 +
 drivers/net/e1000/igb_rxtx.c     | 51 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+), 1 deletion(-)
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 6c25c8d..ad9ddaf 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -300,7 +300,10 @@ int eth_igb_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
 		struct rte_mempool *mb_pool);
 
 uint32_t eth_igb_rx_queue_count(struct rte_eth_dev *dev,
-		uint16_t rx_queue_id);
+				uint16_t rx_queue_id);
+
+uint32_t eth_igb_tx_queue_count(struct rte_eth_dev *dev,
+				uint16_t tx_queue_id);
 
 int eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 08f2a68..a54d374 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -399,6 +399,7 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
+	.tx_queue_count       = eth_igb_tx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
 	.tx_queue_setup       = eth_igb_tx_queue_setup,
 	.tx_queue_release     = eth_igb_tx_queue_release,
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 6b0111f..2ff2417 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1554,6 +1554,57 @@ eth_igb_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return offset;
 }
 
+uint32_t
+eth_igb_tx_queue_count(struct rte_eth_dev *dev, uint16_t tx_queue_id)
+{
+	volatile uint32_t *status;
+	struct igb_tx_queue *txq;
+	int32_t offset, interval, idx, resolution;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+
+	/* check if ring empty */
+	idx = txq->tx_tail - 1;
+	if (idx < 0)
+		idx += txq->nb_tx_desc;
+	status = &txq->tx_ring[idx].wb.status;
+	if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
+		return 0;
+
+	/* check if ring full */
+	idx = txq->tx_tail + 1;
+	if (idx >= txq->nb_tx_desc)
+		idx -= txq->nb_tx_desc;
+	status = &txq->tx_ring[idx].wb.status;
+	if (!(*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD)))
+		return txq->nb_tx_desc;
+
+	/* decrease the precision if ring is large */
+	if (txq->nb_tx_desc <= 256)
+		resolution = 4;
+	else
+		resolution = 16;
+
+	/* use a binary search */
+	interval = txq->nb_tx_desc >> 1;
+	offset = interval;
+
+	do {
+		interval >>= 1;
+		idx = txq->tx_tail + offset;
+		if (idx >= txq->nb_tx_desc)
+			idx -= txq->nb_tx_desc;
+
+		status = &txq->tx_ring[idx].wb.status;
+		if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
+			offset += interval;
+		else
+			offset -= interval;
+	} while (interval >= resolution);
+
+	return txq->nb_tx_desc - offset;
+}
+
 int
 eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 8/9] net/e1000: optimize rx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (6 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 7/9] net/igb: add handler for tx " Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2016-11-24  9:54 ` [dpdk-dev] [RFC 9/9] net/e1000: add handler for tx " Olivier Matz
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Use a binary search algorithm to find the first empty DD bit. The
ring-empty and ring-full cases are managed separately as they are more
likely to happen.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/e1000/em_rxtx.c | 55 +++++++++++++++++++++++++++++++++------------
 1 file changed, 41 insertions(+), 14 deletions(-)
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index c1c724b..a469fd7 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1385,24 +1385,51 @@ eth_em_rx_queue_setup(struct rte_eth_dev *dev,
 uint32_t
 eth_em_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 {
-#define EM_RXQ_SCAN_INTERVAL 4
-	volatile struct e1000_rx_desc *rxdp;
+	volatile uint8_t *status;
 	struct em_rx_queue *rxq;
-	uint32_t desc = 0;
+	uint32_t offset, interval, resolution;
+	int32_t idx;
 
 	rxq = dev->data->rx_queues[rx_queue_id];
-	rxdp = &(rxq->rx_ring[rxq->rx_tail]);
-
-	while ((desc < rxq->nb_rx_desc) &&
-		(rxdp->status & E1000_RXD_STAT_DD)) {
-		desc += EM_RXQ_SCAN_INTERVAL;
-		rxdp += EM_RXQ_SCAN_INTERVAL;
-		if (rxq->rx_tail + desc >= rxq->nb_rx_desc)
-			rxdp = &(rxq->rx_ring[rxq->rx_tail +
-				desc - rxq->nb_rx_desc]);
-	}
 
-	return desc;
+	/* check if ring empty */
+	idx = rxq->rx_tail;
+	status = &rxq->rx_ring[idx].status;
+	if (!(*status & E1000_RXD_STAT_DD))
+		return 0;
+
+	/* decrease the precision if ring is large */
+	if (rxq->nb_rx_desc <= 256)
+		resolution = 4;
+	else
+		resolution = 16;
+
+	/* check if ring full */
+	idx = rxq->rx_tail - rxq->nb_rx_hold - resolution;
+	if (idx < 0)
+		idx += rxq->nb_rx_desc;
+	status = &rxq->rx_ring[idx].status;
+	if (*status & E1000_RXD_STAT_DD)
+		return rxq->nb_rx_desc;
+
+	/* use a binary search */
+	interval = (rxq->nb_rx_desc - rxq->nb_rx_hold) >> 1;
+	offset = interval;
+
+	do {
+		idx = rxq->rx_tail + offset;
+		if (idx >= rxq->nb_rx_desc)
+			idx -= rxq->nb_rx_desc;
+
+		interval >>= 1;
+		status = &rxq->rx_ring[idx].status;
+		if (*status & E1000_RXD_STAT_DD)
+			offset += interval;
+		else
+			offset -= interval;
+	} while (interval >= resolution);
+
+	return offset;
 }
 
 int
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [RFC 9/9] net/e1000: add handler for tx queue descriptor count
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (7 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 8/9] net/e1000: optimize rx " Olivier Matz
@ 2016-11-24  9:54 ` Olivier Matz
  2017-01-13 16:44 ` [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
  10 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24  9:54 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Like for TX, use a binary search algorithm to get the number of used Tx
descriptors.
PR=52423
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
---
 drivers/net/e1000/e1000_ethdev.h |  5 +++-
 drivers/net/e1000/em_ethdev.c    |  1 +
 drivers/net/e1000/em_rxtx.c      | 51 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+), 1 deletion(-)
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index ad9ddaf..8945916 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -364,7 +364,10 @@ int eth_em_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
 		struct rte_mempool *mb_pool);
 
 uint32_t eth_em_rx_queue_count(struct rte_eth_dev *dev,
-		uint16_t rx_queue_id);
+			       uint16_t rx_queue_id);
+
+uint32_t eth_em_tx_queue_count(struct rte_eth_dev *dev,
+			       uint16_t tx_queue_id);
 
 int eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 866a5cf..7fe5e3b 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -190,6 +190,7 @@ static const struct eth_dev_ops eth_em_ops = {
 	.rx_queue_setup       = eth_em_rx_queue_setup,
 	.rx_queue_release     = eth_em_rx_queue_release,
 	.rx_queue_count       = eth_em_rx_queue_count,
+	.tx_queue_count       = eth_em_tx_queue_count,
 	.rx_descriptor_done   = eth_em_rx_descriptor_done,
 	.tx_queue_setup       = eth_em_tx_queue_setup,
 	.tx_queue_release     = eth_em_tx_queue_release,
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index a469fd7..8afcfda 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1432,6 +1432,57 @@ eth_em_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return offset;
 }
 
+uint32_t
+eth_em_tx_queue_count(struct rte_eth_dev *dev, uint16_t tx_queue_id)
+{
+	volatile uint8_t *status;
+	struct em_tx_queue *txq;
+	int32_t offset, interval, idx, resolution;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+
+	/* check if ring empty */
+	idx = txq->tx_tail - 1;
+	if (idx < 0)
+		idx += txq->nb_tx_desc;
+	status = &txq->tx_ring[idx].upper.fields.status;
+	if (*status & E1000_TXD_STAT_DD)
+		return 0;
+
+	/* check if ring full */
+	idx = txq->tx_tail + 1;
+	if (idx >= txq->nb_tx_desc)
+		idx -= txq->nb_tx_desc;
+	status = &txq->tx_ring[idx].upper.fields.status;
+	if (!(*status & E1000_TXD_STAT_DD))
+		return txq->nb_tx_desc;
+
+	/* decrease the precision if ring is large */
+	if (txq->nb_tx_desc <= 256)
+		resolution = 4;
+	else
+		resolution = 16;
+
+	/* use a binary search */
+	offset = txq->nb_tx_desc >> 1;
+	interval = offset;
+
+	do {
+		idx = txq->tx_tail + offset;
+		if (idx >= txq->nb_tx_desc)
+			idx -= txq->nb_tx_desc;
+
+		interval >>= 1;
+		status = &txq->tx_ring[idx].upper.fields.status;
+		if (*status & E1000_TXD_STAT_DD)
+			offset += interval;
+		else
+			offset -= interval;
+	} while (interval >= resolution);
+
+	return txq->nb_tx_desc - offset;
+}
+
 int
 eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count
  2016-11-24  9:54 ` [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count Olivier Matz
@ 2016-11-24 10:52   ` Ferruh Yigit
  2016-11-24 11:13     ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Ferruh Yigit @ 2016-11-24 10:52 UTC (permalink / raw)
  To: Olivier Matz, dev
  Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
On 11/24/2016 9:54 AM, Olivier Matz wrote:
> The API comments are not consistent between each other.
> 
> The function rte_eth_rx_queue_count() returns the number of used
> descriptors on a receive queue.
> 
> PR=52423
What is this marker?
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer
  2016-11-24  9:54 ` [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer Olivier Matz
@ 2016-11-24 10:59   ` Ferruh Yigit
  2016-11-24 13:05     ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Ferruh Yigit @ 2016-11-24 10:59 UTC (permalink / raw)
  To: Olivier Matz, dev
  Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
On 11/24/2016 9:54 AM, Olivier Matz wrote:
> The check of queue_id is done in all drivers implementing
> rte_eth_rx_queue_count(). Factorize this check in the generic function.
> 
> Note that the nfp driver was doing the check differently, which could
> induce crashes if the queue index was too big.
> 
> By the way, also move the is_supported test before the port valid and
> queue valid test.
> 
> PR=52423
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Ivan Boule <ivan.boule@6wind.com>
> ---
<...>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index c3edc23..9551cfd 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -2693,7 +2693,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>   *  The queue id on the specific port.
>   * @return
>   *  The number of used descriptors in the specific queue, or:
> - *     (-EINVAL) if *port_id* is invalid
> + *     (-EINVAL) if *port_id* or *queue_id* is invalid
>   *     (-ENOTSUP) if the device does not support this function
>   */
>  static inline int
> @@ -2701,8 +2701,10 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
>  {
>  	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>  
> -	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
Doing port validity check before accessing dev->dev_ops->rx_queue_count
can be good idea.
What about validating port_id even before accessing
rte_eth_devices[port_id]?
> +	if (queue_id >= dev->data->nb_rx_queues)
> +		return -EINVAL;
>  
>  	return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
>  }
> 
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count
  2016-11-24 10:52   ` Ferruh Yigit
@ 2016-11-24 11:13     ` Olivier Matz
  0 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24 11:13 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
On Thu, 2016-11-24 at 10:52 +0000, Ferruh Yigit wrote:
> On 11/24/2016 9:54 AM, Olivier Matz wrote:
> > The API comments are not consistent between each other.
> > 
> > The function rte_eth_rx_queue_count() returns the number of used
> > descriptors on a receive queue.
> > 
> > PR=52423
> 
> What is this marker?
> 
Sorry, this is a mistake, it's an internal marker...
I hoped nobody would notice it ;)
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > Acked-by: Ivan Boule <ivan.boule@6wind.com>
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
Thanks for reviewing!
Regards,
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer
  2016-11-24 10:59   ` Ferruh Yigit
@ 2016-11-24 13:05     ` Olivier Matz
  0 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2016-11-24 13:05 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang
Hi Ferruh,
On Thu, 2016-11-24 at 10:59 +0000, Ferruh Yigit wrote:
> On 11/24/2016 9:54 AM, Olivier Matz wrote:
> > The check of queue_id is done in all drivers implementing
> > rte_eth_rx_queue_count(). Factorize this check in the generic
> > function.
> > 
> > Note that the nfp driver was doing the check differently, which
> > could
> > induce crashes if the queue index was too big.
> > 
> > By the way, also move the is_supported test before the port valid
> > and
> > queue valid test.
> > 
> > PR=52423
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > Acked-by: Ivan Boule <ivan.boule@6wind.com>
> > ---
> 
> <...>
> 
> > diff --git a/lib/librte_ether/rte_ethdev.h
> > b/lib/librte_ether/rte_ethdev.h
> > index c3edc23..9551cfd 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -2693,7 +2693,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t
> > queue_id,
> >   *  The queue id on the specific port.
> >   * @return
> >   *  The number of used descriptors in the specific queue, or:
> > - *     (-EINVAL) if *port_id* is invalid
> > + *     (-EINVAL) if *port_id* or *queue_id* is invalid
> >   *     (-ENOTSUP) if the device does not support this function
> >   */
> >  static inline int
> > @@ -2701,8 +2701,10 @@ rte_eth_rx_queue_count(uint8_t port_id,
> > uint16_t queue_id)
> >  {
> >  	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >  
> > -	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count,
> > -ENOTSUP);
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> 
> Doing port validity check before accessing dev->dev_ops-
> >rx_queue_count
> can be good idea.
> 
> What about validating port_id even before accessing
> rte_eth_devices[port_id]?
> 
oops right, we should not move this line, it's stupid...
Thanks for the feedback,
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (8 preceding siblings ...)
  2016-11-24  9:54 ` [dpdk-dev] [RFC 9/9] net/e1000: add handler for tx " Olivier Matz
@ 2017-01-13 16:44 ` Olivier Matz
  2017-01-13 17:32   ` Richardson, Bruce
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
  10 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-01-13 16:44 UTC (permalink / raw)
  To: dev
  Cc: thomas.monjalon, konstantin.ananyev, wenzhuo.lu, helin.zhang,
	Richardson, Bruce
Hi,
On Thu, 24 Nov 2016 10:54:12 +0100, Olivier Matz
<olivier.matz@6wind.com> wrote:
> This RFC patchset introduces a new ethdev API function
> rte_eth_tx_queue_count() which is the tx counterpart of
> rte_eth_rx_queue_count(). It implements this API on some
> Intel drivers for reference, and it also optimizes the
> implementation of rte_eth_rx_queue_count().
> 
I'm planning to send a new version of this patchset, fixing the issues
seen by Ferruh, plus a bug fix in the e1000 implementation.
Does anyone have any comment about the new API or about questions
raised in the cover letter? Especially about the real meaning of "used
descriptor": should it include the descriptors hold by the driver?
Any comment about the method (binary search to find the used
descriptors)?
I'm also wondering about adding rte_eth_tx_descriptor_done() in the API
at the same time.
Regards,
Olivier
> The usage of these functions can be:
> - on Rx, anticipate that the cpu is not fast enough to process
>   all incoming packets, and take dispositions to solve the
>   problem (add more cpus, drop specific packets, ...)
> - on Tx, detect that the link is overloaded, and take dispositions
>   to solve the problem (notify flow control, drop specific
>   packets)
> 
> The tests I've done (instrumenting testpmd) show that browsing
> the descriptors linearly is slow when the ring size increases.
> Accessing the head/tail registers through pci is also slow
> whatever the size of the ring. A binary search is a good compromise
> that gives quite good performance whatever the size of the ring.
> 
> Remaining question are about:
> - should we keep this name? I'd say "queue_count" is quite confusing,
>   and I would expect it returns the number of queues, not the
>   number of used descriptors
> - how shall we count the used descriptors, knowing that the driver
>   can hold some to free them by bulk, which reduces the effective
>   size of the ring
> 
> I would be happy to have some feedback about this RFC before
> I send it as a patch.
> 
> Here are some helpers to understand the code more easily (I sometimes
> make some shortcuts between like 1 pkt == 1 desc).
> 
> RX side
> =======
> 
> - sw advances the tail pointer
> - hw advances the head pointer
> - the software populates the ring with descs to buffers that are
> filled when the hw receives packets
> - head == tail means there is no available buffer for hw to receive a
> packet
> - head points to the next descriptor to be filled
> - hw owns all descriptors between [head...tail]
> - when a packet is written in a descriptor, the DD (descriptor done)
>   bit is set, and the head is advanced
> - the driver never reads the head (needs a pci transaction), instead
> it monitors the DD bit of next descriptor
> - when a filled packet is retrieved by the software, the descriptor
> has to be populated with a new empty buffer. This is not done for each
>   packet: the driver holds them and waits until it has many
> descriptors to populate, and do it by bulk.
>   (by the way, it means that the effective size a queue of size=N is
>   lower than N since these descriptors cannot be used by the hw)
> 
> rxq->rx_tail: current value of the sw tail (the idx of the next
> packet to be received). The real tail (hw) can be different since the
> driver can hold descriptors.
> rxq->nb_rx_hold: number of held descriptors
> rxq->rxrearm_nb: same, but for vector driver
> rxq->rx_free_thresh: when the number of held descriptors reaches this
> threshold, descriptors are populated with buffers to be filled, and
> sw advances the tail
> 
> Example with a ring size of 64:
> 
> |----------------------------------------------------------------|
> |                    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx          |
> |                    x buffers filled with data by hw x          |
> |                    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx          |
> |----------------------------------------------------------------|
>                      ^hw_tail=20
>                                     ^sw_tail=35
>                                                        ^hw_head=54
>                      <--- nb_hold -->
>                                     <-pkts in hw queue->
> 
> The descriptors marked with 'x' has their DD bit set, the other
>   (' ') reference empty buffers.
> The next packet to be received by software is at index 35.
> The software holds 15 descriptors that will be rearmed later.
> There are 19 packets waiting in the hw queue.
> 
> We want the function rx_queue_count() to return the number of
> "used" descriptors. The first question is: what does that mean
> exactly? Should it be pkts_in_hw_queue or pkts_in_hw_queue + nb_hold?
> The current implementation returns pkts_in_hw_queue, but including
> nb_hold could be useful to know how many descriptors are really
> free (= size - used).
> 
> The current implementation checks the DD bit starting from sw_tail,
> every 4 packets. It can be quite slow for large rings. An alternative
> is to read the head register, but it's also slow.
> 
> This patchset optimizes rx_queue_count() by doing a binary
> search (checking for DD) between sw_tail and hw_tail, instead of a
> linear search.
> 
> TX side
> =======
> 
> - sw advances the tail pointer
> - hw advances the head pointer
> - the software populates the ring with full buffers to be sent by
>   the hw
> - head points to the in-progress descriptor.
> - sw writes new descriptors at tail
> - head == tail means that the transmit queue is empty
> - when the hw has processed a descriptor, it sets the DD bit if
>   the descriptor has the RS (report status) bit.
> - the driver never reads the head (needs a pci transaction), instead
> it monitors the DD bit of a descriptor that has the RS bit
> 
> txq->tx_tail: sw value for tail register
> txq->tx_free_thresh: free buffers if count(free descriptors) < this
> value txq->tx_rs_thresh: RS bit is set every X descriptor
> txq->tx_next_dd: next desc to scan for DD bit
> txq->tx_next_rs: next desc to set RS bit
> txq->last_desc_cleaned: last descriptor that have been cleaned
> txq->nb_tx_free: number of free descriptors
> 
> Example:
> 
> |----------------------------------------------------------------|
> |               D       R       R       R                        |
> |        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
> |        <descs sent><- descs not sent yet  ->                   |
> |        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
> |----------------------------------------------------------------|
>         ^last_desc_cleaned=8                    ^next_rs=47
>                 ^next_dd=15                   ^tail=45
>                      ^hw_head=20
> 
>                      <----  nb_used  --------->
> 
> The hardware is currently processing the descriptor 20
> 'R' means the descriptor has the RS bit
> 'D' means the descriptor has the DD + RS bits
> 'x' are packets in txq (not sent)
> '.' are packet already sent but not freed by sw
> 
> In this example, we have rs_thres=8. On next call to
> ixgbe_tx_free_bufs(), some buffers will be freed.
> 
> The new implementation does a binary search (checking for DD) between
> next_dd and tail.
> 
> 
> 
> Olivier Matz (9):
>   ethdev: clarify api comments of rx queue count
>   ethdev: move queue id check in generic layer
>   ethdev: add handler for Tx queue descriptor count
>   net/ixgbe: optimize Rx queue descriptor count
>   net/ixgbe: add handler for Tx queue descriptor count
>   net/igb: optimize rx queue descriptor count
>   net/igb: add handler for tx queue descriptor count
>   net/e1000: optimize rx queue descriptor count
>   net/e1000: add handler for tx queue descriptor count
> 
>  drivers/net/e1000/e1000_ethdev.h |  10 +++-
>  drivers/net/e1000/em_ethdev.c    |   1 +
>  drivers/net/e1000/em_rxtx.c      | 109
> ++++++++++++++++++++++++++++------ drivers/net/e1000/igb_ethdev.c
> |   1 + drivers/net/e1000/igb_rxtx.c     | 109
> ++++++++++++++++++++++++++++------ drivers/net/i40e/i40e_rxtx.c
> |   5 -- drivers/net/ixgbe/ixgbe_ethdev.c |   1 +
>  drivers/net/ixgbe/ixgbe_ethdev.h |   4 +-
>  drivers/net/ixgbe/ixgbe_rxtx.c   | 123
> +++++++++++++++++++++++++++++++++------
> drivers/net/ixgbe/ixgbe_rxtx.h   |   2 +
> drivers/net/nfp/nfp_net.c        |   6 --
> lib/librte_ether/rte_ethdev.h    |  48 +++++++++++++-- 12 files
> changed, 344 insertions(+), 75 deletions(-)
> 
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
  2017-01-13 16:44 ` [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
@ 2017-01-13 17:32   ` Richardson, Bruce
  2017-01-17  8:24     ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Richardson, Bruce @ 2017-01-13 17:32 UTC (permalink / raw)
  To: Olivier Matz, dev
  Cc: thomas.monjalon, Ananyev, Konstantin, Lu, Wenzhuo, Zhang, Helin
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Friday, January 13, 2017 4:44 PM
> To: dev@dpdk.org
> Cc: thomas.monjalon@6wind.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang,
> Helin <helin.zhang@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
> 
> Hi,
> 
> On Thu, 24 Nov 2016 10:54:12 +0100, Olivier Matz <olivier.matz@6wind.com>
> wrote:
> > This RFC patchset introduces a new ethdev API function
> > rte_eth_tx_queue_count() which is the tx counterpart of
> > rte_eth_rx_queue_count(). It implements this API on some Intel drivers
> > for reference, and it also optimizes the implementation of
> > rte_eth_rx_queue_count().
> >
> 
> I'm planning to send a new version of this patchset, fixing the issues
> seen by Ferruh, plus a bug fix in the e1000 implementation.
> 
> Does anyone have any comment about the new API or about questions raised
> in the cover letter? Especially about the real meaning of "used
> descriptor": should it include the descriptors hold by the driver?
For TX, I think we just need used/unused, since for TX any driver will reuse
a slot that has been completed by the NIC, and doesn't hold the mbufs back
for buffering at all.
For RX, strictly speaking, we should have three categories, rather than
trying to work it into 2. I don't see why we can't report a slot as
used/unused/unavailable.
> 
> Any comment about the method (binary search to find the used descriptors)?
I think binary search should work ok, though linear search may work better for
smaller ranges as we can prefetch ahead since we know what we will check next.
Linear can also go backward only if we want accuracy (going forward risks having
race conditions between read and NIC write). Overall, though I think binary
search should work well enough.
> 
> I'm also wondering about adding rte_eth_tx_descriptor_done() in the API at
> the same time.
> 
Let me switch the question around - do we need the queue_count APIs at
all, and is it not more efficient to just supply the descriptor_done() APIs?
If an app wants to know the use of the ring, and take some action based on it,
that app is going to have one or more thresholds for taking the action, right?
In that case, rather than scanning descriptors to find the absolute number
of free/used descriptors, it would be more efficient for the app to just check
the descriptor on the threshold - and take action based just on that value.
Any app that really does need the absolute value of the ring capacity can
presumably do its own binary search or linear search to determine the value
itself. However, I think just doing a done function should encourage people
to use the more efficient solution of just checking the minimum number of
descriptors needed.
Regards,
/Bruce
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
  2017-01-13 17:32   ` Richardson, Bruce
@ 2017-01-17  8:24     ` Olivier Matz
  2017-01-17 13:56       ` Bruce Richardson
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-01-17  8:24 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Lu, Wenzhuo, Zhang, Helin
Hi,
Thanks Bruce for the comments.
On Fri, 13 Jan 2017 17:32:38 +0000, "Richardson, Bruce"
<bruce.richardson@intel.com> wrote:
> > -----Original Message-----
> > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > Sent: Friday, January 13, 2017 4:44 PM
> > To: dev@dpdk.org
> > Cc: thomas.monjalon@6wind.com; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > Zhang, Helin <helin.zhang@intel.com>; Richardson, Bruce
> > <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
> > 
> > Hi,
> > 
> > On Thu, 24 Nov 2016 10:54:12 +0100, Olivier Matz
> > <olivier.matz@6wind.com> wrote:  
> > > This RFC patchset introduces a new ethdev API function
> > > rte_eth_tx_queue_count() which is the tx counterpart of
> > > rte_eth_rx_queue_count(). It implements this API on some Intel
> > > drivers for reference, and it also optimizes the implementation of
> > > rte_eth_rx_queue_count().
> > >  
> > 
> > I'm planning to send a new version of this patchset, fixing the
> > issues seen by Ferruh, plus a bug fix in the e1000 implementation.
> > 
> > Does anyone have any comment about the new API or about questions
> > raised in the cover letter? Especially about the real meaning of
> > "used descriptor": should it include the descriptors hold by the
> > driver?  
> For TX, I think we just need used/unused, since for TX any driver
> will reuse a slot that has been completed by the NIC, and doesn't
> hold the mbufs back for buffering at all.
Agree
> For RX, strictly speaking, we should have three categories, rather
> than trying to work it into 2. I don't see why we can't report a slot
> as used/unused/unavailable.
With the rte_eth_rx_queue_count() API, we don't have this opportunity
since it just returns an int.
Something I found a bit strange when doing this patchset is that the
user does not have the full control of the number of hold buffers. With
default parameters, the effective size of a ring of 128 is 64.
So it is, we could introduce an API to retrieve the status:
used/unused/unavailable.
> > Any comment about the method (binary search to find the used
> > descriptors)?  
> 
> I think binary search should work ok, though linear search may work
> better for smaller ranges as we can prefetch ahead since we know what
> we will check next. Linear can also go backward only if we want
> accuracy (going forward risks having race conditions between read and
> NIC write). Overall, though I think binary search should work well
> enough.
> 
> > 
> > I'm also wondering about adding rte_eth_tx_descriptor_done() in the
> > API at the same time.
> >   
> 
> Let me switch the question around - do we need the queue_count APIs at
> all, and is it not more efficient to just supply the
> descriptor_done() APIs? If an app wants to know the use of the ring,
> and take some action based on it, that app is going to have one or
> more thresholds for taking the action, right? In that case, rather
> than scanning descriptors to find the absolute number of free/used
> descriptors, it would be more efficient for the app to just check the
> descriptor on the threshold - and take action based just on that
> value. 
Yes, I reached the same conclusion (...after posting the RFC patchset
unfortunatly).
> Any app that really does need the absolute value of the ring
> capacity can presumably do its own binary search or linear search to
> determine the value itself. However, I think just doing a done
> function should encourage people to use the more efficient solution
> of just checking the minimum number of descriptors needed.
The question is: now that the work is done, is there any application
that would require this absolute values? For instance, monitoring.
If not, I have no problem to the patchset, I just need to validate my
application with a descriptor_done() API. In this case we can also
deprecate rx_queue_count() and tx_queue_count().
The rte_eth_rx_descriptor_done() function could be updated into:
/**
 * Check the status of a RX descriptor in the queue.
 *
 * @param port_id
 *  The port identifier of the Ethernet device.
 * @param queue_id
 *  The queue id on the specific port.
 * @param offset
 *  The offset of the descriptor ID from tail (0 is the next packet to
 *  be received by the driver).
 *  - (2) Descriptor is unavailable (hold by driver, not yet returned to hw)
 *  - (1) Descriptor is done (filled by hw, but not processed by the driver,
 *        i.e. in the receive queue)
 *  - (0) Descriptor is available for the hardware to receive a packet.
 *  - (-ENODEV) if *port_id* invalid.
 *  - (-ENOTSUP) if the device does not support this function
 */
 static inline int rte_eth_rx_descriptor_done(uint8_t port_id,
 	uint16_t queue_id, uint16_t offset)
A similar rte_eth_tx_descriptor_done() would be introduced:
/**
 * Check the status of a TX descriptor in the queue.
 *
 * @param port_id
 *  The port identifier of the Ethernet device.
 * @param queue_id
 *  The queue id on the specific port.
 * @param offset
 *  The offset of the descriptor ID from tail (0 is the place where the next
 *  packet will be send).
 *  - (1) Descriptor is beeing processed by the hw, i.e. in the transmit queue
 *  - (0) Descriptor is available for the driver to send a packet.
 *  - (-ENODEV) if *port_id* invalid.
 *  - (-ENOTSUP) if the device does not support this function
 */
 static inline int rte_eth_tx_descriptor_done(uint8_t port_id,
 	uint16_t queue_id, uint16_t offset)
An alternative would be to rename these functions in descriptor_status()
instead of descriptor_done().
Regards,
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
  2017-01-17  8:24     ` Olivier Matz
@ 2017-01-17 13:56       ` Bruce Richardson
  0 siblings, 0 replies; 72+ messages in thread
From: Bruce Richardson @ 2017-01-17 13:56 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Lu, Wenzhuo, Zhang, Helin
On Tue, Jan 17, 2017 at 09:24:10AM +0100, Olivier Matz wrote:
> Hi,
> 
> Thanks Bruce for the comments.
> 
> On Fri, 13 Jan 2017 17:32:38 +0000, "Richardson, Bruce"
> <bruce.richardson@intel.com> wrote:
> > > -----Original Message-----
> > > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > > Sent: Friday, January 13, 2017 4:44 PM
> > > To: dev@dpdk.org
> > > Cc: thomas.monjalon@6wind.com; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > > Zhang, Helin <helin.zhang@intel.com>; Richardson, Bruce
> > > <bruce.richardson@intel.com>
> > > Subject: Re: [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors
> > > 
> > > Hi,
> > > 
> > > On Thu, 24 Nov 2016 10:54:12 +0100, Olivier Matz
> > > <olivier.matz@6wind.com> wrote:  
> > > > This RFC patchset introduces a new ethdev API function
> > > > rte_eth_tx_queue_count() which is the tx counterpart of
> > > > rte_eth_rx_queue_count(). It implements this API on some Intel
> > > > drivers for reference, and it also optimizes the implementation of
> > > > rte_eth_rx_queue_count().
> > > >  
> > > 
> > > I'm planning to send a new version of this patchset, fixing the
> > > issues seen by Ferruh, plus a bug fix in the e1000 implementation.
> > > 
> > > Does anyone have any comment about the new API or about questions
> > > raised in the cover letter? Especially about the real meaning of
> > > "used descriptor": should it include the descriptors hold by the
> > > driver?  
> > For TX, I think we just need used/unused, since for TX any driver
> > will reuse a slot that has been completed by the NIC, and doesn't
> > hold the mbufs back for buffering at all.
> 
> Agree
> 
> > For RX, strictly speaking, we should have three categories, rather
> > than trying to work it into 2. I don't see why we can't report a slot
> > as used/unused/unavailable.
> 
> With the rte_eth_rx_queue_count() API, we don't have this opportunity
> since it just returns an int.
> 
> Something I found a bit strange when doing this patchset is that the
> user does not have the full control of the number of hold buffers. With
> default parameters, the effective size of a ring of 128 is 64.
> 
> So it is, we could introduce an API to retrieve the status:
> used/unused/unavailable.
> 
> > > Any comment about the method (binary search to find the used
> > > descriptors)?  
> > 
> > I think binary search should work ok, though linear search may work
> > better for smaller ranges as we can prefetch ahead since we know what
> > we will check next. Linear can also go backward only if we want
> > accuracy (going forward risks having race conditions between read and
> > NIC write). Overall, though I think binary search should work well
> > enough.
> > 
> > > 
> > > I'm also wondering about adding rte_eth_tx_descriptor_done() in the
> > > API at the same time.
> > >   
> > 
> > Let me switch the question around - do we need the queue_count APIs at
> > all, and is it not more efficient to just supply the
> > descriptor_done() APIs? If an app wants to know the use of the ring,
> > and take some action based on it, that app is going to have one or
> > more thresholds for taking the action, right? In that case, rather
> > than scanning descriptors to find the absolute number of free/used
> > descriptors, it would be more efficient for the app to just check the
> > descriptor on the threshold - and take action based just on that
> > value. 
> 
> Yes, I reached the same conclusion (...after posting the RFC patchset
> unfortunatly).
> 
> > Any app that really does need the absolute value of the ring
> > capacity can presumably do its own binary search or linear search to
> > determine the value itself. However, I think just doing a done
> > function should encourage people to use the more efficient solution
> > of just checking the minimum number of descriptors needed.
> 
> 
> The question is: now that the work is done, is there any application
> that would require this absolute values? For instance, monitoring.
> 
> If not, I have no problem to the patchset, I just need to validate my
> application with a descriptor_done() API. In this case we can also
> deprecate rx_queue_count() and tx_queue_count().
I wouldn't have a problem with deprecating these functions.
> 
> The rte_eth_rx_descriptor_done() function could be updated into:
> 
> /**
>  * Check the status of a RX descriptor in the queue.
>  *
>  * @param port_id
>  *  The port identifier of the Ethernet device.
>  * @param queue_id
>  *  The queue id on the specific port.
>  * @param offset
>  *  The offset of the descriptor ID from tail (0 is the next packet to
>  *  be received by the driver).
>  *  - (2) Descriptor is unavailable (hold by driver, not yet returned to hw)
>  *  - (1) Descriptor is done (filled by hw, but not processed by the driver,
>  *        i.e. in the receive queue)
>  *  - (0) Descriptor is available for the hardware to receive a packet.
>  *  - (-ENODEV) if *port_id* invalid.
>  *  - (-ENOTSUP) if the device does not support this function
>  */
>  static inline int rte_eth_rx_descriptor_done(uint8_t port_id,
>  	uint16_t queue_id, uint16_t offset)
> 
> 
> A similar rte_eth_tx_descriptor_done() would be introduced:
> 
> /**
>  * Check the status of a TX descriptor in the queue.
>  *
>  * @param port_id
>  *  The port identifier of the Ethernet device.
>  * @param queue_id
>  *  The queue id on the specific port.
>  * @param offset
>  *  The offset of the descriptor ID from tail (0 is the place where the next
>  *  packet will be send).
>  *  - (1) Descriptor is beeing processed by the hw, i.e. in the transmit queue
>  *  - (0) Descriptor is available for the driver to send a packet.
>  *  - (-ENODEV) if *port_id* invalid.
>  *  - (-ENOTSUP) if the device does not support this function
>  */
>  static inline int rte_eth_tx_descriptor_done(uint8_t port_id,
>  	uint16_t queue_id, uint16_t offset)
> 
> 
> 
> An alternative would be to rename these functions in descriptor_status()
> instead of descriptor_done().
Seems good naming to me.
/Bruce
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
                   ` (9 preceding siblings ...)
  2017-01-13 16:44 ` [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
@ 2017-03-01 17:19 ` Olivier Matz
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API Olivier Matz
                     ` (9 more replies)
  10 siblings, 10 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
This patchset introduces a new ethdev API:
- rte_eth_rx_descriptor_status()
- rte_eth_tx_descriptor_status()
The Rx API is aims to replace rte_eth_rx_descriptor_done() which
does almost the same, but does not differentiate the case of a
descriptor used by the driver (not returned to the hw).
The usage of these functions can be:
- on Rx, anticipate that the cpu is not fast enough to process
  all incoming packets, and take dispositions to solve the
  problem (add more cpus, drop specific packets, ...)
- on Tx, detect that the link is overloaded, and take dispositions
  to solve the problem (notify flow control, drop specific
  packets)
The patchset updates ixgbe, i40e, e1000, mlx5.
The other drivers that implement the descriptor_done() API are
fm10k, sfc, virtio. They are not updated.
If the new API is accepted, the descriptor_done() can be deprecated,
and examples/l3fwd-power will be updated to.
RFC->v1:
- instead of optimizing an API that returns the number of used
  descriptors like rx_queue_count(), use a more simple API that
  returns the status of a descriptor, like rx_descriptor_done().
- remove ethdev api rework (first 2 patches), they have been
  sent separately
Olivier Matz (6):
  ethdev: add descriptor status API
  net/ixgbe: implement descriptor status API
  net/e1000: implement descriptor status API (igb)
  net/e1000: implement descriptor status API (em)
  net/mlx5: implement descriptor status API
  net/i40e: implement descriptor status API
 drivers/net/e1000/e1000_ethdev.h  | 10 +++++
 drivers/net/e1000/em_ethdev.c     |  2 +
 drivers/net/e1000/em_rxtx.c       | 49 ++++++++++++++++++++++
 drivers/net/e1000/igb_ethdev.c    |  2 +
 drivers/net/e1000/igb_rxtx.c      | 46 +++++++++++++++++++++
 drivers/net/i40e/i40e_ethdev.c    |  2 +
 drivers/net/i40e/i40e_ethdev_vf.c |  2 +
 drivers/net/i40e/i40e_rxtx.c      | 56 +++++++++++++++++++++++++
 drivers/net/i40e/i40e_rxtx.h      |  4 ++
 drivers/net/ixgbe/ixgbe_ethdev.c  |  4 ++
 drivers/net/ixgbe/ixgbe_ethdev.h  |  5 +++
 drivers/net/ixgbe/ixgbe_rxtx.c    | 55 +++++++++++++++++++++++++
 drivers/net/mlx5/mlx5.c           |  2 +
 drivers/net/mlx5/mlx5_rxtx.c      | 83 +++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.h      |  2 +
 lib/librte_ether/rte_ethdev.h     | 86 +++++++++++++++++++++++++++++++++++++++
 16 files changed, 410 insertions(+)
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
@ 2017-03-01 17:19   ` Olivier Matz
  2017-03-01 18:22     ` Andrew Rybchenko
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 2/6] net/ixgbe: implement " Olivier Matz
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
Introduce a new API to get the status of a descriptor.
For Rx, it is almost similar to rx_descriptor_done API, except it
differentiates "used" descriptors (which are hold by the driver and not
returned to the hardware).
For Tx, it is a new API.
The descriptor_done() API, and probably the rx_queue_count() API could
be replaced by this new API as soon as it is implemented on all PMDs.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_ether/rte_ethdev.h | 86 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 97f3e2d..9ac9c61 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1179,6 +1179,14 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef int (*eth_rx_descriptor_status_t)(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, uint16_t offset);
+/**< @internal Check the status of a Rx descriptor */
+
+typedef int (*eth_tx_descriptor_status_t)(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, uint16_t offset);
+/**< @internal Check the status of a Tx descriptor */
+
 typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
 				     char *fw_version, size_t fw_size);
 /**< @internal Get firmware information of an Ethernet device. */
@@ -1483,6 +1491,10 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
 	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
 	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
+	eth_rx_descriptor_status_t rx_descriptor_status;
+	/**< Check the status of a Rx descriptor. */
+	eth_tx_descriptor_status_t tx_descriptor_status;
+	/**< Check the status of a Tx descriptor. */
 	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
 	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
@@ -2768,6 +2780,80 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
 		dev->data->rx_queues[queue_id], offset);
 }
 
+#define RTE_ETH_RX_DESC_AVAIL 0 /**< Desc available for hw. */
+#define RTE_ETH_RX_DESC_DONE  1 /**< Desc done, filled by hw. */
+#define RTE_ETH_RX_DESC_USED  2 /**< Desc used by driver. */
+
+/**
+ * Check the status of a Rx descriptor in the queue
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param queue_id
+ *  The Rx queue identifier on this port.
+ * @param offset
+ *  The offset of the descriptor starting from tail (0 is the next
+ *  packet to be received by the driver).
+ *
+ * @return
+ *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
+ *    receive a packet.
+ *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
+ *    not yet processed by the driver (i.e. in the receive queue).
+ *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
+ *    not yet returned to hw).
+ *  - (-ENODEV) if *port_id* invalid.
+ *  - (-EINVAL) bad descriptor offset.
+ *  - (-ENOTSUP) if the device does not support this function.
+ */
+static inline int
+rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
+	uint16_t offset)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
+
+	return (*dev->dev_ops->rx_descriptor_status)(dev, queue_id, offset);
+}
+
+#define RTE_ETH_TX_DESC_FULL 0 /**< Desc filled by pmd for hw, waiting xmit. */
+#define RTE_ETH_TX_DESC_DONE 1 /**< Desc done, packet is transmitted. */
+
+/**
+ * Check the status of a Tx descriptor in the queue.
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param queue_id
+ *  The Tx queue identifier on this port.
+ * @param offset
+ *  The offset of the descriptor starting from tail (0 is the place where
+ *  the next packet will be send).
+ *
+ * @return
+ *  - (RTE_ETH_DESC_FULL) Descriptor is being processed by the hw, i.e.
+ *    in the transmit queue.
+ *  - (RTE_ETH_DESC_DONE) Hardware is done with this descriptor, it can be
+ *    reused by the driver.
+ *  - (-ENODEV) if *port_id* invalid.
+ *  - (-EINVAL) bad descriptor offset.
+ *  - (-ENOTSUP) if the device does not support this function.
+ */
+static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
+	uint16_t queue_id, uint16_t offset)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_descriptor_status, -ENOTSUP);
+
+	return (*dev->dev_ops->tx_descriptor_status)(dev, queue_id, offset);
+}
+
 /**
  * Send a burst of output packets on a transmit queue of an Ethernet device.
  *
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 2/6] net/ixgbe: implement descriptor status API
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API Olivier Matz
@ 2017-03-01 17:19   ` Olivier Matz
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c |  4 +++
 drivers/net/ixgbe/ixgbe_ethdev.h |  5 ++++
 drivers/net/ixgbe/ixgbe_rxtx.c   | 55 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7169007..34bd681 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -554,6 +554,8 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
+	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
+	.tx_descriptor_status = ixgbe_dev_tx_descriptor_status,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
 	.dev_led_on           = ixgbe_dev_led_on,
@@ -632,6 +634,8 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
+	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
+	.tx_descriptor_status = ixgbe_dev_tx_descriptor_status,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
 	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 680d5d9..085e598 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -516,6 +516,11 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int ixgbe_dev_rx_descriptor_status(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, uint16_t offset);
+int ixgbe_dev_tx_descriptor_status(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, uint16_t offset);
+
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
 void ixgbe_dev_tx_init(struct rte_eth_dev *dev);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 9502432..0826a45 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2950,6 +2950,61 @@ ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 			rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD));
 }
 
+int
+ixgbe_dev_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+	uint16_t offset)
+{
+	volatile uint32_t *status;
+	struct ixgbe_rx_queue *rxq;
+	uint32_t nb_hold, desc;
+
+	rxq = dev->data->rx_queues[rx_queue_id];
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+#ifdef RTE_IXGBE_INC_VECTOR
+	if (rxq->rx_using_sse)
+		nb_hold = rxq->rxrearm_nb;
+	else
+#endif
+		nb_hold = rxq->nb_rx_hold;
+	if (offset >= rxq->nb_rx_desc - nb_hold)
+		return RTE_ETH_RX_DESC_USED;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+ixgbe_dev_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
+	uint16_t offset)
+{
+	volatile uint32_t *status;
+	struct ixgbe_tx_queue *txq;
+	uint32_t desc;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	status = &txq->tx_ring[desc].wb.status;
+	if (*status & rte_cpu_to_le_32(IXGBE_ADVTXD_STAT_DD))
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void __attribute__((cold))
 ixgbe_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb)
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API Olivier Matz
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 2/6] net/ixgbe: implement " Olivier Matz
@ 2017-03-01 17:19   ` Olivier Matz
  2017-03-02  1:28     ` Lu, Wenzhuo
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/e1000/e1000_ethdev.h |  5 +++++
 drivers/net/e1000/igb_ethdev.c   |  2 ++
 drivers/net/e1000/igb_rxtx.c     | 46 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+)
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 81a6dbb..2b63b1a 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -311,6 +311,11 @@ uint32_t eth_igb_rx_queue_count(struct rte_eth_dev *dev,
 
 int eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int eth_igb_rx_descriptor_status(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, uint16_t offset);
+int eth_igb_tx_descriptor_status(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, uint16_t offset);
+
 int eth_igb_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index a112b38..f6ed824 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -406,6 +406,8 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
+	.rx_descriptor_status = eth_igb_rx_descriptor_status,
+	.tx_descriptor_status = eth_igb_tx_descriptor_status,
 	.tx_queue_setup       = eth_igb_tx_queue_setup,
 	.tx_queue_release     = eth_igb_tx_queue_release,
 	.dev_led_on           = eth_igb_led_on,
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index c9cf392..838413c 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1606,6 +1606,52 @@ eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset)
 	return !!(rxdp->wb.upper.status_error & E1000_RXD_STAT_DD);
 }
 
+int
+eth_igb_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+	uint16_t offset)
+{
+	volatile uint32_t *status;
+	struct igb_rx_queue *rxq;
+	uint32_t desc;
+
+	rxq = dev->data->rx_queues[rx_queue_id];
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_USED;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD))
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+eth_igb_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
+	uint16_t offset)
+{
+	volatile uint32_t *status;
+	struct igb_tx_queue *txq;
+	uint32_t desc;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	status = &txq->tx_ring[desc].wb.status;
+	if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void
 igb_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em)
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (2 preceding siblings ...)
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
@ 2017-03-01 17:19   ` Olivier Matz
  2017-03-02  1:22     ` Lu, Wenzhuo
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 5/6] net/mlx5: implement descriptor status API Olivier Matz
                     ` (5 subsequent siblings)
  9 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/e1000/e1000_ethdev.h |  5 ++++
 drivers/net/e1000/em_ethdev.c    |  2 ++
 drivers/net/e1000/em_rxtx.c      | 49 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+)
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 2b63b1a..e3fd7fc 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -380,6 +380,11 @@ uint32_t eth_em_rx_queue_count(struct rte_eth_dev *dev,
 
 int eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int eth_em_rx_descriptor_status(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, uint16_t offset);
+int eth_em_tx_descriptor_status(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, uint16_t offset);
+
 int eth_em_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 4066ef9..4f34c14 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -205,6 +205,8 @@ static const struct eth_dev_ops eth_em_ops = {
 	.rx_queue_release     = eth_em_rx_queue_release,
 	.rx_queue_count       = eth_em_rx_queue_count,
 	.rx_descriptor_done   = eth_em_rx_descriptor_done,
+	.rx_descriptor_status = eth_em_rx_descriptor_status,
+	.tx_descriptor_status = eth_em_tx_descriptor_status,
 	.tx_queue_setup       = eth_em_tx_queue_setup,
 	.tx_queue_release     = eth_em_tx_queue_release,
 	.rx_queue_intr_enable = eth_em_rx_queue_intr_enable,
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index d099d6a..1651454 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1473,6 +1473,55 @@ eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset)
 	return !!(rxdp->status & E1000_RXD_STAT_DD);
 }
 
+int
+eth_em_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+	uint16_t offset)
+{
+	volatile uint8_t *status;
+	struct em_rx_queue *rxq;
+	uint32_t desc;
+
+	rxq = dev->data->rx_queues[rx_queue_id];
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_USED;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].status;
+	if (*status & E1000_RXD_STAT_DD)
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+eth_em_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
+	uint16_t offset)
+{
+	volatile uint8_t *status;
+	struct em_tx_queue *txq;
+	uint32_t desc;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	status = &txq->tx_ring[desc].upper.fields.status;
+	if (*status & E1000_TXD_STAT_DD)
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void
 em_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 5/6] net/mlx5: implement descriptor status API
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (3 preceding siblings ...)
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
@ 2017-03-01 17:19   ` Olivier Matz
  2017-03-02  7:56     ` Nélio Laranjeiro
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 6/6] net/i40e: " Olivier Matz
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
Since there is no "descriptor done" flag like on Intel drivers, the
approach is different on mlx5 driver.
- for Tx, we call txq_complete() to free descriptors processed by
  the hw, then we check if the descriptor is between tail and head
- for Rx, we need to browse the cqes, managing compressed ones,
  to get the number of used descriptors.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx5/mlx5.c      |  2 ++
 drivers/net/mlx5/mlx5_rxtx.c | 83 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.h |  2 ++
 3 files changed, 87 insertions(+)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d4bd469..4a6450c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -222,6 +222,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.rss_hash_update = mlx5_rss_hash_update,
 	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
 	.filter_ctrl = mlx5_dev_filter_ctrl,
+	.rx_descriptor_status = mlx5_rx_descriptor_status,
+	.tx_descriptor_status = mlx5_tx_descriptor_status,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 88b0354..b3375f6 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -345,6 +345,89 @@ mlx5_tx_dbrec(struct txq *txq, volatile struct mlx5_wqe *wqe)
 }
 
 /**
+ * DPDK callback to check the status of a tx descriptor.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param[in] tx_queue_id
+ *   The identifier of the tx queue.
+ * @param[in] offset
+ *   The index of the descriptor in the ring.
+ *
+ * @return
+ *   The status of the tx descriptor.
+ */
+int
+mlx5_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
+			  uint16_t offset)
+{
+	struct txq *txq = dev->data->tx_queues[tx_queue_id];
+	const unsigned int elts_n = 1 << txq->elts_n;
+	const unsigned int elts_cnt = elts_n - 1;
+	unsigned int used;
+
+	txq_complete(txq);
+
+	used = (txq->elts_head - txq->elts_tail) & elts_cnt;
+	if (offset < used)
+		return RTE_ETH_TX_DESC_FULL;
+	return RTE_ETH_TX_DESC_DONE;
+}
+
+/**
+ * DPDK callback to check the status of a rx descriptor.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param[in] tx_queue_id
+ *   The identifier of the rx queue.
+ * @param[in] offset
+ *   The index of the descriptor in the ring.
+ *
+ * @return
+ *   The status of the rx descriptor.
+ */
+int
+mlx5_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+			  uint16_t offset)
+{
+	struct rxq *rxq = dev->data->rx_queues[rx_queue_id];
+	struct rxq_zip *zip = &rxq->zip;
+	volatile struct mlx5_cqe *cqe;
+	const unsigned int cqe_n = (1 << rxq->cqe_n);
+	const unsigned int cqe_cnt = cqe_n - 1;
+	unsigned int cq_ci;
+	unsigned int used;
+
+	/* if we are processing a compressed cqe */
+	if (zip->ai) {
+		used = zip->cqe_cnt - zip->ca;
+		cq_ci = zip->cq_ci;
+	} else {
+		used = 0;
+		cq_ci = rxq->cq_ci;
+	}
+	cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
+	while (check_cqe(cqe, cqe_n, cq_ci) == 0) {
+		int8_t op_own;
+		unsigned int n;
+
+		op_own = cqe->op_own;
+		if (MLX5_CQE_FORMAT(op_own) == MLX5_COMPRESSED)
+			n = ntohl(cqe->byte_cnt);
+		else
+			n = 1;
+		cq_ci += n;
+		used += n;
+		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
+	}
+	used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
+	if (offset < used)
+		return RTE_ETH_RX_DESC_DONE;
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+/**
  * DPDK callback for TX.
  *
  * @param dpdk_txq
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 41a34d7..e864dcd 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -323,6 +323,8 @@ uint16_t mlx5_tx_burst_mpw_inline(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t removed_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t removed_rx_burst(void *, struct rte_mbuf **, uint16_t);
+int mlx5_rx_descriptor_status(struct rte_eth_dev *, uint16_t, uint16_t);
+int mlx5_tx_descriptor_status(struct rte_eth_dev *, uint16_t, uint16_t);
 
 /* mlx5_mr.c */
 
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH 6/6] net/i40e: implement descriptor status API
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (4 preceding siblings ...)
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 5/6] net/mlx5: implement descriptor status API Olivier Matz
@ 2017-03-01 17:19   ` Olivier Matz
  2017-03-01 18:02   ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Andrew Rybchenko
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-01 17:19 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/i40e/i40e_ethdev.c    |  2 ++
 drivers/net/i40e/i40e_ethdev_vf.c |  2 ++
 drivers/net/i40e/i40e_rxtx.c      | 56 +++++++++++++++++++++++++++++++++++++++
 drivers/net/i40e/i40e_rxtx.h      |  4 +++
 4 files changed, 64 insertions(+)
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 303027b..8b5fd54 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -479,6 +479,8 @@ static const struct eth_dev_ops i40e_eth_dev_ops = {
 	.rx_queue_release             = i40e_dev_rx_queue_release,
 	.rx_queue_count               = i40e_dev_rx_queue_count,
 	.rx_descriptor_done           = i40e_dev_rx_descriptor_done,
+	.rx_descriptor_status         = i40e_dev_rx_descriptor_status,
+	.tx_descriptor_status         = i40e_dev_tx_descriptor_status,
 	.tx_queue_setup               = i40e_dev_tx_queue_setup,
 	.tx_queue_release             = i40e_dev_tx_queue_release,
 	.dev_led_on                   = i40e_dev_led_on,
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 55fd344..d3659c9 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -217,6 +217,8 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
 	.rx_queue_intr_enable = i40evf_dev_rx_queue_intr_enable,
 	.rx_queue_intr_disable = i40evf_dev_rx_queue_intr_disable,
 	.rx_descriptor_done   = i40e_dev_rx_descriptor_done,
+	.rx_descriptor_status = i40e_dev_rx_descriptor_status,
+	.tx_descriptor_status = i40e_dev_tx_descriptor_status,
 	.tx_queue_setup       = i40e_dev_tx_queue_setup,
 	.tx_queue_release     = i40e_dev_tx_queue_release,
 	.rx_queue_count       = i40e_dev_rx_queue_count,
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 45af0d7..b912689 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1929,6 +1929,62 @@ i40e_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 }
 
 int
+i40e_dev_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+	uint16_t offset)
+{
+	struct i40e_rx_queue *rxq;
+	volatile uint64_t *status;
+	uint64_t mask;
+	uint32_t desc;
+
+	rxq = dev->data->rx_queues[rx_queue_id];
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_USED;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.qword1.status_error_len;
+	mask = rte_le_to_cpu_64((1ULL << I40E_RX_DESC_STATUS_DD_SHIFT)
+		<< I40E_RXD_QW1_STATUS_SHIFT);
+	if (*status & mask)
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+i40e_dev_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
+	uint16_t offset)
+{
+	struct i40e_tx_queue *txq;
+	volatile uint64_t *status;
+	uint64_t mask, expect;
+	uint32_t desc;
+
+	txq = dev->data->tx_queues[tx_queue_id];
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	if (desc >= txq->nb_tx_desc)
+		desc -= txq->nb_tx_desc;
+
+	status = &txq->tx_ring[desc].cmd_type_offset_bsz;
+	mask = rte_le_to_cpu_64(I40E_TXD_QW1_DTYPE_MASK);
+	expect = rte_cpu_to_le_64(
+		I40E_TX_DESC_DTYPE_DESC_DONE << I40E_TXD_QW1_DTYPE_SHIFT);
+	if ((*status & mask) == expect)
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
+int
 i40e_dev_tx_queue_setup(struct rte_eth_dev *dev,
 			uint16_t queue_idx,
 			uint16_t nb_desc,
diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
index 9df8a56..c32519e 100644
--- a/drivers/net/i40e/i40e_rxtx.h
+++ b/drivers/net/i40e/i40e_rxtx.h
@@ -246,6 +246,10 @@ void i40e_rx_queue_release_mbufs(struct i40e_rx_queue *rxq);
 uint32_t i40e_dev_rx_queue_count(struct rte_eth_dev *dev,
 				 uint16_t rx_queue_id);
 int i40e_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int i40e_dev_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+	uint16_t offset);
+int i40e_dev_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+	uint16_t offset);
 
 uint16_t i40e_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
 			    uint16_t nb_pkts);
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (5 preceding siblings ...)
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 6/6] net/i40e: " Olivier Matz
@ 2017-03-01 18:02   ` Andrew Rybchenko
  2017-03-02 13:40     ` Olivier Matz
  2017-03-01 18:07   ` Stephen Hemminger
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 72+ messages in thread
From: Andrew Rybchenko @ 2017-03-01 18:02 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, konstantin.ananyev,
	wenzhuo.lu, helin.zhang, jingjing.wu, adrien.mazarguil,
	nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
On 03/01/2017 08:19 PM, Olivier Matz wrote:
> This patchset introduces a new ethdev API:
> - rte_eth_rx_descriptor_status()
> - rte_eth_tx_descriptor_status()
May be corresponding features should be added to the NICs documentation?
> The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> does almost the same, but does not differentiate the case of a
> descriptor used by the driver (not returned to the hw).
>
> The usage of these functions can be:
> - on Rx, anticipate that the cpu is not fast enough to process
>    all incoming packets, and take dispositions to solve the
>    problem (add more cpus, drop specific packets, ...)
> - on Tx, detect that the link is overloaded, and take dispositions
>    to solve the problem (notify flow control, drop specific
>    packets)
>
> The patchset updates ixgbe, i40e, e1000, mlx5.
> The other drivers that implement the descriptor_done() API are
> fm10k, sfc, virtio. They are not updated.
> If the new API is accepted, the descriptor_done() can be deprecated,
> and examples/l3fwd-power will be updated to.
>
>
> RFC->v1:
> - instead of optimizing an API that returns the number of used
>    descriptors like rx_queue_count(), use a more simple API that
>    returns the status of a descriptor, like rx_descriptor_done().
> - remove ethdev api rework (first 2 patches), they have been
>    sent separately
>
>
> Olivier Matz (6):
>    ethdev: add descriptor status API
>    net/ixgbe: implement descriptor status API
>    net/e1000: implement descriptor status API (igb)
>    net/e1000: implement descriptor status API (em)
>    net/mlx5: implement descriptor status API
>    net/i40e: implement descriptor status API
>
>   drivers/net/e1000/e1000_ethdev.h  | 10 +++++
>   drivers/net/e1000/em_ethdev.c     |  2 +
>   drivers/net/e1000/em_rxtx.c       | 49 ++++++++++++++++++++++
>   drivers/net/e1000/igb_ethdev.c    |  2 +
>   drivers/net/e1000/igb_rxtx.c      | 46 +++++++++++++++++++++
>   drivers/net/i40e/i40e_ethdev.c    |  2 +
>   drivers/net/i40e/i40e_ethdev_vf.c |  2 +
>   drivers/net/i40e/i40e_rxtx.c      | 56 +++++++++++++++++++++++++
>   drivers/net/i40e/i40e_rxtx.h      |  4 ++
>   drivers/net/ixgbe/ixgbe_ethdev.c  |  4 ++
>   drivers/net/ixgbe/ixgbe_ethdev.h  |  5 +++
>   drivers/net/ixgbe/ixgbe_rxtx.c    | 55 +++++++++++++++++++++++++
>   drivers/net/mlx5/mlx5.c           |  2 +
>   drivers/net/mlx5/mlx5_rxtx.c      | 83 +++++++++++++++++++++++++++++++++++++
>   drivers/net/mlx5/mlx5_rxtx.h      |  2 +
>   lib/librte_ether/rte_ethdev.h     | 86 +++++++++++++++++++++++++++++++++++++++
>   16 files changed, 410 insertions(+)
>
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (6 preceding siblings ...)
  2017-03-01 18:02   ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Andrew Rybchenko
@ 2017-03-01 18:07   ` Stephen Hemminger
  2017-03-02 13:43     ` Olivier Matz
  2017-03-02 15:32   ` Bruce Richardson
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
  9 siblings, 1 reply; 72+ messages in thread
From: Stephen Hemminger @ 2017-03-01 18:07 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
On Wed,  1 Mar 2017 18:19:06 +0100
Olivier Matz <olivier.matz@6wind.com> wrote:
> This patchset introduces a new ethdev API:
> - rte_eth_rx_descriptor_status()
> - rte_eth_tx_descriptor_status()
> 
> The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> does almost the same, but does not differentiate the case of a
> descriptor used by the driver (not returned to the hw).
> 
> The usage of these functions can be:
> - on Rx, anticipate that the cpu is not fast enough to process
>   all incoming packets, and take dispositions to solve the
>   problem (add more cpus, drop specific packets, ...)
> - on Tx, detect that the link is overloaded, and take dispositions
>   to solve the problem (notify flow control, drop specific
>   packets)
> 
> The patchset updates ixgbe, i40e, e1000, mlx5.
> The other drivers that implement the descriptor_done() API are
> fm10k, sfc, virtio. They are not updated.
> If the new API is accepted, the descriptor_done() can be deprecated,
> and examples/l3fwd-power will be updated to.
> 
> 
> RFC->v1:
> - instead of optimizing an API that returns the number of used
>   descriptors like rx_queue_count(), use a more simple API that
>   returns the status of a descriptor, like rx_descriptor_done().
> - remove ethdev api rework (first 2 patches), they have been
>   sent separately
> 
> 
> Olivier Matz (6):
>   ethdev: add descriptor status API
>   net/ixgbe: implement descriptor status API
>   net/e1000: implement descriptor status API (igb)
>   net/e1000: implement descriptor status API (em)
>   net/mlx5: implement descriptor status API
>   net/i40e: implement descriptor status API
> 
>  drivers/net/e1000/e1000_ethdev.h  | 10 +++++
>  drivers/net/e1000/em_ethdev.c     |  2 +
>  drivers/net/e1000/em_rxtx.c       | 49 ++++++++++++++++++++++
>  drivers/net/e1000/igb_ethdev.c    |  2 +
>  drivers/net/e1000/igb_rxtx.c      | 46 +++++++++++++++++++++
>  drivers/net/i40e/i40e_ethdev.c    |  2 +
>  drivers/net/i40e/i40e_ethdev_vf.c |  2 +
>  drivers/net/i40e/i40e_rxtx.c      | 56 +++++++++++++++++++++++++
>  drivers/net/i40e/i40e_rxtx.h      |  4 ++
>  drivers/net/ixgbe/ixgbe_ethdev.c  |  4 ++
>  drivers/net/ixgbe/ixgbe_ethdev.h  |  5 +++
>  drivers/net/ixgbe/ixgbe_rxtx.c    | 55 +++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5.c           |  2 +
>  drivers/net/mlx5/mlx5_rxtx.c      | 83 +++++++++++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_rxtx.h      |  2 +
>  lib/librte_ether/rte_ethdev.h     | 86 +++++++++++++++++++++++++++++++++++++++
>  16 files changed, 410 insertions(+)
> 
Could you update examples to use this?
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API Olivier Matz
@ 2017-03-01 18:22     ` Andrew Rybchenko
  2017-03-02 13:57       ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Rybchenko @ 2017-03-01 18:22 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, konstantin.ananyev,
	wenzhuo.lu, helin.zhang, jingjing.wu, adrien.mazarguil,
	nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson
On 03/01/2017 08:19 PM, Olivier Matz wrote:
> Introduce a new API to get the status of a descriptor.
>
> For Rx, it is almost similar to rx_descriptor_done API, except it
> differentiates "used" descriptors (which are hold by the driver and not
> returned to the hardware).
>
> For Tx, it is a new API.
>
> The descriptor_done() API, and probably the rx_queue_count() API could
> be replaced by this new API as soon as it is implemented on all PMDs.
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>   lib/librte_ether/rte_ethdev.h | 86 +++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 86 insertions(+)
>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 97f3e2d..9ac9c61 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1179,6 +1179,14 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
>   typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
>   /**< @internal Check DD bit of specific RX descriptor */
>   
> +typedef int (*eth_rx_descriptor_status_t)(struct rte_eth_dev *dev,
> +	uint16_t rx_queue_id, uint16_t offset);
> +/**< @internal Check the status of a Rx descriptor */
> +
> +typedef int (*eth_tx_descriptor_status_t)(struct rte_eth_dev *dev,
> +	uint16_t tx_queue_id, uint16_t offset);
> +/**< @internal Check the status of a Tx descriptor */
> +
>   typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
>   				     char *fw_version, size_t fw_size);
>   /**< @internal Get firmware information of an Ethernet device. */
> @@ -1483,6 +1491,10 @@ struct eth_dev_ops {
>   	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
>   	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
>   	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
> +	eth_rx_descriptor_status_t rx_descriptor_status;
> +	/**< Check the status of a Rx descriptor. */
> +	eth_tx_descriptor_status_t tx_descriptor_status;
> +	/**< Check the status of a Tx descriptor. */
>   	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
>   	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
>   	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
> @@ -2768,6 +2780,80 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
>   		dev->data->rx_queues[queue_id], offset);
>   }
>   
> +#define RTE_ETH_RX_DESC_AVAIL 0 /**< Desc available for hw. */
> +#define RTE_ETH_RX_DESC_DONE  1 /**< Desc done, filled by hw. */
> +#define RTE_ETH_RX_DESC_USED  2 /**< Desc used by driver. */
> +
> +/**
> + * Check the status of a Rx descriptor in the queue
I think it would be useful to highlight caller context.
Should it be the same CPU which receives packets from the queue?
> + *
> + * @param port_id
> + *  The port identifier of the Ethernet device.
> + * @param queue_id
> + *  The Rx queue identifier on this port.
> + * @param offset
> + *  The offset of the descriptor starting from tail (0 is the next
> + *  packet to be received by the driver).
> + * @return
> + *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
> + *    receive a packet.
> + *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
> + *    not yet processed by the driver (i.e. in the receive queue).
> + *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
> + *    not yet returned to hw).
It looks like it is the most suitable for descriptors which are reserved 
and never used.
> + *  - (-ENODEV) if *port_id* invalid.
> + *  - (-EINVAL) bad descriptor offset.
> + *  - (-ENOTSUP) if the device does not support this function.
What should be returned if queue_id is invalid?
What should be returned if the queue is stopped?
> + */
> +static inline int
> +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
> +	uint16_t offset)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
> +
May be it makes sense to range check queue_id here to avoid such code in 
each PMD?
> +	return (*dev->dev_ops->rx_descriptor_status)(dev, queue_id, offset);
> +}
> +
> +#define RTE_ETH_TX_DESC_FULL 0 /**< Desc filled by pmd for hw, waiting xmit. */
> +#define RTE_ETH_TX_DESC_DONE 1 /**< Desc done, packet is transmitted. */
I see no value suitable for descriptor which is never used.
> +/**
> + * Check the status of a Tx descriptor in the queue.
> + *
> + * @param port_id
> + *  The port identifier of the Ethernet device.
> + * @param queue_id
> + *  The Tx queue identifier on this port.
> + * @param offset
> + *  The offset of the descriptor starting from tail (0 is the place where
> + *  the next packet will be send).
> + *
> + * @return
> + *  - (RTE_ETH_DESC_FULL) Descriptor is being processed by the hw, i.e.
> + *    in the transmit queue.
> + *  - (RTE_ETH_DESC_DONE) Hardware is done with this descriptor, it can be
> + *    reused by the driver.
> + *  - (-ENODEV) if *port_id* invalid.
> + *  - (-EINVAL) bad descriptor offset.
> + *  - (-ENOTSUP) if the device does not support this function.
> + */
> +static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
> +	uint16_t queue_id, uint16_t offset)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_descriptor_status, -ENOTSUP);
> +
> +	return (*dev->dev_ops->tx_descriptor_status)(dev, queue_id, offset);
> +}
> +
>   /**
>    * Send a burst of output packets on a transmit queue of an Ethernet device.
>    *
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em)
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
@ 2017-03-02  1:22     ` Lu, Wenzhuo
  2017-03-02 14:46       ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Lu, Wenzhuo @ 2017-03-02  1:22 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, Ananyev, Konstantin, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro
  Cc: Yigit, Ferruh, Richardson, Bruce
Hi Oliver,
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, March 2, 2017 1:19 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Lu,
> Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> nelio.laranjeiro@6wind.com
> Cc: Yigit, Ferruh; Richardson, Bruce
> Subject: [PATCH 4/6] net/e1000: implement descriptor status API (em)
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> +int
> +eth_em_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> +	uint16_t offset)
> +{
> +	volatile uint8_t *status;
> +	struct em_tx_queue *txq;
> +	uint32_t desc;
> +
> +	txq = dev->data->tx_queues[tx_queue_id];
> +	if (unlikely(offset >= txq->nb_tx_desc))
> +		return -EINVAL;
> +
> +	desc = txq->tx_tail + offset;
> +	/* go to next desc that has the RS bit */
> +	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
> +		txq->tx_rs_thresh;
The descriptor may be changed here. So the return value may not be for the offset one. Why?
> +	status = &txq->tx_ring[desc].upper.fields.status;
> +	if (*status & E1000_TXD_STAT_DD)
> +		return RTE_ETH_TX_DESC_DONE;
> +
> +	return RTE_ETH_TX_DESC_FULL;
> +}
> +
>  void
>  em_dev_clear_queues(struct rte_eth_dev *dev)  {
> --
> 2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb)
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
@ 2017-03-02  1:28     ` Lu, Wenzhuo
  2017-03-02 13:58       ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Lu, Wenzhuo @ 2017-03-02  1:28 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, Ananyev, Konstantin, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro
  Cc: Yigit, Ferruh, Richardson, Bruce
Hi Olivier,
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, March 2, 2017 1:19 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Lu,
> Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> nelio.laranjeiro@6wind.com
> Cc: Yigit, Ferruh; Richardson, Bruce
> Subject: [PATCH 3/6] net/e1000: implement descriptor status API (igb)
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> +
> +int
> +eth_igb_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> +	uint16_t offset)
> +{
> +	volatile uint32_t *status;
> +	struct igb_tx_queue *txq;
> +	uint32_t desc;
> +
> +	txq = dev->data->tx_queues[tx_queue_id];
> +	if (unlikely(offset >= txq->nb_tx_desc))
> +		return -EINVAL;
> +
> +	desc = txq->tx_tail + offset;
Should we check nb_tx_desc here? The same for em.
> +	status = &txq->tx_ring[desc].wb.status;
> +	if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
> +		return RTE_ETH_TX_DESC_DONE;
> +
> +	return RTE_ETH_TX_DESC_FULL;
> +}
> +
>  void
>  igb_dev_clear_queues(struct rte_eth_dev *dev)  {
> --
> 2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 5/6] net/mlx5: implement descriptor status API
  2017-03-01 17:19   ` [dpdk-dev] [PATCH 5/6] net/mlx5: implement descriptor status API Olivier Matz
@ 2017-03-02  7:56     ` Nélio Laranjeiro
  0 siblings, 0 replies; 72+ messages in thread
From: Nélio Laranjeiro @ 2017-03-02  7:56 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, ferruh.yigit,
	bruce.richardson
Hi Olivier,
On Wed, Mar 01, 2017 at 06:19:11PM +0100, Olivier Matz wrote:
> Since there is no "descriptor done" flag like on Intel drivers, the
> approach is different on mlx5 driver.
> - for Tx, we call txq_complete() to free descriptors processed by
>   the hw, then we check if the descriptor is between tail and head
> - for Rx, we need to browse the cqes, managing compressed ones,
>   to get the number of used descriptors.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  drivers/net/mlx5/mlx5.c      |  2 ++
>  drivers/net/mlx5/mlx5_rxtx.c | 83 ++++++++++++++++++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_rxtx.h |  2 ++
>  3 files changed, 87 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index d4bd469..4a6450c 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -222,6 +222,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
>  	.rss_hash_update = mlx5_rss_hash_update,
>  	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
>  	.filter_ctrl = mlx5_dev_filter_ctrl,
> +	.rx_descriptor_status = mlx5_rx_descriptor_status,
> +	.tx_descriptor_status = mlx5_tx_descriptor_status,
>  };
>  
>  static struct {
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 88b0354..b3375f6 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -345,6 +345,89 @@ mlx5_tx_dbrec(struct txq *txq, volatile struct mlx5_wqe *wqe)
>  }
>  
>  /**
> + * DPDK callback to check the status of a tx descriptor.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param[in] tx_queue_id
> + *   The identifier of the tx queue.
> + * @param[in] offset
> + *   The index of the descriptor in the ring.
> + *
> + * @return
> + *   The status of the tx descriptor.
> + */
> +int
> +mlx5_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> +			  uint16_t offset)
> +{
> +	struct txq *txq = dev->data->tx_queues[tx_queue_id];
> +	const unsigned int elts_n = 1 << txq->elts_n;
> +	const unsigned int elts_cnt = elts_n - 1;
> +	unsigned int used;
> +
> +	txq_complete(txq);
> +
If you submit a v2 please remove this empty line.
> +	used = (txq->elts_head - txq->elts_tail) & elts_cnt;
> +	if (offset < used)
> +		return RTE_ETH_TX_DESC_FULL;
> +	return RTE_ETH_TX_DESC_DONE;
> +}
> +
> +/**
> + * DPDK callback to check the status of a rx descriptor.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param[in] tx_queue_id
> + *   The identifier of the rx queue.
> + * @param[in] offset
> + *   The index of the descriptor in the ring.
> + *
> + * @return
> + *   The status of the rx descriptor.
> + */
> +int
> +mlx5_rx_descriptor_status(struct rte_eth_dev *dev, uint16_t rx_queue_id,
> +			  uint16_t offset)
> +{
> +	struct rxq *rxq = dev->data->rx_queues[rx_queue_id];
> +	struct rxq_zip *zip = &rxq->zip;
> +	volatile struct mlx5_cqe *cqe;
> +	const unsigned int cqe_n = (1 << rxq->cqe_n);
> +	const unsigned int cqe_cnt = cqe_n - 1;
> +	unsigned int cq_ci;
> +	unsigned int used;
> +
> +	/* if we are processing a compressed cqe */
> +	if (zip->ai) {
> +		used = zip->cqe_cnt - zip->ca;
> +		cq_ci = zip->cq_ci;
> +	} else {
> +		used = 0;
> +		cq_ci = rxq->cq_ci;
> +	}
> +	cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
> +	while (check_cqe(cqe, cqe_n, cq_ci) == 0) {
> +		int8_t op_own;
> +		unsigned int n;
> +
> +		op_own = cqe->op_own;
> +		if (MLX5_CQE_FORMAT(op_own) == MLX5_COMPRESSED)
> +			n = ntohl(cqe->byte_cnt);
> +		else
> +			n = 1;
> +		cq_ci += n;
> +		used += n;
> +		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
> +	}
> +	used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
> +	if (offset < used)
> +		return RTE_ETH_RX_DESC_DONE;
> +	return RTE_ETH_RX_DESC_AVAIL;
> +}
> +
> +/**
>   * DPDK callback for TX.
>   *
>   * @param dpdk_txq
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index 41a34d7..e864dcd 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -323,6 +323,8 @@ uint16_t mlx5_tx_burst_mpw_inline(void *, struct rte_mbuf **, uint16_t);
>  uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
>  uint16_t removed_tx_burst(void *, struct rte_mbuf **, uint16_t);
>  uint16_t removed_rx_burst(void *, struct rte_mbuf **, uint16_t);
> +int mlx5_rx_descriptor_status(struct rte_eth_dev *, uint16_t, uint16_t);
> +int mlx5_tx_descriptor_status(struct rte_eth_dev *, uint16_t, uint16_t);
>  
>  /* mlx5_mr.c */
>  
> -- 
> 2.8.1
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
-- 
Nélio Laranjeiro
6WIND
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-01 18:02   ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Andrew Rybchenko
@ 2017-03-02 13:40     ` Olivier Matz
  2017-03-06 10:41       ` Thomas Monjalon
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 13:40 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
On Wed, 1 Mar 2017 21:02:16 +0300, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
> On 03/01/2017 08:19 PM, Olivier Matz wrote:
> > This patchset introduces a new ethdev API:
> > - rte_eth_rx_descriptor_status()
> > - rte_eth_tx_descriptor_status()  
> 
> May be corresponding features should be added to the NICs documentation?
Yes, good idea.
I propose to use these straightforward names: "Rx Descriptor Status"
and "Tx Descriptor Status".
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-01 18:07   ` Stephen Hemminger
@ 2017-03-02 13:43     ` Olivier Matz
  2017-03-06 10:41       ` Thomas Monjalon
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 13:43 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
Hi Stephen,
On Wed, 1 Mar 2017 10:07:06 -0800, Stephen Hemminger <stephen@networkplumber.org> wrote:
> On Wed,  1 Mar 2017 18:19:06 +0100
> Olivier Matz <olivier.matz@6wind.com> wrote:
> 
> > This patchset introduces a new ethdev API:
> > - rte_eth_rx_descriptor_status()
> > - rte_eth_tx_descriptor_status()
> > 
> > The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> > does almost the same, but does not differentiate the case of a
> > descriptor used by the driver (not returned to the hw).
> > 
> > The usage of these functions can be:
> > - on Rx, anticipate that the cpu is not fast enough to process
> >   all incoming packets, and take dispositions to solve the
> >   problem (add more cpus, drop specific packets, ...)
> > - on Tx, detect that the link is overloaded, and take dispositions
> >   to solve the problem (notify flow control, drop specific
> >   packets)
> > 
> > The patchset updates ixgbe, i40e, e1000, mlx5.
> > The other drivers that implement the descriptor_done() API are
> > fm10k, sfc, virtio. They are not updated.
> > If the new API is accepted, the descriptor_done() can be deprecated,
> > and examples/l3fwd-power will be updated to.
> > 
> > 
> > RFC->v1:
> > - instead of optimizing an API that returns the number of used
> >   descriptors like rx_queue_count(), use a more simple API that
> >   returns the status of a descriptor, like rx_descriptor_done().
> > - remove ethdev api rework (first 2 patches), they have been
> >   sent separately
> > 
> > 
> > Olivier Matz (6):
> >   ethdev: add descriptor status API
> >   net/ixgbe: implement descriptor status API
> >   net/e1000: implement descriptor status API (igb)
> >   net/e1000: implement descriptor status API (em)
> >   net/mlx5: implement descriptor status API
> >   net/i40e: implement descriptor status API
> > 
> >  drivers/net/e1000/e1000_ethdev.h  | 10 +++++
> >  drivers/net/e1000/em_ethdev.c     |  2 +
> >  drivers/net/e1000/em_rxtx.c       | 49 ++++++++++++++++++++++
> >  drivers/net/e1000/igb_ethdev.c    |  2 +
> >  drivers/net/e1000/igb_rxtx.c      | 46 +++++++++++++++++++++
> >  drivers/net/i40e/i40e_ethdev.c    |  2 +
> >  drivers/net/i40e/i40e_ethdev_vf.c |  2 +
> >  drivers/net/i40e/i40e_rxtx.c      | 56 +++++++++++++++++++++++++
> >  drivers/net/i40e/i40e_rxtx.h      |  4 ++
> >  drivers/net/ixgbe/ixgbe_ethdev.c  |  4 ++
> >  drivers/net/ixgbe/ixgbe_ethdev.h  |  5 +++
> >  drivers/net/ixgbe/ixgbe_rxtx.c    | 55 +++++++++++++++++++++++++
> >  drivers/net/mlx5/mlx5.c           |  2 +
> >  drivers/net/mlx5/mlx5_rxtx.c      | 83 +++++++++++++++++++++++++++++++++++++
> >  drivers/net/mlx5/mlx5_rxtx.h      |  2 +
> >  lib/librte_ether/rte_ethdev.h     | 86 +++++++++++++++++++++++++++++++++++++++
> >  16 files changed, 410 insertions(+)
> >   
> 
> Could you update examples to use this?
I can update examples/l3fwd-power, but it will break the
support for drivers that do not implement the new API. Maybe we could
do this in a second time, after all drivers are converted?
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-01 18:22     ` Andrew Rybchenko
@ 2017-03-02 13:57       ` Olivier Matz
  2017-03-02 14:19         ` Andrew Rybchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 13:57 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
Hi Andrew,
Thank you for the review. Comments inline.
On Wed, 1 Mar 2017 21:22:14 +0300, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
> On 03/01/2017 08:19 PM, Olivier Matz wrote:
> > Introduce a new API to get the status of a descriptor.
> >
> > For Rx, it is almost similar to rx_descriptor_done API, except it
> > differentiates "used" descriptors (which are hold by the driver and not
> > returned to the hardware).
> >
> > For Tx, it is a new API.
> >
> > The descriptor_done() API, and probably the rx_queue_count() API could
> > be replaced by this new API as soon as it is implemented on all PMDs.
> >
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
> >   lib/librte_ether/rte_ethdev.h | 86 +++++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 86 insertions(+)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 97f3e2d..9ac9c61 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1179,6 +1179,14 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
> >   typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
> >   /**< @internal Check DD bit of specific RX descriptor */
> >   
> > +typedef int (*eth_rx_descriptor_status_t)(struct rte_eth_dev *dev,
> > +	uint16_t rx_queue_id, uint16_t offset);
> > +/**< @internal Check the status of a Rx descriptor */
> > +
> > +typedef int (*eth_tx_descriptor_status_t)(struct rte_eth_dev *dev,
> > +	uint16_t tx_queue_id, uint16_t offset);
> > +/**< @internal Check the status of a Tx descriptor */
> > +
> >   typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
> >   				     char *fw_version, size_t fw_size);
> >   /**< @internal Get firmware information of an Ethernet device. */
> > @@ -1483,6 +1491,10 @@ struct eth_dev_ops {
> >   	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
> >   	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
> >   	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
> > +	eth_rx_descriptor_status_t rx_descriptor_status;
> > +	/**< Check the status of a Rx descriptor. */
> > +	eth_tx_descriptor_status_t tx_descriptor_status;
> > +	/**< Check the status of a Tx descriptor. */
> >   	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
> >   	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
> >   	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
> > @@ -2768,6 +2780,80 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
> >   		dev->data->rx_queues[queue_id], offset);
> >   }
> >   
> > +#define RTE_ETH_RX_DESC_AVAIL 0 /**< Desc available for hw. */
> > +#define RTE_ETH_RX_DESC_DONE  1 /**< Desc done, filled by hw. */
> > +#define RTE_ETH_RX_DESC_USED  2 /**< Desc used by driver. */
> > +
> > +/**
> > + * Check the status of a Rx descriptor in the queue  
> 
> I think it would be useful to highlight caller context.
> Should it be the same CPU which receives packets from the queue?
Yes, you are right it would be useful. I suggest the following sentences:
  This function should be called on a dataplane core like the
  Rx function. They should not be called concurrently on the same
  queue.
> 
> > + *
> > + * @param port_id
> > + *  The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *  The Rx queue identifier on this port.
> > + * @param offset
> > + *  The offset of the descriptor starting from tail (0 is the next
> > + *  packet to be received by the driver).
> > + * @return
> > + *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
> > + *    receive a packet.
> > + *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
> > + *    not yet processed by the driver (i.e. in the receive queue).
> > + *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
> > + *    not yet returned to hw).  
> 
> It looks like it is the most suitable for descriptors which are reserved 
> and never used.
Can you give some more details about what is a reserved but never
used descriptor? (same question for Tx)
> 
> > + *  - (-ENODEV) if *port_id* invalid.
> > + *  - (-EINVAL) bad descriptor offset.
> > + *  - (-ENOTSUP) if the device does not support this function.  
> 
> What should be returned if queue_id is invalid?
I'd say -ENODEV too. On the other hand, adding these checks is
maybe not a good idea as we are in dataplane.
The previous API rx_descriptor_done() API was taking the queue
pointer as parameter, like Rx/Tx functions. It's probably a better
idea.
> What should be returned if the queue is stopped?
For the same performance reasons, I think we should just highlight
in the API that this dataplane function should not be called on a
stopped queue.
> 
> > + */
> > +static inline int
> > +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
> > +	uint16_t offset)
> > +{
> > +	struct rte_eth_dev *dev;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	dev = &rte_eth_devices[port_id];
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
> > +  
> 
> May be it makes sense to range check queue_id here to avoid such code in 
> each PMD?
If we keep this API, yes. If we switch to a queue pointer as proposed
above, we will assume (and highlight in the API doc) that the pointer
must be valid, like for Rx/Tx funcs.
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb)
  2017-03-02  1:28     ` Lu, Wenzhuo
@ 2017-03-02 13:58       ` Olivier Matz
  0 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 13:58 UTC (permalink / raw)
  To: Lu, Wenzhuo
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Zhang, Helin, Wu,
	Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit, Ferruh,
	Richardson, Bruce
Hi Wenzhuo,
On Thu, 2 Mar 2017 01:28:21 +0000, "Lu, Wenzhuo" <wenzhuo.lu@intel.com> wrote:
> Hi Olivier,
> 
> > -----Original Message-----
> > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > Sent: Thursday, March 2, 2017 1:19 AM
> > To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Lu,
> > Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> > nelio.laranjeiro@6wind.com
> > Cc: Yigit, Ferruh; Richardson, Bruce
> > Subject: [PATCH 3/6] net/e1000: implement descriptor status API (igb)
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > +
> > +int
> > +eth_igb_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> > +	uint16_t offset)
> > +{
> > +	volatile uint32_t *status;
> > +	struct igb_tx_queue *txq;
> > +	uint32_t desc;
> > +
> > +	txq = dev->data->tx_queues[tx_queue_id];
> > +	if (unlikely(offset >= txq->nb_tx_desc))
> > +		return -EINVAL;
> > +
> > +	desc = txq->tx_tail + offset;  
> Should we check nb_tx_desc here? The same for em.
Correct, thanks.
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-02 13:57       ` Olivier Matz
@ 2017-03-02 14:19         ` Andrew Rybchenko
  2017-03-02 14:54           ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Rybchenko @ 2017-03-02 14:19 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
Hi Olivier,
On 03/02/2017 04:57 PM, Olivier Matz wrote:
> Hi Andrew,
>
> Thank you for the review. Comments inline.
>
> On Wed, 1 Mar 2017 21:22:14 +0300, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
>> On 03/01/2017 08:19 PM, Olivier Matz wrote:
>>> Introduce a new API to get the status of a descriptor.
>>>
>>> For Rx, it is almost similar to rx_descriptor_done API, except it
>>> differentiates "used" descriptors (which are hold by the driver and not
>>> returned to the hardware).
>>>
>>> For Tx, it is a new API.
>>>
>>> The descriptor_done() API, and probably the rx_queue_count() API could
>>> be replaced by this new API as soon as it is implemented on all PMDs.
>>>
>>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>>> ---
>>>    lib/librte_ether/rte_ethdev.h | 86 +++++++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 86 insertions(+)
>>>
>>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
>>> index 97f3e2d..9ac9c61 100644
>>> --- a/lib/librte_ether/rte_ethdev.h
>>> +++ b/lib/librte_ether/rte_ethdev.h
>>> @@ -1179,6 +1179,14 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
>>>    typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
>>>    /**< @internal Check DD bit of specific RX descriptor */
>>>    
>>> +typedef int (*eth_rx_descriptor_status_t)(struct rte_eth_dev *dev,
>>> +	uint16_t rx_queue_id, uint16_t offset);
>>> +/**< @internal Check the status of a Rx descriptor */
>>> +
>>> +typedef int (*eth_tx_descriptor_status_t)(struct rte_eth_dev *dev,
>>> +	uint16_t tx_queue_id, uint16_t offset);
>>> +/**< @internal Check the status of a Tx descriptor */
>>> +
>>>    typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
>>>    				     char *fw_version, size_t fw_size);
>>>    /**< @internal Get firmware information of an Ethernet device. */
>>> @@ -1483,6 +1491,10 @@ struct eth_dev_ops {
>>>    	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
>>>    	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
>>>    	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
>>> +	eth_rx_descriptor_status_t rx_descriptor_status;
>>> +	/**< Check the status of a Rx descriptor. */
>>> +	eth_tx_descriptor_status_t tx_descriptor_status;
>>> +	/**< Check the status of a Tx descriptor. */
>>>    	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
>>>    	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
>>>    	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
>>> @@ -2768,6 +2780,80 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
>>>    		dev->data->rx_queues[queue_id], offset);
>>>    }
>>>    
>>> +#define RTE_ETH_RX_DESC_AVAIL 0 /**< Desc available for hw. */
>>> +#define RTE_ETH_RX_DESC_DONE  1 /**< Desc done, filled by hw. */
>>> +#define RTE_ETH_RX_DESC_USED  2 /**< Desc used by driver. */
>>> +
>>> +/**
>>> + * Check the status of a Rx descriptor in the queue
>> I think it would be useful to highlight caller context.
>> Should it be the same CPU which receives packets from the queue?
> Yes, you are right it would be useful. I suggest the following sentences:
>
>    This function should be called on a dataplane core like the
>    Rx function. They should not be called concurrently on the same
>    queue.
The first sentence looks fine. "They" (functions?, dataplane cores?) is 
unclear for me in the second. May be the first one is simply sufficient.
>>> + *
>>> + * @param port_id
>>> + *  The port identifier of the Ethernet device.
>>> + * @param queue_id
>>> + *  The Rx queue identifier on this port.
>>> + * @param offset
>>> + *  The offset of the descriptor starting from tail (0 is the next
>>> + *  packet to be received by the driver).
>>> + * @return
>>> + *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
>>> + *    receive a packet.
>>> + *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
>>> + *    not yet processed by the driver (i.e. in the receive queue).
>>> + *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
>>> + *    not yet returned to hw).
>> It looks like it is the most suitable for descriptors which are reserved
>> and never used.
> Can you give some more details about what is a reserved but never
> used descriptor? (same question for Tx)
Our HW has a requirement to keep few descriptors always unused (i.e. 
some gap between tail and head). It is just a few descriptors, but 
invalid descriptor status may misguide application. E.g. if Rx queue 
size is 512 and offset 510, it will always be unused (since it is 
reserved). It is not an indication that core is too slow and can't keep 
the pace.
>>> + *  - (-ENODEV) if *port_id* invalid.
>>> + *  - (-EINVAL) bad descriptor offset.
>>> + *  - (-ENOTSUP) if the device does not support this function.
>> What should be returned if queue_id is invalid?
> I'd say -ENODEV too. On the other hand, adding these checks is
> maybe not a good idea as we are in dataplane.
>
> The previous API rx_descriptor_done() API was taking the queue
> pointer as parameter, like Rx/Tx functions. It's probably a better
> idea.
I think so too since Rx burst callback (used nearby as above descriptor 
says) gets queue pointer.
>> What should be returned if the queue is stopped?
> For the same performance reasons, I think we should just highlight
> in the API that this dataplane function should not be called on a
> stopped queue.
OK.
>>> + */
>>> +static inline int
>>> +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
>>> +	uint16_t offset)
>>> +{
>>> +	struct rte_eth_dev *dev;
>>> +
>>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>> +	dev = &rte_eth_devices[port_id];
>>> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
>>> +
>> May be it makes sense to range check queue_id here to avoid such code in
>> each PMD?
> If we keep this API, yes. If we switch to a queue pointer as proposed
> above, we will assume (and highlight in the API doc) that the pointer
> must be valid, like for Rx/Tx funcs.
I've simply seen patches which add the queue id range check in the 
generic place.
But I think switching to queue pointer is a better idea.
Thanks,
Andrew.
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em)
  2017-03-02  1:22     ` Lu, Wenzhuo
@ 2017-03-02 14:46       ` Olivier Matz
  2017-03-03  1:15         ` Lu, Wenzhuo
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 14:46 UTC (permalink / raw)
  To: Lu, Wenzhuo
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Zhang, Helin, Wu,
	Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit, Ferruh,
	Richardson, Bruce
Hi Wenzhuo,
On Thu, 2 Mar 2017 01:22:25 +0000, "Lu, Wenzhuo" <wenzhuo.lu@intel.com> wrote:
> Hi Oliver,
> 
> > -----Original Message-----
> > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > Sent: Thursday, March 2, 2017 1:19 AM
> > To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Lu,
> > Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> > nelio.laranjeiro@6wind.com
> > Cc: Yigit, Ferruh; Richardson, Bruce
> > Subject: [PATCH 4/6] net/e1000: implement descriptor status API (em)
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > +int
> > +eth_em_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> > +	uint16_t offset)
> > +{
> > +	volatile uint8_t *status;
> > +	struct em_tx_queue *txq;
> > +	uint32_t desc;
> > +
> > +	txq = dev->data->tx_queues[tx_queue_id];
> > +	if (unlikely(offset >= txq->nb_tx_desc))
> > +		return -EINVAL;
> > +
> > +	desc = txq->tx_tail + offset;
> > +	/* go to next desc that has the RS bit */
> > +	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
> > +		txq->tx_rs_thresh;  
> The descriptor may be changed here. So the return value may not be for the offset one. Why?
Yes, desc index can change. From what I understand, the
"descriptor done" (DD) bit is only set on descriptors which are
flagged with the "report status" (RS) bit.
Here is an example from:
http://www.dpdk.org/ml/archives/dev/2016-November/050679.html
|----------------------------------------------------------------|
|               D       R       R       R                        |
|        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
|        <descs sent><- descs not sent yet  ->                   |
|        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
|----------------------------------------------------------------|
        ^last_desc_cleaned=8                    ^next_rs=47
                ^next_dd=15                   ^tail=45
                     ^hw_head=20
                     <----  nb_used  --------->
The hardware is currently processing the descriptor 20
'R' means the descriptor has the RS bit
'D' means the descriptor has the DD + RS bits
'x' are packets in txq (not sent)
'.' are packet already sent but not freed by sw
Regards,
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-02 14:19         ` Andrew Rybchenko
@ 2017-03-02 14:54           ` Olivier Matz
  2017-03-02 15:05             ` Andrew Rybchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 14:54 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
> >>> + * Check the status of a Rx descriptor in the queue  
> >> I think it would be useful to highlight caller context.
> >> Should it be the same CPU which receives packets from the queue?  
> > Yes, you are right it would be useful. I suggest the following sentences:
> >
> >    This function should be called on a dataplane core like the
> >    Rx function. They should not be called concurrently on the same
> >    queue.  
> 
> The first sentence looks fine. "They" (functions?, dataplane cores?) is 
> unclear for me in the second. May be the first one is simply sufficient.
Ok, I'll keep the first one at least, and see if I can reword the
second one to make it clear.
> >>> + *
> >>> + * @param port_id
> >>> + *  The port identifier of the Ethernet device.
> >>> + * @param queue_id
> >>> + *  The Rx queue identifier on this port.
> >>> + * @param offset
> >>> + *  The offset of the descriptor starting from tail (0 is the next
> >>> + *  packet to be received by the driver).
> >>> + * @return
> >>> + *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
> >>> + *    receive a packet.
> >>> + *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
> >>> + *    not yet processed by the driver (i.e. in the receive queue).
> >>> + *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
> >>> + *    not yet returned to hw).  
> >> It looks like it is the most suitable for descriptors which are reserved
> >> and never used.  
> > Can you give some more details about what is a reserved but never
> > used descriptor? (same question for Tx)  
> 
> Our HW has a requirement to keep few descriptors always unused (i.e. 
> some gap between tail and head). It is just a few descriptors, but 
> invalid descriptor status may misguide application. E.g. if Rx queue 
> size is 512 and offset 510, it will always be unused (since it is 
> reserved). It is not an indication that core is too slow and can't keep 
> the pace.
Understood.
I can change _USED into _UNAVAIL (add it for Tx), with the following
description:
- (RTE_ETH_DESC_UNAVAIL): Descriptor is unavailable: either hold by driver
  and not yet returned to hw, or reserved by the hardware.
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-02 14:54           ` Olivier Matz
@ 2017-03-02 15:05             ` Andrew Rybchenko
  2017-03-02 15:14               ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Rybchenko @ 2017-03-02 15:05 UTC (permalink / raw)
  To: Olivier Matz, Andrew Rybchenko
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
On 03/02/2017 05:54 PM, Olivier Matz wrote:
>>>>> + *
>>>>> + * @param port_id
>>>>> + *  The port identifier of the Ethernet device.
>>>>> + * @param queue_id
>>>>> + *  The Rx queue identifier on this port.
>>>>> + * @param offset
>>>>> + *  The offset of the descriptor starting from tail (0 is the next
>>>>> + *  packet to be received by the driver).
>>>>> + * @return
>>>>> + *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
>>>>> + *    receive a packet.
>>>>> + *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
>>>>> + *    not yet processed by the driver (i.e. in the receive queue).
>>>>> + *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
>>>>> + *    not yet returned to hw).
>>>> It looks like it is the most suitable for descriptors which are reserved
>>>> and never used.
>>> Can you give some more details about what is a reserved but never
>>> used descriptor? (same question for Tx)
>> Our HW has a requirement to keep few descriptors always unused (i.e.
>> some gap between tail and head). It is just a few descriptors, but
>> invalid descriptor status may misguide application. E.g. if Rx queue
>> size is 512 and offset 510, it will always be unused (since it is
>> reserved). It is not an indication that core is too slow and can't keep
>> the pace.
> Understood.
>
> I can change _USED into _UNAVAIL (add it for Tx), with the following
> description:
>
> - (RTE_ETH_DESC_UNAVAIL): Descriptor is unavailable: either hold by driver
>    and not yet returned to hw, or reserved by the hardware.
Looks good. Do I understand correctly that it will be reported for 
descriptors which are not refilled (posted to HW) because of rx_free_thresh?
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API
  2017-03-02 15:05             ` Andrew Rybchenko
@ 2017-03-02 15:14               ` Olivier Matz
  0 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 15:14 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
On Thu, 2 Mar 2017 18:05:52 +0300, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
> On 03/02/2017 05:54 PM, Olivier Matz wrote:
> >>>>> + *
> >>>>> + * @param port_id
> >>>>> + *  The port identifier of the Ethernet device.
> >>>>> + * @param queue_id
> >>>>> + *  The Rx queue identifier on this port.
> >>>>> + * @param offset
> >>>>> + *  The offset of the descriptor starting from tail (0 is the next
> >>>>> + *  packet to be received by the driver).
> >>>>> + * @return
> >>>>> + *  - (RTE_ETH_DESC_AVAIL): Descriptor is available for the hardware to
> >>>>> + *    receive a packet.
> >>>>> + *  - (RTE_ETH_DESC_DONE): Descriptor is done, it is filled by hw, but
> >>>>> + *    not yet processed by the driver (i.e. in the receive queue).
> >>>>> + *  - (RTE_ETH_DESC_USED): Descriptor is unavailable (hold by driver,
> >>>>> + *    not yet returned to hw).  
> >>>> It looks like it is the most suitable for descriptors which are reserved
> >>>> and never used.  
> >>> Can you give some more details about what is a reserved but never
> >>> used descriptor? (same question for Tx)  
> >> Our HW has a requirement to keep few descriptors always unused (i.e.
> >> some gap between tail and head). It is just a few descriptors, but
> >> invalid descriptor status may misguide application. E.g. if Rx queue
> >> size is 512 and offset 510, it will always be unused (since it is
> >> reserved). It is not an indication that core is too slow and can't keep
> >> the pace.  
> > Understood.
> >
> > I can change _USED into _UNAVAIL (add it for Tx), with the following
> > description:
> >
> > - (RTE_ETH_DESC_UNAVAIL): Descriptor is unavailable: either hold by driver
> >    and not yet returned to hw, or reserved by the hardware.  
> 
> Looks good. Do I understand correctly that it will be reported for 
> descriptors which are not refilled (posted to HW) because of rx_free_thresh?
> 
Yes
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (7 preceding siblings ...)
  2017-03-01 18:07   ` Stephen Hemminger
@ 2017-03-02 15:32   ` Bruce Richardson
  2017-03-02 16:14     ` Olivier Matz
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
  9 siblings, 1 reply; 72+ messages in thread
From: Bruce Richardson @ 2017-03-02 15:32 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit
On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:
> This patchset introduces a new ethdev API:
> - rte_eth_rx_descriptor_status()
> - rte_eth_tx_descriptor_status()
> 
> The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> does almost the same, but does not differentiate the case of a
> descriptor used by the driver (not returned to the hw).
> 
> The usage of these functions can be:
> - on Rx, anticipate that the cpu is not fast enough to process
>   all incoming packets, and take dispositions to solve the
>   problem (add more cpus, drop specific packets, ...)
> - on Tx, detect that the link is overloaded, and take dispositions
>   to solve the problem (notify flow control, drop specific
>   packets)
> 
Looking at it from a slightly higher level, are these APIs really going
to help in these situations? If something is overloaded, doing more work
by querying the ring status only makes things worse. I suspect that in
most cases better results can be got by just looking at the results of
RX and TX burst functions. For example, if RX burst is always returning
a full set of packets, chances are you are overloaded, or at least have
no scope for an unexpected burst or event.
Are these really needed for real applications? I suspect our trivial
l3fwd power example can be made to work ok without them.
Regards,
/Bruce
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-02 15:32   ` Bruce Richardson
@ 2017-03-02 16:14     ` Olivier Matz
  2017-03-03 16:18       ` Venkatesan, Venky
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-02 16:14 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit
On Thu, 2 Mar 2017 15:32:15 +0000, Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:
> > This patchset introduces a new ethdev API:
> > - rte_eth_rx_descriptor_status()
> > - rte_eth_tx_descriptor_status()
> > 
> > The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> > does almost the same, but does not differentiate the case of a
> > descriptor used by the driver (not returned to the hw).
> > 
> > The usage of these functions can be:
> > - on Rx, anticipate that the cpu is not fast enough to process
> >   all incoming packets, and take dispositions to solve the
> >   problem (add more cpus, drop specific packets, ...)
> > - on Tx, detect that the link is overloaded, and take dispositions
> >   to solve the problem (notify flow control, drop specific
> >   packets)
> >   
> Looking at it from a slightly higher level, are these APIs really going
> to help in these situations? If something is overloaded, doing more work
> by querying the ring status only makes things worse. I suspect that in
> most cases better results can be got by just looking at the results of
> RX and TX burst functions. For example, if RX burst is always returning
> a full set of packets, chances are you are overloaded, or at least have
> no scope for an unexpected burst or event.
> 
> Are these really needed for real applications? I suspect our trivial
> l3fwd power example can be made to work ok without them.
The l3fwd example uses the rte_eth_rx_descriptor_done() API, which
is very similar to what I'm adding here. The differences are:
- the new lib provides a Tx counterpart
- it differentiates done/avail/hold descriptors
The alternative was to update the descriptor_done API, but
I think we agreed to do that in this thread:
http://www.dpdk.org/ml/archives/dev/2017-January/054947.html
About the usefulness of the API, I confirm it is useful: for
instance, you can detect that you ring is more than half-full, and
take dispositions to increase your processing power or select the
packets you want to drop first.
Regards
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em)
  2017-03-02 14:46       ` Olivier Matz
@ 2017-03-03  1:15         ` Lu, Wenzhuo
  0 siblings, 0 replies; 72+ messages in thread
From: Lu, Wenzhuo @ 2017-03-03  1:15 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Zhang, Helin, Wu,
	Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit, Ferruh,
	Richardson, Bruce
Hi Olivier,
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, March 2, 2017 10:47 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Zhang,
> Helin; Wu, Jingjing; adrien.mazarguil@6wind.com; nelio.laranjeiro@6wind.com;
> Yigit, Ferruh; Richardson, Bruce
> Subject: Re: [PATCH 4/6] net/e1000: implement descriptor status API (em)
> 
> Hi Wenzhuo,
> 
> On Thu, 2 Mar 2017 01:22:25 +0000, "Lu, Wenzhuo" <wenzhuo.lu@intel.com>
> wrote:
> > Hi Oliver,
> >
> > > -----Original Message-----
> > > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > > Sent: Thursday, March 2, 2017 1:19 AM
> > > To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin;
> > > Lu, Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> > > nelio.laranjeiro@6wind.com
> > > Cc: Yigit, Ferruh; Richardson, Bruce
> > > Subject: [PATCH 4/6] net/e1000: implement descriptor status API (em)
> > >
> > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > > +int
> > > +eth_em_tx_descriptor_status(struct rte_eth_dev *dev, uint16_t
> tx_queue_id,
> > > +	uint16_t offset)
> > > +{
> > > +	volatile uint8_t *status;
> > > +	struct em_tx_queue *txq;
> > > +	uint32_t desc;
> > > +
> > > +	txq = dev->data->tx_queues[tx_queue_id];
> > > +	if (unlikely(offset >= txq->nb_tx_desc))
> > > +		return -EINVAL;
> > > +
> > > +	desc = txq->tx_tail + offset;
> > > +	/* go to next desc that has the RS bit */
> > > +	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
> > > +		txq->tx_rs_thresh;
> > The descriptor may be changed here. So the return value may not be for the
> offset one. Why?
> 
> Yes, desc index can change. From what I understand, the "descriptor done" (DD)
> bit is only set on descriptors which are flagged with the "report status" (RS) bit.
Yes, you're right. Sorry for the noise :)
> 
> Here is an example from:
> http://www.dpdk.org/ml/archives/dev/2016-November/050679.html
> 
> |----------------------------------------------------------------|
> |               D       R       R       R                        |
> |        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
> |        <descs sent><- descs not sent yet  ->                   |
> |        ............xxxxxxxxxxxxxxxxxxxxxxxxx                   |
> |----------------------------------------------------------------|
>         ^last_desc_cleaned=8                    ^next_rs=47
>                 ^next_dd=15                   ^tail=45
>                      ^hw_head=20
> 
>                      <----  nb_used  --------->
> 
> The hardware is currently processing the descriptor 20 'R' means the descriptor
> has the RS bit 'D' means the descriptor has the DD + RS bits 'x' are packets in txq
> (not sent) '.' are packet already sent but not freed by sw
> 
> 
> Regards,
> Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-02 16:14     ` Olivier Matz
@ 2017-03-03 16:18       ` Venkatesan, Venky
  2017-03-03 16:45         ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Venkatesan, Venky @ 2017-03-03 16:18 UTC (permalink / raw)
  To: Olivier Matz, Richardson, Bruce
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Lu, Wenzhuo, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit,
	Ferruh
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> Sent: Thursday, March 2, 2017 8:15 AM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> nelio.laranjeiro@6wind.com; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
> 
> On Thu, 2 Mar 2017 15:32:15 +0000, Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> > On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:
> > > This patchset introduces a new ethdev API:
> > > - rte_eth_rx_descriptor_status()
> > > - rte_eth_tx_descriptor_status()
> > >
> > > The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> > > does almost the same, but does not differentiate the case of a
> > > descriptor used by the driver (not returned to the hw).
> > >
> > > The usage of these functions can be:
> > > - on Rx, anticipate that the cpu is not fast enough to process
> > >   all incoming packets, and take dispositions to solve the
> > >   problem (add more cpus, drop specific packets, ...)
> > > - on Tx, detect that the link is overloaded, and take dispositions
> > >   to solve the problem (notify flow control, drop specific
> > >   packets)
> > >
> > Looking at it from a slightly higher level, are these APIs really
> > going to help in these situations? If something is overloaded, doing
> > more work by querying the ring status only makes things worse. I
> > suspect that in most cases better results can be got by just looking
> > at the results of RX and TX burst functions. For example, if RX burst
> > is always returning a full set of packets, chances are you are
> > overloaded, or at least have no scope for an unexpected burst or event.
> >
> > Are these really needed for real applications? I suspect our trivial
> > l3fwd power example can be made to work ok without them.
> 
> The l3fwd example uses the rte_eth_rx_descriptor_done() API, which is very
> similar to what I'm adding here. The differences are:
> 
> - the new lib provides a Tx counterpart
> - it differentiates done/avail/hold descriptors
> 
> The alternative was to update the descriptor_done API, but I think we
> agreed to do that in this thread:
> http://www.dpdk.org/ml/archives/dev/2017-January/054947.html
> 
> About the usefulness of the API, I confirm it is useful: for instance, you can
> detect that you ring is more than half-full, and take dispositions to increase
> your processing power or select the packets you want to drop first.
> 
For either of those cases, you could still implement this in your application without any of these APIs. Simply keep reading rx_burst() till it returns zero. You now have all the packets that you want - look at how many and increase your processing power, or drop them. 
The issue I have with this newer instantiation of the API is that it is essentially to pick up a descriptor at a specified offset. In most cases, if you plan to read far enough ahead with the API (let's say 16 or 32 ahead, or even more), you are almost always guaranteed an L1/L2 miss - essentially making it a costly API call. In cases that don't have something like Data Direct I/O (DDIO), you are now going to hit memory and stall the CPU for a long time. In any case, the API becomes pretty useless unless you want to stay within a smaller look ahead offset. The rx_burst() methodology simply works better in most circumstances, and allows application level control.
So, NAK. My suggestion would be to go back to the older API.
-Venky
> Regards
> Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-03 16:18       ` Venkatesan, Venky
@ 2017-03-03 16:45         ` Olivier Matz
  2017-03-03 18:46           ` Venkatesan, Venky
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-03 16:45 UTC (permalink / raw)
  To: Venkatesan, Venky
  Cc: Richardson, Bruce, dev, thomas.monjalon, Ananyev, Konstantin, Lu,
	Wenzhuo, Zhang, Helin, Wu, Jingjing, adrien.mazarguil,
	nelio.laranjeiro, Yigit, Ferruh
Hi Venky,
On Fri, 3 Mar 2017 16:18:52 +0000, "Venkatesan, Venky" <venky.venkatesan@intel.com> wrote:
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> > Sent: Thursday, March 2, 2017 8:15 AM
> > To: Richardson, Bruce <bruce.richardson@intel.com>
> > Cc: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> > nelio.laranjeiro@6wind.com; Yigit, Ferruh <ferruh.yigit@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
> > 
> > On Thu, 2 Mar 2017 15:32:15 +0000, Bruce Richardson
> > <bruce.richardson@intel.com> wrote:  
> > > On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:  
> > > > This patchset introduces a new ethdev API:
> > > > - rte_eth_rx_descriptor_status()
> > > > - rte_eth_tx_descriptor_status()
> > > >
> > > > The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> > > > does almost the same, but does not differentiate the case of a
> > > > descriptor used by the driver (not returned to the hw).
> > > >
> > > > The usage of these functions can be:
> > > > - on Rx, anticipate that the cpu is not fast enough to process
> > > >   all incoming packets, and take dispositions to solve the
> > > >   problem (add more cpus, drop specific packets, ...)
> > > > - on Tx, detect that the link is overloaded, and take dispositions
> > > >   to solve the problem (notify flow control, drop specific
> > > >   packets)
> > > >  
> > > Looking at it from a slightly higher level, are these APIs really
> > > going to help in these situations? If something is overloaded, doing
> > > more work by querying the ring status only makes things worse. I
> > > suspect that in most cases better results can be got by just looking
> > > at the results of RX and TX burst functions. For example, if RX burst
> > > is always returning a full set of packets, chances are you are
> > > overloaded, or at least have no scope for an unexpected burst or event.
> > >
> > > Are these really needed for real applications? I suspect our trivial
> > > l3fwd power example can be made to work ok without them.  
> > 
> > The l3fwd example uses the rte_eth_rx_descriptor_done() API, which is very
> > similar to what I'm adding here. The differences are:
> > 
> > - the new lib provides a Tx counterpart
> > - it differentiates done/avail/hold descriptors
> > 
> > The alternative was to update the descriptor_done API, but I think we
> > agreed to do that in this thread:
> > http://www.dpdk.org/ml/archives/dev/2017-January/054947.html
> > 
> > About the usefulness of the API, I confirm it is useful: for instance, you can
> > detect that you ring is more than half-full, and take dispositions to increase
> > your processing power or select the packets you want to drop first.
> >   
> For either of those cases, you could still implement this in your application without any of these APIs. Simply keep reading rx_burst() till it returns zero. You now have all the packets that you want - look at how many and increase your processing power, or drop them. 
In my use case, I may have several thresholds, so it gives a fast information
about the ring status.
Keeping reading rx_burst() until it returns 0 will not work if the packet
rate is higher than (or close to) what the cpu is able to eat.
> 
> The issue I have with this newer instantiation of the API is that it is essentially to pick up a descriptor at a specified offset. In most cases, if you plan to read far enough ahead with the API (let's say 16 or 32 ahead, or even more), you are almost always guaranteed an L1/L2 miss - essentially making it a costly API call. In cases that don't have something like Data Direct I/O (DDIO), you are now going to hit memory and stall the CPU for a long time. In any case, the API becomes pretty useless unless you want to stay within a smaller look ahead offset. The rx_burst() methodology simply works better in most circumstances, and allows application level control.
> 
> So, NAK. My suggestion would be to go back to the older API.
I don't understand the reason of your nack.
The old API is there (for Rx it works the same), and it is illustrated
in an example. Since your arguments also applies to the old API, so why
are you saying we should keep the older API?
For Tx, I want to know if I have enough room to send my packets before
doing it. There is no API yet to do that.
And yes, this could trigger cache misses, but in some situations
it's preferable to be a bit slower (all tests are not test-iofwd)
and be able to anticipate that the ring is getting full.
Regards,
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-03 16:45         ` Olivier Matz
@ 2017-03-03 18:46           ` Venkatesan, Venky
  2017-03-04 20:45             ` Olivier Matz
  0 siblings, 1 reply; 72+ messages in thread
From: Venkatesan, Venky @ 2017-03-03 18:46 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Richardson, Bruce, dev, thomas.monjalon, Ananyev, Konstantin, Lu,
	Wenzhuo, Zhang, Helin, Wu, Jingjing, adrien.mazarguil,
	nelio.laranjeiro, Yigit, Ferruh
Hi Olivier, 
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Friday, March 3, 2017 8:45 AM
> To: Venkatesan, Venky <venky.venkatesan@intel.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org;
> thomas.monjalon@6wind.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> nelio.laranjeiro@6wind.com; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
> 
> Hi Venky,
> 
> On Fri, 3 Mar 2017 16:18:52 +0000, "Venkatesan, Venky"
> <venky.venkatesan@intel.com> wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> > > Sent: Thursday, March 2, 2017 8:15 AM
> > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > Cc: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>;
> > > Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> > > nelio.laranjeiro@6wind.com; Yigit, Ferruh <ferruh.yigit@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx
> > > descriptors
> > >
> > > On Thu, 2 Mar 2017 15:32:15 +0000, Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > > On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:
> > > > > This patchset introduces a new ethdev API:
> > > > > - rte_eth_rx_descriptor_status()
> > > > > - rte_eth_tx_descriptor_status()
> > > > >
> > > > > The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> > > > > does almost the same, but does not differentiate the case of a
> > > > > descriptor used by the driver (not returned to the hw).
> > > > >
> > > > > The usage of these functions can be:
> > > > > - on Rx, anticipate that the cpu is not fast enough to process
> > > > >   all incoming packets, and take dispositions to solve the
> > > > >   problem (add more cpus, drop specific packets, ...)
> > > > > - on Tx, detect that the link is overloaded, and take dispositions
> > > > >   to solve the problem (notify flow control, drop specific
> > > > >   packets)
> > > > >
> > > > Looking at it from a slightly higher level, are these APIs really
> > > > going to help in these situations? If something is overloaded,
> > > > doing more work by querying the ring status only makes things
> > > > worse. I suspect that in most cases better results can be got by
> > > > just looking at the results of RX and TX burst functions. For
> > > > example, if RX burst is always returning a full set of packets,
> > > > chances are you are overloaded, or at least have no scope for an
> unexpected burst or event.
> > > >
> > > > Are these really needed for real applications? I suspect our
> > > > trivial l3fwd power example can be made to work ok without them.
> > >
> > > The l3fwd example uses the rte_eth_rx_descriptor_done() API, which
> > > is very similar to what I'm adding here. The differences are:
> > >
> > > - the new lib provides a Tx counterpart
> > > - it differentiates done/avail/hold descriptors
> > >
> > > The alternative was to update the descriptor_done API, but I think
> > > we agreed to do that in this thread:
> > > http://www.dpdk.org/ml/archives/dev/2017-January/054947.html
> > >
> > > About the usefulness of the API, I confirm it is useful: for
> > > instance, you can detect that you ring is more than half-full, and
> > > take dispositions to increase your processing power or select the packets
> you want to drop first.
> > >
> > For either of those cases, you could still implement this in your application
> without any of these APIs. Simply keep reading rx_burst() till it returns zero.
> You now have all the packets that you want - look at how many and increase
> your processing power, or drop them.
> 
> In my use case, I may have several thresholds, so it gives a fast information
> about the ring status.
The problem is that it costs too much to return that status. Let me explain it this way - when processing a burst, it takes the IXGBE driver ~20 - 25 cycles to process a packet. Assuming memory latency is 75ns, a miss to memory costs ~150 cycles at 2.0 GHz, or 215 cycles at 3.0 GHz. Either way, it is between 7 - 10 packet times if you miss to memory. In the case you are suggesting where the CPU isn't able to keep up with the packets, all we've have really done is to make it harder for the CPU to keep up. 
Could you use an RX alternative that returns a 0 or 1 if there are more packets remaining in the ring? That will be lower cost to implement (as a separate API or even as a part of the Rx_burst API itself). But touching the ring at a random offset is almost always going to be a performance problem. 
> 
> Keeping reading rx_burst() until it returns 0 will not work if the packet rate is
> higher than (or close to) what the cpu is able to eat.
> 
If the packet input rate is higher than what the CPU is capable of handling, reading the packets gives you the option of dropping those that you consider lower priority - if you have an application that takes  400 cycles to process a packet, but the input rate is a packet every 100 cycles, then what you have to look at is to read the packets, figure out a lighter weight way of determining a drop per packet (easy suggestion, use the RX filter API to tag packets) and drop them within 10-20 cycles per packet. Ideally, you would do this by draining some percentage of the descriptor ring and prioritizing and dropping those. A second, even better alternative is to use NIC facilities to prioritize packets into different queues. I don't see how adding another 150 cycles to the budget helps you keep up with packets. 
> >
> > The issue I have with this newer instantiation of the API is that it is
> essentially to pick up a descriptor at a specified offset. In most cases, if you
> plan to read far enough ahead with the API (let's say 16 or 32 ahead, or even
> more), you are almost always guaranteed an L1/L2 miss - essentially making it
> a costly API call. In cases that don't have something like Data Direct I/O
> (DDIO), you are now going to hit memory and stall the CPU for a long time. In
> any case, the API becomes pretty useless unless you want to stay within a
> smaller look ahead offset. The rx_burst() methodology simply works better
> in most circumstances, and allows application level control.
> >
> > So, NAK. My suggestion would be to go back to the older API.
> 
> I don't understand the reason of your nack.
> The old API is there (for Rx it works the same), and it is illustrated in an
> example. Since your arguments also applies to the old API, so why are you
> saying we should keep the older API?
> 
I am not a fan of the old API either. In hindsight, it was a mistake (which we didn't catch in time). As Bruce suggested, the example should be reworked to work without the API, and deprecate it. 
> For Tx, I want to know if I have enough room to send my packets before
> doing it. There is no API yet to do that.
> 
Yes. This could be a lightweight API if it returned a count (txq->nb_tx_free) instead of actually touching the ring, which is what I have a problem with. If the implementation changes to that, that may be okay to do. The Tx API has more merit than the Rx API, but not coded around an offset.
> And yes, this could trigger cache misses, but in some situations it's preferable
> to be a bit slower (all tests are not test-iofwd) and be able to anticipate that
> the ring is getting full.
> 
> 
> Regards,
> Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-03 18:46           ` Venkatesan, Venky
@ 2017-03-04 20:45             ` Olivier Matz
  2017-03-06 11:02               ` Thomas Monjalon
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-04 20:45 UTC (permalink / raw)
  To: Venkatesan, Venky, thomas.monjalon
  Cc: Richardson, Bruce, dev, Ananyev, Konstantin, Lu, Wenzhuo, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit,
	Ferruh
On Fri, 3 Mar 2017 18:46:52 +0000, "Venkatesan, Venky" <venky.venkatesan@intel.com> wrote:
> Hi Olivier, 
> 
> > -----Original Message-----
> > From: Olivier Matz [mailto:olivier.matz@6wind.com]
> > Sent: Friday, March 3, 2017 8:45 AM
> > To: Venkatesan, Venky <venky.venkatesan@intel.com>
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org;
> > thomas.monjalon@6wind.com; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> > nelio.laranjeiro@6wind.com; Yigit, Ferruh <ferruh.yigit@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
> > 
> > Hi Venky,
> > 
> > On Fri, 3 Mar 2017 16:18:52 +0000, "Venkatesan, Venky"
> > <venky.venkatesan@intel.com> wrote:  
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> > > > Sent: Thursday, March 2, 2017 8:15 AM
> > > > To: Richardson, Bruce <bruce.richardson@intel.com>
> > > > Cc: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>; Lu, Wenzhuo  
> > <wenzhuo.lu@intel.com>;  
> > > > Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> > > > nelio.laranjeiro@6wind.com; Yigit, Ferruh <ferruh.yigit@intel.com>
> > > > Subject: Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx
> > > > descriptors
> > > >
> > > > On Thu, 2 Mar 2017 15:32:15 +0000, Bruce Richardson
> > > > <bruce.richardson@intel.com> wrote:  
> > > > > On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:  
> > > > > > This patchset introduces a new ethdev API:
> > > > > > - rte_eth_rx_descriptor_status()
> > > > > > - rte_eth_tx_descriptor_status()
> > > > > >
> > > > > > The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> > > > > > does almost the same, but does not differentiate the case of a
> > > > > > descriptor used by the driver (not returned to the hw).
> > > > > >
> > > > > > The usage of these functions can be:
> > > > > > - on Rx, anticipate that the cpu is not fast enough to process
> > > > > >   all incoming packets, and take dispositions to solve the
> > > > > >   problem (add more cpus, drop specific packets, ...)
> > > > > > - on Tx, detect that the link is overloaded, and take dispositions
> > > > > >   to solve the problem (notify flow control, drop specific
> > > > > >   packets)
> > > > > >  
> > > > > Looking at it from a slightly higher level, are these APIs really
> > > > > going to help in these situations? If something is overloaded,
> > > > > doing more work by querying the ring status only makes things
> > > > > worse. I suspect that in most cases better results can be got by
> > > > > just looking at the results of RX and TX burst functions. For
> > > > > example, if RX burst is always returning a full set of packets,
> > > > > chances are you are overloaded, or at least have no scope for an  
> > unexpected burst or event.  
> > > > >
> > > > > Are these really needed for real applications? I suspect our
> > > > > trivial l3fwd power example can be made to work ok without them.  
> > > >
> > > > The l3fwd example uses the rte_eth_rx_descriptor_done() API, which
> > > > is very similar to what I'm adding here. The differences are:
> > > >
> > > > - the new lib provides a Tx counterpart
> > > > - it differentiates done/avail/hold descriptors
> > > >
> > > > The alternative was to update the descriptor_done API, but I think
> > > > we agreed to do that in this thread:
> > > > http://www.dpdk.org/ml/archives/dev/2017-January/054947.html
> > > >
> > > > About the usefulness of the API, I confirm it is useful: for
> > > > instance, you can detect that you ring is more than half-full, and
> > > > take dispositions to increase your processing power or select the packets  
> > you want to drop first.  
> > > >  
> > > For either of those cases, you could still implement this in your application  
> > without any of these APIs. Simply keep reading rx_burst() till it returns zero.
> > You now have all the packets that you want - look at how many and increase
> > your processing power, or drop them.
> > 
> > In my use case, I may have several thresholds, so it gives a fast information
> > about the ring status.  
> 
> The problem is that it costs too much to return that status. Let me explain it this way - when processing a burst, it takes the IXGBE driver ~20 - 25 cycles to process a packet. Assuming memory latency is 75ns, a miss to memory costs ~150 cycles at 2.0 GHz, or 215 cycles at 3.0 GHz. Either way, it is between 7 - 10 packet times if you miss to memory. In the case you are suggesting where the CPU isn't able to keep up with the packets, all we've have really done is to make it harder for the CPU to keep up. 
Did you do the test to validate what you say? I did it.
Let's add the following patch to testpmd:
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -92,6 +92,8 @@ pkt_burst_io_forward(struct fwd_stream *fs)
        start_tsc = rte_rdtsc();
 #endif
+       rte_eth_rx_descriptor_status(fs->rx_port, fs->rx_queue, 128);
+       
        /*
         * Receive a burst of packets and forward them.
         */
If I measure the performance in iofwd (nb_rxd=512), I see no difference
between the results with and without the patch.
To me, your calculation does not apply to a real life application:
- this function is called once per burst (32 or 64), so the penalty
  of 200 cycles (if any) has to be divided by the number of packets
- you need to take in account the number of cycles for the whole
  processing of a packet, not for the ixgbe driver only. If you
  are doing forwarding, you are at least at 100 cycles / packet for
  a benchmark test case. I don't even talk about IPsec.
So, the penalty, in the worst case (burst of 32, 100c/pkt) is ~6%.
Given the information it provides, it is acceptable to me.
Note we are talking here about an optional API, that would only impact
people that use it.
> Could you use an RX alternative that returns a 0 or 1 if there are more packets remaining in the ring? That will be lower cost to implement (as a separate API or even as a part of the Rx_burst API itself). But touching the ring at a random offset is almost always going to be a performance problem. 
About returning 0 or 1 if there are more packets in the ring, it does
not solve my issue since I want to know if the number of descriptor is
above or below a configurable threshold.
Also, changing the Rx burst function is much more likely to be refused
than adding an optional API.
> > Keeping reading rx_burst() until it returns 0 will not work if the packet rate is
> > higher than (or close to) what the cpu is able to eat.
> >   
> 
> If the packet input rate is higher than what the CPU is capable of handling, reading the packets gives you the option of dropping those that you consider lower priority - if you have an application that takes  400 cycles to process a packet, but the input rate is a packet every 100 cycles, then what you have to look at is to read the packets, figure out a lighter weight way of determining a drop per packet (easy suggestion, use the RX filter API to tag packets) and drop them within 10-20 cycles per packet. Ideally, you would do this by draining some percentage of the descriptor ring and prioritizing and dropping those. A second, even better alternative is to use NIC facilities to prioritize packets into different queues. I don't see how adding another 150 cycles to the budget helps you keep up with packets. 
The RX filter API is not available on all drivers, and is not as flexible
as a software filter. If I use this new API, I don't need to call this
software filter when the CPU is not overload, saving cycles in the common
case.
Trying to read all the packets in the ring before processing them is not
an option either, especially if the ring size is large (4096). This would
consumes more mbufs (hw ring + sw processing queue), it will also break
the mbuf prefetch policies done in the drivers.
> > > The issue I have with this newer instantiation of the API is that it is  
> > essentially to pick up a descriptor at a specified offset. In most cases, if you
> > plan to read far enough ahead with the API (let's say 16 or 32 ahead, or even
> > more), you are almost always guaranteed an L1/L2 miss - essentially making it
> > a costly API call. In cases that don't have something like Data Direct I/O
> > (DDIO), you are now going to hit memory and stall the CPU for a long time. In
> > any case, the API becomes pretty useless unless you want to stay within a
> > smaller look ahead offset. The rx_burst() methodology simply works better
> > in most circumstances, and allows application level control.  
> > >
> > > So, NAK. My suggestion would be to go back to the older API.  
> > 
> > I don't understand the reason of your nack.
> > The old API is there (for Rx it works the same), and it is illustrated in an
> > example. Since your arguments also applies to the old API, so why are you
> > saying we should keep the older API?
> >   
> 
> I am not a fan of the old API either. In hindsight, it was a mistake (which we didn't catch in time). As Bruce suggested, the example should be reworked to work without the API, and deprecate it. 
Before deprecating an API, I think we should check if people are using
it and if it can really be replaced. I think there are many things that
could be deprecated before this one.
> > For Tx, I want to know if I have enough room to send my packets before
> > doing it. There is no API yet to do that.
> >   
> 
> Yes. This could be a lightweight API if it returned a count (txq->nb_tx_free) instead of actually touching the ring, which is what I have a problem with. If the implementation changes to that, that may be okay to do. The Tx API has more merit than the Rx API, but not coded around an offset.
Returning txq->nb_tx_free does not work because it is a software view,
which becomes wrong as soon as the hardware has send the packets.
Example:
1. Send many packets at very high rate, the tx ring becomes full
2. wait that packets are transmitted
3. read nb_tx_free, it returns 0, which is not what I want
So in my case there is a also a need for a Tx descriptor status API.
Thomas, you are the maintainer of ethdev API, do you have an opinion?
Regards,
Olivier
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-02 13:40     ` Olivier Matz
@ 2017-03-06 10:41       ` Thomas Monjalon
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Monjalon @ 2017-03-06 10:41 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Andrew Rybchenko, dev, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
2017-03-02 14:40, Olivier Matz:
> On Wed, 1 Mar 2017 21:02:16 +0300, Andrew Rybchenko <arybchenko@solarflare.com> wrote:
> > On 03/01/2017 08:19 PM, Olivier Matz wrote:
> > > This patchset introduces a new ethdev API:
> > > - rte_eth_rx_descriptor_status()
> > > - rte_eth_tx_descriptor_status()  
> > 
> > May be corresponding features should be added to the NICs documentation?
> 
> Yes, good idea.
> 
> I propose to use these straightforward names: "Rx Descriptor Status"
> and "Tx Descriptor Status".
Yes
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-02 13:43     ` Olivier Matz
@ 2017-03-06 10:41       ` Thomas Monjalon
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Monjalon @ 2017-03-06 10:41 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Stephen Hemminger, dev, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro,
	ferruh.yigit, bruce.richardson
2017-03-02 14:43, Olivier Matz:
> Hi Stephen,
> 
> On Wed, 1 Mar 2017 10:07:06 -0800, Stephen Hemminger <stephen@networkplumber.org> wrote:
> > On Wed,  1 Mar 2017 18:19:06 +0100
> > Olivier Matz <olivier.matz@6wind.com> wrote:
> > 
> > > This patchset introduces a new ethdev API:
> > > - rte_eth_rx_descriptor_status()
> > > - rte_eth_tx_descriptor_status()
[...]
> > Could you update examples to use this?
> 
> I can update examples/l3fwd-power, but it will break the
> support for drivers that do not implement the new API. Maybe we could
> do this in a second time, after all drivers are converted?
Yes, good idea
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors
  2017-03-04 20:45             ` Olivier Matz
@ 2017-03-06 11:02               ` Thomas Monjalon
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Monjalon @ 2017-03-06 11:02 UTC (permalink / raw)
  To: Olivier Matz
  Cc: Venkatesan, Venky, Richardson, Bruce, dev, Ananyev, Konstantin,
	Lu, Wenzhuo, Zhang, Helin, Wu, Jingjing, adrien.mazarguil,
	nelio.laranjeiro, Yigit, Ferruh
2017-03-04 21:45, Olivier Matz:
> "Venkatesan, Venky" <venky.venkatesan@intel.com> wrote:
> > From: Olivier Matz
> > > On Fri, 3 Mar 2017 16:18:52 +0000, "Venkatesan, Venky" wrote:  
> > > > From: Olivier Matz
> > > > > On Thu, 2 Mar 2017 15:32:15 +0000, Bruce Richardson wrote:  
> > > > > > On Wed, Mar 01, 2017 at 06:19:06PM +0100, Olivier Matz wrote:  
> > > > > > > The usage of these functions can be:
> > > > > > > - on Rx, anticipate that the cpu is not fast enough to process
> > > > > > >   all incoming packets, and take dispositions to solve the
> > > > > > >   problem (add more cpus, drop specific packets, ...)
> > > > > > > - on Tx, detect that the link is overloaded, and take dispositions
> > > > > > >   to solve the problem (notify flow control, drop specific
> > > > > > >   packets)
> > > > > > >  
[...]
> > > > > > Are these really needed for real applications? I suspect our
> > > > > > trivial l3fwd power example can be made to work ok without them.
OK, please remove the use of such old API in the example.
[...]
> So, the penalty, in the worst case (burst of 32, 100c/pkt) is ~6%.
> Given the information it provides, it is acceptable to me.
Any penalty is acceptable, given it is not mandatory to call these functions.
> Note we are talking here about an optional API, that would only impact
> people that use it.
Yes, it just brings more information and can be used for some debug measures.
[...]
> Also, changing the Rx burst function is much more likely to be refused
> than adding an optional API.
Yes, changing Rx/Tx API is not really an option and does not bring so much
benefits.
[...]
> > > > So, NAK. My suggestion would be to go back to the older API.  
> > > 
> > > I don't understand the reason of your nack.
> > > The old API is there (for Rx it works the same), and it is illustrated in an
> > > example. Since your arguments also applies to the old API, so why are you
> > > saying we should keep the older API?
> > >   
> > 
> > I am not a fan of the old API either. In hindsight, it was a mistake
> > (which we didn't catch in time). As Bruce suggested, the example
> > should be reworked to work without the API, and deprecate it.
Agreed to deprecate the old API.
However, there is no relation with this new optional API.
> Before deprecating an API, I think we should check if people are using
> it and if it can really be replaced. I think there are many things that
> could be deprecated before this one.
Yes we can discuss a lot of things but let's focus on this one :)
> > > For Tx, I want to know if I have enough room to send my packets before
> > > doing it. There is no API yet to do that.
> > 
> > Yes. This could be a lightweight API if it returned a count (txq->nb_tx_free) instead of actually touching the ring, which is what I have a problem with. If the implementation changes to that, that may be okay to do. The Tx API has more merit than the Rx API, but not coded around an offset.
> 
> Returning txq->nb_tx_free does not work because it is a software view,
> which becomes wrong as soon as the hardware has send the packets.
> Example:
> 1. Send many packets at very high rate, the tx ring becomes full
> 2. wait that packets are transmitted
> 3. read nb_tx_free, it returns 0, which is not what I want
> 
> So in my case there is a also a need for a Tx descriptor status API.
> 
> Thomas, you are the maintainer of ethdev API, do you have an opinion?
You show some benefits and it does not hurt any existing API.
So we cannot reject such a feature, even if its best use is for debug
or specific applications.
I think the concern here was the fear of seeing this called in some
benchmark applications. You just have to highlight in the API doc that
there are some performance penalties.
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 0/6] get status of Rx and Tx descriptors
  2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
                     ` (8 preceding siblings ...)
  2017-03-02 15:32   ` Bruce Richardson
@ 2017-03-07 15:59   ` Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API Olivier Matz
                       ` (6 more replies)
  9 siblings, 7 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
This patchset introduces a new ethdev API:
- rte_eth_rx_descriptor_status()
- rte_eth_tx_descriptor_status()
The Rx API is aims to replace rte_eth_rx_descriptor_done() which
does almost the same, but does not differentiate the case of a
descriptor used by the driver (not returned to the hw).
The usage of these functions can be:
- on Rx, anticipate that the cpu is not fast enough to process
  all incoming packets, and take dispositions to solve the
  problem (add more cpus, drop specific packets, ...)
- on Tx, detect that the link is overloaded, and take dispositions
  to solve the problem (notify flow control, drop specific
  packets)
The patchset updates ixgbe, i40e, e1000, mlx5.
The other drivers that implement the descriptor_done() API are
fm10k, sfc, virtio. They are not updated.
If the new API is accepted, the descriptor_done() can be deprecated,
and examples/l3fwd-power will be updated to.
v1->v2:
- replace RTE_ETH_RX_DESC_USED by RTE_ETH_RX_DESC_UNAVAIL: it can be used when
  the descriptor is hold by driver or reserved by the hardware.
- add RTE_ETH_TX_DESC_UNAVAIL (same for Tx)
- change the ethdev callback api to use a queue pointer instead of port_id
  and queue_id
- like rx_burst/tx_burst, do not check the validity of port_id and queue_id
  except in debug mode
- better document the calling context, error status, possible performance
  impact
- add the feature in NIC documentation
- fix overflow of descriptor value in tx functions (ixgbe, igb, em)
- fix tx function to only check descs that have the rs bit (i40e)
- mlx: remove empty line
RFC->v1:
- instead of optimizing an API that returns the number of used
  descriptors like rx_queue_count(), use a more simple API that
  returns the status of a descriptor, like rx_descriptor_done().
- remove ethdev api rework (first 2 patches), they have been
  sent separately
Olivier Matz (6):
  ethdev: add descriptor status API
  net/ixgbe: implement descriptor status API
  net/e1000: implement descriptor status API (igb)
  net/e1000: implement descriptor status API (em)
  net/mlx5: implement descriptor status API
  net/i40e: implement descriptor status API
 doc/guides/nics/features/default.ini      |   2 +
 doc/guides/nics/features/e1000.ini        |   2 +
 doc/guides/nics/features/i40e.ini         |   2 +
 doc/guides/nics/features/i40e_vec.ini     |   2 +
 doc/guides/nics/features/i40e_vf.ini      |   2 +
 doc/guides/nics/features/i40e_vf_vec.ini  |   2 +
 doc/guides/nics/features/igb.ini          |   2 +
 doc/guides/nics/features/igb_vf.ini       |   2 +
 doc/guides/nics/features/ixgbe.ini        |   2 +
 doc/guides/nics/features/ixgbe_vec.ini    |   2 +
 doc/guides/nics/features/ixgbe_vf.ini     |   2 +
 doc/guides/nics/features/ixgbe_vf_vec.ini |   2 +
 doc/guides/nics/features/mlx5.ini         |   2 +
 drivers/net/e1000/e1000_ethdev.h          |   6 ++
 drivers/net/e1000/em_ethdev.c             |   2 +
 drivers/net/e1000/em_rxtx.c               |  51 ++++++++++++
 drivers/net/e1000/igb_ethdev.c            |   2 +
 drivers/net/e1000/igb_rxtx.c              |  45 +++++++++++
 drivers/net/i40e/i40e_ethdev.c            |   2 +
 drivers/net/i40e/i40e_ethdev_vf.c         |   2 +
 drivers/net/i40e/i40e_rxtx.c              |  58 ++++++++++++++
 drivers/net/i40e/i40e_rxtx.h              |   2 +
 drivers/net/ixgbe/ixgbe_ethdev.c          |   4 +
 drivers/net/ixgbe/ixgbe_ethdev.h          |   3 +
 drivers/net/ixgbe/ixgbe_rxtx.c            |  57 ++++++++++++++
 drivers/net/mlx5/mlx5.c                   |   2 +
 drivers/net/mlx5/mlx5_rxtx.c              |  76 ++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.h              |   2 +
 lib/librte_ether/rte_ethdev.h             | 125 ++++++++++++++++++++++++++++++
 29 files changed, 465 insertions(+)
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
@ 2017-03-07 15:59     ` Olivier Matz
  2017-03-09 11:49       ` Andrew Rybchenko
  2017-03-21  8:32       ` Yang, Qiming
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 2/6] net/ixgbe: implement " Olivier Matz
                       ` (5 subsequent siblings)
  6 siblings, 2 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
Introduce a new API to get the status of a descriptor.
For Rx, it is almost similar to rx_descriptor_done API, except it
differentiates "used" descriptors (which are hold by the driver and not
returned to the hardware).
For Tx, it is a new API.
The descriptor_done() API, and probably the rx_queue_count() API could
be replaced by this new API as soon as it is implemented on all PMDs.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/default.ini |   2 +
 lib/librte_ether/rte_ethdev.h        | 125 +++++++++++++++++++++++++++++++++++
 2 files changed, 127 insertions(+)
diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index 9e363ff..0e6a78d 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -49,6 +49,8 @@ Inner L3 checksum    =
 Inner L4 checksum    =
 Packet type parsing  =
 Timesync             =
+Rx Descriptor Status =
+Tx Descriptor Status =
 Basic stats          =
 Extended stats       =
 Stats per queue      =
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 97f3e2d..904ecbe 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1179,6 +1179,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef int (*eth_rx_descriptor_status_t)(void *rxq, uint16_t offset);
+/**< @internal Check the status of a Rx descriptor */
+
+typedef int (*eth_tx_descriptor_status_t)(void *txq, uint16_t offset);
+/**< @internal Check the status of a Tx descriptor */
+
 typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
 				     char *fw_version, size_t fw_size);
 /**< @internal Get firmware information of an Ethernet device. */
@@ -1483,6 +1489,10 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
 	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
 	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
+	eth_rx_descriptor_status_t rx_descriptor_status;
+	/**< Check the status of a Rx descriptor. */
+	eth_tx_descriptor_status_t tx_descriptor_status;
+	/**< Check the status of a Tx descriptor. */
 	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
 	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
@@ -2768,6 +2778,121 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
 		dev->data->rx_queues[queue_id], offset);
 }
 
+#define RTE_ETH_RX_DESC_AVAIL    0 /**< Desc available for hw. */
+#define RTE_ETH_RX_DESC_DONE     1 /**< Desc done, filled by hw. */
+#define RTE_ETH_RX_DESC_UNAVAIL  2 /**< Desc used by driver or hw. */
+
+/**
+ * Check the status of a Rx descriptor in the queue
+ *
+ * It should be called in a similar context than the Rx function:
+ * - on a dataplane core
+ * - not concurrently on the same queue
+ *
+ * Since it's a dataplane function, no check is performed on port_id and
+ * queue_id. The caller must therefore ensure that the port is enabled
+ * and the queue is configured and running.
+ *
+ * Note: accessing to a random descriptor in the ring may trigger cache
+ * misses and have a performance impact.
+ *
+ * @param port_id
+ *  A valid port identifier of the Ethernet device which.
+ * @param queue_id
+ *  A valid Rx queue identifier on this port.
+ * @param offset
+ *  The offset of the descriptor starting from tail (0 is the next
+ *  packet to be received by the driver).
+ *
+ * @return
+ *  - (RTE_ETH_RX_DESC_AVAIL): Descriptor is available for the hardware to
+ *    receive a packet.
+ *  - (RTE_ETH_RX_DESC_DONE): Descriptor is done, it is filled by hw, but
+ *    not yet processed by the driver (i.e. in the receive queue).
+ *  - (RTE_ETH_RX_DESC_UNAVAIL): Descriptor is unavailable, either hold by
+ *    the driver and not yet returned to hw, or reserved by the hw.
+ *  - (-EINVAL) bad descriptor offset.
+ *  - (-ENOTSUP) if the device does not support this function.
+ *  - (-ENODEV) bad port or queue (only if compiled with debug).
+ */
+static inline int
+rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
+	uint16_t offset)
+{
+	struct rte_eth_dev *dev;
+	void *rxq;
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+#endif
+	dev = &rte_eth_devices[port_id];
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	if (queue_id >= dev->data->nb_rx_queues)
+		return -ENODEV;
+#endif
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
+	rxq = dev->data->rx_queues[queue_id];
+
+	return (*dev->dev_ops->rx_descriptor_status)(rxq, offset);
+}
+
+#define RTE_ETH_TX_DESC_FULL    0 /**< Desc filled for hw, waiting xmit. */
+#define RTE_ETH_TX_DESC_DONE    1 /**< Desc done, packet is transmitted. */
+#define RTE_ETH_TX_DESC_UNAVAIL 2 /**< Desc used by driver or hw. */
+
+/**
+ * Check the status of a Tx descriptor in the queue.
+ *
+ * It should be called in a similar context than the Tx function:
+ * - on a dataplane core
+ * - not concurrently on the same queue
+ *
+ * Since it's a dataplane function, no check is performed on port_id and
+ * queue_id. The caller must therefore ensure that the port is enabled
+ * and the queue is configured and running.
+ *
+ * Note: accessing to a random descriptor in the ring may trigger cache
+ * misses and have a performance impact.
+ *
+ * @param port_id
+ *  A valid port identifier of the Ethernet device which.
+ * @param queue_id
+ *  A valid Tx queue identifier on this port.
+ * @param offset
+ *  The offset of the descriptor starting from tail (0 is the place where
+ *  the next packet will be send).
+ *
+ * @return
+ *  - (RTE_ETH_TX_DESC_FULL) Descriptor is being processed by the hw, i.e.
+ *    in the transmit queue.
+ *  - (RTE_ETH_TX_DESC_DONE) Hardware is done with this descriptor, it can
+ *    be reused by the driver.
+ *  - (RTE_ETH_TX_DESC_UNAVAIL): Descriptor is unavailable, reserved by the
+ *    driver or the hardware.
+ *  - (-EINVAL) bad descriptor offset.
+ *  - (-ENOTSUP) if the device does not support this function.
+ *  - (-ENODEV) bad port or queue (only if compiled with debug).
+ */
+static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
+	uint16_t queue_id, uint16_t offset)
+{
+	struct rte_eth_dev *dev;
+	void *txq;
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+#endif
+	dev = &rte_eth_devices[port_id];
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	if (queue_id >= dev->data->nb_tx_queues)
+		return -ENODEV;
+#endif
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_descriptor_status, -ENOTSUP);
+	txq = dev->data->tx_queues[queue_id];
+
+	return (*dev->dev_ops->tx_descriptor_status)(txq, offset);
+}
+
 /**
  * Send a burst of output packets on a transmit queue of an Ethernet device.
  *
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 2/6] net/ixgbe: implement descriptor status API
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API Olivier Matz
@ 2017-03-07 15:59     ` Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/ixgbe.ini        |  2 ++
 doc/guides/nics/features/ixgbe_vec.ini    |  2 ++
 doc/guides/nics/features/ixgbe_vf.ini     |  2 ++
 doc/guides/nics/features/ixgbe_vf_vec.ini |  2 ++
 drivers/net/ixgbe/ixgbe_ethdev.c          |  4 +++
 drivers/net/ixgbe/ixgbe_ethdev.h          |  3 ++
 drivers/net/ixgbe/ixgbe_rxtx.c            | 57 +++++++++++++++++++++++++++++++
 7 files changed, 72 insertions(+)
diff --git a/doc/guides/nics/features/ixgbe.ini b/doc/guides/nics/features/ixgbe.ini
index e65bbb8..d59ed43 100644
--- a/doc/guides/nics/features/ixgbe.ini
+++ b/doc/guides/nics/features/ixgbe.ini
@@ -42,6 +42,8 @@ Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Stats per queue      = Y
diff --git a/doc/guides/nics/features/ixgbe_vec.ini b/doc/guides/nics/features/ixgbe_vec.ini
index e1773dd..1a9326e 100644
--- a/doc/guides/nics/features/ixgbe_vec.ini
+++ b/doc/guides/nics/features/ixgbe_vec.ini
@@ -32,6 +32,8 @@ Flow control         = Y
 Rate limitation      = Y
 Traffic mirroring    = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Stats per queue      = Y
diff --git a/doc/guides/nics/features/ixgbe_vf.ini b/doc/guides/nics/features/ixgbe_vf.ini
index bf28215..8be1db8 100644
--- a/doc/guides/nics/features/ixgbe_vf.ini
+++ b/doc/guides/nics/features/ixgbe_vf.ini
@@ -25,6 +25,8 @@ L4 checksum offload  = Y
 Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Registers dump       = Y
diff --git a/doc/guides/nics/features/ixgbe_vf_vec.ini b/doc/guides/nics/features/ixgbe_vf_vec.ini
index 8b8c90b..f02251f 100644
--- a/doc/guides/nics/features/ixgbe_vf_vec.ini
+++ b/doc/guides/nics/features/ixgbe_vf_vec.ini
@@ -17,6 +17,8 @@ RSS hash             = Y
 RSS key update       = Y
 RSS reta update      = Y
 VLAN filter          = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Registers dump       = Y
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7169007..34bd681 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -554,6 +554,8 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
+	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
+	.tx_descriptor_status = ixgbe_dev_tx_descriptor_status,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
 	.dev_led_on           = ixgbe_dev_led_on,
@@ -632,6 +634,8 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
+	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
+	.tx_descriptor_status = ixgbe_dev_tx_descriptor_status,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
 	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 680d5d9..fc11d20 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -516,6 +516,9 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int ixgbe_dev_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int ixgbe_dev_tx_descriptor_status(void *tx_queue, uint16_t offset);
+
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
 void ixgbe_dev_tx_init(struct rte_eth_dev *dev);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 9502432..216079a 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2950,6 +2950,63 @@ ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 			rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD));
 }
 
+int
+ixgbe_dev_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct ixgbe_rx_queue *rxq = rx_queue;
+	volatile uint32_t *status;
+	uint32_t nb_hold, desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+#ifdef RTE_IXGBE_INC_VECTOR
+	if (rxq->rx_using_sse)
+		nb_hold = rxq->rxrearm_nb;
+	else
+#endif
+		nb_hold = rxq->nb_rx_hold;
+	if (offset >= rxq->nb_rx_desc - nb_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+ixgbe_dev_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct ixgbe_tx_queue *txq = tx_queue;
+	volatile uint32_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	if (desc >= txq->nb_tx_desc) {
+		desc -= txq->nb_tx_desc;
+		if (desc >= txq->nb_tx_desc)
+			desc -= txq->nb_tx_desc;
+	}
+
+	status = &txq->tx_ring[desc].wb.status;
+	if (*status & rte_cpu_to_le_32(IXGBE_ADVTXD_STAT_DD))
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void __attribute__((cold))
 ixgbe_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 3/6] net/e1000: implement descriptor status API (igb)
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 2/6] net/ixgbe: implement " Olivier Matz
@ 2017-03-07 15:59     ` Olivier Matz
  2017-03-08  1:17       ` Lu, Wenzhuo
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
                       ` (3 subsequent siblings)
  6 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/igb.ini    |  2 ++
 doc/guides/nics/features/igb_vf.ini |  2 ++
 drivers/net/e1000/e1000_ethdev.h    |  3 +++
 drivers/net/e1000/igb_ethdev.c      |  2 ++
 drivers/net/e1000/igb_rxtx.c        | 45 +++++++++++++++++++++++++++++++++++++
 5 files changed, 54 insertions(+)
diff --git a/doc/guides/nics/features/igb.ini b/doc/guides/nics/features/igb.ini
index 26ae008..6a7df60 100644
--- a/doc/guides/nics/features/igb.ini
+++ b/doc/guides/nics/features/igb.ini
@@ -33,6 +33,8 @@ L3 checksum offload  = Y
 L4 checksum offload  = Y
 Packet type parsing  = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 FW version           = Y
diff --git a/doc/guides/nics/features/igb_vf.ini b/doc/guides/nics/features/igb_vf.ini
index b617820..653b5da 100644
--- a/doc/guides/nics/features/igb_vf.ini
+++ b/doc/guides/nics/features/igb_vf.ini
@@ -17,6 +17,8 @@ QinQ offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Registers dump       = Y
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 81a6dbb..564438d 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -311,6 +311,9 @@ uint32_t eth_igb_rx_queue_count(struct rte_eth_dev *dev,
 
 int eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int eth_igb_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int eth_igb_tx_descriptor_status(void *tx_queue, uint16_t offset);
+
 int eth_igb_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index a112b38..f6ed824 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -406,6 +406,8 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
+	.rx_descriptor_status = eth_igb_rx_descriptor_status,
+	.tx_descriptor_status = eth_igb_tx_descriptor_status,
 	.tx_queue_setup       = eth_igb_tx_queue_setup,
 	.tx_queue_release     = eth_igb_tx_queue_release,
 	.dev_led_on           = eth_igb_led_on,
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index c9cf392..7190738 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1606,6 +1606,51 @@ eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset)
 	return !!(rxdp->wb.upper.status_error & E1000_RXD_STAT_DD);
 }
 
+int
+eth_igb_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct igb_rx_queue *rxq = rx_queue;
+	volatile uint32_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD))
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+eth_igb_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct igb_tx_queue *txq = tx_queue;
+	volatile uint32_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	if (desc >= txq->nb_tx_desc)
+		desc -= txq->nb_tx_desc;
+
+	status = &txq->tx_ring[desc].wb.status;
+	if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void
 igb_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 4/6] net/e1000: implement descriptor status API (em)
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
                       ` (2 preceding siblings ...)
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
@ 2017-03-07 15:59     ` Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 5/6] net/mlx5: implement descriptor status API Olivier Matz
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/e1000.ini |  2 ++
 drivers/net/e1000/e1000_ethdev.h   |  3 +++
 drivers/net/e1000/em_ethdev.c      |  2 ++
 drivers/net/e1000/em_rxtx.c        | 51 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+)
diff --git a/doc/guides/nics/features/e1000.ini b/doc/guides/nics/features/e1000.ini
index 7f6d55c..4b95730 100644
--- a/doc/guides/nics/features/e1000.ini
+++ b/doc/guides/nics/features/e1000.ini
@@ -20,6 +20,8 @@ VLAN offload         = Y
 QinQ offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 BSD nic_uio          = Y
 Linux UIO            = Y
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 564438d..cb8d273 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -378,6 +378,9 @@ uint32_t eth_em_rx_queue_count(struct rte_eth_dev *dev,
 
 int eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int eth_em_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int eth_em_tx_descriptor_status(void *tx_queue, uint16_t offset);
+
 int eth_em_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 4066ef9..4f34c14 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -205,6 +205,8 @@ static const struct eth_dev_ops eth_em_ops = {
 	.rx_queue_release     = eth_em_rx_queue_release,
 	.rx_queue_count       = eth_em_rx_queue_count,
 	.rx_descriptor_done   = eth_em_rx_descriptor_done,
+	.rx_descriptor_status = eth_em_rx_descriptor_status,
+	.tx_descriptor_status = eth_em_tx_descriptor_status,
 	.tx_queue_setup       = eth_em_tx_queue_setup,
 	.tx_queue_release     = eth_em_tx_queue_release,
 	.rx_queue_intr_enable = eth_em_rx_queue_intr_enable,
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index d099d6a..2c9c7f8 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1473,6 +1473,57 @@ eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset)
 	return !!(rxdp->status & E1000_RXD_STAT_DD);
 }
 
+int
+eth_em_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct em_rx_queue *rxq = rx_queue;
+	volatile uint8_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].status;
+	if (*status & E1000_RXD_STAT_DD)
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+eth_em_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct em_tx_queue *txq = tx_queue;
+	volatile uint8_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	if (desc >= txq->nb_tx_desc) {
+		desc -= txq->nb_tx_desc;
+		if (desc >= txq->nb_tx_desc)
+			desc -= txq->nb_tx_desc;
+	}
+
+	status = &txq->tx_ring[desc].upper.fields.status;
+	if (*status & E1000_TXD_STAT_DD)
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void
 em_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 5/6] net/mlx5: implement descriptor status API
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
                       ` (3 preceding siblings ...)
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
@ 2017-03-07 15:59     ` Olivier Matz
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 6/6] net/i40e: " Olivier Matz
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
Since there is no "descriptor done" flag like on Intel drivers, the
approach is different on mlx5 driver.
- for Tx, we call txq_complete() to free descriptors processed by
  the hw, then we check if the descriptor is between tail and head
- for Rx, we need to browse the cqes, managing compressed ones,
  to get the number of used descriptors.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/mlx5.ini |  2 ++
 drivers/net/mlx5/mlx5.c           |  2 ++
 drivers/net/mlx5/mlx5_rxtx.c      | 76 +++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.h      |  2 ++
 4 files changed, 82 insertions(+)
diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
index f20d214..1c92e37 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -27,6 +27,8 @@ VLAN offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Stats per queue      = Y
 Multiprocess aware   = Y
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d4bd469..4a6450c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -222,6 +222,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.rss_hash_update = mlx5_rss_hash_update,
 	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
 	.filter_ctrl = mlx5_dev_filter_ctrl,
+	.rx_descriptor_status = mlx5_rx_descriptor_status,
+	.tx_descriptor_status = mlx5_tx_descriptor_status,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 88b0354..b8d2bf6 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -345,6 +345,82 @@ mlx5_tx_dbrec(struct txq *txq, volatile struct mlx5_wqe *wqe)
 }
 
 /**
+ * DPDK callback to check the status of a tx descriptor.
+ *
+ * @param tx_queue
+ *   The tx queue.
+ * @param[in] offset
+ *   The index of the descriptor in the ring.
+ *
+ * @return
+ *   The status of the tx descriptor.
+ */
+int
+mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct txq *txq = tx_queue;
+	const unsigned int elts_n = 1 << txq->elts_n;
+	const unsigned int elts_cnt = elts_n - 1;
+	unsigned int used;
+
+	txq_complete(txq);
+	used = (txq->elts_head - txq->elts_tail) & elts_cnt;
+	if (offset < used)
+		return RTE_ETH_TX_DESC_FULL;
+	return RTE_ETH_TX_DESC_DONE;
+}
+
+/**
+ * DPDK callback to check the status of a rx descriptor.
+ *
+ * @param rx_queue
+ *   The rx queue.
+ * @param[in] offset
+ *   The index of the descriptor in the ring.
+ *
+ * @return
+ *   The status of the tx descriptor.
+ */
+int
+mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct rxq *rxq = rx_queue;
+	struct rxq_zip *zip = &rxq->zip;
+	volatile struct mlx5_cqe *cqe;
+	const unsigned int cqe_n = (1 << rxq->cqe_n);
+	const unsigned int cqe_cnt = cqe_n - 1;
+	unsigned int cq_ci;
+	unsigned int used;
+
+	/* if we are processing a compressed cqe */
+	if (zip->ai) {
+		used = zip->cqe_cnt - zip->ca;
+		cq_ci = zip->cq_ci;
+	} else {
+		used = 0;
+		cq_ci = rxq->cq_ci;
+	}
+	cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
+	while (check_cqe(cqe, cqe_n, cq_ci) == 0) {
+		int8_t op_own;
+		unsigned int n;
+
+		op_own = cqe->op_own;
+		if (MLX5_CQE_FORMAT(op_own) == MLX5_COMPRESSED)
+			n = ntohl(cqe->byte_cnt);
+		else
+			n = 1;
+		cq_ci += n;
+		used += n;
+		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
+	}
+	used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
+	if (offset < used)
+		return RTE_ETH_RX_DESC_DONE;
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+/**
  * DPDK callback for TX.
  *
  * @param dpdk_txq
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 41a34d7..012207e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -323,6 +323,8 @@ uint16_t mlx5_tx_burst_mpw_inline(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t removed_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t removed_rx_burst(void *, struct rte_mbuf **, uint16_t);
+int mlx5_rx_descriptor_status(void *, uint16_t);
+int mlx5_tx_descriptor_status(void *, uint16_t);
 
 /* mlx5_mr.c */
 
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v2 6/6] net/i40e: implement descriptor status API
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
                       ` (4 preceding siblings ...)
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 5/6] net/mlx5: implement descriptor status API Olivier Matz
@ 2017-03-07 15:59     ` Olivier Matz
  2017-03-08  1:17       ` Lu, Wenzhuo
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
  6 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-07 15:59 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/i40e.ini        |  2 ++
 doc/guides/nics/features/i40e_vec.ini    |  2 ++
 doc/guides/nics/features/i40e_vf.ini     |  2 ++
 doc/guides/nics/features/i40e_vf_vec.ini |  2 ++
 drivers/net/i40e/i40e_ethdev.c           |  2 ++
 drivers/net/i40e/i40e_ethdev_vf.c        |  2 ++
 drivers/net/i40e/i40e_rxtx.c             | 58 ++++++++++++++++++++++++++++++++
 drivers/net/i40e/i40e_rxtx.h             |  2 ++
 8 files changed, 72 insertions(+)
diff --git a/doc/guides/nics/features/i40e.ini b/doc/guides/nics/features/i40e.ini
index 6d11cce..4725f02 100644
--- a/doc/guides/nics/features/i40e.ini
+++ b/doc/guides/nics/features/i40e.ini
@@ -38,6 +38,8 @@ Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 FW version           = Y
diff --git a/doc/guides/nics/features/i40e_vec.ini b/doc/guides/nics/features/i40e_vec.ini
index edd6b71..2f95eb6 100644
--- a/doc/guides/nics/features/i40e_vec.ini
+++ b/doc/guides/nics/features/i40e_vec.ini
@@ -29,6 +29,8 @@ Flow director        = Y
 Flow control         = Y
 Traffic mirroring    = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Multiprocess aware   = Y
diff --git a/doc/guides/nics/features/i40e_vf.ini b/doc/guides/nics/features/i40e_vf.ini
index 2f82c6b..60bdf7a 100644
--- a/doc/guides/nics/features/i40e_vf.ini
+++ b/doc/guides/nics/features/i40e_vf.ini
@@ -26,6 +26,8 @@ L4 checksum offload  = Y
 Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Multiprocess aware   = Y
diff --git a/doc/guides/nics/features/i40e_vf_vec.ini b/doc/guides/nics/features/i40e_vf_vec.ini
index d6674f7..5766aaf 100644
--- a/doc/guides/nics/features/i40e_vf_vec.ini
+++ b/doc/guides/nics/features/i40e_vf_vec.ini
@@ -18,6 +18,8 @@ RSS key update       = Y
 RSS reta update      = Y
 VLAN filter          = Y
 Hash filter          = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Multiprocess aware   = Y
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 303027b..8b5fd54 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -479,6 +479,8 @@ static const struct eth_dev_ops i40e_eth_dev_ops = {
 	.rx_queue_release             = i40e_dev_rx_queue_release,
 	.rx_queue_count               = i40e_dev_rx_queue_count,
 	.rx_descriptor_done           = i40e_dev_rx_descriptor_done,
+	.rx_descriptor_status         = i40e_dev_rx_descriptor_status,
+	.tx_descriptor_status         = i40e_dev_tx_descriptor_status,
 	.tx_queue_setup               = i40e_dev_tx_queue_setup,
 	.tx_queue_release             = i40e_dev_tx_queue_release,
 	.dev_led_on                   = i40e_dev_led_on,
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 55fd344..d3659c9 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -217,6 +217,8 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
 	.rx_queue_intr_enable = i40evf_dev_rx_queue_intr_enable,
 	.rx_queue_intr_disable = i40evf_dev_rx_queue_intr_disable,
 	.rx_descriptor_done   = i40e_dev_rx_descriptor_done,
+	.rx_descriptor_status = i40e_dev_rx_descriptor_status,
+	.tx_descriptor_status = i40e_dev_tx_descriptor_status,
 	.tx_queue_setup       = i40e_dev_tx_queue_setup,
 	.tx_queue_release     = i40e_dev_tx_queue_release,
 	.rx_queue_count       = i40e_dev_rx_queue_count,
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 48429cc..524be62 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1929,6 +1929,64 @@ i40e_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 }
 
 int
+i40e_dev_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct i40e_rx_queue *rxq = rx_queue;
+	volatile uint64_t *status;
+	uint64_t mask;
+	uint32_t desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.qword1.status_error_len;
+	mask = rte_le_to_cpu_64((1ULL << I40E_RX_DESC_STATUS_DD_SHIFT)
+		<< I40E_RXD_QW1_STATUS_SHIFT);
+	if (*status & mask)
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+i40e_dev_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct i40e_tx_queue *txq = tx_queue;
+	volatile uint64_t *status;
+	uint64_t mask, expect;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	if (desc >= txq->nb_tx_desc) {
+		desc -= txq->nb_tx_desc;
+		if (desc >= txq->nb_tx_desc)
+			desc -= txq->nb_tx_desc;
+	}
+
+	status = &txq->tx_ring[desc].cmd_type_offset_bsz;
+	mask = rte_le_to_cpu_64(I40E_TXD_QW1_DTYPE_MASK);
+	expect = rte_cpu_to_le_64(
+		I40E_TX_DESC_DTYPE_DESC_DONE << I40E_TXD_QW1_DTYPE_SHIFT);
+	if ((*status & mask) == expect)
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
+int
 i40e_dev_tx_queue_setup(struct rte_eth_dev *dev,
 			uint16_t queue_idx,
 			uint16_t nb_desc,
diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
index 9df8a56..7f63328 100644
--- a/drivers/net/i40e/i40e_rxtx.h
+++ b/drivers/net/i40e/i40e_rxtx.h
@@ -246,6 +246,8 @@ void i40e_rx_queue_release_mbufs(struct i40e_rx_queue *rxq);
 uint32_t i40e_dev_rx_queue_count(struct rte_eth_dev *dev,
 				 uint16_t rx_queue_id);
 int i40e_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int i40e_dev_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int i40e_dev_tx_descriptor_status(void *tx_queue, uint16_t offset);
 
 uint16_t i40e_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
 			    uint16_t nb_pkts);
-- 
2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v2 3/6] net/e1000: implement descriptor status API (igb)
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
@ 2017-03-08  1:17       ` Lu, Wenzhuo
  0 siblings, 0 replies; 72+ messages in thread
From: Lu, Wenzhuo @ 2017-03-08  1:17 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, Ananyev, Konstantin, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro
  Cc: Yigit, Ferruh, Richardson, Bruce, Venkatesan, Venky, arybchenko
Hi,
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Wednesday, March 8, 2017 12:00 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Lu,
> Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> nelio.laranjeiro@6wind.com
> Cc: Yigit, Ferruh; Richardson, Bruce; Venkatesan, Venky;
> arybchenko@solarflare.com
> Subject: [PATCH v2 3/6] net/e1000: implement descriptor status API (igb)
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v2 6/6] net/i40e: implement descriptor status API
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 6/6] net/i40e: " Olivier Matz
@ 2017-03-08  1:17       ` Lu, Wenzhuo
  0 siblings, 0 replies; 72+ messages in thread
From: Lu, Wenzhuo @ 2017-03-08  1:17 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, Ananyev, Konstantin, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro
  Cc: Yigit, Ferruh, Richardson, Bruce, Venkatesan, Venky, arybchenko
Hi,
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Wednesday, March 8, 2017 12:00 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin; Lu,
> Wenzhuo; Zhang, Helin; Wu, Jingjing; adrien.mazarguil@6wind.com;
> nelio.laranjeiro@6wind.com
> Cc: Yigit, Ferruh; Richardson, Bruce; Venkatesan, Venky;
> arybchenko@solarflare.com
> Subject: [PATCH v2 6/6] net/i40e: implement descriptor status API
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API Olivier Matz
@ 2017-03-09 11:49       ` Andrew Rybchenko
  2017-03-21  8:32       ` Yang, Qiming
  1 sibling, 0 replies; 72+ messages in thread
From: Andrew Rybchenko @ 2017-03-09 11:49 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, konstantin.ananyev,
	wenzhuo.lu, helin.zhang, jingjing.wu, adrien.mazarguil,
	nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan
On 03/07/2017 06:59 PM, Olivier Matz wrote:
> Introduce a new API to get the status of a descriptor.
>
> For Rx, it is almost similar to rx_descriptor_done API, except it
> differentiates "used" descriptors (which are hold by the driver and not
> returned to the hardware).
>
> For Tx, it is a new API.
>
> The descriptor_done() API, and probably the rx_queue_count() API could
> be replaced by this new API as soon as it is implemented on all PMDs.
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>   doc/guides/nics/features/default.ini |   2 +
>   lib/librte_ether/rte_ethdev.h        | 125 +++++++++++++++++++++++++++++++++++
>   2 files changed, 127 insertions(+)
>
> diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
> index 9e363ff..0e6a78d 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -49,6 +49,8 @@ Inner L3 checksum    =
>   Inner L4 checksum    =
>   Packet type parsing  =
>   Timesync             =
> +Rx Descriptor Status =
> +Tx Descriptor Status =
>   Basic stats          =
>   Extended stats       =
>   Stats per queue      =
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 97f3e2d..904ecbe 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1179,6 +1179,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
>   typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
>   /**< @internal Check DD bit of specific RX descriptor */
>   
> +typedef int (*eth_rx_descriptor_status_t)(void *rxq, uint16_t offset);
> +/**< @internal Check the status of a Rx descriptor */
> +
> +typedef int (*eth_tx_descriptor_status_t)(void *txq, uint16_t offset);
> +/**< @internal Check the status of a Tx descriptor */
> +
>   typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
>   				     char *fw_version, size_t fw_size);
>   /**< @internal Get firmware information of an Ethernet device. */
> @@ -1483,6 +1489,10 @@ struct eth_dev_ops {
>   	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
>   	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
>   	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
> +	eth_rx_descriptor_status_t rx_descriptor_status;
> +	/**< Check the status of a Rx descriptor. */
> +	eth_tx_descriptor_status_t tx_descriptor_status;
> +	/**< Check the status of a Tx descriptor. */
>   	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
>   	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
>   	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
> @@ -2768,6 +2778,121 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
>   		dev->data->rx_queues[queue_id], offset);
>   }
>   
> +#define RTE_ETH_RX_DESC_AVAIL    0 /**< Desc available for hw. */
> +#define RTE_ETH_RX_DESC_DONE     1 /**< Desc done, filled by hw. */
> +#define RTE_ETH_RX_DESC_UNAVAIL  2 /**< Desc used by driver or hw. */
> +
> +/**
> + * Check the status of a Rx descriptor in the queue
> + *
> + * It should be called in a similar context than the Rx function:
> + * - on a dataplane core
> + * - not concurrently on the same queue
> + *
> + * Since it's a dataplane function, no check is performed on port_id and
> + * queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * Note: accessing to a random descriptor in the ring may trigger cache
> + * misses and have a performance impact.
> + *
> + * @param port_id
> + *  A valid port identifier of the Ethernet device which.
> + * @param queue_id
> + *  A valid Rx queue identifier on this port.
> + * @param offset
> + *  The offset of the descriptor starting from tail (0 is the next
> + *  packet to be received by the driver).
> + *
> + * @return
> + *  - (RTE_ETH_RX_DESC_AVAIL): Descriptor is available for the hardware to
> + *    receive a packet.
> + *  - (RTE_ETH_RX_DESC_DONE): Descriptor is done, it is filled by hw, but
> + *    not yet processed by the driver (i.e. in the receive queue).
> + *  - (RTE_ETH_RX_DESC_UNAVAIL): Descriptor is unavailable, either hold by
> + *    the driver and not yet returned to hw, or reserved by the hw.
> + *  - (-EINVAL) bad descriptor offset.
> + *  - (-ENOTSUP) if the device does not support this function.
> + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> + */
> +static inline int
> +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
> +	uint16_t offset)
> +{
> +	struct rte_eth_dev *dev;
> +	void *rxq;
> +
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +#endif
> +	dev = &rte_eth_devices[port_id];
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	if (queue_id >= dev->data->nb_rx_queues)
> +		return -ENODEV;
> +#endif
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
> +	rxq = dev->data->rx_queues[queue_id];
> +
> +	return (*dev->dev_ops->rx_descriptor_status)(rxq, offset);
> +}
> +
> +#define RTE_ETH_TX_DESC_FULL    0 /**< Desc filled for hw, waiting xmit. */
> +#define RTE_ETH_TX_DESC_DONE    1 /**< Desc done, packet is transmitted. */
> +#define RTE_ETH_TX_DESC_UNAVAIL 2 /**< Desc used by driver or hw. */
> +
> +/**
> + * Check the status of a Tx descriptor in the queue.
> + *
> + * It should be called in a similar context than the Tx function:
> + * - on a dataplane core
> + * - not concurrently on the same queue
> + *
> + * Since it's a dataplane function, no check is performed on port_id and
> + * queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * Note: accessing to a random descriptor in the ring may trigger cache
> + * misses and have a performance impact.
> + *
> + * @param port_id
> + *  A valid port identifier of the Ethernet device which.
> + * @param queue_id
> + *  A valid Tx queue identifier on this port.
> + * @param offset
> + *  The offset of the descriptor starting from tail (0 is the place where
> + *  the next packet will be send).
> + *
> + * @return
> + *  - (RTE_ETH_TX_DESC_FULL) Descriptor is being processed by the hw, i.e.
> + *    in the transmit queue.
> + *  - (RTE_ETH_TX_DESC_DONE) Hardware is done with this descriptor, it can
> + *    be reused by the driver.
> + *  - (RTE_ETH_TX_DESC_UNAVAIL): Descriptor is unavailable, reserved by the
> + *    driver or the hardware.
> + *  - (-EINVAL) bad descriptor offset.
> + *  - (-ENOTSUP) if the device does not support this function.
> + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> + */
> +static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
> +	uint16_t queue_id, uint16_t offset)
> +{
> +	struct rte_eth_dev *dev;
> +	void *txq;
> +
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +#endif
> +	dev = &rte_eth_devices[port_id];
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	if (queue_id >= dev->data->nb_tx_queues)
> +		return -ENODEV;
> +#endif
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_descriptor_status, -ENOTSUP);
> +	txq = dev->data->tx_queues[queue_id];
> +
> +	return (*dev->dev_ops->tx_descriptor_status)(txq, offset);
> +}
> +
>   /**
>    * Send a burst of output packets on a transmit queue of an Ethernet device.
>    *
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API Olivier Matz
  2017-03-09 11:49       ` Andrew Rybchenko
@ 2017-03-21  8:32       ` Yang, Qiming
  2017-03-24 12:49         ` Olivier Matz
  1 sibling, 1 reply; 72+ messages in thread
From: Yang, Qiming @ 2017-03-21  8:32 UTC (permalink / raw)
  To: Olivier Matz, dev, thomas.monjalon, Ananyev, Konstantin, Lu,
	Wenzhuo, Zhang, Helin, Wu, Jingjing, adrien.mazarguil,
	nelio.laranjeiro
  Cc: Yigit, Ferruh, Richardson, Bruce, Venkatesan, Venky, arybchenko
Hi, Olivier
You need to add your new API in lib/librte_ether/rte_ether_version.map and doc/guides/rel_notes/release_17_05.rst
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> Sent: Wednesday, March 8, 2017 12:00 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> adrien.mazarguil@6wind.com; nelio.laranjeiro@6wind.com
> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; Venkatesan, Venky
> <venky.venkatesan@intel.com>; arybchenko@solarflare.com
> Subject: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
> 
> Introduce a new API to get the status of a descriptor.
> 
> For Rx, it is almost similar to rx_descriptor_done API, except it differentiates
> "used" descriptors (which are hold by the driver and not returned to the
> hardware).
> 
> For Tx, it is a new API.
> 
> The descriptor_done() API, and probably the rx_queue_count() API could be
> replaced by this new API as soon as it is implemented on all PMDs.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  doc/guides/nics/features/default.ini |   2 +
>  lib/librte_ether/rte_ethdev.h        | 125
> +++++++++++++++++++++++++++++++++++
>  2 files changed, 127 insertions(+)
> 
> diff --git a/doc/guides/nics/features/default.ini
> b/doc/guides/nics/features/default.ini
> index 9e363ff..0e6a78d 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -49,6 +49,8 @@ Inner L3 checksum    =
>  Inner L4 checksum    =
>  Packet type parsing  =
>  Timesync             =
> +Rx Descriptor Status =
> +Tx Descriptor Status =
>  Basic stats          =
>  Extended stats       =
>  Stats per queue      =
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 97f3e2d..904ecbe 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1179,6 +1179,12 @@ typedef uint32_t
> (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,  typedef int
> (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);  /**< @internal
> Check DD bit of specific RX descriptor */
> 
> +typedef int (*eth_rx_descriptor_status_t)(void *rxq, uint16_t offset);
> +/**< @internal Check the status of a Rx descriptor */
> +
> +typedef int (*eth_tx_descriptor_status_t)(void *txq, uint16_t offset);
> +/**< @internal Check the status of a Tx descriptor */
> +
>  typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
>  				     char *fw_version, size_t fw_size);  /**<
> @internal Get firmware information of an Ethernet device. */ @@ -1483,6
> +1489,10 @@ struct eth_dev_ops {
>  	eth_queue_release_t        rx_queue_release; /**< Release RX queue.
> */
>  	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue
> count. */
>  	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD
> bit. */
> +	eth_rx_descriptor_status_t rx_descriptor_status;
> +	/**< Check the status of a Rx descriptor. */
> +	eth_tx_descriptor_status_t tx_descriptor_status;
> +	/**< Check the status of a Tx descriptor. */
>  	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx
> queue interrupt. */
>  	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx
> queue interrupt. */
>  	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX
> queue. */
> @@ -2768,6 +2778,121 @@ rte_eth_rx_descriptor_done(uint8_t port_id,
> uint16_t queue_id, uint16_t offset)
>  		dev->data->rx_queues[queue_id], offset);  }
> 
> +#define RTE_ETH_RX_DESC_AVAIL    0 /**< Desc available for hw. */
> +#define RTE_ETH_RX_DESC_DONE     1 /**< Desc done, filled by hw. */
> +#define RTE_ETH_RX_DESC_UNAVAIL  2 /**< Desc used by driver or hw. */
> +
> +/**
> + * Check the status of a Rx descriptor in the queue
> + *
> + * It should be called in a similar context than the Rx function:
> + * - on a dataplane core
> + * - not concurrently on the same queue
> + *
> + * Since it's a dataplane function, no check is performed on port_id
> +and
> + * queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * Note: accessing to a random descriptor in the ring may trigger cache
> + * misses and have a performance impact.
> + *
> + * @param port_id
> + *  A valid port identifier of the Ethernet device which.
> + * @param queue_id
> + *  A valid Rx queue identifier on this port.
> + * @param offset
> + *  The offset of the descriptor starting from tail (0 is the next
> + *  packet to be received by the driver).
> + *
> + * @return
> + *  - (RTE_ETH_RX_DESC_AVAIL): Descriptor is available for the hardware to
> + *    receive a packet.
> + *  - (RTE_ETH_RX_DESC_DONE): Descriptor is done, it is filled by hw, but
> + *    not yet processed by the driver (i.e. in the receive queue).
> + *  - (RTE_ETH_RX_DESC_UNAVAIL): Descriptor is unavailable, either hold by
> + *    the driver and not yet returned to hw, or reserved by the hw.
> + *  - (-EINVAL) bad descriptor offset.
> + *  - (-ENOTSUP) if the device does not support this function.
> + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> + */
> +static inline int
> +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
> +	uint16_t offset)
> +{
> +	struct rte_eth_dev *dev;
> +	void *rxq;
> +
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); #endif
> +	dev = &rte_eth_devices[port_id];
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	if (queue_id >= dev->data->nb_rx_queues)
> +		return -ENODEV;
> +#endif
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >rx_descriptor_status, -ENOTSUP);
> +	rxq = dev->data->rx_queues[queue_id];
> +
> +	return (*dev->dev_ops->rx_descriptor_status)(rxq, offset); }
> +
> +#define RTE_ETH_TX_DESC_FULL    0 /**< Desc filled for hw, waiting xmit.
> */
> +#define RTE_ETH_TX_DESC_DONE    1 /**< Desc done, packet is
> transmitted. */
> +#define RTE_ETH_TX_DESC_UNAVAIL 2 /**< Desc used by driver or hw. */
> +
> +/**
> + * Check the status of a Tx descriptor in the queue.
> + *
> + * It should be called in a similar context than the Tx function:
> + * - on a dataplane core
> + * - not concurrently on the same queue
> + *
> + * Since it's a dataplane function, no check is performed on port_id
> +and
> + * queue_id. The caller must therefore ensure that the port is enabled
> + * and the queue is configured and running.
> + *
> + * Note: accessing to a random descriptor in the ring may trigger cache
> + * misses and have a performance impact.
> + *
> + * @param port_id
> + *  A valid port identifier of the Ethernet device which.
> + * @param queue_id
> + *  A valid Tx queue identifier on this port.
> + * @param offset
> + *  The offset of the descriptor starting from tail (0 is the place
> +where
> + *  the next packet will be send).
> + *
> + * @return
> + *  - (RTE_ETH_TX_DESC_FULL) Descriptor is being processed by the hw, i.e.
> + *    in the transmit queue.
> + *  - (RTE_ETH_TX_DESC_DONE) Hardware is done with this descriptor, it
> can
> + *    be reused by the driver.
> + *  - (RTE_ETH_TX_DESC_UNAVAIL): Descriptor is unavailable, reserved by
> the
> + *    driver or the hardware.
> + *  - (-EINVAL) bad descriptor offset.
> + *  - (-ENOTSUP) if the device does not support this function.
> + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> + */
> +static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
> +	uint16_t queue_id, uint16_t offset)
> +{
> +	struct rte_eth_dev *dev;
> +	void *txq;
> +
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); #endif
> +	dev = &rte_eth_devices[port_id];
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	if (queue_id >= dev->data->nb_tx_queues)
> +		return -ENODEV;
> +#endif
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >tx_descriptor_status, -ENOTSUP);
> +	txq = dev->data->tx_queues[queue_id];
> +
> +	return (*dev->dev_ops->tx_descriptor_status)(txq, offset); }
> +
>  /**
>   * Send a burst of output packets on a transmit queue of an Ethernet device.
>   *
> --
> 2.8.1
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
  2017-03-21  8:32       ` Yang, Qiming
@ 2017-03-24 12:49         ` Olivier Matz
  2017-03-27  1:28           ` Yang, Qiming
  0 siblings, 1 reply; 72+ messages in thread
From: Olivier Matz @ 2017-03-24 12:49 UTC (permalink / raw)
  To: Yang, Qiming
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Lu, Wenzhuo, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit,
	Ferruh, Richardson, Bruce, Venkatesan, Venky, arybchenko
Hi Qiming,
On Tue, 21 Mar 2017 08:32:17 +0000, "Yang, Qiming" <qiming.yang@intel.com> wrote:
> Hi, Olivier
> You need to add your new API in lib/librte_ether/rte_ether_version.map and doc/guides/rel_notes/release_17_05.rst
About rte_ether_version.map, I don't think it's needed since
the functions are static inline. Did you noticed any issue?
About release_17_05.rst, you are right, I can add some words about
this new feature.
Thanks,
Olivier
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> > Sent: Wednesday, March 8, 2017 12:00 AM
> > To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> > adrien.mazarguil@6wind.com; nelio.laranjeiro@6wind.com
> > Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; Richardson, Bruce
> > <bruce.richardson@intel.com>; Venkatesan, Venky
> > <venky.venkatesan@intel.com>; arybchenko@solarflare.com
> > Subject: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
> > 
> > Introduce a new API to get the status of a descriptor.
> > 
> > For Rx, it is almost similar to rx_descriptor_done API, except it differentiates
> > "used" descriptors (which are hold by the driver and not returned to the
> > hardware).
> > 
> > For Tx, it is a new API.
> > 
> > The descriptor_done() API, and probably the rx_queue_count() API could be
> > replaced by this new API as soon as it is implemented on all PMDs.
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
> >  doc/guides/nics/features/default.ini |   2 +
> >  lib/librte_ether/rte_ethdev.h        | 125
> > +++++++++++++++++++++++++++++++++++
> >  2 files changed, 127 insertions(+)
> > 
> > diff --git a/doc/guides/nics/features/default.ini
> > b/doc/guides/nics/features/default.ini
> > index 9e363ff..0e6a78d 100644
> > --- a/doc/guides/nics/features/default.ini
> > +++ b/doc/guides/nics/features/default.ini
> > @@ -49,6 +49,8 @@ Inner L3 checksum    =
> >  Inner L4 checksum    =
> >  Packet type parsing  =
> >  Timesync             =
> > +Rx Descriptor Status =
> > +Tx Descriptor Status =
> >  Basic stats          =
> >  Extended stats       =
> >  Stats per queue      =
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 97f3e2d..904ecbe 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1179,6 +1179,12 @@ typedef uint32_t
> > (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,  typedef int
> > (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);  /**< @internal
> > Check DD bit of specific RX descriptor */
> > 
> > +typedef int (*eth_rx_descriptor_status_t)(void *rxq, uint16_t offset);
> > +/**< @internal Check the status of a Rx descriptor */
> > +
> > +typedef int (*eth_tx_descriptor_status_t)(void *txq, uint16_t offset);
> > +/**< @internal Check the status of a Tx descriptor */
> > +
> >  typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
> >  				     char *fw_version, size_t fw_size);  /**<
> > @internal Get firmware information of an Ethernet device. */ @@ -1483,6
> > +1489,10 @@ struct eth_dev_ops {
> >  	eth_queue_release_t        rx_queue_release; /**< Release RX queue.
> > */
> >  	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue
> > count. */
> >  	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD
> > bit. */
> > +	eth_rx_descriptor_status_t rx_descriptor_status;
> > +	/**< Check the status of a Rx descriptor. */
> > +	eth_tx_descriptor_status_t tx_descriptor_status;
> > +	/**< Check the status of a Tx descriptor. */
> >  	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx
> > queue interrupt. */
> >  	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx
> > queue interrupt. */
> >  	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX
> > queue. */
> > @@ -2768,6 +2778,121 @@ rte_eth_rx_descriptor_done(uint8_t port_id,
> > uint16_t queue_id, uint16_t offset)
> >  		dev->data->rx_queues[queue_id], offset);  }
> > 
> > +#define RTE_ETH_RX_DESC_AVAIL    0 /**< Desc available for hw. */
> > +#define RTE_ETH_RX_DESC_DONE     1 /**< Desc done, filled by hw. */
> > +#define RTE_ETH_RX_DESC_UNAVAIL  2 /**< Desc used by driver or hw. */
> > +
> > +/**
> > + * Check the status of a Rx descriptor in the queue
> > + *
> > + * It should be called in a similar context than the Rx function:
> > + * - on a dataplane core
> > + * - not concurrently on the same queue
> > + *
> > + * Since it's a dataplane function, no check is performed on port_id
> > +and
> > + * queue_id. The caller must therefore ensure that the port is enabled
> > + * and the queue is configured and running.
> > + *
> > + * Note: accessing to a random descriptor in the ring may trigger cache
> > + * misses and have a performance impact.
> > + *
> > + * @param port_id
> > + *  A valid port identifier of the Ethernet device which.
> > + * @param queue_id
> > + *  A valid Rx queue identifier on this port.
> > + * @param offset
> > + *  The offset of the descriptor starting from tail (0 is the next
> > + *  packet to be received by the driver).
> > + *
> > + * @return
> > + *  - (RTE_ETH_RX_DESC_AVAIL): Descriptor is available for the hardware to
> > + *    receive a packet.
> > + *  - (RTE_ETH_RX_DESC_DONE): Descriptor is done, it is filled by hw, but
> > + *    not yet processed by the driver (i.e. in the receive queue).
> > + *  - (RTE_ETH_RX_DESC_UNAVAIL): Descriptor is unavailable, either hold by
> > + *    the driver and not yet returned to hw, or reserved by the hw.
> > + *  - (-EINVAL) bad descriptor offset.
> > + *  - (-ENOTSUP) if the device does not support this function.
> > + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> > + */
> > +static inline int
> > +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
> > +	uint16_t offset)
> > +{
> > +	struct rte_eth_dev *dev;
> > +	void *rxq;
> > +
> > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); #endif
> > +	dev = &rte_eth_devices[port_id];
> > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > +	if (queue_id >= dev->data->nb_rx_queues)
> > +		return -ENODEV;
> > +#endif
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-  
> > >rx_descriptor_status, -ENOTSUP);  
> > +	rxq = dev->data->rx_queues[queue_id];
> > +
> > +	return (*dev->dev_ops->rx_descriptor_status)(rxq, offset); }
> > +
> > +#define RTE_ETH_TX_DESC_FULL    0 /**< Desc filled for hw, waiting xmit.
> > */
> > +#define RTE_ETH_TX_DESC_DONE    1 /**< Desc done, packet is
> > transmitted. */
> > +#define RTE_ETH_TX_DESC_UNAVAIL 2 /**< Desc used by driver or hw. */
> > +
> > +/**
> > + * Check the status of a Tx descriptor in the queue.
> > + *
> > + * It should be called in a similar context than the Tx function:
> > + * - on a dataplane core
> > + * - not concurrently on the same queue
> > + *
> > + * Since it's a dataplane function, no check is performed on port_id
> > +and
> > + * queue_id. The caller must therefore ensure that the port is enabled
> > + * and the queue is configured and running.
> > + *
> > + * Note: accessing to a random descriptor in the ring may trigger cache
> > + * misses and have a performance impact.
> > + *
> > + * @param port_id
> > + *  A valid port identifier of the Ethernet device which.
> > + * @param queue_id
> > + *  A valid Tx queue identifier on this port.
> > + * @param offset
> > + *  The offset of the descriptor starting from tail (0 is the place
> > +where
> > + *  the next packet will be send).
> > + *
> > + * @return
> > + *  - (RTE_ETH_TX_DESC_FULL) Descriptor is being processed by the hw, i.e.
> > + *    in the transmit queue.
> > + *  - (RTE_ETH_TX_DESC_DONE) Hardware is done with this descriptor, it
> > can
> > + *    be reused by the driver.
> > + *  - (RTE_ETH_TX_DESC_UNAVAIL): Descriptor is unavailable, reserved by
> > the
> > + *    driver or the hardware.
> > + *  - (-EINVAL) bad descriptor offset.
> > + *  - (-ENOTSUP) if the device does not support this function.
> > + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> > + */
> > +static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
> > +	uint16_t queue_id, uint16_t offset)
> > +{
> > +	struct rte_eth_dev *dev;
> > +	void *txq;
> > +
> > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); #endif
> > +	dev = &rte_eth_devices[port_id];
> > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > +	if (queue_id >= dev->data->nb_tx_queues)
> > +		return -ENODEV;
> > +#endif
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-  
> > >tx_descriptor_status, -ENOTSUP);  
> > +	txq = dev->data->tx_queues[queue_id];
> > +
> > +	return (*dev->dev_ops->tx_descriptor_status)(txq, offset); }
> > +
> >  /**
> >   * Send a burst of output packets on a transmit queue of an Ethernet device.
> >   *
> > --
> > 2.8.1  
> 
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
  2017-03-24 12:49         ` Olivier Matz
@ 2017-03-27  1:28           ` Yang, Qiming
  0 siblings, 0 replies; 72+ messages in thread
From: Yang, Qiming @ 2017-03-27  1:28 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, Ananyev, Konstantin, Lu, Wenzhuo, Zhang,
	Helin, Wu, Jingjing, adrien.mazarguil, nelio.laranjeiro, Yigit,
	Ferruh, Richardson, Bruce, Venkatesan, Venky, arybchenko
Hi, Olivier
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Friday, March 24, 2017 8:49 PM
> To: Yang, Qiming <qiming.yang@intel.com>
> Cc: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> adrien.mazarguil@6wind.com; nelio.laranjeiro@6wind.com; Yigit, Ferruh
> <ferruh.yigit@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>;
> Venkatesan, Venky <venky.venkatesan@intel.com>;
> arybchenko@solarflare.com
> Subject: Re: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
> 
> Hi Qiming,
> 
> On Tue, 21 Mar 2017 08:32:17 +0000, "Yang, Qiming" <qiming.yang@intel.com>
> wrote:
> > Hi, Olivier
> > You need to add your new API in lib/librte_ether/rte_ether_version.map
> > and doc/guides/rel_notes/release_17_05.rst
> 
> About rte_ether_version.map, I don't think it's needed since the functions
> are static inline. Did you noticed any issue?
> 
That's OK, I didn't notice that the functions are static inline, sorry.
> About release_17_05.rst, you are right, I can add some words about this new
> feature.
> 
> 
> Thanks,
> Olivier
> 
> 
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> > > Sent: Wednesday, March 8, 2017 12:00 AM
> > > To: dev@dpdk.org; thomas.monjalon@6wind.com; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>;
> > > Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; adrien.mazarguil@6wind.com;
> > > nelio.laranjeiro@6wind.com
> > > Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; Richardson, Bruce
> > > <bruce.richardson@intel.com>; Venkatesan, Venky
> > > <venky.venkatesan@intel.com>; arybchenko@solarflare.com
> > > Subject: [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API
> > >
> > > Introduce a new API to get the status of a descriptor.
> > >
> > > For Rx, it is almost similar to rx_descriptor_done API, except it
> > > differentiates "used" descriptors (which are hold by the driver and
> > > not returned to the hardware).
> > >
> > > For Tx, it is a new API.
> > >
> > > The descriptor_done() API, and probably the rx_queue_count() API
> > > could be replaced by this new API as soon as it is implemented on all
> PMDs.
> > >
> > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > > ---
> > >  doc/guides/nics/features/default.ini |   2 +
> > >  lib/librte_ether/rte_ethdev.h        | 125
> > > +++++++++++++++++++++++++++++++++++
> > >  2 files changed, 127 insertions(+)
> > >
> > > diff --git a/doc/guides/nics/features/default.ini
> > > b/doc/guides/nics/features/default.ini
> > > index 9e363ff..0e6a78d 100644
> > > --- a/doc/guides/nics/features/default.ini
> > > +++ b/doc/guides/nics/features/default.ini
> > > @@ -49,6 +49,8 @@ Inner L3 checksum    =
> > >  Inner L4 checksum    =
> > >  Packet type parsing  =
> > >  Timesync             =
> > > +Rx Descriptor Status =
> > > +Tx Descriptor Status =
> > >  Basic stats          =
> > >  Extended stats       =
> > >  Stats per queue      =
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 97f3e2d..904ecbe 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1179,6 +1179,12 @@ typedef uint32_t
> > > (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,  typedef int
> > > (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);  /**<
> > > @internal Check DD bit of specific RX descriptor */
> > >
> > > +typedef int (*eth_rx_descriptor_status_t)(void *rxq, uint16_t
> > > +offset); /**< @internal Check the status of a Rx descriptor */
> > > +
> > > +typedef int (*eth_tx_descriptor_status_t)(void *txq, uint16_t
> > > +offset); /**< @internal Check the status of a Tx descriptor */
> > > +
> > >  typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
> > >  				     char *fw_version, size_t fw_size);  /**<
> @internal Get
> > > firmware information of an Ethernet device. */ @@ -1483,6
> > > +1489,10 @@ struct eth_dev_ops {
> > >  	eth_queue_release_t        rx_queue_release; /**< Release RX queue.
> > > */
> > >  	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue
> > > count. */
> > >  	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD
> > > bit. */
> > > +	eth_rx_descriptor_status_t rx_descriptor_status;
> > > +	/**< Check the status of a Rx descriptor. */
> > > +	eth_tx_descriptor_status_t tx_descriptor_status;
> > > +	/**< Check the status of a Tx descriptor. */
> > >  	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx
> > > queue interrupt. */
> > >  	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx
> > > queue interrupt. */
> > >  	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX
> > > queue. */
> > > @@ -2768,6 +2778,121 @@ rte_eth_rx_descriptor_done(uint8_t port_id,
> > > uint16_t queue_id, uint16_t offset)
> > >  		dev->data->rx_queues[queue_id], offset);  }
> > >
> > > +#define RTE_ETH_RX_DESC_AVAIL    0 /**< Desc available for hw. */
> > > +#define RTE_ETH_RX_DESC_DONE     1 /**< Desc done, filled by hw. */
> > > +#define RTE_ETH_RX_DESC_UNAVAIL  2 /**< Desc used by driver or hw.
> > > +*/
> > > +
> > > +/**
> > > + * Check the status of a Rx descriptor in the queue
> > > + *
> > > + * It should be called in a similar context than the Rx function:
> > > + * - on a dataplane core
> > > + * - not concurrently on the same queue
> > > + *
> > > + * Since it's a dataplane function, no check is performed on
> > > +port_id and
> > > + * queue_id. The caller must therefore ensure that the port is
> > > +enabled
> > > + * and the queue is configured and running.
> > > + *
> > > + * Note: accessing to a random descriptor in the ring may trigger
> > > +cache
> > > + * misses and have a performance impact.
> > > + *
> > > + * @param port_id
> > > + *  A valid port identifier of the Ethernet device which.
> > > + * @param queue_id
> > > + *  A valid Rx queue identifier on this port.
> > > + * @param offset
> > > + *  The offset of the descriptor starting from tail (0 is the next
> > > + *  packet to be received by the driver).
> > > + *
> > > + * @return
> > > + *  - (RTE_ETH_RX_DESC_AVAIL): Descriptor is available for the
> hardware to
> > > + *    receive a packet.
> > > + *  - (RTE_ETH_RX_DESC_DONE): Descriptor is done, it is filled by hw, but
> > > + *    not yet processed by the driver (i.e. in the receive queue).
> > > + *  - (RTE_ETH_RX_DESC_UNAVAIL): Descriptor is unavailable, either
> hold by
> > > + *    the driver and not yet returned to hw, or reserved by the hw.
> > > + *  - (-EINVAL) bad descriptor offset.
> > > + *  - (-ENOTSUP) if the device does not support this function.
> > > + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> > > + */
> > > +static inline int
> > > +rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
> > > +	uint16_t offset)
> > > +{
> > > +	struct rte_eth_dev *dev;
> > > +	void *rxq;
> > > +
> > > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); #endif
> > > +	dev = &rte_eth_devices[port_id];
> > > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > > +	if (queue_id >= dev->data->nb_rx_queues)
> > > +		return -ENODEV;
> > > +#endif
> > > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> > > >rx_descriptor_status, -ENOTSUP);
> > > +	rxq = dev->data->rx_queues[queue_id];
> > > +
> > > +	return (*dev->dev_ops->rx_descriptor_status)(rxq, offset); }
> > > +
> > > +#define RTE_ETH_TX_DESC_FULL    0 /**< Desc filled for hw, waiting
> xmit.
> > > */
> > > +#define RTE_ETH_TX_DESC_DONE    1 /**< Desc done, packet is
> > > transmitted. */
> > > +#define RTE_ETH_TX_DESC_UNAVAIL 2 /**< Desc used by driver or hw.
> > > +*/
> > > +
> > > +/**
> > > + * Check the status of a Tx descriptor in the queue.
> > > + *
> > > + * It should be called in a similar context than the Tx function:
> > > + * - on a dataplane core
> > > + * - not concurrently on the same queue
> > > + *
> > > + * Since it's a dataplane function, no check is performed on
> > > +port_id and
> > > + * queue_id. The caller must therefore ensure that the port is
> > > +enabled
> > > + * and the queue is configured and running.
> > > + *
> > > + * Note: accessing to a random descriptor in the ring may trigger
> > > +cache
> > > + * misses and have a performance impact.
> > > + *
> > > + * @param port_id
> > > + *  A valid port identifier of the Ethernet device which.
> > > + * @param queue_id
> > > + *  A valid Tx queue identifier on this port.
> > > + * @param offset
> > > + *  The offset of the descriptor starting from tail (0 is the place
> > > +where
> > > + *  the next packet will be send).
> > > + *
> > > + * @return
> > > + *  - (RTE_ETH_TX_DESC_FULL) Descriptor is being processed by the hw,
> i.e.
> > > + *    in the transmit queue.
> > > + *  - (RTE_ETH_TX_DESC_DONE) Hardware is done with this descriptor,
> > > +it
> > > can
> > > + *    be reused by the driver.
> > > + *  - (RTE_ETH_TX_DESC_UNAVAIL): Descriptor is unavailable,
> > > + reserved by
> > > the
> > > + *    driver or the hardware.
> > > + *  - (-EINVAL) bad descriptor offset.
> > > + *  - (-ENOTSUP) if the device does not support this function.
> > > + *  - (-ENODEV) bad port or queue (only if compiled with debug).
> > > + */
> > > +static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
> > > +	uint16_t queue_id, uint16_t offset) {
> > > +	struct rte_eth_dev *dev;
> > > +	void *txq;
> > > +
> > > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); #endif
> > > +	dev = &rte_eth_devices[port_id];
> > > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > > +	if (queue_id >= dev->data->nb_tx_queues)
> > > +		return -ENODEV;
> > > +#endif
> > > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> > > >tx_descriptor_status, -ENOTSUP);
> > > +	txq = dev->data->tx_queues[queue_id];
> > > +
> > > +	return (*dev->dev_ops->tx_descriptor_status)(txq, offset); }
> > > +
> > >  /**
> > >   * Send a burst of output packets on a transmit queue of an Ethernet
> device.
> > >   *
> > > --
> > > 2.8.1
> >
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors
  2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
                       ` (5 preceding siblings ...)
  2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 6/6] net/i40e: " Olivier Matz
@ 2017-03-29  8:36     ` Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 1/6] ethdev: add descriptor status API Olivier Matz
                         ` (6 more replies)
  6 siblings, 7 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
This patchset introduces a new ethdev API:
- rte_eth_rx_descriptor_status()
- rte_eth_tx_descriptor_status()
The Rx API is aims to replace rte_eth_rx_descriptor_done() which
does almost the same, but does not differentiate the case of a
descriptor used by the driver (not returned to the hw).
The usage of these functions can be:
- on Rx, anticipate that the cpu is not fast enough to process
  all incoming packets, and take dispositions to solve the
  problem (add more cpus, drop specific packets, ...)
- on Tx, detect that the link is overloaded, and take dispositions
  to solve the problem (notify flow control, drop specific
  packets)
The patchset updates ixgbe, i40e, e1000, mlx5.
The other drivers that implement the descriptor_done() API are
fm10k, sfc, virtio. They are not updated.
If the new API is accepted, the descriptor_done() can be deprecated,
and examples/l3fwd-power will be updated to.
v2->v3:
- add an entry in the release note
- rebase on top of net-next/master
v1->v2:
- replace RTE_ETH_RX_DESC_USED by RTE_ETH_RX_DESC_UNAVAIL: it can be used when
  the descriptor is hold by driver or reserved by the hardware.
- add RTE_ETH_TX_DESC_UNAVAIL (same for Tx)
- change the ethdev callback api to use a queue pointer instead of port_id
  and queue_id
- like rx_burst/tx_burst, do not check the validity of port_id and queue_id
  except in debug mode
- better document the calling context, error status, possible performance
  impact
- add the feature in NIC documentation
- fix overflow of descriptor value in tx functions (ixgbe, igb, em)
- fix tx function to only check descs that have the rs bit (i40e)
- mlx: remove empty line
RFC->v1:
- instead of optimizing an API that returns the number of used
  descriptors like rx_queue_count(), use a more simple API that
  returns the status of a descriptor, like rx_descriptor_done().
- remove ethdev api rework (first 2 patches), they have been
  sent separately
Olivier Matz (6):
  ethdev: add descriptor status API
  net/ixgbe: implement descriptor status API
  net/e1000: implement descriptor status API (igb)
  net/e1000: implement descriptor status API (em)
  net/mlx5: implement descriptor status API
  net/i40e: implement descriptor status API
 doc/guides/nics/features/default.ini      |   2 +
 doc/guides/nics/features/e1000.ini        |   2 +
 doc/guides/nics/features/i40e.ini         |   2 +
 doc/guides/nics/features/i40e_vec.ini     |   2 +
 doc/guides/nics/features/i40e_vf.ini      |   2 +
 doc/guides/nics/features/i40e_vf_vec.ini  |   2 +
 doc/guides/nics/features/igb.ini          |   2 +
 doc/guides/nics/features/igb_vf.ini       |   2 +
 doc/guides/nics/features/ixgbe.ini        |   2 +
 doc/guides/nics/features/ixgbe_vec.ini    |   2 +
 doc/guides/nics/features/ixgbe_vf.ini     |   2 +
 doc/guides/nics/features/ixgbe_vf_vec.ini |   2 +
 doc/guides/nics/features/mlx5.ini         |   2 +
 doc/guides/rel_notes/release_17_02.rst    |   8 ++
 drivers/net/e1000/e1000_ethdev.h          |   6 ++
 drivers/net/e1000/em_ethdev.c             |   2 +
 drivers/net/e1000/em_rxtx.c               |  51 ++++++++++++
 drivers/net/e1000/igb_ethdev.c            |   2 +
 drivers/net/e1000/igb_rxtx.c              |  45 +++++++++++
 drivers/net/i40e/i40e_ethdev.c            |   2 +
 drivers/net/i40e/i40e_ethdev_vf.c         |   2 +
 drivers/net/i40e/i40e_rxtx.c              |  58 ++++++++++++++
 drivers/net/i40e/i40e_rxtx.h              |   2 +
 drivers/net/ixgbe/ixgbe_ethdev.c          |   4 +
 drivers/net/ixgbe/ixgbe_ethdev.h          |   3 +
 drivers/net/ixgbe/ixgbe_rxtx.c            |  57 ++++++++++++++
 drivers/net/mlx5/mlx5.c                   |   2 +
 drivers/net/mlx5/mlx5_rxtx.c              |  76 ++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.h              |   2 +
 lib/librte_ether/rte_ethdev.h             | 125 ++++++++++++++++++++++++++++++
 30 files changed, 473 insertions(+)
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 1/6] ethdev: add descriptor status API
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
@ 2017-03-29  8:36       ` Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 2/6] net/ixgbe: implement " Olivier Matz
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
Introduce a new API to get the status of a descriptor.
For Rx, it is almost similar to rx_descriptor_done API, except it
differentiates "used" descriptors (which are hold by the driver and not
returned to the hardware).
For Tx, it is a new API.
The descriptor_done() API, and probably the rx_queue_count() API could
be replaced by this new API as soon as it is implemented on all PMDs.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/features/default.ini   |   2 +
 doc/guides/rel_notes/release_17_02.rst |   8 +++
 lib/librte_ether/rte_ethdev.h          | 125 +++++++++++++++++++++++++++++++++
 3 files changed, 135 insertions(+)
diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index 0135c0c4e..effdbe642 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -51,6 +51,8 @@ Inner L3 checksum    =
 Inner L4 checksum    =
 Packet type parsing  =
 Timesync             =
+Rx Descriptor Status =
+Tx Descriptor Status =
 Basic stats          =
 Extended stats       =
 Stats per queue      =
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 357965ac9..524d2ac2b 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -250,6 +250,14 @@ New Features
   See the :ref:`Elastic Flow Distributor Library <Efd_Library>` documentation in
   the Programmers Guide document, for more information.
 
+* **Added descriptor status ethdev API.**
+
+  Added a new API to get the status of a descriptor.
+
+  For Rx, it is almost similar to the *rx_descriptor_done* API, except
+  it differentiates descriptors which are hold by the driver and not
+  returned to the hardware. For Tx, it is a new API.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b3ee8728b..71a35f242 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1179,6 +1179,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef int (*eth_rx_descriptor_status_t)(void *rxq, uint16_t offset);
+/**< @internal Check the status of a Rx descriptor */
+
+typedef int (*eth_tx_descriptor_status_t)(void *txq, uint16_t offset);
+/**< @internal Check the status of a Tx descriptor */
+
 typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
 				     char *fw_version, size_t fw_size);
 /**< @internal Get firmware information of an Ethernet device. */
@@ -1487,6 +1493,10 @@ struct eth_dev_ops {
 	eth_rx_queue_count_t       rx_queue_count;
 	/**< Get the number of used RX descriptors. */
 	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
+	eth_rx_descriptor_status_t rx_descriptor_status;
+	/**< Check the status of a Rx descriptor. */
+	eth_tx_descriptor_status_t tx_descriptor_status;
+	/**< Check the status of a Tx descriptor. */
 	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
 	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
@@ -2778,6 +2788,121 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t offset)
 		dev->data->rx_queues[queue_id], offset);
 }
 
+#define RTE_ETH_RX_DESC_AVAIL    0 /**< Desc available for hw. */
+#define RTE_ETH_RX_DESC_DONE     1 /**< Desc done, filled by hw. */
+#define RTE_ETH_RX_DESC_UNAVAIL  2 /**< Desc used by driver or hw. */
+
+/**
+ * Check the status of a Rx descriptor in the queue
+ *
+ * It should be called in a similar context than the Rx function:
+ * - on a dataplane core
+ * - not concurrently on the same queue
+ *
+ * Since it's a dataplane function, no check is performed on port_id and
+ * queue_id. The caller must therefore ensure that the port is enabled
+ * and the queue is configured and running.
+ *
+ * Note: accessing to a random descriptor in the ring may trigger cache
+ * misses and have a performance impact.
+ *
+ * @param port_id
+ *  A valid port identifier of the Ethernet device which.
+ * @param queue_id
+ *  A valid Rx queue identifier on this port.
+ * @param offset
+ *  The offset of the descriptor starting from tail (0 is the next
+ *  packet to be received by the driver).
+ *
+ * @return
+ *  - (RTE_ETH_RX_DESC_AVAIL): Descriptor is available for the hardware to
+ *    receive a packet.
+ *  - (RTE_ETH_RX_DESC_DONE): Descriptor is done, it is filled by hw, but
+ *    not yet processed by the driver (i.e. in the receive queue).
+ *  - (RTE_ETH_RX_DESC_UNAVAIL): Descriptor is unavailable, either hold by
+ *    the driver and not yet returned to hw, or reserved by the hw.
+ *  - (-EINVAL) bad descriptor offset.
+ *  - (-ENOTSUP) if the device does not support this function.
+ *  - (-ENODEV) bad port or queue (only if compiled with debug).
+ */
+static inline int
+rte_eth_rx_descriptor_status(uint8_t port_id, uint16_t queue_id,
+	uint16_t offset)
+{
+	struct rte_eth_dev *dev;
+	void *rxq;
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+#endif
+	dev = &rte_eth_devices[port_id];
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	if (queue_id >= dev->data->nb_rx_queues)
+		return -ENODEV;
+#endif
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -ENOTSUP);
+	rxq = dev->data->rx_queues[queue_id];
+
+	return (*dev->dev_ops->rx_descriptor_status)(rxq, offset);
+}
+
+#define RTE_ETH_TX_DESC_FULL    0 /**< Desc filled for hw, waiting xmit. */
+#define RTE_ETH_TX_DESC_DONE    1 /**< Desc done, packet is transmitted. */
+#define RTE_ETH_TX_DESC_UNAVAIL 2 /**< Desc used by driver or hw. */
+
+/**
+ * Check the status of a Tx descriptor in the queue.
+ *
+ * It should be called in a similar context than the Tx function:
+ * - on a dataplane core
+ * - not concurrently on the same queue
+ *
+ * Since it's a dataplane function, no check is performed on port_id and
+ * queue_id. The caller must therefore ensure that the port is enabled
+ * and the queue is configured and running.
+ *
+ * Note: accessing to a random descriptor in the ring may trigger cache
+ * misses and have a performance impact.
+ *
+ * @param port_id
+ *  A valid port identifier of the Ethernet device which.
+ * @param queue_id
+ *  A valid Tx queue identifier on this port.
+ * @param offset
+ *  The offset of the descriptor starting from tail (0 is the place where
+ *  the next packet will be send).
+ *
+ * @return
+ *  - (RTE_ETH_TX_DESC_FULL) Descriptor is being processed by the hw, i.e.
+ *    in the transmit queue.
+ *  - (RTE_ETH_TX_DESC_DONE) Hardware is done with this descriptor, it can
+ *    be reused by the driver.
+ *  - (RTE_ETH_TX_DESC_UNAVAIL): Descriptor is unavailable, reserved by the
+ *    driver or the hardware.
+ *  - (-EINVAL) bad descriptor offset.
+ *  - (-ENOTSUP) if the device does not support this function.
+ *  - (-ENODEV) bad port or queue (only if compiled with debug).
+ */
+static inline int rte_eth_tx_descriptor_status(uint8_t port_id,
+	uint16_t queue_id, uint16_t offset)
+{
+	struct rte_eth_dev *dev;
+	void *txq;
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+#endif
+	dev = &rte_eth_devices[port_id];
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	if (queue_id >= dev->data->nb_tx_queues)
+		return -ENODEV;
+#endif
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_descriptor_status, -ENOTSUP);
+	txq = dev->data->tx_queues[queue_id];
+
+	return (*dev->dev_ops->tx_descriptor_status)(txq, offset);
+}
+
 /**
  * Send a burst of output packets on a transmit queue of an Ethernet device.
  *
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 2/6] net/ixgbe: implement descriptor status API
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 1/6] ethdev: add descriptor status API Olivier Matz
@ 2017-03-29  8:36       ` Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/ixgbe.ini        |  2 ++
 doc/guides/nics/features/ixgbe_vec.ini    |  2 ++
 doc/guides/nics/features/ixgbe_vf.ini     |  2 ++
 doc/guides/nics/features/ixgbe_vf_vec.ini |  2 ++
 drivers/net/ixgbe/ixgbe_ethdev.c          |  4 +++
 drivers/net/ixgbe/ixgbe_ethdev.h          |  3 ++
 drivers/net/ixgbe/ixgbe_rxtx.c            | 57 +++++++++++++++++++++++++++++++
 7 files changed, 72 insertions(+)
diff --git a/doc/guides/nics/features/ixgbe.ini b/doc/guides/nics/features/ixgbe.ini
index e65bbb883..d59ed43fc 100644
--- a/doc/guides/nics/features/ixgbe.ini
+++ b/doc/guides/nics/features/ixgbe.ini
@@ -42,6 +42,8 @@ Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Stats per queue      = Y
diff --git a/doc/guides/nics/features/ixgbe_vec.ini b/doc/guides/nics/features/ixgbe_vec.ini
index e1773dd6a..1a9326e5d 100644
--- a/doc/guides/nics/features/ixgbe_vec.ini
+++ b/doc/guides/nics/features/ixgbe_vec.ini
@@ -32,6 +32,8 @@ Flow control         = Y
 Rate limitation      = Y
 Traffic mirroring    = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Stats per queue      = Y
diff --git a/doc/guides/nics/features/ixgbe_vf.ini b/doc/guides/nics/features/ixgbe_vf.ini
index bf28215da..8be1db8ac 100644
--- a/doc/guides/nics/features/ixgbe_vf.ini
+++ b/doc/guides/nics/features/ixgbe_vf.ini
@@ -25,6 +25,8 @@ L4 checksum offload  = Y
 Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Registers dump       = Y
diff --git a/doc/guides/nics/features/ixgbe_vf_vec.ini b/doc/guides/nics/features/ixgbe_vf_vec.ini
index 8b8c90ba8..f02251f8b 100644
--- a/doc/guides/nics/features/ixgbe_vf_vec.ini
+++ b/doc/guides/nics/features/ixgbe_vf_vec.ini
@@ -17,6 +17,8 @@ RSS hash             = Y
 RSS key update       = Y
 RSS reta update      = Y
 VLAN filter          = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Registers dump       = Y
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 8a249db91..4fd22c5ee 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -548,6 +548,8 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
+	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
+	.tx_descriptor_status = ixgbe_dev_tx_descriptor_status,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
 	.dev_led_on           = ixgbe_dev_led_on,
@@ -626,6 +628,8 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
+	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
+	.tx_descriptor_status = ixgbe_dev_tx_descriptor_status,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
 	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 058ad8700..f4ff6ad04 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -527,6 +527,9 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int ixgbe_dev_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int ixgbe_dev_tx_descriptor_status(void *tx_queue, uint16_t offset);
+
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
 void ixgbe_dev_tx_init(struct rte_eth_dev *dev);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index c0805666b..c9993abc6 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2945,6 +2945,63 @@ ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 			rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD));
 }
 
+int
+ixgbe_dev_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct ixgbe_rx_queue *rxq = rx_queue;
+	volatile uint32_t *status;
+	uint32_t nb_hold, desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+#ifdef RTE_IXGBE_INC_VECTOR
+	if (rxq->rx_using_sse)
+		nb_hold = rxq->rxrearm_nb;
+	else
+#endif
+		nb_hold = rxq->nb_rx_hold;
+	if (offset >= rxq->nb_rx_desc - nb_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+ixgbe_dev_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct ixgbe_tx_queue *txq = tx_queue;
+	volatile uint32_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	if (desc >= txq->nb_tx_desc) {
+		desc -= txq->nb_tx_desc;
+		if (desc >= txq->nb_tx_desc)
+			desc -= txq->nb_tx_desc;
+	}
+
+	status = &txq->tx_ring[desc].wb.status;
+	if (*status & rte_cpu_to_le_32(IXGBE_ADVTXD_STAT_DD))
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void __attribute__((cold))
 ixgbe_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 3/6] net/e1000: implement descriptor status API (igb)
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 1/6] ethdev: add descriptor status API Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 2/6] net/ixgbe: implement " Olivier Matz
@ 2017-03-29  8:36       ` Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/nics/features/igb.ini    |  2 ++
 doc/guides/nics/features/igb_vf.ini |  2 ++
 drivers/net/e1000/e1000_ethdev.h    |  3 +++
 drivers/net/e1000/igb_ethdev.c      |  2 ++
 drivers/net/e1000/igb_rxtx.c        | 45 +++++++++++++++++++++++++++++++++++++
 5 files changed, 54 insertions(+)
diff --git a/doc/guides/nics/features/igb.ini b/doc/guides/nics/features/igb.ini
index 26ae008b3..6a7df6071 100644
--- a/doc/guides/nics/features/igb.ini
+++ b/doc/guides/nics/features/igb.ini
@@ -33,6 +33,8 @@ L3 checksum offload  = Y
 L4 checksum offload  = Y
 Packet type parsing  = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 FW version           = Y
diff --git a/doc/guides/nics/features/igb_vf.ini b/doc/guides/nics/features/igb_vf.ini
index b61782028..653b5da87 100644
--- a/doc/guides/nics/features/igb_vf.ini
+++ b/doc/guides/nics/features/igb_vf.ini
@@ -17,6 +17,8 @@ QinQ offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Registers dump       = Y
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 39b2f435b..cb760e986 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -311,6 +311,9 @@ uint32_t eth_igb_rx_queue_count(struct rte_eth_dev *dev,
 
 int eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int eth_igb_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int eth_igb_tx_descriptor_status(void *tx_queue, uint16_t offset);
+
 int eth_igb_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 71d05a938..a73cd7a02 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -406,6 +406,8 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
+	.rx_descriptor_status = eth_igb_rx_descriptor_status,
+	.tx_descriptor_status = eth_igb_tx_descriptor_status,
 	.tx_queue_setup       = eth_igb_tx_queue_setup,
 	.tx_queue_release     = eth_igb_tx_queue_release,
 	.tx_done_cleanup      = eth_igb_tx_done_cleanup,
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index cba3704ba..b3b601b79 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1727,6 +1727,51 @@ eth_igb_rx_descriptor_done(void *rx_queue, uint16_t offset)
 	return !!(rxdp->wb.upper.status_error & E1000_RXD_STAT_DD);
 }
 
+int
+eth_igb_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct igb_rx_queue *rxq = rx_queue;
+	volatile uint32_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.upper.status_error;
+	if (*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD))
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+eth_igb_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct igb_tx_queue *txq = tx_queue;
+	volatile uint32_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	if (desc >= txq->nb_tx_desc)
+		desc -= txq->nb_tx_desc;
+
+	status = &txq->tx_ring[desc].wb.status;
+	if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void
 igb_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 4/6] net/e1000: implement descriptor status API (em)
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
                         ` (2 preceding siblings ...)
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
@ 2017-03-29  8:36       ` Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 5/6] net/mlx5: implement descriptor status API Olivier Matz
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/e1000.ini |  2 ++
 drivers/net/e1000/e1000_ethdev.h   |  3 +++
 drivers/net/e1000/em_ethdev.c      |  2 ++
 drivers/net/e1000/em_rxtx.c        | 51 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+)
diff --git a/doc/guides/nics/features/e1000.ini b/doc/guides/nics/features/e1000.ini
index 3aed7d709..6a7c6c7d7 100644
--- a/doc/guides/nics/features/e1000.ini
+++ b/doc/guides/nics/features/e1000.ini
@@ -21,6 +21,8 @@ VLAN offload         = Y
 QinQ offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 BSD nic_uio          = Y
 Linux UIO            = Y
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index cb760e986..8352d0a79 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -380,6 +380,9 @@ uint32_t eth_em_rx_queue_count(struct rte_eth_dev *dev,
 
 int eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
+int eth_em_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int eth_em_tx_descriptor_status(void *tx_queue, uint16_t offset);
+
 int eth_em_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index e76e34bbe..7110af344 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -205,6 +205,8 @@ static const struct eth_dev_ops eth_em_ops = {
 	.rx_queue_release     = eth_em_rx_queue_release,
 	.rx_queue_count       = eth_em_rx_queue_count,
 	.rx_descriptor_done   = eth_em_rx_descriptor_done,
+	.rx_descriptor_status = eth_em_rx_descriptor_status,
+	.tx_descriptor_status = eth_em_tx_descriptor_status,
 	.tx_queue_setup       = eth_em_tx_queue_setup,
 	.tx_queue_release     = eth_em_tx_queue_release,
 	.rx_queue_intr_enable = eth_em_rx_queue_intr_enable,
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index a265cb41c..31819c5bd 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1468,6 +1468,57 @@ eth_em_rx_descriptor_done(void *rx_queue, uint16_t offset)
 	return !!(rxdp->status & E1000_RXD_STAT_DD);
 }
 
+int
+eth_em_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct em_rx_queue *rxq = rx_queue;
+	volatile uint8_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].status;
+	if (*status & E1000_RXD_STAT_DD)
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+eth_em_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct em_tx_queue *txq = tx_queue;
+	volatile uint8_t *status;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	if (desc >= txq->nb_tx_desc) {
+		desc -= txq->nb_tx_desc;
+		if (desc >= txq->nb_tx_desc)
+			desc -= txq->nb_tx_desc;
+	}
+
+	status = &txq->tx_ring[desc].upper.fields.status;
+	if (*status & E1000_TXD_STAT_DD)
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
 void
 em_dev_clear_queues(struct rte_eth_dev *dev)
 {
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 5/6] net/mlx5: implement descriptor status API
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
                         ` (3 preceding siblings ...)
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
@ 2017-03-29  8:36       ` Olivier Matz
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 6/6] net/i40e: " Olivier Matz
  2017-03-30 13:30       ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Thomas Monjalon
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
Since there is no "descriptor done" flag like on Intel drivers, the
approach is different on mlx5 driver.
- for Tx, we call txq_complete() to free descriptors processed by
  the hw, then we check if the descriptor is between tail and head
- for Rx, we need to browse the cqes, managing compressed ones,
  to get the number of used descriptors.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/nics/features/mlx5.ini |  2 ++
 drivers/net/mlx5/mlx5.c           |  2 ++
 drivers/net/mlx5/mlx5_rxtx.c      | 76 +++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.h      |  2 ++
 4 files changed, 82 insertions(+)
diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini
index 532c0ef3e..5b728ef06 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -31,6 +31,8 @@ L4 checksum offload  = Y
 Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Stats per queue      = Y
 Multiprocess aware   = Y
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index f034e889b..33a7f58b4 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -233,6 +233,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.filter_ctrl = mlx5_dev_filter_ctrl,
 	.rx_queue_intr_enable = mlx5_rx_intr_enable,
 	.rx_queue_intr_disable = mlx5_rx_intr_disable,
+	.rx_descriptor_status = mlx5_rx_descriptor_status,
+	.tx_descriptor_status = mlx5_tx_descriptor_status,
 };
 
 static struct {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index a1dd84a14..c336081d1 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -401,6 +401,82 @@ mlx5_tx_dbrec(struct txq *txq, volatile struct mlx5_wqe *wqe)
 }
 
 /**
+ * DPDK callback to check the status of a tx descriptor.
+ *
+ * @param tx_queue
+ *   The tx queue.
+ * @param[in] offset
+ *   The index of the descriptor in the ring.
+ *
+ * @return
+ *   The status of the tx descriptor.
+ */
+int
+mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct txq *txq = tx_queue;
+	const unsigned int elts_n = 1 << txq->elts_n;
+	const unsigned int elts_cnt = elts_n - 1;
+	unsigned int used;
+
+	txq_complete(txq);
+	used = (txq->elts_head - txq->elts_tail) & elts_cnt;
+	if (offset < used)
+		return RTE_ETH_TX_DESC_FULL;
+	return RTE_ETH_TX_DESC_DONE;
+}
+
+/**
+ * DPDK callback to check the status of a rx descriptor.
+ *
+ * @param rx_queue
+ *   The rx queue.
+ * @param[in] offset
+ *   The index of the descriptor in the ring.
+ *
+ * @return
+ *   The status of the tx descriptor.
+ */
+int
+mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct rxq *rxq = rx_queue;
+	struct rxq_zip *zip = &rxq->zip;
+	volatile struct mlx5_cqe *cqe;
+	const unsigned int cqe_n = (1 << rxq->cqe_n);
+	const unsigned int cqe_cnt = cqe_n - 1;
+	unsigned int cq_ci;
+	unsigned int used;
+
+	/* if we are processing a compressed cqe */
+	if (zip->ai) {
+		used = zip->cqe_cnt - zip->ca;
+		cq_ci = zip->cq_ci;
+	} else {
+		used = 0;
+		cq_ci = rxq->cq_ci;
+	}
+	cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
+	while (check_cqe(cqe, cqe_n, cq_ci) == 0) {
+		int8_t op_own;
+		unsigned int n;
+
+		op_own = cqe->op_own;
+		if (MLX5_CQE_FORMAT(op_own) == MLX5_COMPRESSED)
+			n = ntohl(cqe->byte_cnt);
+		else
+			n = 1;
+		cq_ci += n;
+		used += n;
+		cqe = &(*rxq->cqes)[cq_ci & cqe_cnt];
+	}
+	used = RTE_MIN(used, (1U << rxq->elts_n) - 1);
+	if (offset < used)
+		return RTE_ETH_RX_DESC_DONE;
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+/**
  * DPDK callback for TX.
  *
  * @param dpdk_txq
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 4a4bd8402..b777efa12 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -339,6 +339,8 @@ uint16_t removed_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t removed_rx_burst(void *, struct rte_mbuf **, uint16_t);
 int mlx5_rx_intr_enable(struct rte_eth_dev *dev, uint16_t rx_queue_id);
 int mlx5_rx_intr_disable(struct rte_eth_dev *dev, uint16_t rx_queue_id);
+int mlx5_rx_descriptor_status(void *, uint16_t);
+int mlx5_tx_descriptor_status(void *, uint16_t);
 
 /* mlx5_mr.c */
 
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 6/6] net/i40e: implement descriptor status API
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
                         ` (4 preceding siblings ...)
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 5/6] net/mlx5: implement descriptor status API Olivier Matz
@ 2017-03-29  8:36       ` Olivier Matz
  2017-03-30 13:30       ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Thomas Monjalon
  6 siblings, 0 replies; 72+ messages in thread
From: Olivier Matz @ 2017-03-29  8:36 UTC (permalink / raw)
  To: dev, thomas.monjalon, konstantin.ananyev, wenzhuo.lu,
	helin.zhang, jingjing.wu, adrien.mazarguil, nelio.laranjeiro
  Cc: ferruh.yigit, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/nics/features/i40e.ini        |  2 ++
 doc/guides/nics/features/i40e_vec.ini    |  2 ++
 doc/guides/nics/features/i40e_vf.ini     |  2 ++
 doc/guides/nics/features/i40e_vf_vec.ini |  2 ++
 drivers/net/i40e/i40e_ethdev.c           |  2 ++
 drivers/net/i40e/i40e_ethdev_vf.c        |  2 ++
 drivers/net/i40e/i40e_rxtx.c             | 58 ++++++++++++++++++++++++++++++++
 drivers/net/i40e/i40e_rxtx.h             |  2 ++
 8 files changed, 72 insertions(+)
diff --git a/doc/guides/nics/features/i40e.ini b/doc/guides/nics/features/i40e.ini
index 703001510..7c6c97cd0 100644
--- a/doc/guides/nics/features/i40e.ini
+++ b/doc/guides/nics/features/i40e.ini
@@ -38,6 +38,8 @@ Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 FW version           = Y
diff --git a/doc/guides/nics/features/i40e_vec.ini b/doc/guides/nics/features/i40e_vec.ini
index 5ec4088cb..372aa0154 100644
--- a/doc/guides/nics/features/i40e_vec.ini
+++ b/doc/guides/nics/features/i40e_vec.ini
@@ -29,6 +29,8 @@ Flow director        = Y
 Flow control         = Y
 Traffic mirroring    = Y
 Timesync             = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Multiprocess aware   = Y
diff --git a/doc/guides/nics/features/i40e_vf.ini b/doc/guides/nics/features/i40e_vf.ini
index 2f82c6b98..60bdf7aa5 100644
--- a/doc/guides/nics/features/i40e_vf.ini
+++ b/doc/guides/nics/features/i40e_vf.ini
@@ -26,6 +26,8 @@ L4 checksum offload  = Y
 Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Multiprocess aware   = Y
diff --git a/doc/guides/nics/features/i40e_vf_vec.ini b/doc/guides/nics/features/i40e_vf_vec.ini
index d6674f76e..5766aaf6f 100644
--- a/doc/guides/nics/features/i40e_vf_vec.ini
+++ b/doc/guides/nics/features/i40e_vf_vec.ini
@@ -18,6 +18,8 @@ RSS key update       = Y
 RSS reta update      = Y
 VLAN filter          = Y
 Hash filter          = Y
+Rx Descriptor Status = Y
+Tx Descriptor Status = Y
 Basic stats          = Y
 Extended stats       = Y
 Multiprocess aware   = Y
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 20636039b..af69c9cfc 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -488,6 +488,8 @@ static const struct eth_dev_ops i40e_eth_dev_ops = {
 	.rx_queue_release             = i40e_dev_rx_queue_release,
 	.rx_queue_count               = i40e_dev_rx_queue_count,
 	.rx_descriptor_done           = i40e_dev_rx_descriptor_done,
+	.rx_descriptor_status         = i40e_dev_rx_descriptor_status,
+	.tx_descriptor_status         = i40e_dev_tx_descriptor_status,
 	.tx_queue_setup               = i40e_dev_tx_queue_setup,
 	.tx_queue_release             = i40e_dev_tx_queue_release,
 	.dev_led_on                   = i40e_dev_led_on,
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index 55fd34425..d3659c94b 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -217,6 +217,8 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
 	.rx_queue_intr_enable = i40evf_dev_rx_queue_intr_enable,
 	.rx_queue_intr_disable = i40evf_dev_rx_queue_intr_disable,
 	.rx_descriptor_done   = i40e_dev_rx_descriptor_done,
+	.rx_descriptor_status = i40e_dev_rx_descriptor_status,
+	.tx_descriptor_status = i40e_dev_tx_descriptor_status,
 	.tx_queue_setup       = i40e_dev_tx_queue_setup,
 	.tx_queue_release     = i40e_dev_tx_queue_release,
 	.rx_queue_count       = i40e_dev_rx_queue_count,
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 02367b7f4..9b3b39ede 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1918,6 +1918,64 @@ i40e_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
 }
 
 int
+i40e_dev_rx_descriptor_status(void *rx_queue, uint16_t offset)
+{
+	struct i40e_rx_queue *rxq = rx_queue;
+	volatile uint64_t *status;
+	uint64_t mask;
+	uint32_t desc;
+
+	if (unlikely(offset >= rxq->nb_rx_desc))
+		return -EINVAL;
+
+	if (offset >= rxq->nb_rx_desc - rxq->nb_rx_hold)
+		return RTE_ETH_RX_DESC_UNAVAIL;
+
+	desc = rxq->rx_tail + offset;
+	if (desc >= rxq->nb_rx_desc)
+		desc -= rxq->nb_rx_desc;
+
+	status = &rxq->rx_ring[desc].wb.qword1.status_error_len;
+	mask = rte_le_to_cpu_64((1ULL << I40E_RX_DESC_STATUS_DD_SHIFT)
+		<< I40E_RXD_QW1_STATUS_SHIFT);
+	if (*status & mask)
+		return RTE_ETH_RX_DESC_DONE;
+
+	return RTE_ETH_RX_DESC_AVAIL;
+}
+
+int
+i40e_dev_tx_descriptor_status(void *tx_queue, uint16_t offset)
+{
+	struct i40e_tx_queue *txq = tx_queue;
+	volatile uint64_t *status;
+	uint64_t mask, expect;
+	uint32_t desc;
+
+	if (unlikely(offset >= txq->nb_tx_desc))
+		return -EINVAL;
+
+	desc = txq->tx_tail + offset;
+	/* go to next desc that has the RS bit */
+	desc = ((desc + txq->tx_rs_thresh - 1) / txq->tx_rs_thresh) *
+		txq->tx_rs_thresh;
+	if (desc >= txq->nb_tx_desc) {
+		desc -= txq->nb_tx_desc;
+		if (desc >= txq->nb_tx_desc)
+			desc -= txq->nb_tx_desc;
+	}
+
+	status = &txq->tx_ring[desc].cmd_type_offset_bsz;
+	mask = rte_le_to_cpu_64(I40E_TXD_QW1_DTYPE_MASK);
+	expect = rte_cpu_to_le_64(
+		I40E_TX_DESC_DTYPE_DESC_DONE << I40E_TXD_QW1_DTYPE_SHIFT);
+	if ((*status & mask) == expect)
+		return RTE_ETH_TX_DESC_DONE;
+
+	return RTE_ETH_TX_DESC_FULL;
+}
+
+int
 i40e_dev_tx_queue_setup(struct rte_eth_dev *dev,
 			uint16_t queue_idx,
 			uint16_t nb_desc,
diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
index a87bdb08e..1f8e6817d 100644
--- a/drivers/net/i40e/i40e_rxtx.h
+++ b/drivers/net/i40e/i40e_rxtx.h
@@ -236,6 +236,8 @@ void i40e_rx_queue_release_mbufs(struct i40e_rx_queue *rxq);
 uint32_t i40e_dev_rx_queue_count(struct rte_eth_dev *dev,
 				 uint16_t rx_queue_id);
 int i40e_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int i40e_dev_rx_descriptor_status(void *rx_queue, uint16_t offset);
+int i40e_dev_tx_descriptor_status(void *tx_queue, uint16_t offset);
 
 uint16_t i40e_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
 			    uint16_t nb_pkts);
-- 
2.11.0
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors
  2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
                         ` (5 preceding siblings ...)
  2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 6/6] net/i40e: " Olivier Matz
@ 2017-03-30 13:30       ` Thomas Monjalon
  2017-04-19 15:50         ` Ferruh Yigit
  6 siblings, 1 reply; 72+ messages in thread
From: Thomas Monjalon @ 2017-03-30 13:30 UTC (permalink / raw)
  To: Olivier Matz, ferruh.yigit
  Cc: dev, konstantin.ananyev, wenzhuo.lu, helin.zhang, jingjing.wu,
	adrien.mazarguil, nelio.laranjeiro, bruce.richardson,
	venky.venkatesan, arybchenko, qiming.yang
2017-03-29 10:36, Olivier Matz:
> This patchset introduces a new ethdev API:
> - rte_eth_rx_descriptor_status()
> - rte_eth_tx_descriptor_status()
> 
> The Rx API is aims to replace rte_eth_rx_descriptor_done() which
> does almost the same, but does not differentiate the case of a
> descriptor used by the driver (not returned to the hw).
> 
> The usage of these functions can be:
> - on Rx, anticipate that the cpu is not fast enough to process
>   all incoming packets, and take dispositions to solve the
>   problem (add more cpus, drop specific packets, ...)
> - on Tx, detect that the link is overloaded, and take dispositions
>   to solve the problem (notify flow control, drop specific
>   packets)
> 
> The patchset updates ixgbe, i40e, e1000, mlx5.
> The other drivers that implement the descriptor_done() API are
> fm10k, sfc, virtio. They are not updated.
> If the new API is accepted, the descriptor_done() can be deprecated,
> and examples/l3fwd-power will be updated to.
Applied, thanks
Next steps:
- implement it in other drivers
- deprecate descriptor_done API
Ferruh, please, could you check with drivers maintainers?
Thanks
^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors
  2017-03-30 13:30       ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Thomas Monjalon
@ 2017-04-19 15:50         ` Ferruh Yigit
  0 siblings, 0 replies; 72+ messages in thread
From: Ferruh Yigit @ 2017-04-19 15:50 UTC (permalink / raw)
  To: Jing Chen, Andrew Rybchenko, Yuanhan Liu, Maxime Coquelin
  Cc: Thomas Monjalon, Olivier Matz, dev, konstantin.ananyev,
	wenzhuo.lu, helin.zhang, jingjing.wu, adrien.mazarguil,
	nelio.laranjeiro, bruce.richardson, venky.venkatesan, arybchenko,
	qiming.yang
On 3/30/2017 2:30 PM, Thomas Monjalon wrote:
> 2017-03-29 10:36, Olivier Matz:
>> This patchset introduces a new ethdev API:
>> - rte_eth_rx_descriptor_status()
>> - rte_eth_tx_descriptor_status()
>>
>> The Rx API is aims to replace rte_eth_rx_descriptor_done() which
>> does almost the same, but does not differentiate the case of a
>> descriptor used by the driver (not returned to the hw).
>>
>> The usage of these functions can be:
>> - on Rx, anticipate that the cpu is not fast enough to process
>>   all incoming packets, and take dispositions to solve the
>>   problem (add more cpus, drop specific packets, ...)
>> - on Tx, detect that the link is overloaded, and take dispositions
>>   to solve the problem (notify flow control, drop specific
>>   packets)
>>
>> The patchset updates ixgbe, i40e, e1000, mlx5.
>> The other drivers that implement the descriptor_done() API are
>> fm10k, sfc, virtio. They are not updated.
Reminder for following PMDs:
- fm10k
- sfc
- virtio
There are two new eth_dev_ops:
rx_descriptor_status
tx_descriptor_status
"rx_descriptor_status" replaces "rx_descriptor_done"
It is suggested to switch to new one, and implement tx_descriptor_status
Thanks,
ferruh
>> If the new API is accepted, the descriptor_done() can be deprecated,
>> and examples/l3fwd-power will be updated to.
> 
> Applied, thanks
> 
> Next steps:
> - implement it in other drivers
> - deprecate descriptor_done API
> 
> Ferruh, please, could you check with drivers maintainers?
> Thanks
> 
^ permalink raw reply	[flat|nested] 72+ messages in thread
end of thread, other threads:[~2017-04-19 15:50 UTC | newest]
Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-24  9:54 [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 1/9] ethdev: clarify api comments of rx queue count Olivier Matz
2016-11-24 10:52   ` Ferruh Yigit
2016-11-24 11:13     ` Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 2/9] ethdev: move queue id check in generic layer Olivier Matz
2016-11-24 10:59   ` Ferruh Yigit
2016-11-24 13:05     ` Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 3/9] ethdev: add handler for Tx queue descriptor count Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 4/9] net/ixgbe: optimize Rx " Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 5/9] net/ixgbe: add handler for Tx " Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 6/9] net/igb: optimize rx " Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 7/9] net/igb: add handler for tx " Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 8/9] net/e1000: optimize rx " Olivier Matz
2016-11-24  9:54 ` [dpdk-dev] [RFC 9/9] net/e1000: add handler for tx " Olivier Matz
2017-01-13 16:44 ` [dpdk-dev] [RFC 0/9] get Rx and Tx used descriptors Olivier Matz
2017-01-13 17:32   ` Richardson, Bruce
2017-01-17  8:24     ` Olivier Matz
2017-01-17 13:56       ` Bruce Richardson
2017-03-01 17:19 ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Olivier Matz
2017-03-01 17:19   ` [dpdk-dev] [PATCH 1/6] ethdev: add descriptor status API Olivier Matz
2017-03-01 18:22     ` Andrew Rybchenko
2017-03-02 13:57       ` Olivier Matz
2017-03-02 14:19         ` Andrew Rybchenko
2017-03-02 14:54           ` Olivier Matz
2017-03-02 15:05             ` Andrew Rybchenko
2017-03-02 15:14               ` Olivier Matz
2017-03-01 17:19   ` [dpdk-dev] [PATCH 2/6] net/ixgbe: implement " Olivier Matz
2017-03-01 17:19   ` [dpdk-dev] [PATCH 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
2017-03-02  1:28     ` Lu, Wenzhuo
2017-03-02 13:58       ` Olivier Matz
2017-03-01 17:19   ` [dpdk-dev] [PATCH 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
2017-03-02  1:22     ` Lu, Wenzhuo
2017-03-02 14:46       ` Olivier Matz
2017-03-03  1:15         ` Lu, Wenzhuo
2017-03-01 17:19   ` [dpdk-dev] [PATCH 5/6] net/mlx5: implement descriptor status API Olivier Matz
2017-03-02  7:56     ` Nélio Laranjeiro
2017-03-01 17:19   ` [dpdk-dev] [PATCH 6/6] net/i40e: " Olivier Matz
2017-03-01 18:02   ` [dpdk-dev] [PATCH 0/6] get status of Rx and Tx descriptors Andrew Rybchenko
2017-03-02 13:40     ` Olivier Matz
2017-03-06 10:41       ` Thomas Monjalon
2017-03-01 18:07   ` Stephen Hemminger
2017-03-02 13:43     ` Olivier Matz
2017-03-06 10:41       ` Thomas Monjalon
2017-03-02 15:32   ` Bruce Richardson
2017-03-02 16:14     ` Olivier Matz
2017-03-03 16:18       ` Venkatesan, Venky
2017-03-03 16:45         ` Olivier Matz
2017-03-03 18:46           ` Venkatesan, Venky
2017-03-04 20:45             ` Olivier Matz
2017-03-06 11:02               ` Thomas Monjalon
2017-03-07 15:59   ` [dpdk-dev] [PATCH v2 " Olivier Matz
2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 1/6] ethdev: add descriptor status API Olivier Matz
2017-03-09 11:49       ` Andrew Rybchenko
2017-03-21  8:32       ` Yang, Qiming
2017-03-24 12:49         ` Olivier Matz
2017-03-27  1:28           ` Yang, Qiming
2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 2/6] net/ixgbe: implement " Olivier Matz
2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
2017-03-08  1:17       ` Lu, Wenzhuo
2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 5/6] net/mlx5: implement descriptor status API Olivier Matz
2017-03-07 15:59     ` [dpdk-dev] [PATCH v2 6/6] net/i40e: " Olivier Matz
2017-03-08  1:17       ` Lu, Wenzhuo
2017-03-29  8:36     ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Olivier Matz
2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 1/6] ethdev: add descriptor status API Olivier Matz
2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 2/6] net/ixgbe: implement " Olivier Matz
2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 3/6] net/e1000: implement descriptor status API (igb) Olivier Matz
2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 4/6] net/e1000: implement descriptor status API (em) Olivier Matz
2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 5/6] net/mlx5: implement descriptor status API Olivier Matz
2017-03-29  8:36       ` [dpdk-dev] [PATCH v3 6/6] net/i40e: " Olivier Matz
2017-03-30 13:30       ` [dpdk-dev] [PATCH v3 0/6] get status of Rx and Tx descriptors Thomas Monjalon
2017-04-19 15:50         ` Ferruh Yigit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).