DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock
@ 2016-11-14 10:46 Stefan Puiu
  2016-11-30  4:59 ` Yong Wang
  2016-12-12  8:21 ` [dpdk-dev] [PATCH v2] " Stefan Puiu
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Puiu @ 2016-11-14 10:46 UTC (permalink / raw)
  To: dev; +Cc: mac_leehk, yongwang, Stefan Puiu

Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, and since nobody calls
vmxnet3_post_rx_bufs() after that, no packets will be received after
this. I call this a deadlock for lack of a better term - the virtual
HW waits for free mbufs, while the app waits for the hardware to
notify it for data. Note that after this, the app can't recover.

This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port it,
address review comments and also reverted the allocation failure
handing to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since that's
the only approach that seems to work, and seems to be what other
drivers are doing (I checked ixgbe and em). Reusing the mbuf that's
getting passed to the application doesn't seem to make sense, and it
was causing weird issues in our app. Also, reusing rxm without
checking if it's NULL could cause the code to crash.

Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b109168..c9d2488 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -518,6 +518,32 @@
 	return nb_tx;
 }
 
+static inline void
+vmxnet3_renew_desc(vmxnet3_rx_queue_t *rxq, uint8_t ring_id,
+		struct rte_mbuf *mbuf)
+{
+	uint32_t  val = 0;
+	struct vmxnet3_cmd_ring *ring = &rxq->cmd_ring[ring_id];
+	struct Vmxnet3_RxDesc *rxd =
+		(struct Vmxnet3_RxDesc *)(ring->base + ring->next2fill);
+	vmxnet3_buf_info_t *buf_info = &ring->buf_info[ring->next2fill];
+
+	if (ring_id == 0)
+		val = VMXNET3_RXD_BTYPE_HEAD;
+	else
+		val = VMXNET3_RXD_BTYPE_BODY;
+
+	buf_info->m = mbuf;
+	buf_info->len = (uint16_t)(mbuf->buf_len - RTE_PKTMBUF_HEADROOM);
+	buf_info->bufPA = rte_mbuf_data_dma_addr_default(mbuf);
+
+	rxd->addr = buf_info->bufPA;
+	rxd->btype = val;
+	rxd->len = buf_info->len;
+	rxd->gen = ring->gen;
+
+	vmxnet3_cmd_ring_adv_next2fill(ring);
+}
 /*
  *  Allocates mbufs and clusters. Post rx descriptors with buffer details
  *  so that device can receive packets in those buffers.
@@ -657,9 +683,17 @@
 	}
 
 	while (rcd->gen == rxq->comp_ring.gen) {
+		struct rte_mbuf *newm;
 		if (nb_rx >= nb_pkts)
 			break;
 
+		newm = rte_mbuf_raw_alloc(rxq->mp);
+		if (unlikely(newm == NULL)) {
+			PMD_RX_LOG(ERR, "Error allocating mbuf");
+			rxq->stats.rx_buf_alloc_failure++;
+			break;
+		}
+
 		idx = rcd->rxdIdx;
 		ring_idx = (uint8_t)((rcd->rqID == rxq->qid1) ? 0 : 1);
 		rxd = (Vmxnet3_RxDesc *)rxq->cmd_ring[ring_idx].base + idx;
@@ -759,8 +793,8 @@
 		VMXNET3_INC_RING_IDX_ONLY(rxq->cmd_ring[ring_idx].next2comp,
 					  rxq->cmd_ring[ring_idx].size);
 
-		/* It's time to allocate some new buf and renew descriptors */
-		vmxnet3_post_rx_bufs(rxq, ring_idx);
+		/* It's time to  renew descriptors */
+		vmxnet3_renew_desc(rxq, ring_idx, newm);
 		if (unlikely(rxq->shared->ctrl.updateRxProd)) {
 			VMXNET3_WRITE_BAR0_REG(hw, rxprod_reg[ring_idx] + (rxq->queue_id * VMXNET3_REG_ALIGN),
 					       rxq->cmd_ring[ring_idx].next2fill);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock
  2016-11-14 10:46 [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock Stefan Puiu
@ 2016-11-30  4:59 ` Yong Wang
  2016-12-12  8:27   ` Stefan Puiu
  2016-12-12  8:21 ` [dpdk-dev] [PATCH v2] " Stefan Puiu
  1 sibling, 1 reply; 11+ messages in thread
From: Yong Wang @ 2016-11-30  4:59 UTC (permalink / raw)
  To: Stefan Puiu, dev; +Cc: mac_leehk

> -----Original Message-----
> From: Stefan Puiu [mailto:stefan.puiu@gmail.com]
> Sent: Monday, November 14, 2016 2:46 AM
> To: dev@dpdk.org
> Cc: mac_leehk@yahoo.com.hk; Yong Wang <yongwang@vmware.com>;
> Stefan Puiu <stefan.puiu@gmail.com>
> Subject: [PATCH] vmxnet3: fix Rx deadlock
> 
> Our use case is that we have an app that needs to keep mbufs around
> for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
> vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
> descriptors (where it returns -err). Since there are no mbufs that the
> virtual hardware can use, and since nobody calls
> vmxnet3_post_rx_bufs() after that, no packets will be received after

The patch looks good overall.

I think a more accurate description is that the particular descriptor's generation bit never got flipped properly when an mbuf failed to be refilled which caused the rx stuck, rather than vmxnet3_post_rx_bufs() not being called afterwards.

> this. I call this a deadlock for lack of a better term - the virtual
> HW waits for free mbufs, while the app waits for the hardware to
> notify it for data. Note that after this, the app can't recover.
> 
> This fix is a rework of this patch by Marco Lee:
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__dpdk.org_dev_patchwork_patch_6575_&d=CwIBAg&c=Sqcl0Ez6M0X8a
> eM67LKIiDJAXVeAw-YihVMNtXt-
> uEs&r=44mSO5N5yEs4CeCdtQE0xt0F7J0p67_mApYVAzyYms0&m=g2gi3ZErdx
> AKGY8d3wbhk2D6TLUVYBs3K-
> KMdiJwuvI&s=YLz0Wsl_kQUXPWij82nnO9ROB64AK5ZtDCyUvHuU8jA&e= . I
> had to forward port it,
> address review comments and also reverted the allocation failure
> handing to the first version of the patch

s/handing/handling

> (https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__dpdk.org_ml_archives_dev_2015-
> 2DJuly_022079.html&d=CwIBAg&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-
> YihVMNtXt-
> uEs&r=44mSO5N5yEs4CeCdtQE0xt0F7J0p67_mApYVAzyYms0&m=g2gi3ZErdx
> AKGY8d3wbhk2D6TLUVYBs3K-
> KMdiJwuvI&s=5HksZV8s99b3jVV7Pea60d18hKqXxp4eRpJWjz6sWLc&e= ),
> since that's
> the only approach that seems to work, and seems to be what other
> drivers are doing (I checked ixgbe and em). Reusing the mbuf that's
> getting passed to the application doesn't seem to make sense, and it
> was causing weird issues in our app. Also, reusing rxm without
> checking if it's NULL could cause the code to crash.
> 
> Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_rxtx.c | 38
> ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c
> b/drivers/net/vmxnet3/vmxnet3_rxtx.c
> index b109168..c9d2488 100644
> --- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
> +++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
> @@ -518,6 +518,32 @@
>  	return nb_tx;
>  }
> 
> +static inline void
> +vmxnet3_renew_desc(vmxnet3_rx_queue_t *rxq, uint8_t ring_id,
> +		struct rte_mbuf *mbuf)

Nit: align the params here to be consistent with other functions.

> +{
> +	uint32_t  val = 0;

Nit: extra space before "val"

> +	struct vmxnet3_cmd_ring *ring = &rxq->cmd_ring[ring_id];
> +	struct Vmxnet3_RxDesc *rxd =
> +		(struct Vmxnet3_RxDesc *)(ring->base + ring->next2fill);
> +	vmxnet3_buf_info_t *buf_info = &ring->buf_info[ring->next2fill];
> +
> +	if (ring_id == 0)
> +		val = VMXNET3_RXD_BTYPE_HEAD;
> +	else
> +		val = VMXNET3_RXD_BTYPE_BODY;
> +
> +	buf_info->m = mbuf;
> +	buf_info->len = (uint16_t)(mbuf->buf_len -
> RTE_PKTMBUF_HEADROOM);
> +	buf_info->bufPA = rte_mbuf_data_dma_addr_default(mbuf);
> +
> +	rxd->addr = buf_info->bufPA;
> +	rxd->btype = val;
> +	rxd->len = buf_info->len;
> +	rxd->gen = ring->gen;
> +
> +	vmxnet3_cmd_ring_adv_next2fill(ring);
> +}
>  /*
>   *  Allocates mbufs and clusters. Post rx descriptors with buffer details
>   *  so that device can receive packets in those buffers.
> @@ -657,9 +683,17 @@
>  	}
> 
>  	while (rcd->gen == rxq->comp_ring.gen) {
> +		struct rte_mbuf *newm;

Nit: add a blank line here.

>  		if (nb_rx >= nb_pkts)
>  			break;
> 
> +		newm = rte_mbuf_raw_alloc(rxq->mp);
> +		if (unlikely(newm == NULL)) {
> +			PMD_RX_LOG(ERR, "Error allocating mbuf");
> +			rxq->stats.rx_buf_alloc_failure++;
> +			break;
> +		}
> +
>  		idx = rcd->rxdIdx;
>  		ring_idx = (uint8_t)((rcd->rqID == rxq->qid1) ? 0 : 1);
>  		rxd = (Vmxnet3_RxDesc *)rxq->cmd_ring[ring_idx].base +
> idx;
> @@ -759,8 +793,8 @@
>  		VMXNET3_INC_RING_IDX_ONLY(rxq-
> >cmd_ring[ring_idx].next2comp,
>  					  rxq->cmd_ring[ring_idx].size);
> 
> -		/* It's time to allocate some new buf and renew descriptors
> */
> -		vmxnet3_post_rx_bufs(rxq, ring_idx);
> +		/* It's time to  renew descriptors */

Nit: extra space before "renew"

> +		vmxnet3_renew_desc(rxq, ring_idx, newm);
>  		if (unlikely(rxq->shared->ctrl.updateRxProd)) {
>  			VMXNET3_WRITE_BAR0_REG(hw,
> rxprod_reg[ring_idx] + (rxq->queue_id * VMXNET3_REG_ALIGN),
>  					       rxq->cmd_ring[ring_idx].next2fill);
> --
> 1.9.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v2] vmxnet3: fix Rx deadlock
  2016-11-14 10:46 [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock Stefan Puiu
  2016-11-30  4:59 ` Yong Wang
@ 2016-12-12  8:21 ` Stefan Puiu
  2016-12-16 15:36   ` [dpdk-dev] [PATCH v3] " Stefan Puiu
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Puiu @ 2016-12-12  8:21 UTC (permalink / raw)
  To: dev; +Cc: yongwang, mac_leehk, Stefan Puiu

Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, no packets will be received after this;
nobody retries this later, so the driver gets stuck in this state. I
call this a deadlock for lack of a better term - the virtual HW waits
for free mbufs, while the app waits for the hardware to notify it for
data. Note that after this, the app can't recover.

This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port it,
address review comments and also reverted the allocation failure
handing to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since that's
the only approach that seems to work, and seems to be what other
drivers are doing (I checked ixgbe and em). Reusing the mbuf that's
getting passed to the application doesn't seem to make sense, and it
was causing weird issues in our app. Also, reusing rxm without
checking if it's NULL could cause the code to crash.

v2:
- address review comments, reworded description a bit
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 39 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b109168..93db10f 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -518,6 +518,32 @@
 	return nb_tx;
 }
 
+static inline void
+vmxnet3_renew_desc(vmxnet3_rx_queue_t *rxq, uint8_t ring_id,
+		   struct rte_mbuf *mbuf)
+{
+	uint32_t val = 0;
+	struct vmxnet3_cmd_ring *ring = &rxq->cmd_ring[ring_id];
+	struct Vmxnet3_RxDesc *rxd =
+		(struct Vmxnet3_RxDesc *)(ring->base + ring->next2fill);
+	vmxnet3_buf_info_t *buf_info = &ring->buf_info[ring->next2fill];
+
+	if (ring_id == 0)
+		val = VMXNET3_RXD_BTYPE_HEAD;
+	else
+		val = VMXNET3_RXD_BTYPE_BODY;
+
+	buf_info->m = mbuf;
+	buf_info->len = (uint16_t)(mbuf->buf_len - RTE_PKTMBUF_HEADROOM);
+	buf_info->bufPA = rte_mbuf_data_dma_addr_default(mbuf);
+
+	rxd->addr = buf_info->bufPA;
+	rxd->btype = val;
+	rxd->len = buf_info->len;
+	rxd->gen = ring->gen;
+
+	vmxnet3_cmd_ring_adv_next2fill(ring);
+}
 /*
  *  Allocates mbufs and clusters. Post rx descriptors with buffer details
  *  so that device can receive packets in those buffers.
@@ -657,9 +683,18 @@
 	}
 
 	while (rcd->gen == rxq->comp_ring.gen) {
+		struct rte_mbuf *newm;
+
 		if (nb_rx >= nb_pkts)
 			break;
 
+		newm = rte_mbuf_raw_alloc(rxq->mp);
+		if (unlikely(newm == NULL)) {
+			PMD_RX_LOG(ERR, "Error allocating mbuf");
+			rxq->stats.rx_buf_alloc_failure++;
+			break;
+		}
+
 		idx = rcd->rxdIdx;
 		ring_idx = (uint8_t)((rcd->rqID == rxq->qid1) ? 0 : 1);
 		rxd = (Vmxnet3_RxDesc *)rxq->cmd_ring[ring_idx].base + idx;
@@ -759,8 +794,8 @@
 		VMXNET3_INC_RING_IDX_ONLY(rxq->cmd_ring[ring_idx].next2comp,
 					  rxq->cmd_ring[ring_idx].size);
 
-		/* It's time to allocate some new buf and renew descriptors */
-		vmxnet3_post_rx_bufs(rxq, ring_idx);
+		/* It's time to renew descriptors */
+		vmxnet3_renew_desc(rxq, ring_idx, newm);
 		if (unlikely(rxq->shared->ctrl.updateRxProd)) {
 			VMXNET3_WRITE_BAR0_REG(hw, rxprod_reg[ring_idx] + (rxq->queue_id * VMXNET3_REG_ALIGN),
 					       rxq->cmd_ring[ring_idx].next2fill);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock
  2016-11-30  4:59 ` Yong Wang
@ 2016-12-12  8:27   ` Stefan Puiu
  2016-12-12 18:17     ` Yong Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Puiu @ 2016-12-12  8:27 UTC (permalink / raw)
  To: Yong Wang; +Cc: dev, mac_leehk

Hello and thanks for reviewing the patch.

On Wed, Nov 30, 2016 at 6:59 AM, Yong Wang <yongwang@vmware.com> wrote:
[...]
> I think a more accurate description is that the particular descriptor's generation bit never got flipped properly when an mbuf failed to be refilled which caused the rx stuck, rather than vmxnet3_post_rx_bufs() not being called afterwards.
>

Not sure if this kind of level of detail is useful, but if you can
think of a better explanation to put in the changelist, I can add it.
I see the generation bit not flipping as a symptom, while the core
problem is the hardware can't write to the descriptor. I felt the
explanation was going into too much detail anyway, so I've reworded it
a bit for v2. Let me know what you think.

Thanks,
Stefan.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock
  2016-12-12  8:27   ` Stefan Puiu
@ 2016-12-12 18:17     ` Yong Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Yong Wang @ 2016-12-12 18:17 UTC (permalink / raw)
  To: Stefan Puiu; +Cc: dev, mac_leehk

> -----Original Message-----
> From: Stefan Puiu [mailto:stefan.puiu@gmail.com]
> Sent: Monday, December 12, 2016 12:27 AM
> To: Yong Wang <yongwang@vmware.com>
> Cc: dev@dpdk.org; mac_leehk@yahoo.com.hk
> Subject: Re: [PATCH] vmxnet3: fix Rx deadlock
> 
> Hello and thanks for reviewing the patch.
> 
> On Wed, Nov 30, 2016 at 6:59 AM, Yong Wang <yongwang@vmware.com>
> wrote:
> [...]
> > I think a more accurate description is that the particular descriptor's
> generation bit never got flipped properly when an mbuf failed to be refilled
> which caused the rx stuck, rather than vmxnet3_post_rx_bufs() not being
> called afterwards.
> >
> 
> Not sure if this kind of level of detail is useful, but if you can
> think of a better explanation to put in the changelist, I can add it.
> I see the generation bit not flipping as a symptom, while the core
> problem is the hardware can't write to the descriptor. I felt the
> explanation was going into too much detail anyway, so I've reworded it
> a bit for v2. Let me know what you think.

This is one of the cases that I prefer accuracy and I think the level of details is needed for whoever will work on this part of the code (datapath tx and rx routines).

The v2 description looks good to me expect the following description:

"nobody retries this later, so the driver gets stuck in this state."

The driver definitely retries vmxnet3_post_rx_bufs() after it was in the problematic state but due to the descriptor's gen bit not flipped, the driver won't refill an mbuf.  How about "the driver won't refill the mbuf after this so it gets stuck in this state."?

> Thanks,
> Stefan.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v3] vmxnet3: fix Rx deadlock
  2016-12-12  8:21 ` [dpdk-dev] [PATCH v2] " Stefan Puiu
@ 2016-12-16 15:36   ` Stefan Puiu
  2016-12-16 17:47     ` Yong Wang
  2016-12-19  9:40     ` [dpdk-dev] [PATCH v4] " Stefan Puiu
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Puiu @ 2016-12-16 15:36 UTC (permalink / raw)
  To: dev; +Cc: yongwang, mac_leehk, Stefan Puiu

Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, no packets will be received after this; the
driver won't refill the mbuf after this so it gets stuck in this
state. I call this a deadlock for lack of a better term - the virtual
HW waits for free mbufs, while the app waits for the hardware to
notify it for data (by flipping the generation bit on the used Rx
descriptors). Note that after this, the app can't recover.

This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port
it, address review comments and also reverted the allocation
failure handling to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since
that's the only approach that seems to work, and seems to be what
other drivers are doing (I checked ixgbe and em). Reusing the mbuf
that's getting passed to the application doesn't seem to make
sense, and it was causing weird issues in our app. Also, reusing
rxm without checking if it's NULL could cause the code to crash.
---
v3:
* rework description after review, explain how HW signal receipt of
  packets

v2:
* address review comments, reworded description a bit

 drivers/net/vmxnet3/vmxnet3_rxtx.c | 39 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b109168..93db10f 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -518,6 +518,32 @@
 	return nb_tx;
 }
 
+static inline void
+vmxnet3_renew_desc(vmxnet3_rx_queue_t *rxq, uint8_t ring_id,
+		   struct rte_mbuf *mbuf)
+{
+	uint32_t val = 0;
+	struct vmxnet3_cmd_ring *ring = &rxq->cmd_ring[ring_id];
+	struct Vmxnet3_RxDesc *rxd =
+		(struct Vmxnet3_RxDesc *)(ring->base + ring->next2fill);
+	vmxnet3_buf_info_t *buf_info = &ring->buf_info[ring->next2fill];
+
+	if (ring_id == 0)
+		val = VMXNET3_RXD_BTYPE_HEAD;
+	else
+		val = VMXNET3_RXD_BTYPE_BODY;
+
+	buf_info->m = mbuf;
+	buf_info->len = (uint16_t)(mbuf->buf_len - RTE_PKTMBUF_HEADROOM);
+	buf_info->bufPA = rte_mbuf_data_dma_addr_default(mbuf);
+
+	rxd->addr = buf_info->bufPA;
+	rxd->btype = val;
+	rxd->len = buf_info->len;
+	rxd->gen = ring->gen;
+
+	vmxnet3_cmd_ring_adv_next2fill(ring);
+}
 /*
  *  Allocates mbufs and clusters. Post rx descriptors with buffer details
  *  so that device can receive packets in those buffers.
@@ -657,9 +683,18 @@
 	}
 
 	while (rcd->gen == rxq->comp_ring.gen) {
+		struct rte_mbuf *newm;
+
 		if (nb_rx >= nb_pkts)
 			break;
 
+		newm = rte_mbuf_raw_alloc(rxq->mp);
+		if (unlikely(newm == NULL)) {
+			PMD_RX_LOG(ERR, "Error allocating mbuf");
+			rxq->stats.rx_buf_alloc_failure++;
+			break;
+		}
+
 		idx = rcd->rxdIdx;
 		ring_idx = (uint8_t)((rcd->rqID == rxq->qid1) ? 0 : 1);
 		rxd = (Vmxnet3_RxDesc *)rxq->cmd_ring[ring_idx].base + idx;
@@ -759,8 +794,8 @@
 		VMXNET3_INC_RING_IDX_ONLY(rxq->cmd_ring[ring_idx].next2comp,
 					  rxq->cmd_ring[ring_idx].size);
 
-		/* It's time to allocate some new buf and renew descriptors */
-		vmxnet3_post_rx_bufs(rxq, ring_idx);
+		/* It's time to renew descriptors */
+		vmxnet3_renew_desc(rxq, ring_idx, newm);
 		if (unlikely(rxq->shared->ctrl.updateRxProd)) {
 			VMXNET3_WRITE_BAR0_REG(hw, rxprod_reg[ring_idx] + (rxq->queue_id * VMXNET3_REG_ALIGN),
 					       rxq->cmd_ring[ring_idx].next2fill);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH v3] vmxnet3: fix Rx deadlock
  2016-12-16 15:36   ` [dpdk-dev] [PATCH v3] " Stefan Puiu
@ 2016-12-16 17:47     ` Yong Wang
  2016-12-19  9:40     ` [dpdk-dev] [PATCH v4] " Stefan Puiu
  1 sibling, 0 replies; 11+ messages in thread
From: Yong Wang @ 2016-12-16 17:47 UTC (permalink / raw)
  To: Stefan Puiu, dev; +Cc: mac_leehk

> -----Original Message-----
> From: Stefan Puiu [mailto:stefan.puiu@gmail.com]
> Sent: Friday, December 16, 2016 7:37 AM
> To: dev@dpdk.org
> Cc: Yong Wang <yongwang@vmware.com>; mac_leehk@yahoo.com.hk;
> Stefan Puiu <stefan.puiu@gmail.com>
> Subject: [PATCH v3] vmxnet3: fix Rx deadlock
> 
> Our use case is that we have an app that needs to keep mbufs around
> for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
> vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
> descriptors (where it returns -err). Since there are no mbufs that the
> virtual hardware can use, no packets will be received after this; the
> driver won't refill the mbuf after this so it gets stuck in this
> state. I call this a deadlock for lack of a better term - the virtual
> HW waits for free mbufs, while the app waits for the hardware to
> notify it for data (by flipping the generation bit on the used Rx
> descriptors). Note that after this, the app can't recover.
> 
> This fix is a rework of this patch by Marco Lee:
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__dpdk.org_dev_patchwork_patch_6575_&d=DgIBAg&c=uilaK90D4TOVo
> H58JNXRgQ&r=v4BBYIqiDq552fkYnKKFBFyqvMXOR3UXSdFO2plFD1s&m=zvM
> IQvFmKNiehiMa4e9UerIU-
> XZTcnlOqJZ0FXx0lsM&s=nZk5Zsz_6yrZOCrteBQ4RJbgLMhsPxW8DQkZmzGSo
> yU&e= . I had to forward port
> it, address review comments and also reverted the allocation
> failure handling to the first version of the patch
> (https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__dpdk.org_ml_archives_dev_2015-
> 2DJuly_022079.html&d=DgIBAg&c=uilaK90D4TOVoH58JNXRgQ&r=v4BBYIqiD
> q552fkYnKKFBFyqvMXOR3UXSdFO2plFD1s&m=zvMIQvFmKNiehiMa4e9UerI
> U-XZTcnlOqJZ0FXx0lsM&s=dU2FsdH7OPHIUXeXIrv0yubdCb-
> 4_koMclojVj_5ULo&e= ), since
> that's the only approach that seems to work, and seems to be what
> other drivers are doing (I checked ixgbe and em). Reusing the mbuf
> that's getting passed to the application doesn't seem to make
> sense, and it was causing weird issues in our app. Also, reusing
> rxm without checking if it's NULL could cause the code to crash.
> ---

Signoff info is missing from the commit description.  Otherwise, looks good.

Acked-by: Yong Wang <yongwang@vmware.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v4] vmxnet3: fix Rx deadlock
  2016-12-16 15:36   ` [dpdk-dev] [PATCH v3] " Stefan Puiu
  2016-12-16 17:47     ` Yong Wang
@ 2016-12-19  9:40     ` Stefan Puiu
  2016-12-19 10:41       ` Ferruh Yigit
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Puiu @ 2016-12-19  9:40 UTC (permalink / raw)
  To: dev; +Cc: yongwang, mac_leehk, Stefan Puiu

Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, no packets will be received after this; the
driver won't refill the mbuf after this so it gets stuck in this
state. I call this a deadlock for lack of a better term - the virtual
HW waits for free mbufs, while the app waits for the hardware to
notify it for data (by flipping the generation bit on the used Rx
descriptors). Note that after this, the app can't recover.

This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port
it, address review comments and also reverted the allocation
failure handling to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since
that's the only approach that seems to work, and seems to be what
other drivers are doing (I checked ixgbe and em). Reusing the mbuf
that's getting passed to the application doesn't seem to make
sense, and it was causing weird issues in our app. Also, reusing
rxm without checking if it's NULL could cause the code to crash.

Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
---
v4:
* no change, just added sign-off

v3:
* rework description after review, explain how HW signals receipt of
  packets

v2:
* address review comments, reworded description a bit

 drivers/net/vmxnet3/vmxnet3_rxtx.c | 39 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b109168..93db10f 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -518,6 +518,32 @@
 	return nb_tx;
 }
 
+static inline void
+vmxnet3_renew_desc(vmxnet3_rx_queue_t *rxq, uint8_t ring_id,
+		   struct rte_mbuf *mbuf)
+{
+	uint32_t val = 0;
+	struct vmxnet3_cmd_ring *ring = &rxq->cmd_ring[ring_id];
+	struct Vmxnet3_RxDesc *rxd =
+		(struct Vmxnet3_RxDesc *)(ring->base + ring->next2fill);
+	vmxnet3_buf_info_t *buf_info = &ring->buf_info[ring->next2fill];
+
+	if (ring_id == 0)
+		val = VMXNET3_RXD_BTYPE_HEAD;
+	else
+		val = VMXNET3_RXD_BTYPE_BODY;
+
+	buf_info->m = mbuf;
+	buf_info->len = (uint16_t)(mbuf->buf_len - RTE_PKTMBUF_HEADROOM);
+	buf_info->bufPA = rte_mbuf_data_dma_addr_default(mbuf);
+
+	rxd->addr = buf_info->bufPA;
+	rxd->btype = val;
+	rxd->len = buf_info->len;
+	rxd->gen = ring->gen;
+
+	vmxnet3_cmd_ring_adv_next2fill(ring);
+}
 /*
  *  Allocates mbufs and clusters. Post rx descriptors with buffer details
  *  so that device can receive packets in those buffers.
@@ -657,9 +683,18 @@
 	}
 
 	while (rcd->gen == rxq->comp_ring.gen) {
+		struct rte_mbuf *newm;
+
 		if (nb_rx >= nb_pkts)
 			break;
 
+		newm = rte_mbuf_raw_alloc(rxq->mp);
+		if (unlikely(newm == NULL)) {
+			PMD_RX_LOG(ERR, "Error allocating mbuf");
+			rxq->stats.rx_buf_alloc_failure++;
+			break;
+		}
+
 		idx = rcd->rxdIdx;
 		ring_idx = (uint8_t)((rcd->rqID == rxq->qid1) ? 0 : 1);
 		rxd = (Vmxnet3_RxDesc *)rxq->cmd_ring[ring_idx].base + idx;
@@ -759,8 +794,8 @@
 		VMXNET3_INC_RING_IDX_ONLY(rxq->cmd_ring[ring_idx].next2comp,
 					  rxq->cmd_ring[ring_idx].size);
 
-		/* It's time to allocate some new buf and renew descriptors */
-		vmxnet3_post_rx_bufs(rxq, ring_idx);
+		/* It's time to renew descriptors */
+		vmxnet3_renew_desc(rxq, ring_idx, newm);
 		if (unlikely(rxq->shared->ctrl.updateRxProd)) {
 			VMXNET3_WRITE_BAR0_REG(hw, rxprod_reg[ring_idx] + (rxq->queue_id * VMXNET3_REG_ALIGN),
 					       rxq->cmd_ring[ring_idx].next2fill);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH v4] vmxnet3: fix Rx deadlock
  2016-12-19  9:40     ` [dpdk-dev] [PATCH v4] " Stefan Puiu
@ 2016-12-19 10:41       ` Ferruh Yigit
  2016-12-19 12:26         ` Ferruh Yigit
  0 siblings, 1 reply; 11+ messages in thread
From: Ferruh Yigit @ 2016-12-19 10:41 UTC (permalink / raw)
  To: Stefan Puiu, dev; +Cc: yongwang, mac_leehk

On 12/19/2016 9:40 AM, Stefan Puiu wrote:
> Our use case is that we have an app that needs to keep mbufs around
> for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
> vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
> descriptors (where it returns -err). Since there are no mbufs that the
> virtual hardware can use, no packets will be received after this; the
> driver won't refill the mbuf after this so it gets stuck in this
> state. I call this a deadlock for lack of a better term - the virtual
> HW waits for free mbufs, while the app waits for the hardware to
> notify it for data (by flipping the generation bit on the used Rx
> descriptors). Note that after this, the app can't recover.
> 
> This fix is a rework of this patch by Marco Lee:
> http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port
> it, address review comments and also reverted the allocation
> failure handling to the first version of the patch
> (http://dpdk.org/ml/archives/dev/2015-July/022079.html), since
> that's the only approach that seems to work, and seems to be what
> other drivers are doing (I checked ixgbe and em). Reusing the mbuf
> that's getting passed to the application doesn't seem to make
> sense, and it was causing weird issues in our app. Also, reusing
> rxm without checking if it's NULL could cause the code to crash.
> 
> Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH v4] vmxnet3: fix Rx deadlock
  2016-12-19 10:41       ` Ferruh Yigit
@ 2016-12-19 12:26         ` Ferruh Yigit
  0 siblings, 0 replies; 11+ messages in thread
From: Ferruh Yigit @ 2016-12-19 12:26 UTC (permalink / raw)
  To: Stefan Puiu, dev; +Cc: yongwang, mac_leehk, dpdk stable

On 12/19/2016 10:41 AM, Ferruh Yigit wrote:
> On 12/19/2016 9:40 AM, Stefan Puiu wrote:
>> Our use case is that we have an app that needs to keep mbufs around
>> for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
>> vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
>> descriptors (where it returns -err). Since there are no mbufs that the
>> virtual hardware can use, no packets will be received after this; the
>> driver won't refill the mbuf after this so it gets stuck in this
>> state. I call this a deadlock for lack of a better term - the virtual
>> HW waits for free mbufs, while the app waits for the hardware to
>> notify it for data (by flipping the generation bit on the used Rx
>> descriptors). Note that after this, the app can't recover.
>>
>> This fix is a rework of this patch by Marco Lee:
>> http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port
>> it, address review comments and also reverted the allocation
>> failure handling to the first version of the patch
>> (http://dpdk.org/ml/archives/dev/2015-July/022079.html), since
>> that's the only approach that seems to work, and seems to be what
>> other drivers are doing (I checked ixgbe and em). Reusing the mbuf
>> that's getting passed to the application doesn't seem to make
>> sense, and it was causing weird issues in our app. Also, reusing
>> rxm without checking if it's NULL could cause the code to crash.
>>
>> Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
> 
> Applied to dpdk-next-net/master, thanks.
> 

CC:stable@dpdk.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock
@ 2016-11-14 10:40 Stefan Puiu
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Puiu @ 2016-11-14 10:40 UTC (permalink / raw)
  To: dev; +Cc: mac_leehk, yongwang, Stefan Puiu

Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, and since nobody calls
vmxnet3_post_rx_bufs() after that, no packets will be received after
this. I call this a deadlock for lack of a better term - the virtual
HW waits for free mbufs, while the app waits for the hardware to
notify it for data. Note that after this, the app can't recover.

This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port it,
address review comments and also reverted the allocation failure
handing to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since that's
the only approach that seems to work, and seems to be what other
drivers are doing (I checked ixgbe and em). Reusing the mbuf that's
getting passed to the application doesn't seem to make sense, and it
was causing weird issues in our app. Also, reusing rxm without
checking if it's NULL could cause the code to crash.
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b109168..c9d2488 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -518,6 +518,32 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 	return nb_tx;
 }
 
+static inline void
+vmxnet3_renew_desc(vmxnet3_rx_queue_t *rxq, uint8_t ring_id,
+		struct rte_mbuf *mbuf)
+{
+	uint32_t  val = 0;
+	struct vmxnet3_cmd_ring *ring = &rxq->cmd_ring[ring_id];
+	struct Vmxnet3_RxDesc *rxd =
+		(struct Vmxnet3_RxDesc *)(ring->base + ring->next2fill);
+	vmxnet3_buf_info_t *buf_info = &ring->buf_info[ring->next2fill];
+
+	if (ring_id == 0)
+		val = VMXNET3_RXD_BTYPE_HEAD;
+	else
+		val = VMXNET3_RXD_BTYPE_BODY;
+
+	buf_info->m = mbuf;
+	buf_info->len = (uint16_t)(mbuf->buf_len - RTE_PKTMBUF_HEADROOM);
+	buf_info->bufPA = rte_mbuf_data_dma_addr_default(mbuf);
+
+	rxd->addr = buf_info->bufPA;
+	rxd->btype = val;
+	rxd->len = buf_info->len;
+	rxd->gen = ring->gen;
+
+	vmxnet3_cmd_ring_adv_next2fill(ring);
+}
 /*
  *  Allocates mbufs and clusters. Post rx descriptors with buffer details
  *  so that device can receive packets in those buffers.
@@ -657,9 +683,17 @@ vmxnet3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	}
 
 	while (rcd->gen == rxq->comp_ring.gen) {
+		struct rte_mbuf *newm;
 		if (nb_rx >= nb_pkts)
 			break;
 
+		newm = rte_mbuf_raw_alloc(rxq->mp);
+		if (unlikely(newm == NULL)) {
+			PMD_RX_LOG(ERR, "Error allocating mbuf");
+			rxq->stats.rx_buf_alloc_failure++;
+			break;
+		}
+
 		idx = rcd->rxdIdx;
 		ring_idx = (uint8_t)((rcd->rqID == rxq->qid1) ? 0 : 1);
 		rxd = (Vmxnet3_RxDesc *)rxq->cmd_ring[ring_idx].base + idx;
@@ -759,8 +793,8 @@ rcd_done:
 		VMXNET3_INC_RING_IDX_ONLY(rxq->cmd_ring[ring_idx].next2comp,
 					  rxq->cmd_ring[ring_idx].size);
 
-		/* It's time to allocate some new buf and renew descriptors */
-		vmxnet3_post_rx_bufs(rxq, ring_idx);
+		/* It's time to  renew descriptors */
+		vmxnet3_renew_desc(rxq, ring_idx, newm);
 		if (unlikely(rxq->shared->ctrl.updateRxProd)) {
 			VMXNET3_WRITE_BAR0_REG(hw, rxprod_reg[ring_idx] + (rxq->queue_id * VMXNET3_REG_ALIGN),
 					       rxq->cmd_ring[ring_idx].next2fill);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-19 12:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-14 10:46 [dpdk-dev] [PATCH] vmxnet3: fix Rx deadlock Stefan Puiu
2016-11-30  4:59 ` Yong Wang
2016-12-12  8:27   ` Stefan Puiu
2016-12-12 18:17     ` Yong Wang
2016-12-12  8:21 ` [dpdk-dev] [PATCH v2] " Stefan Puiu
2016-12-16 15:36   ` [dpdk-dev] [PATCH v3] " Stefan Puiu
2016-12-16 17:47     ` Yong Wang
2016-12-19  9:40     ` [dpdk-dev] [PATCH v4] " Stefan Puiu
2016-12-19 10:41       ` Ferruh Yigit
2016-12-19 12:26         ` Ferruh Yigit
  -- strict thread matches above, loose matches on Subject: below --
2016-11-14 10:40 [dpdk-dev] [PATCH] " Stefan Puiu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).