[dpdk-dev] Surprisingly high TCP ACK packets drop counter

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] Surprisingly high TCP ACK packets drop counter
@ 2013-11-01 13:43 Alexander Belyakov
  2013-11-01 14:54 ` Wang, Shawn
  2013-11-02  5:29 ` Prashant Upadhyaya
  0 siblings, 2 replies; 14+ messages in thread
From: Alexander Belyakov @ 2013-11-01 13:43 UTC (permalink / raw)
  To: dev

Hello,

we have simple test application on top of DPDK which sole purpose is to
forward as much packets as possible. Generally we easily achieve 14.5Mpps
with two 82599EB (one as input and one as output). The only suprising
exception is forwarding pure TCP ACK flood when performace always drops to
approximately 7Mpps.

For simplicity consider two different types of traffic:
1) TCP SYN flood is forwarded at 14.5Mpps rate,
2) pure TCP ACK flood is forwarded only at 7Mpps rate.

Both SYN and ACK packets have exactly the same length.

It is worth to mention, this forwarding application looks at Ethernet and
IP headers, but never deals with L4 headers.

We tracked down issue to RX circuit. To be specific, there are 4 RX queues
initialized on input port and rte_eth_stats_get() shows uniform packet
distribution (q_ipackets) among them, while q_errors remain zero for all
queues. The only drop counter quickly increasing in the case of pure ACK
flood is ierrors, while rx_nombuf remains zero.

We tried different kinds of traffic generators, but always got the same
result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK flag
bit set while all other flag bits dropped. Source IPs and ports are
selected randomly.

Please let us know if anyone is aware of such strange behavior and where
should we look at to narrow down the problem.

Thanks in advance,
Alexander Belyakov

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-01 13:43 [dpdk-dev] Surprisingly high TCP ACK packets drop counter Alexander Belyakov
@ 2013-11-01 14:54 ` Wang, Shawn
  2013-11-01 15:14   ` Thomas Monjalon
                     ` (2 more replies)
  2013-11-02  5:29 ` Prashant Upadhyaya
  1 sibling, 3 replies; 14+ messages in thread
From: Wang, Shawn @ 2013-11-01 14:54 UTC (permalink / raw)
  To: Alexander Belyakov; +Cc: dev

Hi:

We had the same problem before. It turned out that RSC (receive side
coalescing) is enabled by default in DPDK. So we write this naïve patch to
disable it. This patch is based on DPDK 1.3. Not sure 1.5 has changed it
or not.
After this patch, ACK rate should go back to 14.5Mpps. For details, you
can refer to Intel® 82599 10 GbE Controller Datasheet. (7.11 Receive Side
Coalescing).

From: xingbow <xingbow@amazon.com>
Date: Wed, 21 Aug 2013 11:35:23 -0700
Subject: [PATCH] Disable RSC in ixgbe_dev_rx_init function in file

 ixgbe_rxtx.c
 
---

 DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 2 +-
 DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c       | 7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)
 
diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
index 7fffd60..f03046f 100644

--- a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h

+++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h

@@ -1930,7 +1930,7 @@ enum {

 #define IXGBE_RFCTL_ISCSI_DIS		0x00000001
 #define IXGBE_RFCTL_ISCSI_DWC_MASK	0x0000003E
 #define IXGBE_RFCTL_ISCSI_DWC_SHIFT	1
-#define IXGBE_RFCTL_RSC_DIS		0x00000010

+#define IXGBE_RFCTL_RSC_DIS		0x00000020

 #define IXGBE_RFCTL_NFSW_DIS		0x00000040
 #define IXGBE_RFCTL_NFSR_DIS		0x00000080
 #define IXGBE_RFCTL_NFS_VER_MASK	0x00000300
diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 07830b7..ba6e05d 100755

--- a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c

+++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c

@@ -3007,6 +3007,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)

 	uint64_t bus_addr;
 	uint32_t rxctrl;
 	uint32_t fctrl;
+	uint32_t rfctl;

 	uint32_t hlreg0;
 	uint32_t maxfrs;
 	uint32_t srrctl;
@@ -3033,6 +3034,12 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)

 	fctrl |= IXGBE_FCTRL_PMCF;
 	IXGBE_WRITE_REG(hw, IXGBE_FCTRL, fctrl);
 
+	/* Disable RSC */
+	RTE_LOG(INFO, PMD, "Disable RSC\n");
+	rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL);
+	rfctl |= IXGBE_RFCTL_RSC_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl);
+

 	/*
 	 * Configure CRC stripping, if any.
 	 */
-- 


Thanks.
Wang, Xingbo




On 11/1/13 6:43 AM, "Alexander Belyakov" <abelyako@gmail.com> wrote:

>Hello,
>
>we have simple test application on top of DPDK which sole purpose is to
>forward as much packets as possible. Generally we easily achieve 14.5Mpps
>with two 82599EB (one as input and one as output). The only suprising
>exception is forwarding pure TCP ACK flood when performace always drops to
>approximately 7Mpps.
>
>For simplicity consider two different types of traffic:
>1) TCP SYN flood is forwarded at 14.5Mpps rate,
>2) pure TCP ACK flood is forwarded only at 7Mpps rate.
>
>Both SYN and ACK packets have exactly the same length.
>
>It is worth to mention, this forwarding application looks at Ethernet and
>IP headers, but never deals with L4 headers.
>
>We tracked down issue to RX circuit. To be specific, there are 4 RX queues
>initialized on input port and rte_eth_stats_get() shows uniform packet
>distribution (q_ipackets) among them, while q_errors remain zero for all
>queues. The only drop counter quickly increasing in the case of pure ACK
>flood is ierrors, while rx_nombuf remains zero.
>
>We tried different kinds of traffic generators, but always got the same
>result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK flag
>bit set while all other flag bits dropped. Source IPs and ports are
>selected randomly.
>
>Please let us know if anyone is aware of such strange behavior and where
>should we look at to narrow down the problem.
>
>Thanks in advance,
>Alexander Belyakov

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-01 14:54 ` Wang, Shawn
@ 2013-11-01 15:14   ` Thomas Monjalon
  2013-11-02  5:32   ` Prashant Upadhyaya
  2013-11-03 20:20   ` Alexander Belyakov
  2 siblings, 0 replies; 14+ messages in thread
From: Thomas Monjalon @ 2013-11-01 15:14 UTC (permalink / raw)
  To: Wang, Shawn; +Cc: dev

Hello,

01/11/2013 14:54, Wang, Shawn :
> We had the same problem before. It turned out that RSC (receive side
> coalescing) is enabled by default in DPDK. So we write this naïve patch to
> disable it. This patch is based on DPDK 1.3. Not sure 1.5 has changed it
> or not.
> After this patch, ACK rate should go back to 14.5Mpps. For details, you
> can refer to Intel® 82599 10 GbE Controller Datasheet. (7.11 Receive Side
> Coalescing).

Please, could you resend this patch with a properly formatted commit log,
in order to review it for integration ?

Thank you
-- 
Thomas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-01 13:43 [dpdk-dev] Surprisingly high TCP ACK packets drop counter Alexander Belyakov
  2013-11-01 14:54 ` Wang, Shawn
@ 2013-11-02  5:29 ` Prashant Upadhyaya
  2013-11-03 20:32   ` Alexander Belyakov
  1 sibling, 1 reply; 14+ messages in thread
From: Prashant Upadhyaya @ 2013-11-02  5:29 UTC (permalink / raw)
  To: Alexander Belyakov, dev

Hi Alexander,

Regarding your following statement --
"
The only drop counter quickly increasing in the case of pure ACK flood is ierrors, while rx_nombuf remains zero.
"

Can you please explain the significance of "ierrors" counter since I am not familiar with that.

Further,  you said you have 4 queues, how many cores  are you using for polling the queues ? Hopefully 4 cores for one queue each without locks.
[It is absolutely critical that all 4 queues be polled]
Further, is it possible so that your application itself reports the traffic receive in packets per second on each queue ? [Don't try to forward the traffic here, simply receive and drop in your app and sample the counters every second]

Regards
-Prashant

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alexander Belyakov
Sent: Friday, November 01, 2013 7:13 PM
To: dev@dpdk.org
Subject: [dpdk-dev] Surprisingly high TCP ACK packets drop counter

Hello,

we have simple test application on top of DPDK which sole purpose is to forward as much packets as possible. Generally we easily achieve 14.5Mpps with two 82599EB (one as input and one as output). The only suprising exception is forwarding pure TCP ACK flood when performace always drops to approximately 7Mpps.

For simplicity consider two different types of traffic:
1) TCP SYN flood is forwarded at 14.5Mpps rate,
2) pure TCP ACK flood is forwarded only at 7Mpps rate.

Both SYN and ACK packets have exactly the same length.

It is worth to mention, this forwarding application looks at Ethernet and IP headers, but never deals with L4 headers.

We tracked down issue to RX circuit. To be specific, there are 4 RX queues initialized on input port and rte_eth_stats_get() shows uniform packet distribution (q_ipackets) among them, while q_errors remain zero for all queues. The only drop counter quickly increasing in the case of pure ACK flood is ierrors, while rx_nombuf remains zero.

We tried different kinds of traffic generators, but always got the same
result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK flag bit set while all other flag bits dropped. Source IPs and ports are selected randomly.

Please let us know if anyone is aware of such strange behavior and where should we look at to narrow down the problem.

Thanks in advance,
Alexander Belyakov

===============================================================================
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===============================================================================

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-01 14:54 ` Wang, Shawn
  2013-11-01 15:14   ` Thomas Monjalon
@ 2013-11-02  5:32   ` Prashant Upadhyaya
  2013-11-03 20:20   ` Alexander Belyakov
  2 siblings, 0 replies; 14+ messages in thread
From: Prashant Upadhyaya @ 2013-11-02  5:32 UTC (permalink / raw)
  To: Wang, Shawn, Alexander Belyakov; +Cc: dev

Hi,

I have used DPDK1.4 and DPDK1.5 and the packets do fan out nicely on the rx queues nicely in some usecases I have.
Alexander, can you please try using DPDK1.4 or 1.5 and share the results.

Regards
-Prashant


-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Wang, Shawn
Sent: Friday, November 01, 2013 8:24 PM
To: Alexander Belyakov
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter

Hi:

We had the same problem before. It turned out that RSC (receive side
coalescing) is enabled by default in DPDK. So we write this naïve patch to disable it. This patch is based on DPDK 1.3. Not sure 1.5 has changed it or not.
After this patch, ACK rate should go back to 14.5Mpps. For details, you can refer to Intel® 82599 10 GbE Controller Datasheet. (7.11 Receive Side Coalescing).

From: xingbow <xingbow@amazon.com>
Date: Wed, 21 Aug 2013 11:35:23 -0700
Subject: [PATCH] Disable RSC in ixgbe_dev_rx_init function in file

 ixgbe_rxtx.c

---

 DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 2 +-
 DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c       | 7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
index 7fffd60..f03046f 100644

--- a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h

+++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h

@@ -1930,7 +1930,7 @@ enum {

 #define IXGBE_RFCTL_ISCSI_DIS          0x00000001
 #define IXGBE_RFCTL_ISCSI_DWC_MASK     0x0000003E
 #define IXGBE_RFCTL_ISCSI_DWC_SHIFT    1
-#define IXGBE_RFCTL_RSC_DIS            0x00000010

+#define IXGBE_RFCTL_RSC_DIS            0x00000020

 #define IXGBE_RFCTL_NFSW_DIS           0x00000040
 #define IXGBE_RFCTL_NFSR_DIS           0x00000080
 #define IXGBE_RFCTL_NFS_VER_MASK       0x00000300
diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 07830b7..ba6e05d 100755

--- a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c

+++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c

@@ -3007,6 +3007,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)

        uint64_t bus_addr;
        uint32_t rxctrl;
        uint32_t fctrl;
+       uint32_t rfctl;

        uint32_t hlreg0;
        uint32_t maxfrs;
        uint32_t srrctl;
@@ -3033,6 +3034,12 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)

        fctrl |= IXGBE_FCTRL_PMCF;
        IXGBE_WRITE_REG(hw, IXGBE_FCTRL, fctrl);

+       /* Disable RSC */
+       RTE_LOG(INFO, PMD, "Disable RSC\n");
+       rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL);
+       rfctl |= IXGBE_RFCTL_RSC_DIS;
+       IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl);
+

        /*
         * Configure CRC stripping, if any.
         */
--


Thanks.
Wang, Xingbo




On 11/1/13 6:43 AM, "Alexander Belyakov" <abelyako@gmail.com> wrote:

>Hello,
>
>we have simple test application on top of DPDK which sole purpose is to
>forward as much packets as possible. Generally we easily achieve
>14.5Mpps with two 82599EB (one as input and one as output). The only
>suprising exception is forwarding pure TCP ACK flood when performace
>always drops to approximately 7Mpps.
>
>For simplicity consider two different types of traffic:
>1) TCP SYN flood is forwarded at 14.5Mpps rate,
>2) pure TCP ACK flood is forwarded only at 7Mpps rate.
>
>Both SYN and ACK packets have exactly the same length.
>
>It is worth to mention, this forwarding application looks at Ethernet
>and IP headers, but never deals with L4 headers.
>
>We tracked down issue to RX circuit. To be specific, there are 4 RX
>queues initialized on input port and rte_eth_stats_get() shows uniform
>packet distribution (q_ipackets) among them, while q_errors remain zero
>for all queues. The only drop counter quickly increasing in the case of
>pure ACK flood is ierrors, while rx_nombuf remains zero.
>
>We tried different kinds of traffic generators, but always got the same
>result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK
>flag bit set while all other flag bits dropped. Source IPs and ports
>are selected randomly.
>
>Please let us know if anyone is aware of such strange behavior and
>where should we look at to narrow down the problem.
>
>Thanks in advance,
>Alexander Belyakov





===============================================================================
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===============================================================================

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-01 14:54 ` Wang, Shawn
  2013-11-01 15:14   ` Thomas Monjalon
  2013-11-02  5:32   ` Prashant Upadhyaya
@ 2013-11-03 20:20   ` Alexander Belyakov
  2013-11-04  3:06     ` Prashant Upadhyaya
  2 siblings, 1 reply; 14+ messages in thread
From: Alexander Belyakov @ 2013-11-03 20:20 UTC (permalink / raw)
  To: Wang, Shawn; +Cc: dev

Hi,

thanks for the patch and explanation. We have tried DPDK 1.3 and 1.5 - both
have the same issue.

Regards,
Alexander


On Fri, Nov 1, 2013 at 6:54 PM, Wang, Shawn <xingbow@amazon.com> wrote:

> Hi:
>
> We had the same problem before. It turned out that RSC (receive side
> coalescing) is enabled by default in DPDK. So we write this naïve patch to
> disable it. This patch is based on DPDK 1.3. Not sure 1.5 has changed it
> or not.
> After this patch, ACK rate should go back to 14.5Mpps. For details, you
> can refer to Intel® 82599 10 GbE Controller Datasheet. (7.11 Receive Side
> Coalescing).
>
> From: xingbow <xingbow@amazon.com>
> Date: Wed, 21 Aug 2013 11:35:23 -0700
> Subject: [PATCH] Disable RSC in ixgbe_dev_rx_init function in file
>
>  ixgbe_rxtx.c
>
> ---
>
>  DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 2 +-
>  DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c       | 7 +++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
> b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
> index 7fffd60..f03046f 100644
>
> --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
>
> +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
>
> @@ -1930,7 +1930,7 @@ enum {
>
>  #define IXGBE_RFCTL_ISCSI_DIS          0x00000001
>  #define IXGBE_RFCTL_ISCSI_DWC_MASK     0x0000003E
>  #define IXGBE_RFCTL_ISCSI_DWC_SHIFT    1
> -#define IXGBE_RFCTL_RSC_DIS            0x00000010
>
> +#define IXGBE_RFCTL_RSC_DIS            0x00000020
>
>  #define IXGBE_RFCTL_NFSW_DIS           0x00000040
>  #define IXGBE_RFCTL_NFSR_DIS           0x00000080
>  #define IXGBE_RFCTL_NFS_VER_MASK       0x00000300
> diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 07830b7..ba6e05d 100755
>
> --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
>
> +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
>
> @@ -3007,6 +3007,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
>
>         uint64_t bus_addr;
>         uint32_t rxctrl;
>         uint32_t fctrl;
> +       uint32_t rfctl;
>
>         uint32_t hlreg0;
>         uint32_t maxfrs;
>         uint32_t srrctl;
> @@ -3033,6 +3034,12 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
>
>         fctrl |= IXGBE_FCTRL_PMCF;
>         IXGBE_WRITE_REG(hw, IXGBE_FCTRL, fctrl);
>
> +       /* Disable RSC */
> +       RTE_LOG(INFO, PMD, "Disable RSC\n");
> +       rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL);
> +       rfctl |= IXGBE_RFCTL_RSC_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl);
> +
>
>         /*
>          * Configure CRC stripping, if any.
>          */
> --
>
>
> Thanks.
> Wang, Xingbo
>
>
>
>
> On 11/1/13 6:43 AM, "Alexander Belyakov" <abelyako@gmail.com> wrote:
>
> >Hello,
> >
> >we have simple test application on top of DPDK which sole purpose is to
> >forward as much packets as possible. Generally we easily achieve 14.5Mpps
> >with two 82599EB (one as input and one as output). The only suprising
> >exception is forwarding pure TCP ACK flood when performace always drops to
> >approximately 7Mpps.
> >
> >For simplicity consider two different types of traffic:
> >1) TCP SYN flood is forwarded at 14.5Mpps rate,
> >2) pure TCP ACK flood is forwarded only at 7Mpps rate.
> >
> >Both SYN and ACK packets have exactly the same length.
> >
> >It is worth to mention, this forwarding application looks at Ethernet and
> >IP headers, but never deals with L4 headers.
> >
> >We tracked down issue to RX circuit. To be specific, there are 4 RX queues
> >initialized on input port and rte_eth_stats_get() shows uniform packet
> >distribution (q_ipackets) among them, while q_errors remain zero for all
> >queues. The only drop counter quickly increasing in the case of pure ACK
> >flood is ierrors, while rx_nombuf remains zero.
> >
> >We tried different kinds of traffic generators, but always got the same
> >result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK flag
> >bit set while all other flag bits dropped. Source IPs and ports are
> >selected randomly.
> >
> >Please let us know if anyone is aware of such strange behavior and where
> >should we look at to narrow down the problem.
> >
> >Thanks in advance,
> >Alexander Belyakov
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-02  5:29 ` Prashant Upadhyaya
@ 2013-11-03 20:32   ` Alexander Belyakov
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Belyakov @ 2013-11-03 20:32 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

Hello,


On Sat, Nov 2, 2013 at 9:29 AM, Prashant Upadhyaya <
prashant.upadhyaya@aricent.com> wrote:

> Hi Alexander,
>
> Regarding your following statement --
> "
> The only drop counter quickly increasing in the case of pure ACK flood is
> ierrors, while rx_nombuf remains zero.
> "
>
> Can you please explain the significance of "ierrors" counter since I am
> not familiar with that.
>
>
I was speaking about struct rte_eth_stats fields.
http://dpdk.org/doc/api/structrte__eth__stats.html


> Further,  you said you have 4 queues, how many cores  are you using for
> polling the queues ? Hopefully 4 cores for one queue each without locks.
> [It is absolutely critical that all 4 queues be polled]
>

There were one independent core per each RX queue, of course.

Further, is it possible so that your application itself reports the traffic
> receive in packets per second on each queue ? [Don't try to forward the
> traffic here, simply receive and drop in your app and sample the counters
> every second]
>

 There are DPDK counters for RX packets per queue in the same struct
rte_eth_stats. TX was not an issue in this case.


>
> Regards
> -Prashant
>
>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alexander Belyakov
> Sent: Friday, November 01, 2013 7:13 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
>
> Hello,
>
> we have simple test application on top of DPDK which sole purpose is to
> forward as much packets as possible. Generally we easily achieve 14.5Mpps
> with two 82599EB (one as input and one as output). The only suprising
> exception is forwarding pure TCP ACK flood when performace always drops to
> approximately 7Mpps.
>
> For simplicity consider two different types of traffic:
> 1) TCP SYN flood is forwarded at 14.5Mpps rate,
> 2) pure TCP ACK flood is forwarded only at 7Mpps rate.
>
> Both SYN and ACK packets have exactly the same length.
>
> It is worth to mention, this forwarding application looks at Ethernet and
> IP headers, but never deals with L4 headers.
>
> We tracked down issue to RX circuit. To be specific, there are 4 RX queues
> initialized on input port and rte_eth_stats_get() shows uniform packet
> distribution (q_ipackets) among them, while q_errors remain zero for all
> queues. The only drop counter quickly increasing in the case of pure ACK
> flood is ierrors, while rx_nombuf remains zero.
>
> We tried different kinds of traffic generators, but always got the same
> result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK flag
> bit set while all other flag bits dropped. Source IPs and ports are
> selected randomly.
>
> Please let us know if anyone is aware of such strange behavior and where
> should we look at to narrow down the problem.
>
> Thanks in advance,
> Alexander Belyakov
>
>
>
>
>
> ===============================================================================
> Please refer to http://www.aricent.com/legal/email_disclaimer.html
> for important disclosures regarding this electronic communication.
>
> ===============================================================================
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-03 20:20   ` Alexander Belyakov
@ 2013-11-04  3:06     ` Prashant Upadhyaya
  2013-11-05  9:29       ` Alexander Belyakov
  0 siblings, 1 reply; 14+ messages in thread
From: Prashant Upadhyaya @ 2013-11-04  3:06 UTC (permalink / raw)
  To: Alexander Belyakov, Wang, Shawn; +Cc: dev

Hi Alexander,

Please confirm if the patch works for you.

@Wang, are you saying that without the patch the NIC does not fan out the messages properly on all the receive queues ?
So what exactly happens ?

Regards
-Prashant


-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alexander Belyakov
Sent: Monday, November 04, 2013 1:51 AM
To: Wang, Shawn
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter

Hi,

thanks for the patch and explanation. We have tried DPDK 1.3 and 1.5 - both have the same issue.

Regards,
Alexander


On Fri, Nov 1, 2013 at 6:54 PM, Wang, Shawn <xingbow@amazon.com> wrote:

> Hi:
>
> We had the same problem before. It turned out that RSC (receive side
> coalescing) is enabled by default in DPDK. So we write this naïve
> patch to disable it. This patch is based on DPDK 1.3. Not sure 1.5 has
> changed it or not.
> After this patch, ACK rate should go back to 14.5Mpps. For details,
> you can refer to Intel® 82599 10 GbE Controller Datasheet. (7.11
> Receive Side Coalescing).
>
> From: xingbow <xingbow@amazon.com>
> Date: Wed, 21 Aug 2013 11:35:23 -0700
> Subject: [PATCH] Disable RSC in ixgbe_dev_rx_init function in file
>
>  ixgbe_rxtx.c
>
> ---
>
>  DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 2 +-
>  DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c       | 7 +++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
> b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
> index 7fffd60..f03046f 100644
>
> --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
>
> +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h
>
> @@ -1930,7 +1930,7 @@ enum {
>
>  #define IXGBE_RFCTL_ISCSI_DIS          0x00000001
>  #define IXGBE_RFCTL_ISCSI_DWC_MASK     0x0000003E
>  #define IXGBE_RFCTL_ISCSI_DWC_SHIFT    1
> -#define IXGBE_RFCTL_RSC_DIS            0x00000010
>
> +#define IXGBE_RFCTL_RSC_DIS            0x00000020
>
>  #define IXGBE_RFCTL_NFSW_DIS           0x00000040
>  #define IXGBE_RFCTL_NFSR_DIS           0x00000080
>  #define IXGBE_RFCTL_NFS_VER_MASK       0x00000300
> diff --git a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 07830b7..ba6e05d 100755
>
> --- a/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
>
> +++ b/DPDK/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
>
> @@ -3007,6 +3007,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
>
>         uint64_t bus_addr;
>         uint32_t rxctrl;
>         uint32_t fctrl;
> +       uint32_t rfctl;
>
>         uint32_t hlreg0;
>         uint32_t maxfrs;
>         uint32_t srrctl;
> @@ -3033,6 +3034,12 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
>
>         fctrl |= IXGBE_FCTRL_PMCF;
>         IXGBE_WRITE_REG(hw, IXGBE_FCTRL, fctrl);
>
> +       /* Disable RSC */
> +       RTE_LOG(INFO, PMD, "Disable RSC\n");
> +       rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL);
> +       rfctl |= IXGBE_RFCTL_RSC_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl);
> +
>
>         /*
>          * Configure CRC stripping, if any.
>          */
> --
>
>
> Thanks.
> Wang, Xingbo
>
>
>
>
> On 11/1/13 6:43 AM, "Alexander Belyakov" <abelyako@gmail.com> wrote:
>
> >Hello,
> >
> >we have simple test application on top of DPDK which sole purpose is
> >to forward as much packets as possible. Generally we easily achieve
> >14.5Mpps with two 82599EB (one as input and one as output). The only
> >suprising exception is forwarding pure TCP ACK flood when performace
> >always drops to approximately 7Mpps.
> >
> >For simplicity consider two different types of traffic:
> >1) TCP SYN flood is forwarded at 14.5Mpps rate,
> >2) pure TCP ACK flood is forwarded only at 7Mpps rate.
> >
> >Both SYN and ACK packets have exactly the same length.
> >
> >It is worth to mention, this forwarding application looks at Ethernet
> >and IP headers, but never deals with L4 headers.
> >
> >We tracked down issue to RX circuit. To be specific, there are 4 RX
> >queues initialized on input port and rte_eth_stats_get() shows
> >uniform packet distribution (q_ipackets) among them, while q_errors
> >remain zero for all queues. The only drop counter quickly increasing
> >in the case of pure ACK flood is ierrors, while rx_nombuf remains zero.
> >
> >We tried different kinds of traffic generators, but always got the
> >same
> >result: 7Mpps (instead of expected 14Mpps) for TCP packets with ACK
> >flag bit set while all other flag bits dropped. Source IPs and ports
> >are selected randomly.
> >
> >Please let us know if anyone is aware of such strange behavior and
> >where should we look at to narrow down the problem.
> >
> >Thanks in advance,
> >Alexander Belyakov
>
>




===============================================================================
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===============================================================================

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-04  3:06     ` Prashant Upadhyaya
@ 2013-11-05  9:29       ` Alexander Belyakov
  2013-11-05 10:10         ` Olivier MATZ
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Belyakov @ 2013-11-05  9:29 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

Hello,

On Mon, Nov 4, 2013 at 7:06 AM, Prashant Upadhyaya <
prashant.upadhyaya@aricent.com> wrote:

> Hi Alexander,
>
> Please confirm if the patch works for you.
>

Disabling RSC (DPDK 1.3) indeed brings ACK flood forwarding performance to
14,5+ Mpps. No negative side affects were discovered so far, but we're
still testing.


>
> @Wang, are you saying that without the patch the NIC does not fan out the
> messages properly on all the receive queues ?
> So what exactly happens ?
>
>
Patch deals with RSC (receive side coalescing) but not RSS (receive side
scaling).


> Regards
> -Prashant
>
>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Alexander Belyakov
> Sent: Monday, November 04, 2013 1:51 AM
> To: Wang, Shawn
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
>
> Hi,
>
> thanks for the patch and explanation. We have tried DPDK 1.3 and 1.5 -
> both have the same issue.
>
> Regards,
> Alexander
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-05  9:29       ` Alexander Belyakov
@ 2013-11-05 10:10         ` Olivier MATZ
  2013-11-05 11:59           ` Alexander Belyakov
  0 siblings, 1 reply; 14+ messages in thread
From: Olivier MATZ @ 2013-11-05 10:10 UTC (permalink / raw)
  To: Alexander Belyakov; +Cc: dev

Hi,

 > Disabling RSC (DPDK 1.3) indeed brings ACK flood forwarding
 > performance to 14,5+ Mpps. No negative side affects were discovered
 > so far, but we're still testing.

The role of RSC is to reassemble input TCP segments, so it is possible
that the number of TCP packets sent to the DPDK is lower but some
packets may contain more data. Can you confirm that?

In my opinion, this mechanism should be disabled by default because it
could break PMTU discovery on a router. However it could be useful for
somebody doing TCP termination only.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-05 10:10         ` Olivier MATZ
@ 2013-11-05 11:59           ` Alexander Belyakov
  2013-11-05 14:40             ` Prashant Upadhyaya
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Belyakov @ 2013-11-05 11:59 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

Hello,

The role of RSC is to reassemble input TCP segments, so it is possible
> that the number of TCP packets sent to the DPDK is lower but some
> packets may contain more data. Can you confirm that?
>
>
I don't think out test case can answer your question, because all generated
TCP ACK packets were as small as possible (no tcp payload at all). Source
IPs and ports were picked at random for each packet, so most of (adjacent)
packets belong to different TCP sessions.


> In my opinion, this mechanism should be disabled by default because it
> could break PMTU discovery on a router. However it could be useful for
> somebody doing TCP termination only.
>
>
I was thinking about new rte_eth_rxmode structure option:

@@ -280,6 +280,7 @@ struct rte_eth_rxmode {
                hw_vlan_strip    : 1, /**< VLAN strip enable. */
                hw_vlan_extend   : 1, /**< Extended VLAN enable. */
                jumbo_frame      : 1, /**< Jumbo Frame Receipt enable. */
+               disable_rsc      : 1, /**< Disable RSC (receive side
convalescing). */
                hw_strip_crc     : 1; /**< Enable CRC stripping by
hardware. */
 };


Regards,
Alexander

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-05 11:59           ` Alexander Belyakov
@ 2013-11-05 14:40             ` Prashant Upadhyaya
  2013-11-05 17:03               ` Wang, Shawn
  2013-11-06  8:42               ` Alexander Belyakov
  0 siblings, 2 replies; 14+ messages in thread
From: Prashant Upadhyaya @ 2013-11-05 14:40 UTC (permalink / raw)
  To: Alexander Belyakov, Olivier MATZ; +Cc: dev

Hi Alexander,

I am also wondering like Olivier – yours is a nice testcase and setup, hence requesting the information below instead of spending a lot of time reinventing the test case at my end.
If you have the time on your side, it would be interesting to know what is the number of packets per second received inside your application on each of your 4 queues individually in both the usecases – with and without RSC.

I am just wondering (since your throughput almost exactly goes down 50 %), that your apparent randomization of packets may not really be random enough and with RSC enabled the packets are coming on two queues only or there might be an uneven distribution.
Or it may well be that NIC gets overwhelmed with RSC processing and that brings down the throughput.

Either way, it would be very interesting to get stats for packets per second on each queue in both the usecases.

Regards
-Prashant

From: Alexander Belyakov [mailto:abelyako@gmail.com]
Sent: Tuesday, November 05, 2013 5:29 PM
To: Olivier MATZ
Cc: Prashant Upadhyaya; dev@dpdk.org
Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter

Hello,

The role of RSC is to reassemble input TCP segments, so it is possible
that the number of TCP packets sent to the DPDK is lower but some
packets may contain more data. Can you confirm that?

I don't think out test case can answer your question, because all generated TCP ACK packets were as small as possible (no tcp payload at all). Source IPs and ports were picked at random for each packet, so most of (adjacent) packets belong to different TCP sessions.

In my opinion, this mechanism should be disabled by default because it
could break PMTU discovery on a router. However it could be useful for
somebody doing TCP termination only.

I was thinking about new rte_eth_rxmode structure option:

@@ -280,6 +280,7 @@ struct rte_eth_rxmode {
                hw_vlan_strip    : 1, /**< VLAN strip enable. */
                hw_vlan_extend   : 1, /**< Extended VLAN enable. */
                jumbo_frame      : 1, /**< Jumbo Frame Receipt enable. */
+               disable_rsc      : 1, /**< Disable RSC (receive side convalescing). */
                hw_strip_crc     : 1; /**< Enable CRC stripping by hardware. */
 };

Regards,
Alexander

===============================================================================
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===============================================================================

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-05 14:40             ` Prashant Upadhyaya
@ 2013-11-05 17:03               ` Wang, Shawn
  2013-11-06  8:42               ` Alexander Belyakov
  1 sibling, 0 replies; 14+ messages in thread
From: Wang, Shawn @ 2013-11-05 17:03 UTC (permalink / raw)
  To: Prashant Upadhyaya, Alexander Belyakov, Olivier MATZ; +Cc: dev

My test is almost same with Alexander. But we only use one rx queue.

Sent from Samsung Mobile

-------- Original message --------
From: Prashant Upadhyaya <prashant.upadhyaya@aricent.com>
Date: 11/05/2013 6:41 AM (GMT-08:00)
To: Alexander Belyakov <abelyako@gmail.com>,Olivier MATZ <olivier.matz@6wind.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter

Hi Alexander,

I am also wondering like Olivier – yours is a nice testcase and setup, hence requesting the information below instead of spending a lot of time reinventing the test case at my end.
If you have the time on your side, it would be interesting to know what is the number of packets per second received inside your application on each of your 4 queues individually in both the usecases – with and without RSC.

I am just wondering (since your throughput almost exactly goes down 50 %), that your apparent randomization of packets may not really be random enough and with RSC enabled the packets are coming on two queues only or there might be an uneven distribution.
Or it may well be that NIC gets overwhelmed with RSC processing and that brings down the throughput.

Either way, it would be very interesting to get stats for packets per second on each queue in both the usecases.

Regards
-Prashant

From: Alexander Belyakov [mailto:abelyako@gmail.com]
Sent: Tuesday, November 05, 2013 5:29 PM
To: Olivier MATZ
Cc: Prashant Upadhyaya; dev@dpdk.org
Subject: Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter

Hello,

The role of RSC is to reassemble input TCP segments, so it is possible
that the number of TCP packets sent to the DPDK is lower but some
packets may contain more data. Can you confirm that?

I don't think out test case can answer your question, because all generated TCP ACK packets were as small as possible (no tcp payload at all). Source IPs and ports were picked at random for each packet, so most of (adjacent) packets belong to different TCP sessions.

In my opinion, this mechanism should be disabled by default because it
could break PMTU discovery on a router. However it could be useful for
somebody doing TCP termination only.

I was thinking about new rte_eth_rxmode structure option:

@@ -280,6 +280,7 @@ struct rte_eth_rxmode {
                hw_vlan_strip    : 1, /**< VLAN strip enable. */
                hw_vlan_extend   : 1, /**< Extended VLAN enable. */
                jumbo_frame      : 1, /**< Jumbo Frame Receipt enable. */
+               disable_rsc      : 1, /**< Disable RSC (receive side convalescing). */
                hw_strip_crc     : 1; /**< Enable CRC stripping by hardware. */
 };

Regards,
Alexander

===============================================================================
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===============================================================================

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Surprisingly high TCP ACK packets drop counter
  2013-11-05 14:40             ` Prashant Upadhyaya
  2013-11-05 17:03               ` Wang, Shawn
@ 2013-11-06  8:42               ` Alexander Belyakov
  1 sibling, 0 replies; 14+ messages in thread
From: Alexander Belyakov @ 2013-11-06  8:42 UTC (permalink / raw)
  To: Prashant Upadhyaya; +Cc: dev

Hello,

On Tue, Nov 5, 2013 at 6:40 PM, Prashant Upadhyaya <
prashant.upadhyaya@aricent.com> wrote:

>  Hi Alexander,
>
>
>
> I am also wondering like Olivier – yours is a nice testcase and setup,
> hence requesting the information below instead of spending a lot of time
> reinventing the test case at my end.
>
> If you have the time on your side, it would be interesting to know what is
> the number of packets per second received inside your application on each
> of your 4 queues individually in both the usecases – with and without RSC.
>
>
There is even packet distribution among all RX queues in both cases with
and without RSC.

Regards,
Alexander

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-11-06  8:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-01 13:43 [dpdk-dev] Surprisingly high TCP ACK packets drop counter Alexander Belyakov
2013-11-01 14:54 ` Wang, Shawn
2013-11-01 15:14   ` Thomas Monjalon
2013-11-02  5:32   ` Prashant Upadhyaya
2013-11-03 20:20   ` Alexander Belyakov
2013-11-04  3:06     ` Prashant Upadhyaya
2013-11-05  9:29       ` Alexander Belyakov
2013-11-05 10:10         ` Olivier MATZ
2013-11-05 11:59           ` Alexander Belyakov
2013-11-05 14:40             ` Prashant Upadhyaya
2013-11-05 17:03               ` Wang, Shawn
2013-11-06  8:42               ` Alexander Belyakov
2013-11-02  5:29 ` Prashant Upadhyaya
2013-11-03 20:32   ` Alexander Belyakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).