DPDK patches and discussions
 help / color / Atom feed
* [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes for aarch64
@ 2019-08-13 10:02 Ruifeng Wang
  2019-08-13 10:02 ` [dpdk-dev] [PATCH 1/2] net/ixgbe: remove barrier in vPMD " Ruifeng Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ruifeng Wang @ 2019-08-13 10:02 UTC (permalink / raw)
  To: jerinj, gavin.hu; +Cc: dev, honnappa.nagarahalli, nd, Ruifeng Wang

Couple of changes to IXGBE vector PMD on aarch64 platform.
An unnecessary memory barrier was identified and removed.
Also part of processing was replaced with NEON intrinsics.
Both of the changes will help to improve performance.

Ruifeng Wang (2):
  net/ixgbe: remove barrier in vPMD for aarch64
  net/ixgbe: use neon intrinsics to count packet for aarch64

 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 32 ++++++++++++-------------
 1 file changed, 16 insertions(+), 16 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [dpdk-dev] [PATCH 1/2] net/ixgbe: remove barrier in vPMD for aarch64
  2019-08-13 10:02 [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes for aarch64 Ruifeng Wang
@ 2019-08-13 10:02 ` " Ruifeng Wang
  2019-08-13 10:02 ` [dpdk-dev] [PATCH 2/2] net/ixgbe: use neon intrinsics to count packet " Ruifeng Wang
  2019-08-25  1:33 ` [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes " Ye Xiaolong
  2 siblings, 0 replies; 4+ messages in thread
From: Ruifeng Wang @ 2019-08-13 10:02 UTC (permalink / raw)
  To: jerinj, gavin.hu; +Cc: dev, honnappa.nagarahalli, nd, Ruifeng Wang

The memory barrier was intended for descriptor data integrity (see
comments in [1]). However, since NEON loads are atomic, there is
no need for the memory barrier. Remove it accordingly.

Corrected couple of code comments.

In terms of performance, observed slightly higher average throughput
in tests with 82599ES NIC.

[1] http://patches.dpdk.org/patch/18153/

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index edb138354..86fb3afdb 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -214,13 +214,13 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		uint32_t var = 0;
 		uint32_t stat;
 
-		/* B.1 load 1 mbuf point */
+		/* B.1 load 2 mbuf point */
 		mbp1 = vld1q_u64((uint64_t *)&sw_ring[pos]);
 
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1);
 
-		/* B.1 load 1 mbuf point */
+		/* B.1 load 2 mbuf point */
 		mbp2 = vld1q_u64((uint64_t *)&sw_ring[pos + 2]);
 
 		/* A. load 4 pkts descs */
@@ -228,7 +228,6 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		descs[1] =  vld1q_u64((uint64_t *)(rxdp + 1));
 		descs[2] =  vld1q_u64((uint64_t *)(rxdp + 2));
 		descs[3] =  vld1q_u64((uint64_t *)(rxdp + 3));
-		rte_smp_rmb();
 
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos + 2], mbp2);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [dpdk-dev] [PATCH 2/2] net/ixgbe: use neon intrinsics to count packet for aarch64
  2019-08-13 10:02 [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes for aarch64 Ruifeng Wang
  2019-08-13 10:02 ` [dpdk-dev] [PATCH 1/2] net/ixgbe: remove barrier in vPMD " Ruifeng Wang
@ 2019-08-13 10:02 ` " Ruifeng Wang
  2019-08-25  1:33 ` [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes " Ye Xiaolong
  2 siblings, 0 replies; 4+ messages in thread
From: Ruifeng Wang @ 2019-08-13 10:02 UTC (permalink / raw)
  To: jerinj, gavin.hu; +Cc: dev, honnappa.nagarahalli, nd, Ruifeng Wang

vPMD for aarch64 calculates the number of received packets using a loop.
Change to use NEON intrinsics for calculation. This saves CPU cycles
and has slightly better performance.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 27 +++++++++++++------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index 86fb3afdb..eeb825911 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -144,6 +144,7 @@ desc_to_olflags_v(uint8x16x2_t sterr_tmp1, uint8x16x2_t sterr_tmp2,
 
 #define IXGBE_VPMD_DESC_DD_MASK		0x01010101
 #define IXGBE_VPMD_DESC_EOP_MASK	0x02020202
+#define IXGBE_UINT8_BIT			(CHAR_BIT * sizeof(uint8_t))
 
 static inline uint16_t
 _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
@@ -211,7 +212,6 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		uint64x2_t mbp1, mbp2;
 		uint8x16_t staterr;
 		uint16x8_t tmp;
-		uint32_t var = 0;
 		uint32_t stat;
 
 		/* B.1 load 2 mbuf point */
@@ -256,7 +256,6 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 
 		/* C.2 get 4 pkts staterr value  */
 		staterr = vzipq_u8(sterr_tmp1.val[1], sterr_tmp2.val[1]).val[0];
-		stat = vgetq_lane_u32(vreinterpretq_u32_u8(staterr), 0);
 
 		/* set ol_flags with vlan packet type */
 		desc_to_olflags_v(sterr_tmp1, sterr_tmp2, staterr,
@@ -282,12 +281,20 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 
 		/* C* extract and record EOP bit */
 		if (split_packet) {
+			stat = vgetq_lane_u32(vreinterpretq_u32_u8(staterr), 0);
 			/* and with mask to extract bits, flipping 1-0 */
 			*(int *)split_packet = ~stat & IXGBE_VPMD_DESC_EOP_MASK;
 
 			split_packet += RTE_IXGBE_DESCS_PER_LOOP;
 		}
 
+		/* C.4 expand DD bit to saturate UINT8 */
+		staterr = vshlq_n_u8(staterr, IXGBE_UINT8_BIT - 1);
+		staterr = vreinterpretq_u8_s8
+				(vshrq_n_s8(vreinterpretq_s8_u8(staterr),
+					IXGBE_UINT8_BIT - 1));
+		stat = ~vgetq_lane_u32(vreinterpretq_u32_u8(staterr), 0);
+
 		rte_prefetch_non_temporal(rxdp + RTE_IXGBE_DESCS_PER_LOOP);
 
 		/* D.3 copy final 1,2 data to rx_pkts */
@@ -296,18 +303,12 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		vst1q_u8((uint8_t *)&rx_pkts[pos]->rx_descriptor_fields1,
 			 pkt_mb1);
 
-		stat &= IXGBE_VPMD_DESC_DD_MASK;
-
-		/* C.4 calc avaialbe number of desc */
-		if (likely(stat != IXGBE_VPMD_DESC_DD_MASK)) {
-			while (stat & 0x01) {
-				++var;
-				stat = stat >> 8;
-			}
-			nb_pkts_recd += var;
-			break;
-		} else {
+		/* C.5 calc available number of desc */
+		if (unlikely(stat == 0)) {
 			nb_pkts_recd += RTE_IXGBE_DESCS_PER_LOOP;
+		} else {
+			nb_pkts_recd += __builtin_ctz(stat) / IXGBE_UINT8_BIT;
+			break;
 		}
 	}
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes for aarch64
  2019-08-13 10:02 [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes for aarch64 Ruifeng Wang
  2019-08-13 10:02 ` [dpdk-dev] [PATCH 1/2] net/ixgbe: remove barrier in vPMD " Ruifeng Wang
  2019-08-13 10:02 ` [dpdk-dev] [PATCH 2/2] net/ixgbe: use neon intrinsics to count packet " Ruifeng Wang
@ 2019-08-25  1:33 ` " Ye Xiaolong
  2 siblings, 0 replies; 4+ messages in thread
From: Ye Xiaolong @ 2019-08-25  1:33 UTC (permalink / raw)
  To: Ruifeng Wang; +Cc: jerinj, gavin.hu, dev, honnappa.nagarahalli, nd

Hi, 

Thanks for the patches, could you also provide the Fixes tag and cc stable?
The patchset looks good to me.

Thanks,
Xiaolong

On 08/13, Ruifeng Wang wrote:
>Couple of changes to IXGBE vector PMD on aarch64 platform.
>An unnecessary memory barrier was identified and removed.
>Also part of processing was replaced with NEON intrinsics.
>Both of the changes will help to improve performance.
>
>Ruifeng Wang (2):
>  net/ixgbe: remove barrier in vPMD for aarch64
>  net/ixgbe: use neon intrinsics to count packet for aarch64
>
> drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 32 ++++++++++++-------------
> 1 file changed, 16 insertions(+), 16 deletions(-)
>
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-13 10:02 [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes for aarch64 Ruifeng Wang
2019-08-13 10:02 ` [dpdk-dev] [PATCH 1/2] net/ixgbe: remove barrier in vPMD " Ruifeng Wang
2019-08-13 10:02 ` [dpdk-dev] [PATCH 2/2] net/ixgbe: use neon intrinsics to count packet " Ruifeng Wang
2019-08-25  1:33 ` [dpdk-dev] [PATCH 0/2] IXGBE vPMD changes " Ye Xiaolong

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox