From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 73450A0093; Thu, 23 Jun 2022 11:38:36 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 630134067B; Thu, 23 Jun 2022 11:38:36 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 4ABA64003F for ; Thu, 23 Jun 2022 11:38:34 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25N3kfmp011053; Thu, 23 Jun 2022 02:38:31 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=G0jFTVD0h/sqyipAh1rdtWAiFdLByTP+aGv48IIMETY=; b=diK51PTXuM0QxiM/BYGp/awPk0gq0qJ3eBDkyp4pwvPuMbgjE3eids2I5JRrC6i+4uV/ fvgFB9ySjHY0bNVXxrmFgchSor1HJp7tkMPNzbgFGdNh9j8SwySDvt++HXiWIn9VXA8V g6jOpMdWe+LYZmVFBrsTbLbFzs2rjWCozkJXA/if+GAb5zGQdIFHrVhi1c+yfF24MAdy zJCwMR2DpLS1Zbsz07XYxRzHT0XNN+SXa99Bc5tphVTP1FgxKITHYRy7h/Qezo0kR7qk Ejy9ac1uLzb+jl4O2slU6nuYhr6pv/Y4IS7vQwsjKhd+aaAkjIBiP9jeLbpgYK/suynD 1w== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3guye7x86k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 23 Jun 2022 02:38:31 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 23 Jun 2022 02:38:29 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Thu, 23 Jun 2022 02:38:29 -0700 Received: from localhost.localdomain (unknown [10.28.48.107]) by maili.marvell.com (Postfix) with ESMTP id 0277D3F7076; Thu, 23 Jun 2022 02:38:26 -0700 (PDT) From: Rahul Bhansali To: , David Christensen , Ruifeng Wang , Bruce Richardson , Konstantin Ananyev CC: , , Rahul Bhansali Subject: [PATCH v3 1/2] examples/l3fwd: common packet group functionality Date: Thu, 23 Jun 2022 15:08:15 +0530 Message-ID: <20220623093816.254830-1-rbhansali@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220524095717.3875284-1-rbhansali@marvell.com> References: <20220524095717.3875284-1-rbhansali@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-GUID: pQYuxva-XoWWV0GnwLqaCLh3ZHiS4GRJ X-Proofpoint-ORIG-GUID: pQYuxva-XoWWV0GnwLqaCLh3ZHiS4GRJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-06-23_04,2022-06-22_03,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This will make the packet grouping function common, so that other examples can utilize as per need. For each architecture sse/neon/altivec, port group headers will be created under examples/common/. Signed-off-by: Rahul Bhansali --- Changes in v3: Created common port-group headers for architectures sse/neon/altivec as suggested by Konstantin. Changes in v2: New patch to address review comment. examples/common/altivec/port_group.h | 48 +++++++++ examples/common/neon/port_group.h | 50 ++++++++++ examples/common/pkt_group.h | 139 +++++++++++++++++++++++++++ examples/common/sse/port_group.h | 47 +++++++++ examples/l3fwd/Makefile | 5 +- examples/l3fwd/l3fwd.h | 2 - examples/l3fwd/l3fwd_altivec.h | 37 +------ examples/l3fwd/l3fwd_common.h | 129 +------------------------ examples/l3fwd/l3fwd_neon.h | 39 +------- examples/l3fwd/l3fwd_sse.h | 36 +------ examples/meson.build | 2 +- 11 files changed, 293 insertions(+), 241 deletions(-) create mode 100644 examples/common/altivec/port_group.h create mode 100644 examples/common/neon/port_group.h create mode 100644 examples/common/pkt_group.h create mode 100644 examples/common/sse/port_group.h diff --git a/examples/common/altivec/port_group.h b/examples/common/altivec/port_group.h new file mode 100644 index 0000000000..d96d14ca94 --- /dev/null +++ b/examples/common/altivec/port_group.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2016 Intel Corporation. + * Copyright(c) 2017 IBM Corporation. + * Copyright(C) 2022 Marvell. + */ + +#ifndef _PORT_GROUP_H_ +#define _PORT_GROUP_H_ + +#include "pkt_group.h" + +/* + * Group consecutive packets with the same destination port in bursts of 4. + * Suppose we have array of destination ports: + * dst_port[] = {a, b, c, d,, e, ... } + * dp1 should contain: , dp2: . + * We doing 4 comparisons at once and the result is 4 bit mask. + * This mask is used as an index into prebuild array of pnum values. + */ +static inline uint16_t * +port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, + __vector unsigned short dp1, + __vector unsigned short dp2) +{ + union { + uint16_t u16[FWDSTEP + 1]; + uint64_t u64; + } *pnum = (void *)pn; + + int32_t v; + + v = vec_any_eq(dp1, dp2); + + + /* update last port counter. */ + lp[0] += gptbl[v].lpv; + + /* if dest port value has changed. */ + if (v != GRPMSK) { + pnum->u64 = gptbl[v].pnum; + pnum->u16[FWDSTEP] = 1; + lp = pnum->u16 + gptbl[v].idx; + } + + return lp; +} + +#endif /* _PORT_GROUP_H_ */ diff --git a/examples/common/neon/port_group.h b/examples/common/neon/port_group.h new file mode 100644 index 0000000000..82c6ed6d73 --- /dev/null +++ b/examples/common/neon/port_group.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2016-2018 Intel Corporation. + * Copyright(c) 2017-2018 Linaro Limited. + * Copyright(C) 2022 Marvell. + */ + +#ifndef _PORT_GROUP_H_ +#define _PORT_GROUP_H_ + +#include "pkt_group.h" + +/* + * Group consecutive packets with the same destination port in bursts of 4. + * Suppose we have array of destination ports: + * dst_port[] = {a, b, c, d,, e, ... } + * dp1 should contain: , dp2: . + * We doing 4 comparisons at once and the result is 4 bit mask. + * This mask is used as an index into prebuild array of pnum values. + */ +static inline uint16_t * +port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16x8_t dp1, + uint16x8_t dp2) +{ + union { + uint16_t u16[FWDSTEP + 1]; + uint64_t u64; + } *pnum = (void *)pn; + + uint16x8_t mask = {1, 2, 4, 8, 0, 0, 0, 0}; + int32_t v; + + dp1 = vceqq_u16(dp1, dp2); + dp1 = vandq_u16(dp1, mask); + v = vaddvq_u16(dp1); + + /* update last port counter. */ + lp[0] += gptbl[v].lpv; + rte_compiler_barrier(); + + /* if dest port value has changed. */ + if (v != GRPMSK) { + pnum->u64 = gptbl[v].pnum; + pnum->u16[FWDSTEP] = 1; + lp = pnum->u16 + gptbl[v].idx; + } + + return lp; +} + +#endif /* _PORT_GROUP_H_ */ diff --git a/examples/common/pkt_group.h b/examples/common/pkt_group.h new file mode 100644 index 0000000000..8b26d9380f --- /dev/null +++ b/examples/common/pkt_group.h @@ -0,0 +1,139 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2016-2018 Intel Corporation. + * Copyright(c) 2017-2018 Linaro Limited. + * Copyright(C) 2022 Marvell. + */ + +#ifndef _PKT_GROUP_H_ +#define _PKT_GROUP_H_ + +#define FWDSTEP 4 + +/* + * Group consecutive packets with the same destination port into one burst. + * To avoid extra latency this is done together with some other packet + * processing, but after we made a final decision about packet's destination. + * To do this we maintain: + * pnum - array of number of consecutive packets with the same dest port for + * each packet in the input burst. + * lp - pointer to the last updated element in the pnum. + * dlp - dest port value lp corresponds to. + */ + +#define GRPSZ (1 << FWDSTEP) +#define GRPMSK (GRPSZ - 1) + +#define GROUP_PORT_STEP(dlp, dcp, lp, pn, idx) do { \ + if (likely((dlp) == (dcp)[(idx)])) { \ + (lp)[0]++; \ + } else { \ + (dlp) = (dcp)[idx]; \ + (lp) = (pn) + (idx); \ + (lp)[0] = 1; \ + } \ +} while (0) + +static const struct { + uint64_t pnum; /* prebuild 4 values for pnum[]. */ + int32_t idx; /* index for new last updated elemnet. */ + uint16_t lpv; /* add value to the last updated element. */ +} gptbl[GRPSZ] = { + { + /* 0: a != b, b != c, c != d, d != e */ + .pnum = UINT64_C(0x0001000100010001), + .idx = 4, + .lpv = 0, + }, + { + /* 1: a == b, b != c, c != d, d != e */ + .pnum = UINT64_C(0x0001000100010002), + .idx = 4, + .lpv = 1, + }, + { + /* 2: a != b, b == c, c != d, d != e */ + .pnum = UINT64_C(0x0001000100020001), + .idx = 4, + .lpv = 0, + }, + { + /* 3: a == b, b == c, c != d, d != e */ + .pnum = UINT64_C(0x0001000100020003), + .idx = 4, + .lpv = 2, + }, + { + /* 4: a != b, b != c, c == d, d != e */ + .pnum = UINT64_C(0x0001000200010001), + .idx = 4, + .lpv = 0, + }, + { + /* 5: a == b, b != c, c == d, d != e */ + .pnum = UINT64_C(0x0001000200010002), + .idx = 4, + .lpv = 1, + }, + { + /* 6: a != b, b == c, c == d, d != e */ + .pnum = UINT64_C(0x0001000200030001), + .idx = 4, + .lpv = 0, + }, + { + /* 7: a == b, b == c, c == d, d != e */ + .pnum = UINT64_C(0x0001000200030004), + .idx = 4, + .lpv = 3, + }, + { + /* 8: a != b, b != c, c != d, d == e */ + .pnum = UINT64_C(0x0002000100010001), + .idx = 3, + .lpv = 0, + }, + { + /* 9: a == b, b != c, c != d, d == e */ + .pnum = UINT64_C(0x0002000100010002), + .idx = 3, + .lpv = 1, + }, + { + /* 0xa: a != b, b == c, c != d, d == e */ + .pnum = UINT64_C(0x0002000100020001), + .idx = 3, + .lpv = 0, + }, + { + /* 0xb: a == b, b == c, c != d, d == e */ + .pnum = UINT64_C(0x0002000100020003), + .idx = 3, + .lpv = 2, + }, + { + /* 0xc: a != b, b != c, c == d, d == e */ + .pnum = UINT64_C(0x0002000300010001), + .idx = 2, + .lpv = 0, + }, + { + /* 0xd: a == b, b != c, c == d, d == e */ + .pnum = UINT64_C(0x0002000300010002), + .idx = 2, + .lpv = 1, + }, + { + /* 0xe: a != b, b == c, c == d, d == e */ + .pnum = UINT64_C(0x0002000300040001), + .idx = 1, + .lpv = 0, + }, + { + /* 0xf: a == b, b == c, c == d, d == e */ + .pnum = UINT64_C(0x0002000300040005), + .idx = 0, + .lpv = 4, + }, +}; + +#endif /* _PKT_GROUP_H_ */ diff --git a/examples/common/sse/port_group.h b/examples/common/sse/port_group.h new file mode 100644 index 0000000000..1ec09f8e4e --- /dev/null +++ b/examples/common/sse/port_group.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2016 Intel Corporation. + * Copyright(C) 2022 Marvell. + */ + +#ifndef _PORT_GROUP_H_ +#define _PORT_GROUP_H_ + +#include "pkt_group.h" + +/* + * Group consecutive packets with the same destination port in bursts of 4. + * Suppose we have array of destination ports: + * dst_port[] = {a, b, c, d,, e, ... } + * dp1 should contain: , dp2: . + * We doing 4 comparisons at once and the result is 4 bit mask. + * This mask is used as an index into prebuild array of pnum values. + */ +static inline uint16_t * +port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, + __m128i dp2) +{ + union { + uint16_t u16[FWDSTEP + 1]; + uint64_t u64; + } *pnum = (void *)pn; + + int32_t v; + + dp1 = _mm_cmpeq_epi16(dp1, dp2); + dp1 = _mm_unpacklo_epi16(dp1, dp1); + v = _mm_movemask_ps((__m128)dp1); + + /* update last port counter. */ + lp[0] += gptbl[v].lpv; + + /* if dest port value has changed. */ + if (v != GRPMSK) { + pnum->u64 = gptbl[v].pnum; + pnum->u16[FWDSTEP] = 1; + lp = pnum->u16 + gptbl[v].idx; + } + + return lp; +} + +#endif /* _PORT_GROUP_H_ */ diff --git a/examples/l3fwd/Makefile b/examples/l3fwd/Makefile index 8efe6378e2..8dbe85c2e6 100644 --- a/examples/l3fwd/Makefile +++ b/examples/l3fwd/Makefile @@ -22,6 +22,7 @@ shared: build/$(APP)-shared static: build/$(APP)-static ln -sf $(APP)-static build/$(APP) +INCLUDES =-I../common PC_FILE := $(shell $(PKGCONF) --path libdpdk 2>/dev/null) CFLAGS += -O3 $(shell $(PKGCONF) --cflags libdpdk) # Added for 'rte_eth_link_to_str()' @@ -38,10 +39,10 @@ endif endif build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build - $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED) + $(CC) $(CFLAGS) $(SRCS-y) $(INCLUDES) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED) build/$(APP)-static: $(SRCS-y) Makefile $(PC_FILE) | build - $(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC) + $(CC) $(CFLAGS) $(SRCS-y) $(INCLUDES) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC) build: @mkdir -p $@ diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h index 8a52c90755..40b5f32a9e 100644 --- a/examples/l3fwd/l3fwd.h +++ b/examples/l3fwd/l3fwd.h @@ -44,8 +44,6 @@ /* Used to mark destination port as 'invalid'. */ #define BAD_PORT ((uint16_t)-1) -#define FWDSTEP 4 - /* replace first 12B of the ethernet header. */ #define MASK_ETH 0x3f diff --git a/examples/l3fwd/l3fwd_altivec.h b/examples/l3fwd/l3fwd_altivec.h index 88fb41843b..87018f5dbe 100644 --- a/examples/l3fwd/l3fwd_altivec.h +++ b/examples/l3fwd/l3fwd_altivec.h @@ -8,6 +8,7 @@ #define _L3FWD_ALTIVEC_H_ #include "l3fwd.h" +#include "altivec/port_group.h" #include "l3fwd_common.h" /* @@ -82,42 +83,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) &dst_port[3], pkt[3]->packet_type); } -/* - * Group consecutive packets with the same destination port in bursts of 4. - * Suppose we have array of destination ports: - * dst_port[] = {a, b, c, d,, e, ... } - * dp1 should contain: , dp2: . - * We doing 4 comparisons at once and the result is 4 bit mask. - * This mask is used as an index into prebuild array of pnum values. - */ -static inline uint16_t * -port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, - __vector unsigned short dp1, - __vector unsigned short dp2) -{ - union { - uint16_t u16[FWDSTEP + 1]; - uint64_t u64; - } *pnum = (void *)pn; - - int32_t v; - - v = vec_any_eq(dp1, dp2); - - - /* update last port counter. */ - lp[0] += gptbl[v].lpv; - - /* if dest port value has changed. */ - if (v != GRPMSK) { - pnum->u64 = gptbl[v].pnum; - pnum->u16[FWDSTEP] = 1; - lp = pnum->u16 + gptbl[v].idx; - } - - return lp; -} - /** * Process one packet: * Update source and destination MAC addresses in the ethernet header. diff --git a/examples/l3fwd/l3fwd_common.h b/examples/l3fwd/l3fwd_common.h index 8e4c27218f..224b1c08e8 100644 --- a/examples/l3fwd/l3fwd_common.h +++ b/examples/l3fwd/l3fwd_common.h @@ -7,6 +7,8 @@ #ifndef _L3FWD_COMMON_H_ #define _L3FWD_COMMON_H_ +#include "pkt_group.h" + #ifdef DO_RFC_1812_CHECKS #define IPV4_MIN_VER_IHL 0x45 @@ -50,133 +52,6 @@ rfc1812_process(struct rte_ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t ptype) #define rfc1812_process(mb, dp, ptype) do { } while (0) #endif /* DO_RFC_1812_CHECKS */ -/* - * We group consecutive packets with the same destination port into one burst. - * To avoid extra latency this is done together with some other packet - * processing, but after we made a final decision about packet's destination. - * To do this we maintain: - * pnum - array of number of consecutive packets with the same dest port for - * each packet in the input burst. - * lp - pointer to the last updated element in the pnum. - * dlp - dest port value lp corresponds to. - */ - -#define GRPSZ (1 << FWDSTEP) -#define GRPMSK (GRPSZ - 1) - -#define GROUP_PORT_STEP(dlp, dcp, lp, pn, idx) do { \ - if (likely((dlp) == (dcp)[(idx)])) { \ - (lp)[0]++; \ - } else { \ - (dlp) = (dcp)[idx]; \ - (lp) = (pn) + (idx); \ - (lp)[0] = 1; \ - } \ -} while (0) - -static const struct { - uint64_t pnum; /* prebuild 4 values for pnum[]. */ - int32_t idx; /* index for new last updated element. */ - uint16_t lpv; /* add value to the last updated element. */ -} gptbl[GRPSZ] = { - { - /* 0: a != b, b != c, c != d, d != e */ - .pnum = UINT64_C(0x0001000100010001), - .idx = 4, - .lpv = 0, - }, - { - /* 1: a == b, b != c, c != d, d != e */ - .pnum = UINT64_C(0x0001000100010002), - .idx = 4, - .lpv = 1, - }, - { - /* 2: a != b, b == c, c != d, d != e */ - .pnum = UINT64_C(0x0001000100020001), - .idx = 4, - .lpv = 0, - }, - { - /* 3: a == b, b == c, c != d, d != e */ - .pnum = UINT64_C(0x0001000100020003), - .idx = 4, - .lpv = 2, - }, - { - /* 4: a != b, b != c, c == d, d != e */ - .pnum = UINT64_C(0x0001000200010001), - .idx = 4, - .lpv = 0, - }, - { - /* 5: a == b, b != c, c == d, d != e */ - .pnum = UINT64_C(0x0001000200010002), - .idx = 4, - .lpv = 1, - }, - { - /* 6: a != b, b == c, c == d, d != e */ - .pnum = UINT64_C(0x0001000200030001), - .idx = 4, - .lpv = 0, - }, - { - /* 7: a == b, b == c, c == d, d != e */ - .pnum = UINT64_C(0x0001000200030004), - .idx = 4, - .lpv = 3, - }, - { - /* 8: a != b, b != c, c != d, d == e */ - .pnum = UINT64_C(0x0002000100010001), - .idx = 3, - .lpv = 0, - }, - { - /* 9: a == b, b != c, c != d, d == e */ - .pnum = UINT64_C(0x0002000100010002), - .idx = 3, - .lpv = 1, - }, - { - /* 0xa: a != b, b == c, c != d, d == e */ - .pnum = UINT64_C(0x0002000100020001), - .idx = 3, - .lpv = 0, - }, - { - /* 0xb: a == b, b == c, c != d, d == e */ - .pnum = UINT64_C(0x0002000100020003), - .idx = 3, - .lpv = 2, - }, - { - /* 0xc: a != b, b != c, c == d, d == e */ - .pnum = UINT64_C(0x0002000300010001), - .idx = 2, - .lpv = 0, - }, - { - /* 0xd: a == b, b != c, c == d, d == e */ - .pnum = UINT64_C(0x0002000300010002), - .idx = 2, - .lpv = 1, - }, - { - /* 0xe: a != b, b == c, c == d, d == e */ - .pnum = UINT64_C(0x0002000300040001), - .idx = 1, - .lpv = 0, - }, - { - /* 0xf: a == b, b == c, c == d, d == e */ - .pnum = UINT64_C(0x0002000300040005), - .idx = 0, - .lpv = 4, - }, -}; - static __rte_always_inline void send_packetsx4(struct lcore_conf *qconf, uint16_t port, struct rte_mbuf *m[], uint32_t num) diff --git a/examples/l3fwd/l3fwd_neon.h b/examples/l3fwd/l3fwd_neon.h index e3d33a5229..ce515e0bc4 100644 --- a/examples/l3fwd/l3fwd_neon.h +++ b/examples/l3fwd/l3fwd_neon.h @@ -7,6 +7,7 @@ #define _L3FWD_NEON_H_ #include "l3fwd.h" +#include "neon/port_group.h" #include "l3fwd_common.h" /* @@ -62,44 +63,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) &dst_port[3], pkt[3]->packet_type); } -/* - * Group consecutive packets with the same destination port in bursts of 4. - * Suppose we have array of destination ports: - * dst_port[] = {a, b, c, d,, e, ... } - * dp1 should contain: , dp2: . - * We doing 4 comparisons at once and the result is 4 bit mask. - * This mask is used as an index into prebuild array of pnum values. - */ -static inline uint16_t * -port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16x8_t dp1, - uint16x8_t dp2) -{ - union { - uint16_t u16[FWDSTEP + 1]; - uint64_t u64; - } *pnum = (void *)pn; - - int32_t v; - uint16x8_t mask = {1, 2, 4, 8, 0, 0, 0, 0}; - - dp1 = vceqq_u16(dp1, dp2); - dp1 = vandq_u16(dp1, mask); - v = vaddvq_u16(dp1); - - /* update last port counter. */ - lp[0] += gptbl[v].lpv; - rte_compiler_barrier(); - - /* if dest port value has changed. */ - if (v != GRPMSK) { - pnum->u64 = gptbl[v].pnum; - pnum->u16[FWDSTEP] = 1; - lp = pnum->u16 + gptbl[v].idx; - } - - return lp; -} - /** * Process one packet: * Update source and destination MAC addresses in the ethernet header. diff --git a/examples/l3fwd/l3fwd_sse.h b/examples/l3fwd/l3fwd_sse.h index d5a717e18c..0f0d0323a2 100644 --- a/examples/l3fwd/l3fwd_sse.h +++ b/examples/l3fwd/l3fwd_sse.h @@ -7,6 +7,7 @@ #define _L3FWD_SSE_H_ #include "l3fwd.h" +#include "sse/port_group.h" #include "l3fwd_common.h" /* @@ -62,41 +63,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) &dst_port[3], pkt[3]->packet_type); } -/* - * Group consecutive packets with the same destination port in bursts of 4. - * Suppose we have array of destination ports: - * dst_port[] = {a, b, c, d,, e, ... } - * dp1 should contain: , dp2: . - * We doing 4 comparisons at once and the result is 4 bit mask. - * This mask is used as an index into prebuild array of pnum values. - */ -static inline uint16_t * -port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, __m128i dp1, __m128i dp2) -{ - union { - uint16_t u16[FWDSTEP + 1]; - uint64_t u64; - } *pnum = (void *)pn; - - int32_t v; - - dp1 = _mm_cmpeq_epi16(dp1, dp2); - dp1 = _mm_unpacklo_epi16(dp1, dp1); - v = _mm_movemask_ps((__m128)dp1); - - /* update last port counter. */ - lp[0] += gptbl[v].lpv; - - /* if dest port value has changed. */ - if (v != GRPMSK) { - pnum->u64 = gptbl[v].pnum; - pnum->u16[FWDSTEP] = 1; - lp = pnum->u16 + gptbl[v].idx; - } - - return lp; -} - /** * Process one packet: * Update source and destination MAC addresses in the ethernet header. diff --git a/examples/meson.build b/examples/meson.build index 78de0e1f37..81e93799f2 100644 --- a/examples/meson.build +++ b/examples/meson.build @@ -97,7 +97,7 @@ foreach example: examples ldflags = default_ldflags ext_deps = [] - includes = [include_directories(example)] + includes = [include_directories(example, 'common')] deps = ['eal', 'mempool', 'net', 'mbuf', 'ethdev', 'cmdline'] subdir(example) -- 2.25.1