DPDK patches and discussions
 help / color / mirror / Atom feed
From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	Jerin Jacob Kollanukkaran <jerinj@marvell.com>, nd <nd@arm.com>,
	Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
Cc: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>, nd <nd@arm.com>
Subject: RE: [PATCH v2 2/3] ip_frag: improve reassembly lookup performance
Date: Tue, 23 May 2023 17:58:34 +0000	[thread overview]
Message-ID: <CO6PR18MB4084CBF6E5D072BEE00C0A2BDE409@CO6PR18MB4084.namprd18.prod.outlook.com> (raw)
In-Reply-To: <DBAPR08MB5814890A343C28626EF9A86D98409@DBAPR08MB5814.eurprd08.prod.outlook.com>

> > -----Original Message-----
> > From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> > Sent: Tuesday, May 23, 2023 9:39 AM
> > To: jerinj@marvell.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; Konstantin
> Ananyev
> > <konstantin.v.ananyev@yandex.ru>
> > Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Subject: [PATCH v2 2/3] ip_frag: improve reassembly lookup performance
> >
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > Improve reassembly lookup performance by using NEON intrinsics for key
> > validation.
> What is the improvement do you see with this?

On Neoverse-N2 I see around improvement of 300-600c per flow and ~200c per insert.

Here are some test results.

Without patch:
+==========================================================================================================+
| IPV4                            | Flow Count : 32768                                                     |
+================+================+=============+=============+========================+===================+
| Fragment Order | Fragments/Flow | Outstanding | Cycles/Flow | Cycles/Fragment insert | Cycles/Reassembly |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 0           | 1244        | 919                    | 114               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 2              | 0           | 1653        | 968                    | 128               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 3              | 0           | 1379        | 503                    | 110               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 3              | 0           | 1613        | 520                    | 139               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 0           | 2030        | 199                    | 190               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 8              | 0           | 4393        | 309                    | 402               |
+================+================+=============+=============+========================+===================+
| LINEAR         | RANDOM         | 0           | 1531        | 333                    | 147               |
+================+================+=============+=============+========================+===================+
| RANDOM         | RANDOM         | 0           | 2771        | 357                    | 213               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 100         | 1228        | 920                    | 102               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 500         | 1197        | 905                    | 103               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 1000        | 1183        | 904                    | 104               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 2000        | 1153        | 921                    | 105               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 3000        | 1123        | 911                    | 111               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 100         | 829         | 193                    | 690               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 500         | 830         | 195                    | 682               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 1000        | 817         | 211                    | 690               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 2000        | 819         | 195                    | 690               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 3000        | 823         | 223                    | 676               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 2              | 0           | 1765        | 1038                   | 177               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 3              | 0           | 2588        | 699                    | 190               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 8              | 0           | 5253        | 265                    | 403               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | RANDOM         | 0           | 3398        | 493                    | 301               |
+================+================+=============+=============+========================+===================+

+==========================================================================================================+
| IPV6                            | Flow Count : 32768                                                     |
+================+================+=============+=============+========================+===================+
| Fragment Order | Fragments/Flow | Outstanding | Cycles/Flow | Cycles/Fragment insert | Cycles/Reassembly |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 0           | 1838        | 1176                   | 136               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 2              | 0           | 1892        | 1188                   | 160               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 3              | 0           | 1986        | 628                    | 143               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 3              | 0           | 2670        | 646                    | 155               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 0           | 3152        | 261                    | 271               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 8              | 0           | 5127        | 324                    | 434               |
+================+================+=============+=============+========================+===================+
| LINEAR         | RANDOM         | 0           | 2169        | 427                    | 203               |
+================+================+=============+=============+========================+===================+
| RANDOM         | RANDOM         | 0           | 3382        | 452                    | 255               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 100         | 1837        | 1164                   | 124               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 500         | 1790        | 1158                   | 126               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 1000        | 1807        | 1161                   | 138               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 2000        | 1776        | 1160                   | 138               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 3000        | 1715        | 1169                   | 144               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 100         | 1488        | 256                    | 1228              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 500         | 1461        | 300                    | 1205              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 1000        | 1457        | 303                    | 1202              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 2000        | 1456        | 305                    | 1201              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 3000        | 1460        | 308                    | 1205              |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 2              | 0           | 2145        | 1330                   | 296               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 3              | 0           | 2778        | 830                    | 330               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 8              | 0           | 5715        | 324                    | 444               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | RANDOM         | 0           | 3625        | 550                    | 363               |
+================+================+=============+=============+========================+===================+

With patch :

+==========================================================================================================+
| IPV4                            | Flow Count : 32768                                                     |
+================+================+=============+=============+========================+===================+
| Fragment Order | Fragments/Flow | Outstanding | Cycles/Flow | Cycles/Fragment insert | Cycles/Reassembly |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 0           | 950         | 717                    | 98                |
+================+================+=============+=============+========================+===================+
| RANDOM         | 2              | 0           | 1013        | 706                    | 108               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 3              | 0           | 1096        | 397                    | 115               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 3              | 0           | 1150        | 412                    | 128               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 0           | 1783        | 166                    | 202               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 8              | 0           | 3933        | 284                    | 424               |
+================+================+=============+=============+========================+===================+
| LINEAR         | RANDOM         | 0           | 1288        | 267                    | 159               |
+================+================+=============+=============+========================+===================+
| RANDOM         | RANDOM         | 0           | 2393        | 302                    | 235               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 100         | 956         | 703                    | 110               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 500         | 937         | 693                    | 112               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 1000        | 912         | 670                    | 121               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 2000        | 908         | 688                    | 122               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 3000        | 894         | 688                    | 128               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 100         | 1019        | 179                    | 865               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 500         | 1052        | 176                    | 895               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 1000        | 1130        | 180                    | 1003              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 2000        | 1143        | 180                    | 1020              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 3000        | 1130        | 181                    | 985               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 2              | 0           | 1582        | 710                    | 168               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 3              | 0           | 2162        | 446                    | 194               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 8              | 0           | 4997        | 214                    | 426               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | RANDOM         | 0           | 2921        | 341                    | 311               |
+================+================+=============+=============+========================+===================+

+==========================================================================================================+
| IPV6                            | Flow Count : 32768                                                     |
+================+================+=============+=============+========================+===================+
| Fragment Order | Fragments/Flow | Outstanding | Cycles/Flow | Cycles/Fragment insert | Cycles/Reassembly |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 0           | 1275        | 687                    | 125               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 2              | 0           | 1335        | 721                    | 169               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 3              | 0           | 1388        | 415                    | 169               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 3              | 0           | 2117        | 393                    | 163               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 0           | 2811        | 172                    | 241               |
+================+================+=============+=============+========================+===================+
| RANDOM         | 8              | 0           | 4322        | 227                    | 401               |
+================+================+=============+=============+========================+===================+
| LINEAR         | RANDOM         | 0           | 1730        | 270                    | 192               |
+================+================+=============+=============+========================+===================+
| RANDOM         | RANDOM         | 0           | 2839        | 317                    | 264               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 100         | 1152        | 662                    | 126               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 500         | 1107        | 658                    | 130               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 1000        | 1190        | 647                    | 138               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 2000        | 1086        | 635                    | 141               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 2              | 3000        | 1064        | 645                    | 150               |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 100         | 1560        | 172                    | 1296              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 500         | 1536        | 226                    | 1274              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 1000        | 1543        | 228                    | 1282              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 2000        | 1548        | 228                    | 1287              |
+================+================+=============+=============+========================+===================+
| LINEAR         | 8              | 3000        | 1541        | 227                    | 1280              |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 2              | 0           | 1585        | 769                    | 281               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 3              | 0           | 2222        | 536                    | 327               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | 8              | 0           | 4962        | 232                    | 439               |
+================+================+=============+=============+========================+===================+
| INTERLEAVED    | RANDOM         | 0           | 2998        | 373                    | 360               |
+================+================+=============+=============+========================+===================+

> 
> >
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > ---
> >  lib/ip_frag/ip_frag_internal.c   | 224 +++++++++++++++++++++++++------
> >  lib/ip_frag/ip_reassembly.h      |   6 +
> >  lib/ip_frag/rte_ip_frag_common.c |  10 ++
> >  3 files changed, 196 insertions(+), 44 deletions(-)
> >
> > diff --git a/lib/ip_frag/ip_frag_internal.c b/lib/ip_frag/ip_frag_internal.c
> index
> > 7cbef647df..de78a0ed8f 100644
> > --- a/lib/ip_frag/ip_frag_internal.c
> > +++ b/lib/ip_frag/ip_frag_internal.c
> > @@ -4,8 +4,9 @@
> >
> >  #include <stddef.h>
> >
> > -#include <rte_jhash.h>
> >  #include <rte_hash_crc.h>
> > +#include <rte_jhash.h>
> > +#include <rte_vect.h>
> >
> >  #include "ip_frag_common.h"
> >
> > @@ -280,10 +281,166 @@ ip_frag_find(struct rte_ip_frag_tbl *tbl, struct
> > rte_ip_frag_death_row *dr,
> >  	return pkt;
> >  }
> >
> > -struct ip_frag_pkt *
> > -ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
> > -	const struct ip_frag_key *key, uint64_t tms,
> > -	struct ip_frag_pkt **free, struct ip_frag_pkt **stale)
> > +static inline void
> > +ip_frag_dbg(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *p,
> > +	    uint32_t list_idx, uint32_t list_cnt) {
> > +	RTE_SET_USED(tbl);
> > +	RTE_SET_USED(list_idx);
> > +	RTE_SET_USED(list_cnt);
> > +	if (p->key.key_len == IPV4_KEYLEN)
> > +		IP_FRAG_LOG(DEBUG,
> > +			    "%s:%d:\n"
> > +			    "tbl: %p, max_entries: %u, use_entries: %u\n"
> > +			    "ipv4_frag_pkt line0: %p, index: %u from %u\n"
> > +			    "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
> > +			    __func__, __LINE__, tbl, tbl->max_entries,
> > +			    tbl->use_entries, p, list_idx, list_cnt,
> > +			    p->key.src_dst[0], p->key.id, p->start);
> > +	else
> > +		IP_FRAG_LOG(DEBUG,
> > +			    "%s:%d:\n"
> > +			    "tbl: %p, max_entries: %u, use_entries: %u\n"
> > +			    "ipv6_frag_pkt line0: %p, index: %u from %u\n"
> > +			    "key: <" IPv6_KEY_BYTES_FMT
> > +			    ", %#x>, start: %" PRIu64 "\n",
> > +			    __func__, __LINE__, tbl, tbl->max_entries,
> > +			    tbl->use_entries, p, list_idx, list_cnt,
> > +			    IPv6_KEY_BYTES(p1[i].key.src_dst), p->key.id,
> > +			    p->start);
> > +}
> > +
> > +#if defined(RTE_ARCH_ARM64)
> > +static inline struct ip_frag_pkt *
> > +ip_frag_lookup_neon(struct rte_ip_frag_tbl *tbl, const struct ip_frag_key
> > *key, uint64_t tms,
> > +		    struct ip_frag_pkt **free, struct ip_frag_pkt **stale) {
> > +	struct ip_frag_pkt *empty, *old;
> > +	struct ip_frag_pkt *p1, *p2;
> > +	uint32_t assoc, sig1, sig2;
> > +	uint64_t max_cycles;
> > +
> > +	empty = NULL;
> > +	old = NULL;
> > +
> > +	max_cycles = tbl->max_cycles;
> > +	assoc = tbl->bucket_entries;
> > +
> > +	if (tbl->last != NULL && ip_frag_key_cmp(key, &tbl->last->key) == 0)
> > +		return tbl->last;
> > +
> > +	/* different hashing methods for IPv4 and IPv6 */
> > +	if (key->key_len == IPV4_KEYLEN)
> > +		ipv4_frag_hash(key, &sig1, &sig2);
> > +	else
> > +		ipv6_frag_hash(key, &sig1, &sig2);
> > +
> > +	p1 = IP_FRAG_TBL_POS(tbl, sig1);
> > +	p2 = IP_FRAG_TBL_POS(tbl, sig2);
> > +
> > +	uint64x2_t key0, key1, key2, key3;
> > +	uint64_t vmask, zmask, ts_mask;
> > +	uint64x2_t ts0, ts1;
> > +	uint32x4_t nz_key;
> > +	uint8_t idx;
> > +	/* Bucket entries are always power of 2. */
> > +	rte_prefetch0(&p1[0].key);
> > +	rte_prefetch0(&p1[1].key);
> > +	rte_prefetch0(&p2[0].key);
> > +	rte_prefetch0(&p2[1].key);
> > +
> > +	while (assoc > 1) {
> > +		if (assoc > 2) {
> > +			rte_prefetch0(&p1[2].key);
> > +			rte_prefetch0(&p1[3].key);
> > +			rte_prefetch0(&p2[2].key);
> > +			rte_prefetch0(&p2[3].key);
> > +		}
> > +		struct ip_frag_pkt *p[] = {&p1[0], &p2[0], &p1[1], &p2[1]};
> > +		key0 = vld1q_u64(&p[0]->key.id_key_len);
> > +		key1 = vld1q_u64(&p[1]->key.id_key_len);
> > +		key2 = vld1q_u64(&p[2]->key.id_key_len);
> > +		key3 = vld1q_u64(&p[3]->key.id_key_len);
> > +
> > +		nz_key =
> > vsetq_lane_u32(vgetq_lane_u32(vreinterpretq_u32_u64(key0), 1),
> nz_key, 0);
> > +		nz_key =
> > vsetq_lane_u32(vgetq_lane_u32(vreinterpretq_u32_u64(key1), 1),
> nz_key, 1);
> > +		nz_key =
> > vsetq_lane_u32(vgetq_lane_u32(vreinterpretq_u32_u64(key2), 1),
> nz_key, 2);
> > +		nz_key =
> > vsetq_lane_u32(vgetq_lane_u32(vreinterpretq_u32_u64(key3),
> > +1), nz_key, 3);
> > +
> > +		nz_key = vceqzq_u32(nz_key);
> > +		zmask =
> > vget_lane_u64(vreinterpret_u64_u16(vshrn_n_u32(nz_key, 16)), 0);
> > +		vmask = ~zmask;
> > +
> > +		vmask &= 0x8000800080008000;
> > +		for (; vmask > 0; vmask &= vmask - 1) {
> > +			idx = __builtin_ctzll(vmask) >> 4;
> > +			if (ip_frag_key_cmp(key, &p[idx]->key) == 0)
> > +				return p[idx];
> > +		}
> > +
> > +		vmask = ~zmask;
> > +		if (zmask && empty == NULL) {
> > +			zmask &= 0x8000800080008000;
> > +			idx = __builtin_ctzll(zmask) >> 4;
> > +			empty = p[idx];
> > +		}
> > +
> > +		if (vmask && old == NULL) {
> > +			const uint64x2_t max_cyc =
> > vdupq_n_u64(max_cycles);
> > +			const uint64x2_t cur_cyc = vdupq_n_u64(tms);
> > +
> > +			ts0 = vsetq_lane_u64(vgetq_lane_u64(key0, 1), ts0,
> > 0);
> > +			ts0 = vsetq_lane_u64(vgetq_lane_u64(key1, 1), ts0,
> > 1);
> > +			ts1 = vsetq_lane_u64(vgetq_lane_u64(key2, 1), ts1,
> > 0);
> > +			ts1 = vsetq_lane_u64(vgetq_lane_u64(key3, 1), ts1,
> > 1);
> > +
> > +			ts0 = vcgtq_u64(cur_cyc, vaddq_u64(ts0, max_cyc));
> > +			ts1 = vcgtq_u64(cur_cyc, vaddq_u64(ts1, max_cyc));
> > +
> > +			ts_mask =
> > vget_lane_u64(vreinterpret_u64_u16(vshrn_n_u32(
> > +
> > 	vuzp1q_u32(vreinterpretq_u32_u64(ts0),
> > +
> > vreinterpretq_u32_u64(ts1)),
> > +							16)),
> > +						0);
> > +			vmask &= 0x8000800080008000;
> > +			ts_mask &= vmask;
> > +			if (ts_mask) {
> > +				idx = __builtin_ctzll(ts_mask) >> 4;
> > +				old = p[idx];
> > +			}
> > +		}
> > +		p1 += 2;
> > +		p2 += 2;
> > +		assoc -= 4;
> > +	}
> > +	while (assoc) {
> > +		if (ip_frag_key_cmp(key, &p1->key) == 0)
> > +			return p1;
> > +		else if (ip_frag_key_is_empty(&p1->key))
> > +			empty = (empty == NULL) ? p1 : empty;
> > +		else if (max_cycles + p1->start < tms)
> > +			old = (old == NULL) ? p1 : old;
> > +
> > +		if (ip_frag_key_cmp(key, &p2->key) == 0)
> > +			return p2;
> > +		else if (ip_frag_key_is_empty(&p2->key))
> > +			empty = (empty == NULL) ? p2 : empty;
> > +		else if (max_cycles + p2->start < tms)
> > +			old = (old == NULL) ? p2 : old;
> > +		p1++;
> > +		p2++;
> > +		assoc--;
> > +	}
> > +
> > +	*free = empty;
> > +	*stale = old;
> > +	return NULL;
> > +}
> > +#endif
> > +
> > +static struct ip_frag_pkt *
> > +ip_frag_lookup_scalar(struct rte_ip_frag_tbl *tbl, const struct
> ip_frag_key
> > *key, uint64_t tms,
> > +		      struct ip_frag_pkt **free, struct ip_frag_pkt **stale)
> >  {
> >  	struct ip_frag_pkt *p1, *p2;
> >  	struct ip_frag_pkt *empty, *old;
> > @@ -309,25 +466,7 @@ ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
> >  	p2 = IP_FRAG_TBL_POS(tbl, sig2);
> >
> >  	for (i = 0; i != assoc; i++) {
> > -		if (p1->key.key_len == IPV4_KEYLEN)
> > -			IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> > -					"tbl: %p, max_entries: %u,
> > use_entries: %u\n"
> > -					"ipv4_frag_pkt line0: %p, index: %u
> > from %u\n"
> > -			"key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
> > -					__func__, __LINE__,
> > -					tbl, tbl->max_entries, tbl-
> >use_entries,
> > -					p1, i, assoc,
> > -			p1[i].key.src_dst[0], p1[i].key.id, p1[i].start);
> > -		else
> > -			IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> > -					"tbl: %p, max_entries: %u,
> > use_entries: %u\n"
> > -					"ipv6_frag_pkt line0: %p, index: %u
> > from %u\n"
> > -			"key: <" IPv6_KEY_BYTES_FMT ", %#x>, start: %"
> > PRIu64 "\n",
> > -					__func__, __LINE__,
> > -					tbl, tbl->max_entries, tbl-
> >use_entries,
> > -					p1, i, assoc,
> > -			IPv6_KEY_BYTES(p1[i].key.src_dst), p1[i].key.id,
> > p1[i].start);
> > -
> > +		ip_frag_dbg(tbl, &p1[i], i, assoc);
> >  		if (ip_frag_key_cmp(key, &p1[i].key) == 0)
> >  			return p1 + i;
> >  		else if (ip_frag_key_is_empty(&p1[i].key))
> > @@ -335,29 +474,11 @@ ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
> >  		else if (max_cycles + p1[i].start < tms)
> >  			old = (old == NULL) ? (p1 + i) : old;
> >
> > -		if (p2->key.key_len == IPV4_KEYLEN)
> > -			IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> > -					"tbl: %p, max_entries: %u,
> > use_entries: %u\n"
> > -					"ipv4_frag_pkt line1: %p, index: %u
> > from %u\n"
> > -			"key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
> > -					__func__, __LINE__,
> > -					tbl, tbl->max_entries, tbl-
> >use_entries,
> > -					p2, i, assoc,
> > -			p2[i].key.src_dst[0], p2[i].key.id, p2[i].start);
> > -		else
> > -			IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> > -					"tbl: %p, max_entries: %u,
> > use_entries: %u\n"
> > -					"ipv6_frag_pkt line1: %p, index: %u
> > from %u\n"
> > -			"key: <" IPv6_KEY_BYTES_FMT ", %#x>, start: %"
> > PRIu64 "\n",
> > -					__func__, __LINE__,
> > -					tbl, tbl->max_entries, tbl-
> >use_entries,
> > -					p2, i, assoc,
> > -			IPv6_KEY_BYTES(p2[i].key.src_dst), p2[i].key.id,
> > p2[i].start);
> > -
> > +		ip_frag_dbg(tbl, &p2[i], i, assoc);
> >  		if (ip_frag_key_cmp(key, &p2[i].key) == 0)
> >  			return p2 + i;
> >  		else if (ip_frag_key_is_empty(&p2[i].key))
> > -			empty = (empty == NULL) ?( p2 + i) : empty;
> > +			empty = (empty == NULL) ? (p2 + i) : empty;
> >  		else if (max_cycles + p2[i].start < tms)
> >  			old = (old == NULL) ? (p2 + i) : old;
> >  	}
> > @@ -366,3 +487,18 @@ ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
> >  	*stale = old;
> >  	return NULL;
> >  }
> > +
> > +struct ip_frag_pkt *
> > +ip_frag_lookup(struct rte_ip_frag_tbl *tbl, const struct ip_frag_key *key,
> > uint64_t tms,
> > +	       struct ip_frag_pkt **free, struct ip_frag_pkt **stale) {
> > +	switch (tbl->lookup_fn) {
> > +#if defined(RTE_ARCH_ARM64)
> > +	case REASSEMBLY_LOOKUP_NEON:
> > +		return ip_frag_lookup_neon(tbl, key, tms, free, stale);
> #endif
> > +	case REASSEMBLY_LOOKUP_SCALAR:
> > +	default:
> > +		return ip_frag_lookup_scalar(tbl, key, tms, free, stale);
> > +	}
> > +}
> > diff --git a/lib/ip_frag/ip_reassembly.h b/lib/ip_frag/ip_reassembly.h index
> > ef9d8c0d75..049437ae32 100644
> > --- a/lib/ip_frag/ip_reassembly.h
> > +++ b/lib/ip_frag/ip_reassembly.h
> > @@ -12,6 +12,11 @@
> >
> >  #include <rte_ip_frag.h>
> >
> > +enum ip_frag_lookup_func {
> > +	REASSEMBLY_LOOKUP_SCALAR = 0,
> > +	REASSEMBLY_LOOKUP_NEON,
> > +};
> > +
> >  enum {
> >  	IP_LAST_FRAG_IDX,    /* index of last fragment */
> >  	IP_FIRST_FRAG_IDX,   /* index of first fragment */
> > @@ -83,6 +88,7 @@ struct rte_ip_frag_tbl {
> >  	struct ip_frag_pkt *last;     /* last used entry. */
> >  	struct ip_pkt_list lru;       /* LRU list for table entries. */
> >  	struct ip_frag_tbl_stat stat; /* statistics counters. */
> > +	enum ip_frag_lookup_func lookup_fn;	/* hash table lookup
> function.
> > */
> >  	__extension__ struct ip_frag_pkt pkt[]; /* hash table. */  };
> >
> > diff --git a/lib/ip_frag/rte_ip_frag_common.c
> > b/lib/ip_frag/rte_ip_frag_common.c
> > index c1de2e81b6..ef3c104e45 100644
> > --- a/lib/ip_frag/rte_ip_frag_common.c
> > +++ b/lib/ip_frag/rte_ip_frag_common.c
> > @@ -5,7 +5,9 @@
> >  #include <stddef.h>
> >  #include <stdio.h>
> >
> > +#include <rte_cpuflags.h>
> >  #include <rte_log.h>
> > +#include <rte_vect.h>
> >
> >  #include "ip_frag_common.h"
> >
> > @@ -75,6 +77,14 @@ rte_ip_frag_table_create(uint32_t bucket_num,
> > uint32_t bucket_entries,
> >  	tbl->bucket_entries = bucket_entries;
> >  	tbl->entry_mask = (tbl->nb_entries - 1) & ~(tbl->bucket_entries  - 1);
> >
> > +#if defined(RTE_ARCH_ARM64)
> > +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON) &&
> > +	    rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128)
> > +		tbl->lookup_fn = REASSEMBLY_LOOKUP_NEON;
> > +	else
> > +#endif
> > +		tbl->lookup_fn = REASSEMBLY_LOOKUP_SCALAR;
> > +
> >  	TAILQ_INIT(&(tbl->lru));
> >  	return tbl;
> >  }
> > --
> > 2.25.1


  reply	other threads:[~2023-05-23 17:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 12:54 [PATCH 1/3] ip_frag: optimize key compare and hash generation pbhagavatula
2023-05-23 12:54 ` [PATCH 2/3] ip_frag: improve reassembly lookup performance pbhagavatula
2023-05-23 12:54 ` [PATCH 3/3] test: add reassembly perf test pbhagavatula
2023-05-23 14:39 ` [PATCH v2 1/3] ip_frag: optimize key compare and hash generation pbhagavatula
2023-05-23 14:39   ` [PATCH v2 2/3] ip_frag: improve reassembly lookup performance pbhagavatula
2023-05-23 16:22     ` Honnappa Nagarahalli
2023-05-23 17:58       ` Pavan Nikhilesh Bhagavatula [this message]
2023-05-23 22:23         ` Pavan Nikhilesh Bhagavatula
2023-05-23 22:30     ` Stephen Hemminger
2023-05-29 13:17       ` [EXT] " Pavan Nikhilesh Bhagavatula
2023-05-23 14:39   ` [PATCH v2 3/3] test: add reassembly perf test pbhagavatula
2023-05-29 14:55   ` [PATCH v3 1/2] ip_frag: optimize key compare and hash generation pbhagavatula
2023-05-29 14:55     ` [PATCH v3 2/2] test: add reassembly perf test pbhagavatula
2023-05-30 10:51       ` [EXT] " Amit Prakash Shukla
2023-05-30  3:09     ` [PATCH v3 1/2] ip_frag: optimize key compare and hash generation Stephen Hemminger
2023-05-30 17:50       ` [EXT] " Pavan Nikhilesh Bhagavatula
2023-05-30  7:44     ` Ruifeng Wang
2023-05-31  4:26     ` [PATCH v4 " pbhagavatula
2023-05-31  4:26       ` [PATCH v4 2/2] test: add reassembly perf test pbhagavatula
2023-06-05 11:12         ` Константин Ананьев
2023-06-02 17:01       ` [PATCH v5 1/2] ip_frag: optimize key compare and hash generation pbhagavatula
2023-06-02 17:01         ` [PATCH v5 2/2] test: add reassembly perf test pbhagavatula
2023-06-27  9:36           ` Konstantin Ananyev
2023-06-05 11:09         ` [PATCH v5 1/2] ip_frag: optimize key compare and hash generation Константин Ананьев
2023-06-27  9:23         ` Konstantin Ananyev
2023-07-11 16:52         ` [PATCH v6 " pbhagavatula
2023-07-11 16:52           ` [PATCH v6 2/2] test: add reassembly perf test pbhagavatula
2023-07-12 14:59           ` [PATCH v6 1/2] ip_frag: optimize key compare and hash generation Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO6PR18MB4084CBF6E5D072BEE00C0A2BDE409@CO6PR18MB4084.namprd18.prod.outlook.com \
    --to=pbhagavatula@marvell.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).