* [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 01/13] ip_frag: Moving fragmentation/reassembly headers into a separate library Anatoly Burakov
` (15 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
This patch is mostly a refactoring of the fragmentation/reassembly code
that was already present in sample applications. Also, support for
IPv6 is added as well (although somewhat limited, since full IPv6
support would require a proper IP stack).
Also, renamed ipv4_frag app to ip_fragmentation, and added IPv6
support to both ip_fragmentation and ip_reassembly app (also
simplifying them in the process - e.g. dropping support for
exact match and using only LPM/LPM6 for routing).
Anatoly Burakov (13):
ip_frag: Moving fragmentation/reassembly headers into a separate
library
Refactored IPv4 fragmentation into a proper library
Fixing issues reported by checkpatch
ip_frag: new internal common header
ip_frag: removed unneeded check and macro
ip_frag: renaming structures in fragmentation table to be more generic
ip_frag: refactored reassembly code and made it a proper library
ip_frag: renamed ipv4 frag function
ip_frag: added IPv6 fragmentation support
examples: renamed ipv4_frag example app to ip_fragmentation
example: overhaul of ip_fragmentation example app
ip_frag: add support for IPv6 reassembly
examples: overhaul of ip_reassembly app
config/common_bsdapp | 7 +
config/common_linuxapp | 7 +
examples/{ipv4_frag => ip_fragmentation}/Makefile | 2 +-
examples/{ipv4_frag => ip_fragmentation}/main.c | 536 ++++++--
examples/{ipv4_frag => ip_fragmentation}/main.h | 0
examples/ip_reassembly/Makefile | 1 -
examples/ip_reassembly/ipv4_frag_tbl.h | 400 ------
examples/ip_reassembly/ipv4_rsmbl.h | 425 ------
examples/ip_reassembly/main.c | 1348 +++++++-------------
lib/Makefile | 1 +
lib/librte_ip_frag/Makefile | 55 +
lib/librte_ip_frag/ip_frag_common.h | 193 +++
lib/librte_ip_frag/ip_frag_internal.c | 421 ++++++
lib/librte_ip_frag/rte_ip_frag.h | 344 +++++
lib/librte_ip_frag/rte_ip_frag_common.c | 142 +++
.../librte_ip_frag/rte_ipv4_fragmentation.c | 91 +-
lib/librte_ip_frag/rte_ipv4_reassembly.c | 191 +++
lib/librte_ip_frag/rte_ipv6_fragmentation.c | 219 ++++
lib/librte_ip_frag/rte_ipv6_reassembly.c | 218 ++++
mk/rte.app.mk | 4 +
20 files changed, 2668 insertions(+), 1937 deletions(-)
rename examples/{ipv4_frag => ip_fragmentation}/Makefile (99%)
rename examples/{ipv4_frag => ip_fragmentation}/main.c (57%)
rename examples/{ipv4_frag => ip_fragmentation}/main.h (100%)
delete mode 100644 examples/ip_reassembly/ipv4_frag_tbl.h
delete mode 100644 examples/ip_reassembly/ipv4_rsmbl.h
create mode 100644 lib/librte_ip_frag/Makefile
create mode 100644 lib/librte_ip_frag/ip_frag_common.h
create mode 100644 lib/librte_ip_frag/ip_frag_internal.c
create mode 100644 lib/librte_ip_frag/rte_ip_frag.h
create mode 100644 lib/librte_ip_frag/rte_ip_frag_common.c
rename examples/ipv4_frag/rte_ipv4_frag.h => lib/librte_ip_frag/rte_ipv4_fragmentation.c (80%)
create mode 100644 lib/librte_ip_frag/rte_ipv4_reassembly.c
create mode 100644 lib/librte_ip_frag/rte_ipv6_fragmentation.c
create mode 100644 lib/librte_ip_frag/rte_ipv6_reassembly.c
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE ***
@ 2014-05-28 17:32 Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Anatoly Burakov
` (16 more replies)
0 siblings, 17 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
*** BLURB HERE ***
Anatoly Burakov (13):
ip_frag: Moving fragmentation/reassembly headers into a separate
library
Refactored IPv4 fragmentation into a proper library
Fixing issues reported by checkpatch
ip_frag: new internal common header
ip_frag: removed unneeded check and macro
ip_frag: renaming structures in fragmentation table to be more generic
ip_frag: refactored reassembly code and made it a proper library
ip_frag: renamed ipv4 frag function
ip_frag: added IPv6 fragmentation support
examples: renamed ipv4_frag example app to ip_fragmentation
example: overhaul of ip_fragmentation example app
ip_frag: add support for IPv6 reassembly
examples: overhaul of ip_reassembly app
config/common_bsdapp | 7 +
config/common_linuxapp | 7 +
examples/{ipv4_frag => ip_fragmentation}/Makefile | 2 +-
examples/{ipv4_frag => ip_fragmentation}/main.c | 536 ++++++--
examples/{ipv4_frag => ip_fragmentation}/main.h | 0
examples/ip_reassembly/Makefile | 1 -
examples/ip_reassembly/ipv4_frag_tbl.h | 400 ------
examples/ip_reassembly/ipv4_rsmbl.h | 425 ------
examples/ip_reassembly/main.c | 1348 +++++++-------------
lib/Makefile | 1 +
lib/librte_ip_frag/Makefile | 55 +
lib/librte_ip_frag/ip_frag_common.h | 193 +++
lib/librte_ip_frag/ip_frag_internal.c | 421 ++++++
lib/librte_ip_frag/rte_ip_frag.h | 344 +++++
lib/librte_ip_frag/rte_ip_frag_common.c | 142 +++
.../librte_ip_frag/rte_ipv4_fragmentation.c | 91 +-
lib/librte_ip_frag/rte_ipv4_reassembly.c | 191 +++
lib/librte_ip_frag/rte_ipv6_fragmentation.c | 219 ++++
lib/librte_ip_frag/rte_ipv6_reassembly.c | 218 ++++
mk/rte.app.mk | 4 +
20 files changed, 2668 insertions(+), 1937 deletions(-)
rename examples/{ipv4_frag => ip_fragmentation}/Makefile (99%)
rename examples/{ipv4_frag => ip_fragmentation}/main.c (57%)
rename examples/{ipv4_frag => ip_fragmentation}/main.h (100%)
delete mode 100644 examples/ip_reassembly/ipv4_frag_tbl.h
delete mode 100644 examples/ip_reassembly/ipv4_rsmbl.h
create mode 100644 lib/librte_ip_frag/Makefile
create mode 100644 lib/librte_ip_frag/ip_frag_common.h
create mode 100644 lib/librte_ip_frag/ip_frag_internal.c
create mode 100644 lib/librte_ip_frag/rte_ip_frag.h
create mode 100644 lib/librte_ip_frag/rte_ip_frag_common.c
rename examples/ipv4_frag/rte_ipv4_frag.h => lib/librte_ip_frag/rte_ipv4_fragmentation.c (80%)
create mode 100644 lib/librte_ip_frag/rte_ipv4_reassembly.c
create mode 100644 lib/librte_ip_frag/rte_ipv6_fragmentation.c
create mode 100644 lib/librte_ip_frag/rte_ipv6_reassembly.c
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 01/13] ip_frag: Moving fragmentation/reassembly headers into a separate library
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 02/13] Refactored IPv4 fragmentation into a proper library Anatoly Burakov
` (14 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
config/common_bsdapp | 5 +++
config/common_linuxapp | 5 +++
examples/ip_reassembly/main.c | 2 +-
examples/ipv4_frag/main.c | 2 +-
lib/Makefile | 1 +
lib/librte_ip_frag/Makefile | 42 ++++++++++++++++++++++
.../librte_ip_frag}/ipv4_frag_tbl.h | 0
.../librte_ip_frag/rte_ip_frag.h | 0
.../librte_ip_frag/rte_ipv4_rsmbl.h | 0
9 files changed, 55 insertions(+), 2 deletions(-)
create mode 100644 lib/librte_ip_frag/Makefile
rename {examples/ip_reassembly => lib/librte_ip_frag}/ipv4_frag_tbl.h (100%)
rename examples/ipv4_frag/rte_ipv4_frag.h => lib/librte_ip_frag/rte_ip_frag.h (100%)
rename examples/ip_reassembly/ipv4_rsmbl.h => lib/librte_ip_frag/rte_ipv4_rsmbl.h (100%)
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 2cc7b80..d30802e 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -258,6 +258,11 @@ CONFIG_RTE_MAX_LCORE_FREQS=64
CONFIG_RTE_LIBRTE_NET=y
#
+# Compile librte_net
+#
+CONFIG_RTE_LIBRTE_IP_FRAG=y
+
+#
# Compile librte_meter
#
CONFIG_RTE_LIBRTE_METER=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 62619c6..074d961 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -285,6 +285,11 @@ CONFIG_RTE_MAX_LCORE_FREQS=64
CONFIG_RTE_LIBRTE_NET=y
#
+# Compile librte_net
+#
+CONFIG_RTE_LIBRTE_IP_FRAG=y
+
+#
# Compile librte_meter
#
CONFIG_RTE_LIBRTE_METER=y
diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index bafa8d9..42ade5c 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -94,7 +94,7 @@
#define MAX_PKT_BURST 32
-#include "ipv4_rsmbl.h"
+#include "rte_ipv4_rsmbl.h"
#ifndef IPv6_BYTES
#define IPv6_BYTES_FMT "%02x%02x:%02x%02x:%02x%02x:%02x%02x:"\
diff --git a/examples/ipv4_frag/main.c b/examples/ipv4_frag/main.c
index 329f2ce..3c2c960 100644
--- a/examples/ipv4_frag/main.c
+++ b/examples/ipv4_frag/main.c
@@ -71,7 +71,7 @@
#include <rte_lpm.h>
#include <rte_ip.h>
-#include "rte_ipv4_frag.h"
+#include "rte_ip_frag.h"
#include "main.h"
#define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1
diff --git a/lib/Makefile b/lib/Makefile
index b92b392..99f60d0 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -55,6 +55,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
DIRS-$(CONFIG_RTE_LIBRTE_KVARGS) += librte_kvargs
+DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
new file mode 100644
index 0000000..3054c1f
--- /dev/null
+++ b/lib/librte_ip_frag/Makefile
@@ -0,0 +1,42 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += rte_ip_frag.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += ipv4_frag_tbl.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += rte_ipv4_rsmbl.h
+
+# this library depends on rte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += lib/librte_mempool lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/examples/ip_reassembly/ipv4_frag_tbl.h b/lib/librte_ip_frag/ipv4_frag_tbl.h
similarity index 100%
rename from examples/ip_reassembly/ipv4_frag_tbl.h
rename to lib/librte_ip_frag/ipv4_frag_tbl.h
diff --git a/examples/ipv4_frag/rte_ipv4_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
similarity index 100%
rename from examples/ipv4_frag/rte_ipv4_frag.h
rename to lib/librte_ip_frag/rte_ip_frag.h
diff --git a/examples/ip_reassembly/ipv4_rsmbl.h b/lib/librte_ip_frag/rte_ipv4_rsmbl.h
similarity index 100%
rename from examples/ip_reassembly/ipv4_rsmbl.h
rename to lib/librte_ip_frag/rte_ipv4_rsmbl.h
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 02/13] Refactored IPv4 fragmentation into a proper library
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 01/13] ip_frag: Moving fragmentation/reassembly headers into a separate library Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 03/13] Fixing issues reported by checkpatch Anatoly Burakov
` (13 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
examples/ipv4_frag/main.c | 11 ++
lib/librte_ip_frag/Makefile | 9 ++
lib/librte_ip_frag/rte_ip_frag.h | 186 +---------------------
lib/librte_ip_frag/rte_ipv4_fragmentation.c | 239 ++++++++++++++++++++++++++++
mk/rte.app.mk | 4 +
5 files changed, 267 insertions(+), 182 deletions(-)
create mode 100644 lib/librte_ip_frag/rte_ipv4_fragmentation.c
diff --git a/examples/ipv4_frag/main.c b/examples/ipv4_frag/main.c
index 3c2c960..05a26b1 100644
--- a/examples/ipv4_frag/main.c
+++ b/examples/ipv4_frag/main.c
@@ -74,6 +74,17 @@
#include "rte_ip_frag.h"
#include "main.h"
+/*
+ * Default byte size for the IPv4 Maximum Transfer Unit (MTU).
+ * This value includes the size of IPv4 header.
+ */
+#define IPV4_MTU_DEFAULT ETHER_MTU
+
+/*
+ * Default payload in bytes for the IPv4 packet.
+ */
+#define IPV4_DEFAULT_PAYLOAD (IPV4_MTU_DEFAULT - sizeof(struct ipv4_hdr))
+
#define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1
#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
index 3054c1f..13a83b1 100644
--- a/lib/librte_ip_frag/Makefile
+++ b/lib/librte_ip_frag/Makefile
@@ -31,6 +31,15 @@
include $(RTE_SDK)/mk/rte.vars.mk
+# library name
+LIB = librte_ip_frag.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
+
# install this header file
SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += rte_ip_frag.h
SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += ipv4_frag_tbl.h
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index 84fa9c9..0cf3878 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -31,9 +31,8 @@
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
-#ifndef __INCLUDE_RTE_IPV4_FRAG_H__
-#define __INCLUDE_RTE_IPV4_FRAG_H__
-#include <rte_ip.h>
+#ifndef _RTE_IP_FRAG_H__
+#define _RTE_IP_FRAG_H__
/**
* @file
@@ -43,67 +42,6 @@
*
*/
-/*
- * Default byte size for the IPv4 Maximum Transfer Unit (MTU).
- * This value includes the size of IPv4 header.
- */
-#define IPV4_MTU_DEFAULT ETHER_MTU
-
-/*
- * Default payload in bytes for the IPv4 packet.
- */
-#define IPV4_DEFAULT_PAYLOAD (IPV4_MTU_DEFAULT - sizeof(struct ipv4_hdr))
-
-/*
- * MAX number of fragments per packet allowed.
- */
-#define IPV4_MAX_FRAGS_PER_PACKET 0x80
-
-
-/* Debug on/off */
-#ifdef RTE_IPV4_FRAG_DEBUG
-
-#define RTE_IPV4_FRAG_ASSERT(exp) \
-if (!(exp)) { \
- rte_panic("function %s, line%d\tassert \"" #exp "\" failed\n", \
- __func__, __LINE__); \
-}
-
-#else /*RTE_IPV4_FRAG_DEBUG*/
-
-#define RTE_IPV4_FRAG_ASSERT(exp) do { } while(0)
-
-#endif /*RTE_IPV4_FRAG_DEBUG*/
-
-/* Fragment Offset */
-#define IPV4_HDR_DF_SHIFT 14
-#define IPV4_HDR_MF_SHIFT 13
-#define IPV4_HDR_FO_SHIFT 3
-
-#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
-#define IPV4_HDR_MF_MASK (1 << IPV4_HDR_MF_SHIFT)
-
-#define IPV4_HDR_FO_MASK ((1 << IPV4_HDR_FO_SHIFT) - 1)
-
-static inline void __fill_ipv4hdr_frag(struct ipv4_hdr *dst,
- const struct ipv4_hdr *src, uint16_t len, uint16_t fofs,
- uint16_t dofs, uint32_t mf)
-{
- rte_memcpy(dst, src, sizeof(*dst));
- fofs = (uint16_t)(fofs + (dofs >> IPV4_HDR_FO_SHIFT));
- fofs = (uint16_t)(fofs | mf << IPV4_HDR_MF_SHIFT);
- dst->fragment_offset = rte_cpu_to_be_16(fofs);
- dst->total_length = rte_cpu_to_be_16(len);
- dst->hdr_checksum = 0;
-}
-
-static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
-{
- uint32_t i;
- for (i = 0; i != num; i++)
- rte_pktmbuf_free(mb[i]);
-}
-
/**
* IPv4 fragmentation.
*
@@ -125,127 +63,11 @@ static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
* in the pkts_out array.
* Otherwise - (-1) * <errno>.
*/
-static inline int32_t rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
+int32_t rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
struct rte_mbuf **pkts_out,
uint16_t nb_pkts_out,
uint16_t mtu_size,
struct rte_mempool *pool_direct,
- struct rte_mempool *pool_indirect)
-{
- struct rte_mbuf *in_seg = NULL;
- struct ipv4_hdr *in_hdr;
- uint32_t out_pkt_pos, in_seg_data_pos;
- uint32_t more_in_segs;
- uint16_t fragment_offset, flag_offset, frag_size;
-
- frag_size = (uint16_t)(mtu_size - sizeof(struct ipv4_hdr));
-
- /* Fragment size should be a multiply of 8. */
- RTE_IPV4_FRAG_ASSERT((frag_size & IPV4_HDR_FO_MASK) == 0);
-
- /* Fragment size should be a multiply of 8. */
- RTE_IPV4_FRAG_ASSERT(IPV4_MAX_FRAGS_PER_PACKET * frag_size >=
- (uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv4_hdr)));
-
- in_hdr = (struct ipv4_hdr*) pkt_in->pkt.data;
- flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);
-
- /* If Don't Fragment flag is set */
- if (unlikely ((flag_offset & IPV4_HDR_DF_MASK) != 0))
- return (-ENOTSUP);
-
- /* Check that pkts_out is big enough to hold all fragments */
- if (unlikely (frag_size * nb_pkts_out <
- (uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv4_hdr))))
- return (-EINVAL);
-
- in_seg = pkt_in;
- in_seg_data_pos = sizeof(struct ipv4_hdr);
- out_pkt_pos = 0;
- fragment_offset = 0;
-
- more_in_segs = 1;
- while (likely(more_in_segs)) {
- struct rte_mbuf *out_pkt = NULL, *out_seg_prev = NULL;
- uint32_t more_out_segs;
- struct ipv4_hdr *out_hdr;
-
- /* Allocate direct buffer */
- out_pkt = rte_pktmbuf_alloc(pool_direct);
- if (unlikely(out_pkt == NULL)) {
- __free_fragments(pkts_out, out_pkt_pos);
- return (-ENOMEM);
- }
-
- /* Reserve space for the IP header that will be built later */
- out_pkt->pkt.data_len = sizeof(struct ipv4_hdr);
- out_pkt->pkt.pkt_len = sizeof(struct ipv4_hdr);
-
- out_seg_prev = out_pkt;
- more_out_segs = 1;
- while (likely(more_out_segs && more_in_segs)) {
- struct rte_mbuf *out_seg = NULL;
- uint32_t len;
-
- /* Allocate indirect buffer */
- out_seg = rte_pktmbuf_alloc(pool_indirect);
- if (unlikely(out_seg == NULL)) {
- rte_pktmbuf_free(out_pkt);
- __free_fragments(pkts_out, out_pkt_pos);
- return (-ENOMEM);
- }
- out_seg_prev->pkt.next = out_seg;
- out_seg_prev = out_seg;
-
- /* Prepare indirect buffer */
- rte_pktmbuf_attach(out_seg, in_seg);
- len = mtu_size - out_pkt->pkt.pkt_len;
- if (len > (in_seg->pkt.data_len - in_seg_data_pos)) {
- len = in_seg->pkt.data_len - in_seg_data_pos;
- }
- out_seg->pkt.data = (char*) in_seg->pkt.data + (uint16_t)in_seg_data_pos;
- out_seg->pkt.data_len = (uint16_t)len;
- out_pkt->pkt.pkt_len = (uint16_t)(len +
- out_pkt->pkt.pkt_len);
- out_pkt->pkt.nb_segs += 1;
- in_seg_data_pos += len;
-
- /* Current output packet (i.e. fragment) done ? */
- if (unlikely(out_pkt->pkt.pkt_len >= mtu_size)) {
- more_out_segs = 0;
- }
-
- /* Current input segment done ? */
- if (unlikely(in_seg_data_pos == in_seg->pkt.data_len)) {
- in_seg = in_seg->pkt.next;
- in_seg_data_pos = 0;
-
- if (unlikely(in_seg == NULL)) {
- more_in_segs = 0;
- }
- }
- }
-
- /* Build the IP header */
-
- out_hdr = (struct ipv4_hdr*) out_pkt->pkt.data;
-
- __fill_ipv4hdr_frag(out_hdr, in_hdr,
- (uint16_t)out_pkt->pkt.pkt_len,
- flag_offset, fragment_offset, more_in_segs);
-
- fragment_offset = (uint16_t)(fragment_offset +
- out_pkt->pkt.pkt_len - sizeof(struct ipv4_hdr));
-
- out_pkt->ol_flags |= PKT_TX_IP_CKSUM;
- out_pkt->pkt.vlan_macip.f.l3_len = sizeof(struct ipv4_hdr);
-
- /* Write the fragment to the output list */
- pkts_out[out_pkt_pos] = out_pkt;
- out_pkt_pos ++;
- }
-
- return (out_pkt_pos);
-}
+ struct rte_mempool *pool_indirect);
#endif
diff --git a/lib/librte_ip_frag/rte_ipv4_fragmentation.c b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
new file mode 100644
index 0000000..2d33a7b
--- /dev/null
+++ b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
@@ -0,0 +1,239 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <errno.h>
+
+#include <rte_byteorder.h>
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+#include <rte_debug.h>
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+
+#include "rte_ip_frag.h"
+
+/*
+ * MAX number of fragments per packet allowed.
+ */
+#define IPV4_MAX_FRAGS_PER_PACKET 0x80
+
+/* Debug on/off */
+#ifdef RTE_IPV4_FRAG_DEBUG
+
+#define RTE_IPV4_FRAG_ASSERT(exp) \
+if (!(exp)) { \
+ rte_panic("function %s, line%d\tassert \"" #exp "\" failed\n", \
+ __func__, __LINE__); \
+}
+
+#else /*RTE_IPV4_FRAG_DEBUG*/
+
+#define RTE_IPV4_FRAG_ASSERT(exp) do { } while(0)
+
+#endif /*RTE_IPV4_FRAG_DEBUG*/
+
+/* Fragment Offset */
+#define IPV4_HDR_DF_SHIFT 14
+#define IPV4_HDR_MF_SHIFT 13
+#define IPV4_HDR_FO_SHIFT 3
+
+#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
+#define IPV4_HDR_MF_MASK (1 << IPV4_HDR_MF_SHIFT)
+
+#define IPV4_HDR_FO_MASK ((1 << IPV4_HDR_FO_SHIFT) - 1)
+
+static inline void __fill_ipv4hdr_frag(struct ipv4_hdr *dst,
+ const struct ipv4_hdr *src, uint16_t len, uint16_t fofs,
+ uint16_t dofs, uint32_t mf)
+{
+ rte_memcpy(dst, src, sizeof(*dst));
+ fofs = (uint16_t)(fofs + (dofs >> IPV4_HDR_FO_SHIFT));
+ fofs = (uint16_t)(fofs | mf << IPV4_HDR_MF_SHIFT);
+ dst->fragment_offset = rte_cpu_to_be_16(fofs);
+ dst->total_length = rte_cpu_to_be_16(len);
+ dst->hdr_checksum = 0;
+}
+
+static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
+{
+ uint32_t i;
+ for (i = 0; i != num; i++)
+ rte_pktmbuf_free(mb[i]);
+}
+
+/**
+ * IPv4 fragmentation.
+ *
+ * This function implements the fragmentation of IPv4 packets.
+ *
+ * @param pkt_in
+ * The input packet.
+ * @param pkts_out
+ * Array storing the output fragments.
+ * @param mtu_size
+ * Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing IPv4
+ * datagrams. This value includes the size of the IPv4 header.
+ * @param pool_direct
+ * MBUF pool used for allocating direct buffers for the output fragments.
+ * @param pool_indirect
+ * MBUF pool used for allocating indirect buffers for the output fragments.
+ * @return
+ * Upon successful completion - number of output fragments placed
+ * in the pkts_out array.
+ * Otherwise - (-1) * <errno>.
+ */
+int32_t
+rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
+ struct rte_mbuf **pkts_out,
+ uint16_t nb_pkts_out,
+ uint16_t mtu_size,
+ struct rte_mempool *pool_direct,
+ struct rte_mempool *pool_indirect)
+{
+ struct rte_mbuf *in_seg = NULL;
+ struct ipv4_hdr *in_hdr;
+ uint32_t out_pkt_pos, in_seg_data_pos;
+ uint32_t more_in_segs;
+ uint16_t fragment_offset, flag_offset, frag_size;
+
+ frag_size = (uint16_t)(mtu_size - sizeof(struct ipv4_hdr));
+
+ /* Fragment size should be a multiply of 8. */
+ RTE_IPV4_FRAG_ASSERT((frag_size & IPV4_HDR_FO_MASK) == 0);
+
+ /* Fragment size should be a multiply of 8. */
+ RTE_IPV4_FRAG_ASSERT(IPV4_MAX_FRAGS_PER_PACKET * frag_size >=
+ (uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv4_hdr)));
+
+ in_hdr = (struct ipv4_hdr*) pkt_in->pkt.data;
+ flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);
+
+ /* If Don't Fragment flag is set */
+ if (unlikely ((flag_offset & IPV4_HDR_DF_MASK) != 0))
+ return (-ENOTSUP);
+
+ /* Check that pkts_out is big enough to hold all fragments */
+ if (unlikely (frag_size * nb_pkts_out <
+ (uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv4_hdr))))
+ return (-EINVAL);
+
+ in_seg = pkt_in;
+ in_seg_data_pos = sizeof(struct ipv4_hdr);
+ out_pkt_pos = 0;
+ fragment_offset = 0;
+
+ more_in_segs = 1;
+ while (likely(more_in_segs)) {
+ struct rte_mbuf *out_pkt = NULL, *out_seg_prev = NULL;
+ uint32_t more_out_segs;
+ struct ipv4_hdr *out_hdr;
+
+ /* Allocate direct buffer */
+ out_pkt = rte_pktmbuf_alloc(pool_direct);
+ if (unlikely(out_pkt == NULL)) {
+ __free_fragments(pkts_out, out_pkt_pos);
+ return (-ENOMEM);
+ }
+
+ /* Reserve space for the IP header that will be built later */
+ out_pkt->pkt.data_len = sizeof(struct ipv4_hdr);
+ out_pkt->pkt.pkt_len = sizeof(struct ipv4_hdr);
+
+ out_seg_prev = out_pkt;
+ more_out_segs = 1;
+ while (likely(more_out_segs && more_in_segs)) {
+ struct rte_mbuf *out_seg = NULL;
+ uint32_t len;
+
+ /* Allocate indirect buffer */
+ out_seg = rte_pktmbuf_alloc(pool_indirect);
+ if (unlikely(out_seg == NULL)) {
+ rte_pktmbuf_free(out_pkt);
+ __free_fragments(pkts_out, out_pkt_pos);
+ return (-ENOMEM);
+ }
+ out_seg_prev->pkt.next = out_seg;
+ out_seg_prev = out_seg;
+
+ /* Prepare indirect buffer */
+ rte_pktmbuf_attach(out_seg, in_seg);
+ len = mtu_size - out_pkt->pkt.pkt_len;
+ if (len > (in_seg->pkt.data_len - in_seg_data_pos)) {
+ len = in_seg->pkt.data_len - in_seg_data_pos;
+ }
+ out_seg->pkt.data = (char*) in_seg->pkt.data + (uint16_t)in_seg_data_pos;
+ out_seg->pkt.data_len = (uint16_t)len;
+ out_pkt->pkt.pkt_len = (uint16_t)(len +
+ out_pkt->pkt.pkt_len);
+ out_pkt->pkt.nb_segs += 1;
+ in_seg_data_pos += len;
+
+ /* Current output packet (i.e. fragment) done ? */
+ if (unlikely(out_pkt->pkt.pkt_len >= mtu_size)) {
+ more_out_segs = 0;
+ }
+
+ /* Current input segment done ? */
+ if (unlikely(in_seg_data_pos == in_seg->pkt.data_len)) {
+ in_seg = in_seg->pkt.next;
+ in_seg_data_pos = 0;
+
+ if (unlikely(in_seg == NULL)) {
+ more_in_segs = 0;
+ }
+ }
+ }
+
+ /* Build the IP header */
+
+ out_hdr = (struct ipv4_hdr*) out_pkt->pkt.data;
+
+ __fill_ipv4hdr_frag(out_hdr, in_hdr,
+ (uint16_t)out_pkt->pkt.pkt_len,
+ flag_offset, fragment_offset, more_in_segs);
+
+ fragment_offset = (uint16_t)(fragment_offset +
+ out_pkt->pkt.pkt_len - sizeof(struct ipv4_hdr));
+
+ out_pkt->ol_flags |= PKT_TX_IP_CKSUM;
+ out_pkt->pkt.vlan_macip.f.l3_len = sizeof(struct ipv4_hdr);
+
+ /* Write the fragment to the output list */
+ pkts_out[out_pkt_pos] = out_pkt;
+ out_pkt_pos ++;
+ }
+
+ return (out_pkt_pos);
+}
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index a836577..058e362 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -113,6 +113,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_MBUF),y)
LDLIBS += -lrte_mbuf
endif
+ifeq ($(CONFIG_RTE_LIBRTE_IP_FRAG),y)
+LDLIBS += -lrte_ip_frag
+endif
+
ifeq ($(CONFIG_RTE_LIBRTE_ETHER),y)
LDLIBS += -lethdev
endif
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 03/13] Fixing issues reported by checkpatch
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (2 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 02/13] Refactored IPv4 fragmentation into a proper library Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 04/13] ip_frag: new internal common header Anatoly Burakov
` (12 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_ip_frag/rte_ipv4_fragmentation.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/lib/librte_ip_frag/rte_ipv4_fragmentation.c b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
index 2d33a7b..5f67417 100644
--- a/lib/librte_ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
@@ -60,7 +60,7 @@ if (!(exp)) { \
#else /*RTE_IPV4_FRAG_DEBUG*/
-#define RTE_IPV4_FRAG_ASSERT(exp) do { } while(0)
+#define RTE_IPV4_FRAG_ASSERT(exp) do { } while (0)
#endif /*RTE_IPV4_FRAG_DEBUG*/
@@ -135,19 +135,19 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
/* Fragment size should be a multiply of 8. */
RTE_IPV4_FRAG_ASSERT(IPV4_MAX_FRAGS_PER_PACKET * frag_size >=
- (uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv4_hdr)));
+ (uint16_t)(pkt_in->pkt.pkt_len - sizeof(struct ipv4_hdr)));
- in_hdr = (struct ipv4_hdr*) pkt_in->pkt.data;
+ in_hdr = (struct ipv4_hdr *) pkt_in->pkt.data;
flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);
/* If Don't Fragment flag is set */
if (unlikely ((flag_offset & IPV4_HDR_DF_MASK) != 0))
- return (-ENOTSUP);
+ return -ENOTSUP;
/* Check that pkts_out is big enough to hold all fragments */
- if (unlikely (frag_size * nb_pkts_out <
+ if (unlikely(frag_size * nb_pkts_out <
(uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv4_hdr))))
- return (-EINVAL);
+ return -EINVAL;
in_seg = pkt_in;
in_seg_data_pos = sizeof(struct ipv4_hdr);
@@ -164,7 +164,7 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
out_pkt = rte_pktmbuf_alloc(pool_direct);
if (unlikely(out_pkt == NULL)) {
__free_fragments(pkts_out, out_pkt_pos);
- return (-ENOMEM);
+ return -ENOMEM;
}
/* Reserve space for the IP header that will be built later */
@@ -182,7 +182,7 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
if (unlikely(out_seg == NULL)) {
rte_pktmbuf_free(out_pkt);
__free_fragments(pkts_out, out_pkt_pos);
- return (-ENOMEM);
+ return -ENOMEM;
}
out_seg_prev->pkt.next = out_seg;
out_seg_prev = out_seg;
@@ -201,18 +201,16 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
in_seg_data_pos += len;
/* Current output packet (i.e. fragment) done ? */
- if (unlikely(out_pkt->pkt.pkt_len >= mtu_size)) {
+ if (unlikely(out_pkt->pkt.pkt_len >= mtu_size))
more_out_segs = 0;
- }
/* Current input segment done ? */
if (unlikely(in_seg_data_pos == in_seg->pkt.data_len)) {
in_seg = in_seg->pkt.next;
in_seg_data_pos = 0;
- if (unlikely(in_seg == NULL)) {
+ if (unlikely(in_seg == NULL))
more_in_segs = 0;
- }
}
}
@@ -235,5 +233,5 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
out_pkt_pos ++;
}
- return (out_pkt_pos);
+ return out_pkt_pos;
}
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 04/13] ip_frag: new internal common header
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (3 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 03/13] Fixing issues reported by checkpatch Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 05/13] ip_frag: removed unneeded check and macro Anatoly Burakov
` (11 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Moved out debug log macros into common, as reassembly code will later
need them as well.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_ip_frag/ip_frag_common.h | 52 +++++++++++++++++++++++++++++
lib/librte_ip_frag/rte_ipv4_fragmentation.c | 20 ++---------
2 files changed, 55 insertions(+), 17 deletions(-)
create mode 100644 lib/librte_ip_frag/ip_frag_common.h
diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h
new file mode 100644
index 0000000..c9741c0
--- /dev/null
+++ b/lib/librte_ip_frag/ip_frag_common.h
@@ -0,0 +1,52 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _IP_FRAG_COMMON_H_
+#define _IP_FRAG_COMMON_H_
+
+/* Debug on/off */
+#ifdef RTE_IP_FRAG_DEBUG
+
+#define RTE_IP_FRAG_ASSERT(exp) \
+if (!(exp)) { \
+ rte_panic("function %s, line%d\tassert \"" #exp "\" failed\n", \
+ __func__, __LINE__); \
+}
+
+#else /*RTE_IP_FRAG_DEBUG*/
+
+#define RTE_IP_FRAG_ASSERT(exp) do { } while (0)
+
+#endif /*RTE_IP_FRAG_DEBUG*/
+
+#endif
diff --git a/lib/librte_ip_frag/rte_ipv4_fragmentation.c b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
index 5f67417..46ed583 100644
--- a/lib/librte_ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
@@ -43,27 +43,13 @@
#include <rte_ip.h>
#include "rte_ip_frag.h"
+#include "ip_frag_common.h"
/*
* MAX number of fragments per packet allowed.
*/
#define IPV4_MAX_FRAGS_PER_PACKET 0x80
-/* Debug on/off */
-#ifdef RTE_IPV4_FRAG_DEBUG
-
-#define RTE_IPV4_FRAG_ASSERT(exp) \
-if (!(exp)) { \
- rte_panic("function %s, line%d\tassert \"" #exp "\" failed\n", \
- __func__, __LINE__); \
-}
-
-#else /*RTE_IPV4_FRAG_DEBUG*/
-
-#define RTE_IPV4_FRAG_ASSERT(exp) do { } while (0)
-
-#endif /*RTE_IPV4_FRAG_DEBUG*/
-
/* Fragment Offset */
#define IPV4_HDR_DF_SHIFT 14
#define IPV4_HDR_MF_SHIFT 13
@@ -131,10 +117,10 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
frag_size = (uint16_t)(mtu_size - sizeof(struct ipv4_hdr));
/* Fragment size should be a multiply of 8. */
- RTE_IPV4_FRAG_ASSERT((frag_size & IPV4_HDR_FO_MASK) == 0);
+ RTE_IP_FRAG_ASSERT((frag_size & IPV4_HDR_FO_MASK) == 0);
/* Fragment size should be a multiply of 8. */
- RTE_IPV4_FRAG_ASSERT(IPV4_MAX_FRAGS_PER_PACKET * frag_size >=
+ RTE_IP_FRAG_ASSERT(IPV4_MAX_FRAGS_PER_PACKET * frag_size >=
(uint16_t)(pkt_in->pkt.pkt_len - sizeof(struct ipv4_hdr)));
in_hdr = (struct ipv4_hdr *) pkt_in->pkt.data;
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 05/13] ip_frag: removed unneeded check and macro
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (4 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 04/13] ip_frag: new internal common header Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 06/13] ip_frag: renaming structures in fragmentation table to be more generic Anatoly Burakov
` (10 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_ip_frag/rte_ipv4_fragmentation.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/lib/librte_ip_frag/rte_ipv4_fragmentation.c b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
index 46ed583..6e5feb6 100644
--- a/lib/librte_ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
@@ -45,11 +45,6 @@
#include "rte_ip_frag.h"
#include "ip_frag_common.h"
-/*
- * MAX number of fragments per packet allowed.
- */
-#define IPV4_MAX_FRAGS_PER_PACKET 0x80
-
/* Fragment Offset */
#define IPV4_HDR_DF_SHIFT 14
#define IPV4_HDR_MF_SHIFT 13
@@ -119,10 +114,6 @@ rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
/* Fragment size should be a multiply of 8. */
RTE_IP_FRAG_ASSERT((frag_size & IPV4_HDR_FO_MASK) == 0);
- /* Fragment size should be a multiply of 8. */
- RTE_IP_FRAG_ASSERT(IPV4_MAX_FRAGS_PER_PACKET * frag_size >=
- (uint16_t)(pkt_in->pkt.pkt_len - sizeof(struct ipv4_hdr)));
-
in_hdr = (struct ipv4_hdr *) pkt_in->pkt.data;
flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 06/13] ip_frag: renaming structures in fragmentation table to be more generic
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (5 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 05/13] ip_frag: removed unneeded check and macro Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 07/13] ip_frag: refactored reassembly code and made it a proper library Anatoly Burakov
` (9 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Technically, fragmentation table can work for both IPv4 and IPv6
packets, so we're renaming everything to be generic enough to make sense
in IPv6 context.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
examples/ip_reassembly/main.c | 16 ++---
lib/librte_ip_frag/ip_frag_common.h | 2 +
lib/librte_ip_frag/ipv4_frag_tbl.h | 130 ++++++++++++++++++------------------
lib/librte_ip_frag/rte_ipv4_rsmbl.h | 92 ++++++++++++-------------
4 files changed, 122 insertions(+), 118 deletions(-)
diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 42ade5c..23ec4be 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -407,9 +407,9 @@ struct lcore_conf {
#else
lookup_struct_t * ipv6_lookup_struct;
#endif
- struct ipv4_frag_tbl *frag_tbl[MAX_RX_QUEUE_PER_LCORE];
+ struct ip_frag_tbl *frag_tbl[MAX_RX_QUEUE_PER_LCORE];
struct rte_mempool *pool[MAX_RX_QUEUE_PER_LCORE];
- struct ipv4_frag_death_row death_row;
+ struct ip_frag_death_row death_row;
struct mbuf_table *tx_mbufs[MAX_PORTS];
struct tx_lcore_stat tx_stat;
} __rte_cache_aligned;
@@ -673,8 +673,8 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
if (ip_flag != 0 || ip_ofs != 0) {
struct rte_mbuf *mo;
- struct ipv4_frag_tbl *tbl;
- struct ipv4_frag_death_row *dr;
+ struct ip_frag_tbl *tbl;
+ struct ip_frag_death_row *dr;
tbl = qconf->frag_tbl[queue];
dr = &qconf->death_row;
@@ -684,7 +684,7 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
m->pkt.vlan_macip.f.l3_len = sizeof(*ipv4_hdr);
/* process this fragment. */
- if ((mo = ipv4_frag_mbuf(tbl, dr, m, tms, ipv4_hdr,
+ if ((mo = rte_ipv4_reassemble_packet(tbl, dr, m, tms, ipv4_hdr,
ip_ofs, ip_flag)) == NULL)
/* no packet to send out. */
return;
@@ -822,7 +822,7 @@ main_loop(__attribute__((unused)) void *dummy)
i, qconf, cur_tsc);
}
- ipv4_frag_free_death_row(&qconf->death_row,
+ rte_ip_frag_free_death_row(&qconf->death_row,
PREFETCH_OFFSET);
}
}
@@ -1456,7 +1456,7 @@ setup_queue_tbl(struct lcore_conf *qconf, uint32_t lcore, int socket,
frag_cycles = (rte_get_tsc_hz() + MS_PER_S - 1) / MS_PER_S *
max_flow_ttl;
- if ((qconf->frag_tbl[queue] = ipv4_frag_tbl_create(max_flow_num,
+ if ((qconf->frag_tbl[queue] = rte_ip_frag_table_create(max_flow_num,
IPV4_FRAG_TBL_BUCKET_ENTRIES, max_flow_num, frag_cycles,
socket)) == NULL)
rte_exit(EXIT_FAILURE, "ipv4_frag_tbl_create(%u) on "
@@ -1501,7 +1501,7 @@ queue_dump_stat(void)
"rxqueueid=%hhu frag tbl stat:\n",
lcore, qconf->rx_queue_list[i].port_id,
qconf->rx_queue_list[i].queue_id);
- ipv4_frag_tbl_dump_stat(stdout, qconf->frag_tbl[i]);
+ rte_ip_frag_table_statistics_dump(stdout, qconf->frag_tbl[i]);
fprintf(stdout, "TX bursts:\t%" PRIu64 "\n"
"TX packets _queued:\t%" PRIu64 "\n"
"TX packets dropped:\t%" PRIu64 "\n"
diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h
index c9741c0..6d4706a 100644
--- a/lib/librte_ip_frag/ip_frag_common.h
+++ b/lib/librte_ip_frag/ip_frag_common.h
@@ -34,6 +34,8 @@
#ifndef _IP_FRAG_COMMON_H_
#define _IP_FRAG_COMMON_H_
+#include "rte_ip_frag.h"
+
/* Debug on/off */
#ifdef RTE_IP_FRAG_DEBUG
diff --git a/lib/librte_ip_frag/ipv4_frag_tbl.h b/lib/librte_ip_frag/ipv4_frag_tbl.h
index 5487230..fa3291d 100644
--- a/lib/librte_ip_frag/ipv4_frag_tbl.h
+++ b/lib/librte_ip_frag/ipv4_frag_tbl.h
@@ -43,7 +43,7 @@
*/
/*
- * The ipv4_frag_tbl is a simple hash table:
+ * The ip_frag_tbl is a simple hash table:
* The basic idea is to use two hash functions and <bucket_entries>
* associativity. This provides 2 * <bucket_entries> possible locations in
* the hash table for each key. Sort of simplified Cuckoo hashing,
@@ -64,9 +64,9 @@
#define PRIME_VALUE 0xeaad8405
-TAILQ_HEAD(ipv4_pkt_list, ipv4_frag_pkt);
+TAILQ_HEAD(ip_pkt_list, ip_frag_pkt);
-struct ipv4_frag_tbl_stat {
+struct ip_frag_tbl_stat {
uint64_t find_num; /* total # of find/insert attempts. */
uint64_t add_num; /* # of add ops. */
uint64_t del_num; /* # of del ops. */
@@ -75,7 +75,7 @@ struct ipv4_frag_tbl_stat {
uint64_t fail_nospace; /* # of 'no space' add failures. */
} __rte_cache_aligned;
-struct ipv4_frag_tbl {
+struct ip_frag_tbl {
uint64_t max_cycles; /* ttl for table entries. */
uint32_t entry_mask; /* hash value mask. */
uint32_t max_entries; /* max entries allowed. */
@@ -83,25 +83,25 @@ struct ipv4_frag_tbl {
uint32_t bucket_entries; /* hash assocaitivity. */
uint32_t nb_entries; /* total size of the table. */
uint32_t nb_buckets; /* num of associativity lines. */
- struct ipv4_frag_pkt *last; /* last used entry. */
- struct ipv4_pkt_list lru; /* LRU list for table entries. */
- struct ipv4_frag_tbl_stat stat; /* statistics counters. */
- struct ipv4_frag_pkt pkt[0]; /* hash table. */
+ struct ip_frag_pkt *last; /* last used entry. */
+ struct ip_pkt_list lru; /* LRU list for table entries. */
+ struct ip_frag_tbl_stat stat; /* statistics counters. */
+ struct ip_frag_pkt pkt[0]; /* hash table. */
};
-#define IPV4_FRAG_TBL_POS(tbl, sig) \
+#define IP_FRAG_TBL_POS(tbl, sig) \
((tbl)->pkt + ((sig) & (tbl)->entry_mask))
-#define IPV4_FRAG_HASH_FNUM 2
+#define IP_FRAG_HASH_FNUM 2
-#ifdef IPV4_FRAG_TBL_STAT
-#define IPV4_FRAG_TBL_STAT_UPDATE(s, f, v) ((s)->f += (v))
+#ifdef IP_FRAG_TBL_STAT
+#define IP_FRAG_TBL_STAT_UPDATE(s, f, v) ((s)->f += (v))
#else
-#define IPV4_FRAG_TBL_STAT_UPDATE(s, f, v) do {} while (0)
+#define IP_FRAG_TBL_STAT_UPDATE(s, f, v) do {} while (0)
#endif /* IPV4_FRAG_TBL_STAT */
static inline void
-ipv4_frag_hash(const struct ipv4_frag_key *key, uint32_t *v1, uint32_t *v2)
+ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2)
{
uint32_t v;
const uint32_t *p;
@@ -125,9 +125,9 @@ ipv4_frag_hash(const struct ipv4_frag_key *key, uint32_t *v1, uint32_t *v2)
* Update the table, after we finish processing it's entry.
*/
static inline void
-ipv4_frag_inuse(struct ipv4_frag_tbl *tbl, const struct ipv4_frag_pkt *fp)
+ip_frag_inuse(struct ip_frag_tbl *tbl, const struct ip_frag_pkt *fp)
{
- if (IPV4_FRAG_KEY_EMPTY(&fp->key)) {
+ if (IP_FRAG_KEY_EMPTY(&fp->key)) {
TAILQ_REMOVE(&tbl->lru, fp, lru);
tbl->use_entries--;
}
@@ -138,13 +138,13 @@ ipv4_frag_inuse(struct ipv4_frag_tbl *tbl, const struct ipv4_frag_pkt *fp)
* If such entry doesn't exist, will return free and/or timed-out entry,
* that can be used for that key.
*/
-static inline struct ipv4_frag_pkt *
-ipv4_frag_lookup(struct ipv4_frag_tbl *tbl,
- const struct ipv4_frag_key *key, uint64_t tms,
- struct ipv4_frag_pkt **free, struct ipv4_frag_pkt **stale)
+static inline struct ip_frag_pkt *
+ip_frag_lookup(struct ip_frag_tbl *tbl,
+ const struct ip_frag_key *key, uint64_t tms,
+ struct ip_frag_pkt **free, struct ip_frag_pkt **stale)
{
- struct ipv4_frag_pkt *p1, *p2;
- struct ipv4_frag_pkt *empty, *old;
+ struct ip_frag_pkt *p1, *p2;
+ struct ip_frag_pkt *empty, *old;
uint64_t max_cycles;
uint32_t i, assoc, sig1, sig2;
@@ -154,43 +154,43 @@ ipv4_frag_lookup(struct ipv4_frag_tbl *tbl,
max_cycles = tbl->max_cycles;
assoc = tbl->bucket_entries;
- if (tbl->last != NULL && IPV4_FRAG_KEY_CMP(&tbl->last->key, key) == 0)
+ if (tbl->last != NULL && IP_FRAG_KEY_CMP(&tbl->last->key, key) == 0)
return (tbl->last);
ipv4_frag_hash(key, &sig1, &sig2);
- p1 = IPV4_FRAG_TBL_POS(tbl, sig1);
- p2 = IPV4_FRAG_TBL_POS(tbl, sig2);
+ p1 = IP_FRAG_TBL_POS(tbl, sig1);
+ p2 = IP_FRAG_TBL_POS(tbl, sig2);
for (i = 0; i != assoc; i++) {
- IPV4_FRAG_LOG(DEBUG, "%s:%d:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
"tbl: %p, max_entries: %u, use_entries: %u\n"
- "ipv4_frag_pkt line0: %p, index: %u from %u\n"
+ "ip_frag_pkt line0: %p, index: %u from %u\n"
"key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
__func__, __LINE__,
tbl, tbl->max_entries, tbl->use_entries,
p1, i, assoc,
p1[i].key.src_dst, p1[i].key.id, p1[i].start);
- if (IPV4_FRAG_KEY_CMP(&p1[i].key, key) == 0)
+ if (IP_FRAG_KEY_CMP(&p1[i].key, key) == 0)
return (p1 + i);
- else if (IPV4_FRAG_KEY_EMPTY(&p1[i].key))
+ else if (IP_FRAG_KEY_EMPTY(&p1[i].key))
empty = (empty == NULL) ? (p1 + i) : empty;
else if (max_cycles + p1[i].start < tms)
old = (old == NULL) ? (p1 + i) : old;
- IPV4_FRAG_LOG(DEBUG, "%s:%d:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
"tbl: %p, max_entries: %u, use_entries: %u\n"
- "ipv4_frag_pkt line1: %p, index: %u from %u\n"
+ "ip_frag_pkt line1: %p, index: %u from %u\n"
"key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
__func__, __LINE__,
tbl, tbl->max_entries, tbl->use_entries,
p2, i, assoc,
p2[i].key.src_dst, p2[i].key.id, p2[i].start);
- if (IPV4_FRAG_KEY_CMP(&p2[i].key, key) == 0)
+ if (IP_FRAG_KEY_CMP(&p2[i].key, key) == 0)
return (p2 + i);
- else if (IPV4_FRAG_KEY_EMPTY(&p2[i].key))
+ else if (IP_FRAG_KEY_EMPTY(&p2[i].key))
empty = (empty == NULL) ?( p2 + i) : empty;
else if (max_cycles + p2[i].start < tms)
old = (old == NULL) ? (p2 + i) : old;
@@ -202,36 +202,36 @@ ipv4_frag_lookup(struct ipv4_frag_tbl *tbl,
}
static inline void
-ipv4_frag_tbl_del(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
- struct ipv4_frag_pkt *fp)
+ip_frag_tbl_del(struct ip_frag_tbl *tbl, struct ip_frag_death_row *dr,
+ struct ip_frag_pkt *fp)
{
- ipv4_frag_free(fp, dr);
- IPV4_FRAG_KEY_INVALIDATE(&fp->key);
+ ip_frag_free(fp, dr);
+ IP_FRAG_KEY_INVALIDATE(&fp->key);
TAILQ_REMOVE(&tbl->lru, fp, lru);
tbl->use_entries--;
- IPV4_FRAG_TBL_STAT_UPDATE(&tbl->stat, del_num, 1);
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, del_num, 1);
}
static inline void
-ipv4_frag_tbl_add(struct ipv4_frag_tbl *tbl, struct ipv4_frag_pkt *fp,
- const struct ipv4_frag_key *key, uint64_t tms)
+ip_frag_tbl_add(struct ip_frag_tbl *tbl, struct ip_frag_pkt *fp,
+ const struct ip_frag_key *key, uint64_t tms)
{
fp->key = key[0];
- ipv4_frag_reset(fp, tms);
+ ip_frag_reset(fp, tms);
TAILQ_INSERT_TAIL(&tbl->lru, fp, lru);
tbl->use_entries++;
- IPV4_FRAG_TBL_STAT_UPDATE(&tbl->stat, add_num, 1);
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, add_num, 1);
}
static inline void
-ipv4_frag_tbl_reuse(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
- struct ipv4_frag_pkt *fp, uint64_t tms)
+ip_frag_tbl_reuse(struct ip_frag_tbl *tbl, struct ip_frag_death_row *dr,
+ struct ip_frag_pkt *fp, uint64_t tms)
{
- ipv4_frag_free(fp, dr);
- ipv4_frag_reset(fp, tms);
+ ip_frag_free(fp, dr);
+ ip_frag_reset(fp, tms);
TAILQ_REMOVE(&tbl->lru, fp, lru);
TAILQ_INSERT_TAIL(&tbl->lru, fp, lru);
- IPV4_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1);
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1);
}
/*
@@ -239,11 +239,11 @@ ipv4_frag_tbl_reuse(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
* If such entry is not present, then allocate a new one.
* If the entry is stale, then free and reuse it.
*/
-static inline struct ipv4_frag_pkt *
-ipv4_frag_find(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
- const struct ipv4_frag_key *key, uint64_t tms)
+static inline struct ip_frag_pkt *
+ip_frag_find(struct ip_frag_tbl *tbl, struct ip_frag_death_row *dr,
+ const struct ip_frag_key *key, uint64_t tms)
{
- struct ipv4_frag_pkt *pkt, *free, *stale, *lru;
+ struct ip_frag_pkt *pkt, *free, *stale, *lru;
uint64_t max_cycles;
/*
@@ -254,13 +254,13 @@ ipv4_frag_find(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
stale = NULL;
max_cycles = tbl->max_cycles;
- IPV4_FRAG_TBL_STAT_UPDATE(&tbl->stat, find_num, 1);
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, find_num, 1);
- if ((pkt = ipv4_frag_lookup(tbl, key, tms, &free, &stale)) == NULL) {
+ if ((pkt = ip_frag_lookup(tbl, key, tms, &free, &stale)) == NULL) {
/*timed-out entry, free and invalidate it*/
if (stale != NULL) {
- ipv4_frag_tbl_del(tbl, dr, stale);
+ ip_frag_tbl_del(tbl, dr, stale);
free = stale;
/*
@@ -272,17 +272,17 @@ ipv4_frag_find(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
tbl->max_entries <= tbl->use_entries) {
lru = TAILQ_FIRST(&tbl->lru);
if (max_cycles + lru->start < tms) {
- ipv4_frag_tbl_del(tbl, dr, lru);
+ ip_frag_tbl_del(tbl, dr, lru);
} else {
free = NULL;
- IPV4_FRAG_TBL_STAT_UPDATE(&tbl->stat,
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat,
fail_nospace, 1);
}
}
/* found a free entry to reuse. */
if (free != NULL) {
- ipv4_frag_tbl_add(tbl, free, key, tms);
+ ip_frag_tbl_add(tbl, free, key, tms);
pkt = free;
}
@@ -292,10 +292,10 @@ ipv4_frag_find(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
* and reuse it.
*/
} else if (max_cycles + pkt->start < tms) {
- ipv4_frag_tbl_reuse(tbl, dr, pkt, tms);
+ ip_frag_tbl_reuse(tbl, dr, pkt, tms);
}
- IPV4_FRAG_TBL_STAT_UPDATE(&tbl->stat, fail_total, (pkt == NULL));
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, fail_total, (pkt == NULL));
tbl->last = pkt;
return (pkt);
@@ -319,17 +319,17 @@ ipv4_frag_find(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
* @return
* The pointer to the new allocated mempool, on success. NULL on error.
*/
-static struct ipv4_frag_tbl *
-ipv4_frag_tbl_create(uint32_t bucket_num, uint32_t bucket_entries,
+static struct ip_frag_tbl *
+rte_ip_frag_table_create(uint32_t bucket_num, uint32_t bucket_entries,
uint32_t max_entries, uint64_t max_cycles, int socket_id)
{
- struct ipv4_frag_tbl *tbl;
+ struct ip_frag_tbl *tbl;
size_t sz;
uint64_t nb_entries;
nb_entries = rte_align32pow2(bucket_num);
nb_entries *= bucket_entries;
- nb_entries *= IPV4_FRAG_HASH_FNUM;
+ nb_entries *= IP_FRAG_HASH_FNUM;
/* check input parameters. */
if (rte_is_power_of_2(bucket_entries) == 0 ||
@@ -363,13 +363,13 @@ ipv4_frag_tbl_create(uint32_t bucket_num, uint32_t bucket_entries,
}
static inline void
-ipv4_frag_tbl_destroy( struct ipv4_frag_tbl *tbl)
+rte_ip_frag_table_destroy( struct ip_frag_tbl *tbl)
{
rte_free(tbl);
}
static void
-ipv4_frag_tbl_dump_stat(FILE *f, const struct ipv4_frag_tbl *tbl)
+rte_ip_frag_table_statistics_dump(FILE *f, const struct ip_frag_tbl *tbl)
{
uint64_t fail_total, fail_nospace;
diff --git a/lib/librte_ip_frag/rte_ipv4_rsmbl.h b/lib/librte_ip_frag/rte_ipv4_rsmbl.h
index 58ec1ee..82cb9b5 100644
--- a/lib/librte_ip_frag/rte_ipv4_rsmbl.h
+++ b/lib/librte_ip_frag/rte_ipv4_rsmbl.h
@@ -34,6 +34,8 @@
#ifndef _IPV4_RSMBL_H_
#define _IPV4_RSMBL_H_
+#include "ip_frag_common.h"
+
/**
* @file
* IPv4 reassemble
@@ -49,7 +51,7 @@ enum {
MAX_FRAG_NUM = 4,
};
-struct ipv4_frag {
+struct ip_frag {
uint16_t ofs;
uint16_t len;
struct rte_mbuf *mb;
@@ -58,15 +60,15 @@ struct ipv4_frag {
/*
* Use <src addr, dst_addr, id> to uniquely indetify fragmented datagram.
*/
-struct ipv4_frag_key {
+struct ip_frag_key {
uint64_t src_dst;
uint32_t id;
};
-#define IPV4_FRAG_KEY_INVALIDATE(k) ((k)->src_dst = 0)
-#define IPV4_FRAG_KEY_EMPTY(k) ((k)->src_dst == 0)
+#define IP_FRAG_KEY_INVALIDATE(k) ((k)->src_dst = 0)
+#define IP_FRAG_KEY_EMPTY(k) ((k)->src_dst == 0)
-#define IPV4_FRAG_KEY_CMP(k1, k2) \
+#define IP_FRAG_KEY_CMP(k1, k2) \
(((k1)->src_dst ^ (k2)->src_dst) | ((k1)->id ^ (k2)->id))
@@ -74,37 +76,37 @@ struct ipv4_frag_key {
* Fragmented packet to reassemble.
* First two entries in the frags[] array are for the last and first fragments.
*/
-struct ipv4_frag_pkt {
- TAILQ_ENTRY(ipv4_frag_pkt) lru; /* LRU list */
- struct ipv4_frag_key key;
+struct ip_frag_pkt {
+ TAILQ_ENTRY(ip_frag_pkt) lru; /* LRU list */
+ struct ip_frag_key key;
uint64_t start; /* creation timestamp */
uint32_t total_size; /* expected reassembled size */
uint32_t frag_size; /* size of fragments received */
uint32_t last_idx; /* index of next entry to fill */
- struct ipv4_frag frags[MAX_FRAG_NUM];
+ struct ip_frag frags[MAX_FRAG_NUM];
} __rte_cache_aligned;
-struct ipv4_frag_death_row {
+struct ip_frag_death_row {
uint32_t cnt;
struct rte_mbuf *row[MAX_PKT_BURST * (MAX_FRAG_NUM + 1)];
};
-#define IPV4_FRAG_MBUF2DR(dr, mb) ((dr)->row[(dr)->cnt++] = (mb))
+#define IP_FRAG_MBUF2DR(dr, mb) ((dr)->row[(dr)->cnt++] = (mb))
/* logging macros. */
-#ifdef IPV4_FRAG_DEBUG
-#define IPV4_FRAG_LOG(lvl, fmt, args...) RTE_LOG(lvl, USER1, fmt, ##args)
+#ifdef IP_FRAG_DEBUG
+#define IP_FRAG_LOG(lvl, fmt, args...) RTE_LOG(lvl, USER1, fmt, ##args)
#else
-#define IPV4_FRAG_LOG(lvl, fmt, args...) do {} while(0)
-#endif /* IPV4_FRAG_DEBUG */
+#define IP_FRAG_LOG(lvl, fmt, args...) do {} while(0)
+#endif /* IP_FRAG_DEBUG */
static inline void
-ipv4_frag_reset(struct ipv4_frag_pkt *fp, uint64_t tms)
+ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms)
{
- static const struct ipv4_frag zero_frag = {
+ static const struct ip_frag zero_frag = {
.ofs = 0,
.len = 0,
.mb = NULL,
@@ -119,7 +121,7 @@ ipv4_frag_reset(struct ipv4_frag_pkt *fp, uint64_t tms)
}
static inline void
-ipv4_frag_free(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr)
+ip_frag_free(struct ip_frag_pkt *fp, struct ip_frag_death_row *dr)
{
uint32_t i, k;
@@ -136,7 +138,7 @@ ipv4_frag_free(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr)
}
static inline void
-ipv4_frag_free_death_row(struct ipv4_frag_death_row *dr, uint32_t prefetch)
+rte_ip_frag_free_death_row(struct ip_frag_death_row *dr, uint32_t prefetch)
{
uint32_t i, k, n;
@@ -163,7 +165,7 @@ ipv4_frag_free_death_row(struct ipv4_frag_death_row *dr, uint32_t prefetch)
* chains them into one mbuf.
*/
static inline void
-ipv4_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp)
+ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp)
{
struct rte_mbuf *ms;
@@ -188,7 +190,7 @@ ipv4_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp)
* Reassemble fragments into one packet.
*/
static inline struct rte_mbuf *
-ipv4_frag_reassemble(const struct ipv4_frag_pkt *fp)
+ipv4_frag_reassemble(const struct ip_frag_pkt *fp)
{
struct ipv4_hdr *ip_hdr;
struct rte_mbuf *m, *prev;
@@ -210,7 +212,7 @@ ipv4_frag_reassemble(const struct ipv4_frag_pkt *fp)
/* previous fragment found. */
if(fp->frags[i].ofs + fp->frags[i].len == ofs) {
- ipv4_frag_chain(fp->frags[i].mb, m);
+ ip_frag_chain(fp->frags[i].mb, m);
/* update our last fragment and offset. */
m = fp->frags[i].mb;
@@ -225,14 +227,14 @@ ipv4_frag_reassemble(const struct ipv4_frag_pkt *fp)
}
/* chain with the first fragment. */
- ipv4_frag_chain(fp->frags[FIRST_FRAG_IDX].mb, m);
+ ip_frag_chain(fp->frags[FIRST_FRAG_IDX].mb, m);
m = fp->frags[FIRST_FRAG_IDX].mb;
/* update mbuf fields for reassembled packet. */
m->ol_flags |= PKT_TX_IP_CKSUM;
/* update ipv4 header for the reassmebled packet */
- ip_hdr = (struct ipv4_hdr*)(rte_pktmbuf_mtod(m, uint8_t *) +
+ ip_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(m, uint8_t *) +
m->pkt.vlan_macip.f.l2_len);
ip_hdr->total_length = rte_cpu_to_be_16((uint16_t)(fp->total_size +
@@ -245,7 +247,7 @@ ipv4_frag_reassemble(const struct ipv4_frag_pkt *fp)
}
static inline struct rte_mbuf *
-ipv4_frag_process(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr,
+ip_frag_process(struct ip_frag_pkt *fp, struct ip_frag_death_row *dr,
struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags)
{
uint32_t idx;
@@ -276,7 +278,7 @@ ipv4_frag_process(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr,
if (idx >= sizeof (fp->frags) / sizeof (fp->frags[0])) {
/* report an error. */
- IPV4_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
"ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
"total_size: %u, frag_size: %u, last_idx: %u\n"
"first fragment: ofs: %u, len: %u\n"
@@ -290,9 +292,9 @@ ipv4_frag_process(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr,
fp->frags[LAST_FRAG_IDX].len);
/* free all fragments, invalidate the entry. */
- ipv4_frag_free(fp, dr);
- IPV4_FRAG_KEY_INVALIDATE(&fp->key);
- IPV4_FRAG_MBUF2DR(dr, mb);
+ ip_frag_free(fp, dr);
+ IP_FRAG_KEY_INVALIDATE(&fp->key);
+ IP_FRAG_MBUF2DR(dr, mb);
return (NULL);
}
@@ -317,7 +319,7 @@ ipv4_frag_process(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr,
if (mb == NULL) {
/* report an error. */
- IPV4_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
"ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
"total_size: %u, frag_size: %u, last_idx: %u\n"
"first fragment: ofs: %u, len: %u\n"
@@ -331,11 +333,11 @@ ipv4_frag_process(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr,
fp->frags[LAST_FRAG_IDX].len);
/* free associated resources. */
- ipv4_frag_free(fp, dr);
+ ip_frag_free(fp, dr);
}
/* we are done with that entry, invalidate it. */
- IPV4_FRAG_KEY_INVALIDATE(&fp->key);
+ IP_FRAG_KEY_INVALIDATE(&fp->key);
return (mb);
}
@@ -362,12 +364,12 @@ ipv4_frag_process(struct ipv4_frag_pkt *fp, struct ipv4_frag_death_row *dr,
* - not all fragments of the packet are collected yet.
*/
static inline struct rte_mbuf *
-ipv4_frag_mbuf(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
- struct rte_mbuf *mb, uint64_t tms, struct ipv4_hdr *ip_hdr,
- uint16_t ip_ofs, uint16_t ip_flag)
+rte_ipv4_reassemble_packet(struct ip_frag_tbl *tbl,
+ struct ip_frag_death_row *dr, struct rte_mbuf *mb, uint64_t tms,
+ struct ipv4_hdr *ip_hdr, uint16_t ip_ofs, uint16_t ip_flag)
{
- struct ipv4_frag_pkt *fp;
- struct ipv4_frag_key key;
+ struct ip_frag_pkt *fp;
+ struct ip_frag_key key;
const uint64_t *psd;
uint16_t ip_len;
@@ -379,7 +381,7 @@ ipv4_frag_mbuf(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
ip_len = (uint16_t)(rte_be_to_cpu_16(ip_hdr->total_length) -
mb->pkt.vlan_macip.f.l3_len);
- IPV4_FRAG_LOG(DEBUG, "%s:%d:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
"mbuf: %p, tms: %" PRIu64
", key: <%" PRIx64 ", %#x>, ofs: %u, len: %u, flags: %#x\n"
"tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
@@ -390,12 +392,12 @@ ipv4_frag_mbuf(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
tbl->use_entries);
/* try to find/add entry into the fragment's table. */
- if ((fp = ipv4_frag_find(tbl, dr, &key, tms)) == NULL) {
- IPV4_FRAG_MBUF2DR(dr, mb);
- return (NULL);
+ if ((fp = ip_frag_find(tbl, dr, &key, tms)) == NULL) {
+ IP_FRAG_MBUF2DR(dr, mb);
+ return NULL;
}
- IPV4_FRAG_LOG(DEBUG, "%s:%d:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
"tbl: %p, max_entries: %u, use_entries: %u\n"
"ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, start: %" PRIu64
", total_size: %u, frag_size: %u, last_idx: %u\n\n",
@@ -406,10 +408,10 @@ ipv4_frag_mbuf(struct ipv4_frag_tbl *tbl, struct ipv4_frag_death_row *dr,
/* process the fragmented packet. */
- mb = ipv4_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag);
- ipv4_frag_inuse(tbl, fp);
+ mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag);
+ ip_frag_inuse(tbl, fp);
- IPV4_FRAG_LOG(DEBUG, "%s:%d:\n"
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
"mbuf: %p\n"
"tbl: %p, max_entries: %u, use_entries: %u\n"
"ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, start: %" PRIu64
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 07/13] ip_frag: refactored reassembly code and made it a proper library
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (6 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 06/13] ip_frag: renaming structures in fragmentation table to be more generic Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 08/13] ip_frag: renamed ipv4 frag function Anatoly Burakov
` (8 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
config/common_bsdapp | 2 +
config/common_linuxapp | 2 +
examples/ip_reassembly/main.c | 24 +-
lib/librte_ip_frag/Makefile | 6 +-
lib/librte_ip_frag/ip_frag_common.h | 134 +++++++++-
lib/librte_ip_frag/ip_frag_internal.c | 337 ++++++++++++++++++++++++
lib/librte_ip_frag/ipv4_frag_tbl.h | 400 -----------------------------
lib/librte_ip_frag/rte_ip_frag.h | 223 +++++++++++++++-
lib/librte_ip_frag/rte_ip_frag_common.c | 142 ++++++++++
lib/librte_ip_frag/rte_ipv4_reassembly.c | 189 ++++++++++++++
lib/librte_ip_frag/rte_ipv4_rsmbl.h | 427 -------------------------------
11 files changed, 1023 insertions(+), 863 deletions(-)
create mode 100644 lib/librte_ip_frag/ip_frag_internal.c
delete mode 100644 lib/librte_ip_frag/ipv4_frag_tbl.h
create mode 100644 lib/librte_ip_frag/rte_ip_frag_common.c
create mode 100644 lib/librte_ip_frag/rte_ipv4_reassembly.c
delete mode 100644 lib/librte_ip_frag/rte_ipv4_rsmbl.h
diff --git a/config/common_bsdapp b/config/common_bsdapp
index d30802e..be56ca7 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -261,6 +261,8 @@ CONFIG_RTE_LIBRTE_NET=y
# Compile librte_net
#
CONFIG_RTE_LIBRTE_IP_FRAG=y
+CONFIG_RTE_LIBRTE_IP_FRAG_DEBUG=n
+CONFIG_RTE_LIBRTE_IP_FRAG_MAX_FRAG=4
#
# Compile librte_meter
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 074d961..4d58496 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -288,6 +288,8 @@ CONFIG_RTE_LIBRTE_NET=y
# Compile librte_net
#
CONFIG_RTE_LIBRTE_IP_FRAG=y
+CONFIG_RTE_LIBRTE_IP_FRAG_DEBUG=n
+CONFIG_RTE_LIBRTE_IP_FRAG_MAX_FRAG=4
#
# Compile librte_meter
diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 23ec4be..6c40d76 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -94,7 +94,7 @@
#define MAX_PKT_BURST 32
-#include "rte_ipv4_rsmbl.h"
+#include "rte_ip_frag.h"
#ifndef IPv6_BYTES
#define IPv6_BYTES_FMT "%02x%02x:%02x%02x:%02x%02x:%02x%02x:"\
@@ -407,9 +407,9 @@ struct lcore_conf {
#else
lookup_struct_t * ipv6_lookup_struct;
#endif
- struct ip_frag_tbl *frag_tbl[MAX_RX_QUEUE_PER_LCORE];
+ struct rte_ip_frag_tbl *frag_tbl[MAX_RX_QUEUE_PER_LCORE];
struct rte_mempool *pool[MAX_RX_QUEUE_PER_LCORE];
- struct ip_frag_death_row death_row;
+ struct rte_ip_frag_death_row death_row;
struct mbuf_table *tx_mbufs[MAX_PORTS];
struct tx_lcore_stat tx_stat;
} __rte_cache_aligned;
@@ -645,7 +645,6 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
struct ipv4_hdr *ipv4_hdr;
void *d_addr_bytes;
uint8_t dst_port;
- uint16_t flag_offset, ip_flag, ip_ofs;
eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
@@ -665,16 +664,12 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
++(ipv4_hdr->hdr_checksum);
#endif
- flag_offset = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
- ip_ofs = (uint16_t)(flag_offset & IPV4_HDR_OFFSET_MASK);
- ip_flag = (uint16_t)(flag_offset & IPV4_HDR_MF_FLAG);
-
/* if it is a fragmented packet, then try to reassemble. */
- if (ip_flag != 0 || ip_ofs != 0) {
+ if (rte_ipv4_frag_pkt_is_fragmented(ipv4_hdr)) {
struct rte_mbuf *mo;
- struct ip_frag_tbl *tbl;
- struct ip_frag_death_row *dr;
+ struct rte_ip_frag_tbl *tbl;
+ struct rte_ip_frag_death_row *dr;
tbl = qconf->frag_tbl[queue];
dr = &qconf->death_row;
@@ -684,8 +679,8 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
m->pkt.vlan_macip.f.l3_len = sizeof(*ipv4_hdr);
/* process this fragment. */
- if ((mo = rte_ipv4_reassemble_packet(tbl, dr, m, tms, ipv4_hdr,
- ip_ofs, ip_flag)) == NULL)
+ if ((mo = rte_ipv4_frag_reassemble_packet(tbl, dr, m, tms,
+ ipv4_hdr)) == NULL)
/* no packet to send out. */
return;
@@ -1469,7 +1464,8 @@ setup_queue_tbl(struct lcore_conf *qconf, uint32_t lcore, int socket,
* Plus, each TX queue can hold up to <max_flow_num> packets.
*/
- nb_mbuf = 2 * RTE_MAX(max_flow_num, 2UL * MAX_PKT_BURST) * MAX_FRAG_NUM;
+ nb_mbuf = 2 * RTE_MAX(max_flow_num, 2UL * MAX_PKT_BURST) *
+ RTE_LIBRTE_IP_FRAG_MAX_FRAG;
nb_mbuf *= (port_conf.rxmode.max_rx_pkt_len + BUF_SIZE - 1) / BUF_SIZE;
nb_mbuf += RTE_TEST_RX_DESC_DEFAULT + RTE_TEST_TX_DESC_DEFAULT;
diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
index 13a83b1..022092d 100644
--- a/lib/librte_ip_frag/Makefile
+++ b/lib/librte_ip_frag/Makefile
@@ -39,11 +39,13 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
#source files
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ip_frag_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += ip_frag_internal.c
# install this header file
SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += rte_ip_frag.h
-SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += ipv4_frag_tbl.h
-SYMLINK-$(CONFIG_RTE_LIBRTE_IP_FRAG)-include += rte_ipv4_rsmbl.h
+
# this library depends on rte_ether
DEPDIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += lib/librte_mempool lib/librte_ether
diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h
index 6d4706a..3e588a0 100644
--- a/lib/librte_ip_frag/ip_frag_common.h
+++ b/lib/librte_ip_frag/ip_frag_common.h
@@ -36,19 +36,141 @@
#include "rte_ip_frag.h"
-/* Debug on/off */
-#ifdef RTE_IP_FRAG_DEBUG
+/* logging macros. */
+#ifdef RTE_LIBRTE_IP_FRAG_DEBUG
+
+#define IP_FRAG_LOG(lvl, fmt, args...) RTE_LOG(lvl, USER1, fmt, ##args)
#define RTE_IP_FRAG_ASSERT(exp) \
if (!(exp)) { \
rte_panic("function %s, line%d\tassert \"" #exp "\" failed\n", \
__func__, __LINE__); \
}
+#else
+#define IP_FRAG_LOG(lvl, fmt, args...) do {} while(0)
+#define RTE_IP_FRAG_ASSERT(exp) do { } while(0)
+#endif /* IP_FRAG_DEBUG */
+
+/* helper macros */
+#define IP_FRAG_MBUF2DR(dr, mb) ((dr)->row[(dr)->cnt++] = (mb))
+
+/* internal functions declarations */
+struct rte_mbuf * ip_frag_process(struct rte_ip_frag_pkt *fp,
+ struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb,
+ uint16_t ofs, uint16_t len, uint16_t more_frags);
+
+struct rte_ip_frag_pkt * ip_frag_find(struct rte_ip_frag_tbl *tbl,
+ struct rte_ip_frag_death_row *dr,
+ const struct ip_frag_key *key, uint64_t tms);
+
+struct rte_ip_frag_pkt * ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
+ const struct ip_frag_key *key, uint64_t tms,
+ struct rte_ip_frag_pkt **free, struct rte_ip_frag_pkt **stale);
+
+/* these functions need to be declared here as ip_frag_process relies on them */
+struct rte_mbuf * ipv4_frag_reassemble(const struct rte_ip_frag_pkt *fp);
+
+
+
+/*
+ * misc frag key functions
+ */
+
+/* check if key is empty */
+static inline int
+ip_frag_key_is_empty(const struct ip_frag_key * key)
+{
+ if (key->src_dst != 0)
+ return 0;
+ return 1;
+}
-#else /*RTE_IP_FRAG_DEBUG*/
+/* empty the key */
+static inline void
+ip_frag_key_invalidate(struct ip_frag_key * key)
+{
+ key->src_dst = 0;
+}
+
+/* compare two keys */
+static inline int
+ip_frag_key_cmp(const struct ip_frag_key * k1, const struct ip_frag_key * k2)
+{
+ return k1->src_dst ^ k2->src_dst;
+}
-#define RTE_IP_FRAG_ASSERT(exp) do { } while (0)
+/*
+ * misc fragment functions
+ */
+
+/* put fragment on death row */
+static inline void
+ip_frag_free(struct rte_ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr)
+{
+ uint32_t i, k;
+
+ k = dr->cnt;
+ for (i = 0; i != fp->last_idx; i++) {
+ if (fp->frags[i].mb != NULL) {
+ dr->row[k++] = fp->frags[i].mb;
+ fp->frags[i].mb = NULL;
+ }
+ }
+
+ fp->last_idx = 0;
+ dr->cnt = k;
+}
+
+/* if key is empty, mark key as in use */
+static inline void
+ip_frag_inuse(struct rte_ip_frag_tbl *tbl, const struct rte_ip_frag_pkt *fp)
+{
+ if (ip_frag_key_is_empty(&fp->key)) {
+ TAILQ_REMOVE(&tbl->lru, fp, lru);
+ tbl->use_entries--;
+ }
+}
+
+/* reset the fragment */
+static inline void
+ip_frag_reset(struct rte_ip_frag_pkt *fp, uint64_t tms)
+{
+ static const struct ip_frag zero_frag = {
+ .ofs = 0,
+ .len = 0,
+ .mb = NULL,
+ };
+
+ fp->start = tms;
+ fp->total_size = UINT32_MAX;
+ fp->frag_size = 0;
+ fp->last_idx = IP_MIN_FRAG_NUM;
+ fp->frags[IP_LAST_FRAG_IDX] = zero_frag;
+ fp->frags[IP_FIRST_FRAG_IDX] = zero_frag;
+}
+
+/* chain two mbufs */
+static inline void
+ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp)
+{
+ struct rte_mbuf *ms;
+
+ /* adjust start of the last fragment data. */
+ rte_pktmbuf_adj(mp, (uint16_t)(mp->pkt.vlan_macip.f.l2_len +
+ mp->pkt.vlan_macip.f.l3_len));
+
+ /* chain two fragments. */
+ ms = rte_pktmbuf_lastseg(mn);
+ ms->pkt.next = mp;
+
+ /* accumulate number of segments and total length. */
+ mn->pkt.nb_segs = (uint8_t)(mn->pkt.nb_segs + mp->pkt.nb_segs);
+ mn->pkt.pkt_len += mp->pkt.pkt_len;
+
+ /* reset pkt_len and nb_segs for chained fragment. */
+ mp->pkt.pkt_len = mp->pkt.data_len;
+ mp->pkt.nb_segs = 1;
+}
-#endif /*RTE_IP_FRAG_DEBUG*/
-#endif
+#endif /* _IP_FRAG_COMMON_H_ */
diff --git a/lib/librte_ip_frag/ip_frag_internal.c b/lib/librte_ip_frag/ip_frag_internal.c
new file mode 100644
index 0000000..2f5a4b8
--- /dev/null
+++ b/lib/librte_ip_frag/ip_frag_internal.c
@@ -0,0 +1,337 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include <rte_byteorder.h>
+#include <rte_jhash.h>
+#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
+#include <rte_hash_crc.h>
+#endif /* RTE_MACHINE_CPUFLAG_SSE4_2 */
+
+#include "rte_ip_frag.h"
+#include "ip_frag_common.h"
+
+#define PRIME_VALUE 0xeaad8405
+
+#define IP_FRAG_TBL_POS(tbl, sig) \
+ ((tbl)->pkt + ((sig) & (tbl)->entry_mask))
+
+#ifdef RTE_LIBRTE_IP_FRAG_TBL_STAT
+#define IP_FRAG_TBL_STAT_UPDATE(s, f, v) ((s)->f += (v))
+#else
+#define IP_FRAG_TBL_STAT_UPDATE(s, f, v) do {} while (0)
+#endif /* IP_FRAG_TBL_STAT */
+
+/* local frag table helper functions */
+static inline void
+ip_frag_tbl_del(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr,
+ struct rte_ip_frag_pkt *fp)
+{
+ ip_frag_free(fp, dr);
+ ip_frag_key_invalidate(&fp->key);
+ TAILQ_REMOVE(&tbl->lru, fp, lru);
+ tbl->use_entries--;
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, del_num, 1);
+}
+
+static inline void
+ip_frag_tbl_add(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_pkt *fp,
+ const struct ip_frag_key *key, uint64_t tms)
+{
+ fp->key = key[0];
+ ip_frag_reset(fp, tms);
+ TAILQ_INSERT_TAIL(&tbl->lru, fp, lru);
+ tbl->use_entries++;
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, add_num, 1);
+}
+
+static inline void
+ip_frag_tbl_reuse(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr,
+ struct rte_ip_frag_pkt *fp, uint64_t tms)
+{
+ ip_frag_free(fp, dr);
+ ip_frag_reset(fp, tms);
+ TAILQ_REMOVE(&tbl->lru, fp, lru);
+ TAILQ_INSERT_TAIL(&tbl->lru, fp, lru);
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1);
+}
+
+
+static inline void
+ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2)
+{
+ uint32_t v;
+ const uint32_t *p;
+
+ p = (const uint32_t *)&key->src_dst;
+
+#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
+ v = rte_hash_crc_4byte(p[0], PRIME_VALUE);
+ v = rte_hash_crc_4byte(p[1], v);
+ v = rte_hash_crc_4byte(key->id, v);
+#else
+
+ v = rte_jhash_3words(p[0], p[1], key->id, PRIME_VALUE);
+#endif /* RTE_MACHINE_CPUFLAG_SSE4_2 */
+
+ *v1 = v;
+ *v2 = (v << 7) + (v >> 14);
+}
+
+struct rte_mbuf *
+ip_frag_process(struct rte_ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr,
+ struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags)
+{
+ uint32_t idx;
+
+ fp->frag_size += len;
+
+ /* this is the first fragment. */
+ if (ofs == 0) {
+ idx = (fp->frags[IP_FIRST_FRAG_IDX].mb == NULL) ?
+ IP_FIRST_FRAG_IDX : UINT32_MAX;
+
+ /* this is the last fragment. */
+ } else if (more_frags == 0) {
+ fp->total_size = ofs + len;
+ idx = (fp->frags[IP_LAST_FRAG_IDX].mb == NULL) ?
+ IP_LAST_FRAG_IDX : UINT32_MAX;
+
+ /* this is the intermediate fragment. */
+ } else if ((idx = fp->last_idx) <
+ sizeof (fp->frags) / sizeof (fp->frags[0])) {
+ fp->last_idx++;
+ }
+
+ /*
+ * errorneous packet: either exceeed max allowed number of fragments,
+ * or duplicate first/last fragment encountered.
+ */
+ if (idx >= sizeof (fp->frags) / sizeof (fp->frags[0])) {
+
+ /* report an error. */
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
+ "total_size: %u, frag_size: %u, last_idx: %u\n"
+ "first fragment: ofs: %u, len: %u\n"
+ "last fragment: ofs: %u, len: %u\n\n",
+ __func__, __LINE__,
+ fp, fp->key.src_dst[0], fp->key.id,
+ fp->total_size, fp->frag_size, fp->last_idx,
+ fp->frags[IP_FIRST_FRAG_IDX].ofs,
+ fp->frags[IP_FIRST_FRAG_IDX].len,
+ fp->frags[IP_LAST_FRAG_IDX].ofs,
+ fp->frags[IP_LAST_FRAG_IDX].len);
+
+ /* free all fragments, invalidate the entry. */
+ ip_frag_free(fp, dr);
+ ip_frag_key_invalidate(&fp->key);
+ IP_FRAG_MBUF2DR(dr, mb);
+
+ return (NULL);
+ }
+
+ fp->frags[idx].ofs = ofs;
+ fp->frags[idx].len = len;
+ fp->frags[idx].mb = mb;
+
+ mb = NULL;
+
+ /* not all fragments are collected yet. */
+ if (likely (fp->frag_size < fp->total_size)) {
+ return (mb);
+
+ /* if we collected all fragments, then try to reassemble. */
+ } else if (fp->frag_size == fp->total_size &&
+ fp->frags[IP_FIRST_FRAG_IDX].mb != NULL)
+ mb = ipv4_frag_reassemble(fp);
+
+ /* errorenous set of fragments. */
+ if (mb == NULL) {
+
+ /* report an error. */
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
+ "total_size: %u, frag_size: %u, last_idx: %u\n"
+ "first fragment: ofs: %u, len: %u\n"
+ "last fragment: ofs: %u, len: %u\n\n",
+ __func__, __LINE__,
+ fp, fp->key.src_dst[0], fp->key.id,
+ fp->total_size, fp->frag_size, fp->last_idx,
+ fp->frags[IP_FIRST_FRAG_IDX].ofs,
+ fp->frags[IP_FIRST_FRAG_IDX].len,
+ fp->frags[IP_LAST_FRAG_IDX].ofs,
+ fp->frags[IP_LAST_FRAG_IDX].len);
+
+ /* free associated resources. */
+ ip_frag_free(fp, dr);
+ }
+
+ /* we are done with that entry, invalidate it. */
+ ip_frag_key_invalidate(&fp->key);
+ return (mb);
+}
+
+
+/*
+ * Find an entry in the table for the corresponding fragment.
+ * If such entry is not present, then allocate a new one.
+ * If the entry is stale, then free and reuse it.
+ */
+struct rte_ip_frag_pkt *
+ip_frag_find(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr,
+ const struct ip_frag_key *key, uint64_t tms)
+{
+ struct rte_ip_frag_pkt *pkt, *free, *stale, *lru;
+ uint64_t max_cycles;
+
+ /*
+ * Actually the two line below are totally redundant.
+ * they are here, just to make gcc 4.6 happy.
+ */
+ free = NULL;
+ stale = NULL;
+ max_cycles = tbl->max_cycles;
+
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, find_num, 1);
+
+ if ((pkt = ip_frag_lookup(tbl, key, tms, &free, &stale)) == NULL) {
+
+ /*timed-out entry, free and invalidate it*/
+ if (stale != NULL) {
+ ip_frag_tbl_del(tbl, dr, stale);
+ free = stale;
+
+ /*
+ * we found a free entry, check if we can use it.
+ * If we run out of free entries in the table, then
+ * check if we have a timed out entry to delete.
+ */
+ } else if (free != NULL &&
+ tbl->max_entries <= tbl->use_entries) {
+ lru = TAILQ_FIRST(&tbl->lru);
+ if (max_cycles + lru->start < tms) {
+ ip_frag_tbl_del(tbl, dr, lru);
+ } else {
+ free = NULL;
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat,
+ fail_nospace, 1);
+ }
+ }
+
+ /* found a free entry to reuse. */
+ if (free != NULL) {
+ ip_frag_tbl_add(tbl, free, key, tms);
+ pkt = free;
+ }
+
+ /*
+ * we found the flow, but it is already timed out,
+ * so free associated resources, reposition it in the LRU list,
+ * and reuse it.
+ */
+ } else if (max_cycles + pkt->start < tms) {
+ ip_frag_tbl_reuse(tbl, dr, pkt, tms);
+ }
+
+ IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, fail_total, (pkt == NULL));
+
+ tbl->last = pkt;
+ return (pkt);
+}
+
+struct rte_ip_frag_pkt *
+ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
+ const struct ip_frag_key *key, uint64_t tms,
+ struct rte_ip_frag_pkt **free, struct rte_ip_frag_pkt **stale)
+{
+ struct rte_ip_frag_pkt *p1, *p2;
+ struct rte_ip_frag_pkt *empty, *old;
+ uint64_t max_cycles;
+ uint32_t i, assoc, sig1, sig2;
+
+ empty = NULL;
+ old = NULL;
+
+ max_cycles = tbl->max_cycles;
+ assoc = tbl->bucket_entries;
+
+ if (tbl->last != NULL && ip_frag_key_cmp(&tbl->last->key, key) == 0)
+ return (tbl->last);
+
+ ipv4_frag_hash(key, &sig1, &sig2);
+
+ p1 = IP_FRAG_TBL_POS(tbl, sig1);
+ p2 = IP_FRAG_TBL_POS(tbl, sig2);
+
+ for (i = 0; i != assoc; i++) {
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt line0: %p, index: %u from %u\n"
+ "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ p1, i, assoc,
+ p1[i].key.src_dst[0], p1[i].key.id, p1[i].start);
+
+ if (ip_frag_key_cmp(&p1[i].key, key) == 0)
+ return (p1 + i);
+ else if (ip_frag_key_is_empty(&p1[i].key))
+ empty = (empty == NULL) ? (p1 + i) : empty;
+ else if (max_cycles + p1[i].start < tms)
+ old = (old == NULL) ? (p1 + i) : old;
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt line1: %p, index: %u from %u\n"
+ "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ p2, i, assoc,
+ p2[i].key.src_dst[0], p2[i].key.id, p2[i].start);
+
+ if (ip_frag_key_cmp(&p2[i].key, key) == 0)
+ return (p2 + i);
+ else if (ip_frag_key_is_empty(&p2[i].key))
+ empty = (empty == NULL) ?( p2 + i) : empty;
+ else if (max_cycles + p2[i].start < tms)
+ old = (old == NULL) ? (p2 + i) : old;
+ }
+
+ *free = empty;
+ *stale = old;
+ return (NULL);
+}
diff --git a/lib/librte_ip_frag/ipv4_frag_tbl.h b/lib/librte_ip_frag/ipv4_frag_tbl.h
deleted file mode 100644
index fa3291d..0000000
--- a/lib/librte_ip_frag/ipv4_frag_tbl.h
+++ /dev/null
@@ -1,400 +0,0 @@
-/*-
- * BSD LICENSE
- *
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in
- * the documentation and/or other materials provided with the
- * distribution.
- * * Neither the name of Intel Corporation nor the names of its
- * contributors may be used to endorse or promote products derived
- * from this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _IPV4_FRAG_TBL_H_
-#define _IPV4_FRAG_TBL_H_
-
-/**
- * @file
- * IPv4 fragments table.
- *
- * Implementation of IPv4 fragment table create/destroy/find/update.
- *
- */
-
-/*
- * The ip_frag_tbl is a simple hash table:
- * The basic idea is to use two hash functions and <bucket_entries>
- * associativity. This provides 2 * <bucket_entries> possible locations in
- * the hash table for each key. Sort of simplified Cuckoo hashing,
- * when the collision occurs and all 2 * <bucket_entries> are occupied,
- * instead of resinserting existing keys into alternative locations, we just
- * return a faiure.
- * Another thing timing: entries that resides in the table longer then
- * <max_cycles> are considered as invalid, and could be removed/replaced
- * byt the new ones.
- * <key, data> pair is stored together, all add/update/lookup opearions are not
- * MT safe.
- */
-
-#include <rte_jhash.h>
-#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
-#include <rte_hash_crc.h>
-#endif /* RTE_MACHINE_CPUFLAG_SSE4_2 */
-
-#define PRIME_VALUE 0xeaad8405
-
-TAILQ_HEAD(ip_pkt_list, ip_frag_pkt);
-
-struct ip_frag_tbl_stat {
- uint64_t find_num; /* total # of find/insert attempts. */
- uint64_t add_num; /* # of add ops. */
- uint64_t del_num; /* # of del ops. */
- uint64_t reuse_num; /* # of reuse (del/add) ops. */
- uint64_t fail_total; /* total # of add failures. */
- uint64_t fail_nospace; /* # of 'no space' add failures. */
-} __rte_cache_aligned;
-
-struct ip_frag_tbl {
- uint64_t max_cycles; /* ttl for table entries. */
- uint32_t entry_mask; /* hash value mask. */
- uint32_t max_entries; /* max entries allowed. */
- uint32_t use_entries; /* entries in use. */
- uint32_t bucket_entries; /* hash assocaitivity. */
- uint32_t nb_entries; /* total size of the table. */
- uint32_t nb_buckets; /* num of associativity lines. */
- struct ip_frag_pkt *last; /* last used entry. */
- struct ip_pkt_list lru; /* LRU list for table entries. */
- struct ip_frag_tbl_stat stat; /* statistics counters. */
- struct ip_frag_pkt pkt[0]; /* hash table. */
-};
-
-#define IP_FRAG_TBL_POS(tbl, sig) \
- ((tbl)->pkt + ((sig) & (tbl)->entry_mask))
-
-#define IP_FRAG_HASH_FNUM 2
-
-#ifdef IP_FRAG_TBL_STAT
-#define IP_FRAG_TBL_STAT_UPDATE(s, f, v) ((s)->f += (v))
-#else
-#define IP_FRAG_TBL_STAT_UPDATE(s, f, v) do {} while (0)
-#endif /* IPV4_FRAG_TBL_STAT */
-
-static inline void
-ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2)
-{
- uint32_t v;
- const uint32_t *p;
-
- p = (const uint32_t *)&key->src_dst;
-
-#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
- v = rte_hash_crc_4byte(p[0], PRIME_VALUE);
- v = rte_hash_crc_4byte(p[1], v);
- v = rte_hash_crc_4byte(key->id, v);
-#else
-
- v = rte_jhash_3words(p[0], p[1], key->id, PRIME_VALUE);
-#endif /* RTE_MACHINE_CPUFLAG_SSE4_2 */
-
- *v1 = v;
- *v2 = (v << 7) + (v >> 14);
-}
-
-/*
- * Update the table, after we finish processing it's entry.
- */
-static inline void
-ip_frag_inuse(struct ip_frag_tbl *tbl, const struct ip_frag_pkt *fp)
-{
- if (IP_FRAG_KEY_EMPTY(&fp->key)) {
- TAILQ_REMOVE(&tbl->lru, fp, lru);
- tbl->use_entries--;
- }
-}
-
-/*
- * For the given key, try to find an existing entry.
- * If such entry doesn't exist, will return free and/or timed-out entry,
- * that can be used for that key.
- */
-static inline struct ip_frag_pkt *
-ip_frag_lookup(struct ip_frag_tbl *tbl,
- const struct ip_frag_key *key, uint64_t tms,
- struct ip_frag_pkt **free, struct ip_frag_pkt **stale)
-{
- struct ip_frag_pkt *p1, *p2;
- struct ip_frag_pkt *empty, *old;
- uint64_t max_cycles;
- uint32_t i, assoc, sig1, sig2;
-
- empty = NULL;
- old = NULL;
-
- max_cycles = tbl->max_cycles;
- assoc = tbl->bucket_entries;
-
- if (tbl->last != NULL && IP_FRAG_KEY_CMP(&tbl->last->key, key) == 0)
- return (tbl->last);
-
- ipv4_frag_hash(key, &sig1, &sig2);
- p1 = IP_FRAG_TBL_POS(tbl, sig1);
- p2 = IP_FRAG_TBL_POS(tbl, sig2);
-
- for (i = 0; i != assoc; i++) {
-
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "tbl: %p, max_entries: %u, use_entries: %u\n"
- "ip_frag_pkt line0: %p, index: %u from %u\n"
- "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
- __func__, __LINE__,
- tbl, tbl->max_entries, tbl->use_entries,
- p1, i, assoc,
- p1[i].key.src_dst, p1[i].key.id, p1[i].start);
-
- if (IP_FRAG_KEY_CMP(&p1[i].key, key) == 0)
- return (p1 + i);
- else if (IP_FRAG_KEY_EMPTY(&p1[i].key))
- empty = (empty == NULL) ? (p1 + i) : empty;
- else if (max_cycles + p1[i].start < tms)
- old = (old == NULL) ? (p1 + i) : old;
-
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "tbl: %p, max_entries: %u, use_entries: %u\n"
- "ip_frag_pkt line1: %p, index: %u from %u\n"
- "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
- __func__, __LINE__,
- tbl, tbl->max_entries, tbl->use_entries,
- p2, i, assoc,
- p2[i].key.src_dst, p2[i].key.id, p2[i].start);
-
- if (IP_FRAG_KEY_CMP(&p2[i].key, key) == 0)
- return (p2 + i);
- else if (IP_FRAG_KEY_EMPTY(&p2[i].key))
- empty = (empty == NULL) ?( p2 + i) : empty;
- else if (max_cycles + p2[i].start < tms)
- old = (old == NULL) ? (p2 + i) : old;
- }
-
- *free = empty;
- *stale = old;
- return (NULL);
-}
-
-static inline void
-ip_frag_tbl_del(struct ip_frag_tbl *tbl, struct ip_frag_death_row *dr,
- struct ip_frag_pkt *fp)
-{
- ip_frag_free(fp, dr);
- IP_FRAG_KEY_INVALIDATE(&fp->key);
- TAILQ_REMOVE(&tbl->lru, fp, lru);
- tbl->use_entries--;
- IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, del_num, 1);
-}
-
-static inline void
-ip_frag_tbl_add(struct ip_frag_tbl *tbl, struct ip_frag_pkt *fp,
- const struct ip_frag_key *key, uint64_t tms)
-{
- fp->key = key[0];
- ip_frag_reset(fp, tms);
- TAILQ_INSERT_TAIL(&tbl->lru, fp, lru);
- tbl->use_entries++;
- IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, add_num, 1);
-}
-
-static inline void
-ip_frag_tbl_reuse(struct ip_frag_tbl *tbl, struct ip_frag_death_row *dr,
- struct ip_frag_pkt *fp, uint64_t tms)
-{
- ip_frag_free(fp, dr);
- ip_frag_reset(fp, tms);
- TAILQ_REMOVE(&tbl->lru, fp, lru);
- TAILQ_INSERT_TAIL(&tbl->lru, fp, lru);
- IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1);
-}
-
-/*
- * Find an entry in the table for the corresponding fragment.
- * If such entry is not present, then allocate a new one.
- * If the entry is stale, then free and reuse it.
- */
-static inline struct ip_frag_pkt *
-ip_frag_find(struct ip_frag_tbl *tbl, struct ip_frag_death_row *dr,
- const struct ip_frag_key *key, uint64_t tms)
-{
- struct ip_frag_pkt *pkt, *free, *stale, *lru;
- uint64_t max_cycles;
-
- /*
- * Actually the two line below are totally redundant.
- * they are here, just to make gcc 4.6 happy.
- */
- free = NULL;
- stale = NULL;
- max_cycles = tbl->max_cycles;
-
- IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, find_num, 1);
-
- if ((pkt = ip_frag_lookup(tbl, key, tms, &free, &stale)) == NULL) {
-
- /*timed-out entry, free and invalidate it*/
- if (stale != NULL) {
- ip_frag_tbl_del(tbl, dr, stale);
- free = stale;
-
- /*
- * we found a free entry, check if we can use it.
- * If we run out of free entries in the table, then
- * check if we have a timed out entry to delete.
- */
- } else if (free != NULL &&
- tbl->max_entries <= tbl->use_entries) {
- lru = TAILQ_FIRST(&tbl->lru);
- if (max_cycles + lru->start < tms) {
- ip_frag_tbl_del(tbl, dr, lru);
- } else {
- free = NULL;
- IP_FRAG_TBL_STAT_UPDATE(&tbl->stat,
- fail_nospace, 1);
- }
- }
-
- /* found a free entry to reuse. */
- if (free != NULL) {
- ip_frag_tbl_add(tbl, free, key, tms);
- pkt = free;
- }
-
- /*
- * we found the flow, but it is already timed out,
- * so free associated resources, reposition it in the LRU list,
- * and reuse it.
- */
- } else if (max_cycles + pkt->start < tms) {
- ip_frag_tbl_reuse(tbl, dr, pkt, tms);
- }
-
- IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, fail_total, (pkt == NULL));
-
- tbl->last = pkt;
- return (pkt);
-}
-
-/*
- * Create a new IPV4 Frag table.
- * @param bucket_num
- * Number of buckets in the hash table.
- * @param bucket_entries
- * Number of entries per bucket (e.g. hash associativity).
- * Should be power of two.
- * @param max_entries
- * Maximum number of entries that could be stored in the table.
- * The value should be less or equal then bucket_num * bucket_entries.
- * @param max_cycles
- * Maximum TTL in cycles for each fragmented packet.
- * @param socket_id
- * The *socket_id* argument is the socket identifier in the case of
- * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA constraints.
- * @return
- * The pointer to the new allocated mempool, on success. NULL on error.
- */
-static struct ip_frag_tbl *
-rte_ip_frag_table_create(uint32_t bucket_num, uint32_t bucket_entries,
- uint32_t max_entries, uint64_t max_cycles, int socket_id)
-{
- struct ip_frag_tbl *tbl;
- size_t sz;
- uint64_t nb_entries;
-
- nb_entries = rte_align32pow2(bucket_num);
- nb_entries *= bucket_entries;
- nb_entries *= IP_FRAG_HASH_FNUM;
-
- /* check input parameters. */
- if (rte_is_power_of_2(bucket_entries) == 0 ||
- nb_entries > UINT32_MAX || nb_entries == 0 ||
- nb_entries < max_entries) {
- RTE_LOG(ERR, USER1, "%s: invalid input parameter\n", __func__);
- return (NULL);
- }
-
- sz = sizeof (*tbl) + nb_entries * sizeof (tbl->pkt[0]);
- if ((tbl = rte_zmalloc_socket(__func__, sz, CACHE_LINE_SIZE,
- socket_id)) == NULL) {
- RTE_LOG(ERR, USER1,
- "%s: allocation of %zu bytes at socket %d failed do\n",
- __func__, sz, socket_id);
- return (NULL);
- }
-
- RTE_LOG(INFO, USER1, "%s: allocated of %zu bytes at socket %d\n",
- __func__, sz, socket_id);
-
- tbl->max_cycles = max_cycles;
- tbl->max_entries = max_entries;
- tbl->nb_entries = (uint32_t)nb_entries;
- tbl->nb_buckets = bucket_num;
- tbl->bucket_entries = bucket_entries;
- tbl->entry_mask = (tbl->nb_entries - 1) & ~(tbl->bucket_entries - 1);
-
- TAILQ_INIT(&(tbl->lru));
- return (tbl);
-}
-
-static inline void
-rte_ip_frag_table_destroy( struct ip_frag_tbl *tbl)
-{
- rte_free(tbl);
-}
-
-static void
-rte_ip_frag_table_statistics_dump(FILE *f, const struct ip_frag_tbl *tbl)
-{
- uint64_t fail_total, fail_nospace;
-
- fail_total = tbl->stat.fail_total;
- fail_nospace = tbl->stat.fail_nospace;
-
- fprintf(f, "max entries:\t%u;\n"
- "entries in use:\t%u;\n"
- "finds/inserts:\t%" PRIu64 ";\n"
- "entries added:\t%" PRIu64 ";\n"
- "entries deleted by timeout:\t%" PRIu64 ";\n"
- "entries reused by timeout:\t%" PRIu64 ";\n"
- "total add failures:\t%" PRIu64 ";\n"
- "add no-space failures:\t%" PRIu64 ";\n"
- "add hash-collisions failures:\t%" PRIu64 ";\n",
- tbl->max_entries,
- tbl->use_entries,
- tbl->stat.find_num,
- tbl->stat.add_num,
- tbl->stat.del_num,
- tbl->stat.reuse_num,
- fail_total,
- fail_nospace,
- fail_total - fail_nospace);
-}
-
-
-#endif /* _IPV4_FRAG_TBL_H_ */
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index 0cf3878..327e1f1 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -1,13 +1,13 @@
/*-
* BSD LICENSE
- *
+ *
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
- *
+ *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
- *
+ *
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
@@ -17,7 +17,7 @@
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
- *
+ *
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
@@ -31,16 +31,147 @@
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
-#ifndef _RTE_IP_FRAG_H__
-#define _RTE_IP_FRAG_H__
+#ifndef _RTE_IP_FRAG_H_
+#define _RTE_IP_FRAG_H_
/**
* @file
- * RTE IPv4 Fragmentation
+ * RTE IPv4 Fragmentation and Reassembly
+ *
+ * Implementation of IPv4 packet fragmentation and reassembly.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+
+enum {
+ IP_LAST_FRAG_IDX, /**< index of last fragment */
+ IP_FIRST_FRAG_IDX, /**< index of first fragment */
+ IP_MIN_FRAG_NUM, /**< minimum number of fragments */
+ IP_MAX_FRAG_NUM = RTE_LIBRTE_IP_FRAG_MAX_FRAG,
+ /**< maximum number of fragments per packet */
+};
+
+/** @internal fragmented mbuf */
+struct ip_frag {
+ uint16_t ofs; /**< offset into the packet */
+ uint16_t len; /**< length of fragment */
+ struct rte_mbuf *mb; /**< fragment mbuf */
+};
+
+/** @internal <src addr, dst_addr, id> to uniquely indetify fragmented datagram. */
+struct ip_frag_key {
+ uint64_t src_dst; /**< src address */
+ uint32_t id; /**< dst address */
+};
+
+/*
+ * @internal Fragmented packet to reassemble.
+ * First two entries in the frags[] array are for the last and first fragments.
+ */
+struct rte_ip_frag_pkt {
+ TAILQ_ENTRY(rte_ip_frag_pkt) lru; /**< LRU list */
+ struct ip_frag_key key; /**< fragmentation key */
+ uint64_t start; /**< creation timestamp */
+ uint32_t total_size; /**< expected reassembled size */
+ uint32_t frag_size; /**< size of fragments received */
+ uint32_t last_idx; /**< index of next entry to fill */
+ struct ip_frag frags[IP_MAX_FRAG_NUM]; /**< fragments */
+} __rte_cache_aligned;
+
+#define IP_FRAG_DEATH_ROW_LEN 32 /**< death row size (in packets) */
+
+/** mbuf death row (packets to be freed) */
+struct rte_ip_frag_death_row {
+ uint32_t cnt; /**< number of mbufs currently on death row */
+ struct rte_mbuf *row[IP_FRAG_DEATH_ROW_LEN * (IP_MAX_FRAG_NUM + 1)];
+ /**< mbufs to be freed */
+};
+
+TAILQ_HEAD(rte_ip_pkt_list, rte_ip_frag_pkt); /**< @internal fragments tailq */
+
+/** fragmentation table statistics */
+struct rte_ip_frag_tbl_stat {
+ uint64_t find_num; /**< total # of find/insert attempts. */
+ uint64_t add_num; /**< # of add ops. */
+ uint64_t del_num; /**< # of del ops. */
+ uint64_t reuse_num; /**< # of reuse (del/add) ops. */
+ uint64_t fail_total; /**< total # of add failures. */
+ uint64_t fail_nospace; /**< # of 'no space' add failures. */
+} __rte_cache_aligned;
+
+/** fragmentation table */
+struct rte_ip_frag_tbl {
+ uint64_t max_cycles; /**< ttl for table entries. */
+ uint32_t entry_mask; /**< hash value mask. */
+ uint32_t max_entries; /**< max entries allowed. */
+ uint32_t use_entries; /**< entries in use. */
+ uint32_t bucket_entries; /**< hash assocaitivity. */
+ uint32_t nb_entries; /**< total size of the table. */
+ uint32_t nb_buckets; /**< num of associativity lines. */
+ struct rte_ip_frag_pkt *last; /**< last used entry. */
+ struct rte_ip_pkt_list lru; /**< LRU list for table entries. */
+ struct rte_ip_frag_tbl_stat stat; /**< statistics counters. */
+ struct rte_ip_frag_pkt pkt[0]; /**< hash table. */
+};
+
+/** IPv6 fragment extension header */
+struct ipv6_extension_fragment {
+ uint8_t next_header; /**< Next header type */
+ uint8_t reserved1; /**< Reserved */
+ union {
+ struct {
+ uint16_t frag_offset:13; /**< Offset from the start of the packet */
+ uint16_t reserved2:2; /**< Reserved */
+ uint16_t more_frags:1;
+ /**< 1 if more fragments left, 0 if last fragment */
+ };
+ uint16_t frag_data;
+ /**< union of all fragmentation data */
+ };
+ uint32_t id; /**< Packet ID */
+} __attribute__((__packed__));
+
+
+
+/*
+ * Create a new IP fragmentation table.
*
- * Implementation of IPv4 fragmentation.
+ * @param bucket_num
+ * Number of buckets in the hash table.
+ * @param bucket_entries
+ * Number of entries per bucket (e.g. hash associativity).
+ * Should be power of two.
+ * @param max_entries
+ * Maximum number of entries that could be stored in the table.
+ * The value should be less or equal then bucket_num * bucket_entries.
+ * @param max_cycles
+ * Maximum TTL in cycles for each fragmented packet.
+ * @param socket_id
+ * The *socket_id* argument is the socket identifier in the case of
+ * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA constraints.
+ * @return
+ * The pointer to the new allocated fragmentation table, on success. NULL on error.
+ */
+struct rte_ip_frag_tbl * rte_ip_frag_table_create(uint32_t bucket_num,
+ uint32_t bucket_entries, uint32_t max_entries,
+ uint64_t max_cycles, int socket_id);
+
+/*
+ * Free allocated IP fragmentation table.
*
+ * @param btl
+ * Fragmentation table to free.
*/
+static inline void
+rte_ip_frag_table_destroy( struct rte_ip_frag_tbl *tbl)
+{
+ rte_free(tbl);
+}
/**
* IPv4 fragmentation.
@@ -64,10 +195,74 @@
* Otherwise - (-1) * <errno>.
*/
int32_t rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
- struct rte_mbuf **pkts_out,
- uint16_t nb_pkts_out,
- uint16_t mtu_size,
- struct rte_mempool *pool_direct,
- struct rte_mempool *pool_indirect);
+ struct rte_mbuf **pkts_out,
+ uint16_t nb_pkts_out, uint16_t mtu_size,
+ struct rte_mempool *pool_direct,
+ struct rte_mempool *pool_indirect);
+
+/*
+ * This function implements reassembly of fragmented IPv4 packets.
+ * Incoming mbufs should have its l2_len/l3_len fields setup correclty.
+ *
+ * @param tbl
+ * Table where to lookup/add the fragmented packet.
+ * @param dr
+ * Death row to free buffers to
+ * @param mb
+ * Incoming mbuf with IPv4 fragment.
+ * @param tms
+ * Fragment arrival timestamp.
+ * @param ip_hdr
+ * Pointer to the IPV4 header inside the fragment.
+ * @return
+ * Pointer to mbuf for reassebled packet, or NULL if:
+ * - an error occured.
+ * - not all fragments of the packet are collected yet.
+ */
+struct rte_mbuf * rte_ipv4_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
+ struct rte_ip_frag_death_row *dr,
+ struct rte_mbuf *mb, uint64_t tms, struct ipv4_hdr *ip_hdr);
+
+/*
+ * Check if the IPv4 packet is fragmented
+ *
+ * @param hdr
+ * IPv4 header of the packet
+ * @return
+ * 1 if fragmented, 0 if not fragmented
+ */
+static inline int
+rte_ipv4_frag_pkt_is_fragmented(const struct ipv4_hdr * hdr) {
+ uint16_t flag_offset, ip_flag, ip_ofs;
+
+ flag_offset = rte_be_to_cpu_16(hdr->fragment_offset);
+ ip_ofs = (uint16_t)(flag_offset & IPV4_HDR_OFFSET_MASK);
+ ip_flag = (uint16_t)(flag_offset & IPV4_HDR_MF_FLAG);
+
+ return ip_flag != 0 || ip_ofs != 0;
+}
+
+/*
+ * Free mbufs on a given death row.
+ *
+ * @param dr
+ * Death row to free mbufs in.
+ * @param prefetch
+ * How many buffers to prefetch before freeing.
+ */
+void rte_ip_frag_free_death_row(struct rte_ip_frag_death_row *dr,
+ uint32_t prefetch);
+
+
+/*
+ * Dump fragmentation table statistics to file.
+ *
+ * @param f
+ * File to dump statistics to
+ * @param tbl
+ * Fragmentation table to dump statistics from
+ */
+void
+rte_ip_frag_table_statistics_dump(FILE * f, const struct rte_ip_frag_tbl *tbl);
-#endif
+#endif /* _RTE_IP_FRAG_H_ */
diff --git a/lib/librte_ip_frag/rte_ip_frag_common.c b/lib/librte_ip_frag/rte_ip_frag_common.c
new file mode 100644
index 0000000..acd1864
--- /dev/null
+++ b/lib/librte_ip_frag/rte_ip_frag_common.c
@@ -0,0 +1,142 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+
+#include <rte_memory.h>
+#include <rte_log.h>
+#include <rte_byteorder.h>
+
+#include "rte_ip_frag.h"
+#include "ip_frag_common.h"
+
+#define IP_FRAG_HASH_FNUM 2
+
+/* free mbufs from death row */
+void
+rte_ip_frag_free_death_row(struct rte_ip_frag_death_row *dr,
+ uint32_t prefetch)
+{
+ uint32_t i, k, n;
+
+ k = RTE_MIN(prefetch, dr->cnt);
+ n = dr->cnt;
+
+ for (i = 0; i != k; i++)
+ rte_prefetch0(dr->row[i]);
+
+ for (i = 0; i != n - k; i++) {
+ rte_prefetch0(dr->row[i + k]);
+ rte_pktmbuf_free(dr->row[i]);
+ }
+
+ for (; i != n; i++)
+ rte_pktmbuf_free(dr->row[i]);
+
+ dr->cnt = 0;
+}
+
+/* create fragmentation table */
+struct rte_ip_frag_tbl *
+rte_ip_frag_table_create(uint32_t bucket_num, uint32_t bucket_entries,
+ uint32_t max_entries, uint64_t max_cycles, int socket_id)
+{
+ struct rte_ip_frag_tbl *tbl;
+ size_t sz;
+ uint64_t nb_entries;
+
+ nb_entries = rte_align32pow2(bucket_num);
+ nb_entries *= bucket_entries;
+ nb_entries *= IP_FRAG_HASH_FNUM;
+
+ /* check input parameters. */
+ if (rte_is_power_of_2(bucket_entries) == 0 ||
+ nb_entries > UINT32_MAX || nb_entries == 0 ||
+ nb_entries < max_entries) {
+ RTE_LOG(ERR, USER1, "%s: invalid input parameter\n", __func__);
+ return (NULL);
+ }
+
+ sz = sizeof (*tbl) + nb_entries * sizeof (tbl->pkt[0]);
+ if ((tbl = rte_zmalloc_socket(__func__, sz, CACHE_LINE_SIZE,
+ socket_id)) == NULL) {
+ RTE_LOG(ERR, USER1,
+ "%s: allocation of %zu bytes at socket %d failed do\n",
+ __func__, sz, socket_id);
+ return (NULL);
+ }
+
+ RTE_LOG(INFO, USER1, "%s: allocated of %zu bytes at socket %d\n",
+ __func__, sz, socket_id);
+
+ tbl->max_cycles = max_cycles;
+ tbl->max_entries = max_entries;
+ tbl->nb_entries = (uint32_t)nb_entries;
+ tbl->nb_buckets = bucket_num;
+ tbl->bucket_entries = bucket_entries;
+ tbl->entry_mask = (tbl->nb_entries - 1) & ~(tbl->bucket_entries - 1);
+
+ TAILQ_INIT(&(tbl->lru));
+ return (tbl);
+}
+
+/* dump frag table statistics to file */
+void
+rte_ip_frag_table_statistics_dump(FILE *f, const struct rte_ip_frag_tbl *tbl)
+{
+ uint64_t fail_total, fail_nospace;
+
+ fail_total = tbl->stat.fail_total;
+ fail_nospace = tbl->stat.fail_nospace;
+
+ fprintf(f, "max entries:\t%u;\n"
+ "entries in use:\t%u;\n"
+ "finds/inserts:\t%" PRIu64 ";\n"
+ "entries added:\t%" PRIu64 ";\n"
+ "entries deleted by timeout:\t%" PRIu64 ";\n"
+ "entries reused by timeout:\t%" PRIu64 ";\n"
+ "total add failures:\t%" PRIu64 ";\n"
+ "add no-space failures:\t%" PRIu64 ";\n"
+ "add hash-collisions failures:\t%" PRIu64 ";\n",
+ tbl->max_entries,
+ tbl->use_entries,
+ tbl->stat.find_num,
+ tbl->stat.add_num,
+ tbl->stat.del_num,
+ tbl->stat.reuse_num,
+ fail_total,
+ fail_nospace,
+ fail_total - fail_nospace);
+}
diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c
new file mode 100644
index 0000000..483fb95
--- /dev/null
+++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c
@@ -0,0 +1,189 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include <rte_byteorder.h>
+#include <rte_mbuf.h>
+#include <rte_debug.h>
+#include <rte_tailq.h>
+#include <rte_malloc.h>
+#include <rte_ip.h>
+
+#include "rte_ip_frag.h"
+#include "ip_frag_common.h"
+
+/*
+ * Reassemble fragments into one packet.
+ */
+struct rte_mbuf *
+ipv4_frag_reassemble(const struct rte_ip_frag_pkt *fp)
+{
+ struct ipv4_hdr *ip_hdr;
+ struct rte_mbuf *m, *prev;
+ uint32_t i, n, ofs, first_len;
+
+ first_len = fp->frags[IP_FIRST_FRAG_IDX].len;
+ n = fp->last_idx - 1;
+
+ /*start from the last fragment. */
+ m = fp->frags[IP_LAST_FRAG_IDX].mb;
+ ofs = fp->frags[IP_LAST_FRAG_IDX].ofs;
+
+ while (ofs != first_len) {
+
+ prev = m;
+
+ for (i = n; i != IP_FIRST_FRAG_IDX && ofs != first_len; i--) {
+
+ /* previous fragment found. */
+ if(fp->frags[i].ofs + fp->frags[i].len == ofs) {
+
+ ip_frag_chain(fp->frags[i].mb, m);
+
+ /* update our last fragment and offset. */
+ m = fp->frags[i].mb;
+ ofs = fp->frags[i].ofs;
+ }
+ }
+
+ /* error - hole in the packet. */
+ if (m == prev) {
+ return (NULL);
+ }
+ }
+
+ /* chain with the first fragment. */
+ ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m);
+ m = fp->frags[IP_FIRST_FRAG_IDX].mb;
+
+ /* update mbuf fields for reassembled packet. */
+ m->ol_flags |= PKT_TX_IP_CKSUM;
+
+ /* update ipv4 header for the reassmebled packet */
+ ip_hdr = (struct ipv4_hdr*)(rte_pktmbuf_mtod(m, uint8_t *) +
+ m->pkt.vlan_macip.f.l2_len);
+
+ ip_hdr->total_length = rte_cpu_to_be_16((uint16_t)(fp->total_size +
+ m->pkt.vlan_macip.f.l3_len));
+ ip_hdr->fragment_offset = (uint16_t)(ip_hdr->fragment_offset &
+ rte_cpu_to_be_16(IPV4_HDR_DF_FLAG));
+ ip_hdr->hdr_checksum = 0;
+
+ return (m);
+}
+
+/*
+ * Process new mbuf with fragment of IPV4 packet.
+ * Incoming mbuf should have it's l2_len/l3_len fields setuped correclty.
+ * @param tbl
+ * Table where to lookup/add the fragmented packet.
+ * @param mb
+ * Incoming mbuf with IPV4 fragment.
+ * @param tms
+ * Fragment arrival timestamp.
+ * @param ip_hdr
+ * Pointer to the IPV4 header inside the fragment.
+ * @return
+ * Pointer to mbuf for reassebled packet, or NULL if:
+ * - an error occured.
+ * - not all fragments of the packet are collected yet.
+ */
+struct rte_mbuf *
+rte_ipv4_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
+ struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, uint64_t tms,
+ struct ipv4_hdr *ip_hdr)
+{
+ struct rte_ip_frag_pkt *fp;
+ struct ip_frag_key key;
+ const uint64_t *psd;
+ uint16_t ip_len;
+ uint16_t flag_offset, ip_ofs, ip_flag;
+
+ flag_offset = rte_be_to_cpu_16(ip_hdr->fragment_offset);
+ ip_ofs = (uint16_t)(flag_offset & IPV4_HDR_OFFSET_MASK);
+ ip_flag = (uint16_t)(flag_offset & IPV4_HDR_MF_FLAG);
+
+ psd = (uint64_t *)&ip_hdr->src_addr;
+ key.src_dst = *psd;
+ key.id = ip_hdr->packet_id;
+
+ ip_ofs *= IPV4_HDR_OFFSET_UNITS;
+ ip_len = (uint16_t)(rte_be_to_cpu_16(ip_hdr->total_length) -
+ mb->pkt.vlan_macip.f.l3_len);
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "mbuf: %p, tms: %" PRIu64
+ ", key: <%" PRIx64 ", %#x>, ofs: %u, len: %u, flags: %#x\n"
+ "tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
+ "max_entries: %u, use_entries: %u\n\n",
+ __func__, __LINE__,
+ mb, tms, key.src_dst, key.id, ip_ofs, ip_len, ip_flag,
+ tbl, tbl->max_cycles, tbl->entry_mask, tbl->max_entries,
+ tbl->use_entries);
+
+ /* try to find/add entry into the fragment's table. */
+ if ((fp = ip_frag_find(tbl, dr, &key, tms)) == NULL) {
+ IP_FRAG_MBUF2DR(dr, mb);
+ return (NULL);
+ }
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, start: %" PRIu64
+ ", total_size: %u, frag_size: %u, last_idx: %u\n\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ fp, fp->key.src_dst, fp->key.id, fp->start,
+ fp->total_size, fp->frag_size, fp->last_idx);
+
+
+ /* process the fragmented packet. */
+ mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag);
+ ip_frag_inuse(tbl, fp);
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "mbuf: %p\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, start: %" PRIu64
+ ", total_size: %u, frag_size: %u, last_idx: %u\n\n",
+ __func__, __LINE__, mb,
+ tbl, tbl->max_entries, tbl->use_entries,
+ fp, fp->key.src_dst, fp->key.id, fp->start,
+ fp->total_size, fp->frag_size, fp->last_idx);
+
+ return (mb);
+}
diff --git a/lib/librte_ip_frag/rte_ipv4_rsmbl.h b/lib/librte_ip_frag/rte_ipv4_rsmbl.h
deleted file mode 100644
index 82cb9b5..0000000
--- a/lib/librte_ip_frag/rte_ipv4_rsmbl.h
+++ /dev/null
@@ -1,427 +0,0 @@
-/*-
- * BSD LICENSE
- *
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in
- * the documentation and/or other materials provided with the
- * distribution.
- * * Neither the name of Intel Corporation nor the names of its
- * contributors may be used to endorse or promote products derived
- * from this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _IPV4_RSMBL_H_
-#define _IPV4_RSMBL_H_
-
-#include "ip_frag_common.h"
-
-/**
- * @file
- * IPv4 reassemble
- *
- * Implementation of IPv4 reassemble.
- *
- */
-
-enum {
- LAST_FRAG_IDX,
- FIRST_FRAG_IDX,
- MIN_FRAG_NUM,
- MAX_FRAG_NUM = 4,
-};
-
-struct ip_frag {
- uint16_t ofs;
- uint16_t len;
- struct rte_mbuf *mb;
-};
-
-/*
- * Use <src addr, dst_addr, id> to uniquely indetify fragmented datagram.
- */
-struct ip_frag_key {
- uint64_t src_dst;
- uint32_t id;
-};
-
-#define IP_FRAG_KEY_INVALIDATE(k) ((k)->src_dst = 0)
-#define IP_FRAG_KEY_EMPTY(k) ((k)->src_dst == 0)
-
-#define IP_FRAG_KEY_CMP(k1, k2) \
- (((k1)->src_dst ^ (k2)->src_dst) | ((k1)->id ^ (k2)->id))
-
-
-/*
- * Fragmented packet to reassemble.
- * First two entries in the frags[] array are for the last and first fragments.
- */
-struct ip_frag_pkt {
- TAILQ_ENTRY(ip_frag_pkt) lru; /* LRU list */
- struct ip_frag_key key;
- uint64_t start; /* creation timestamp */
- uint32_t total_size; /* expected reassembled size */
- uint32_t frag_size; /* size of fragments received */
- uint32_t last_idx; /* index of next entry to fill */
- struct ip_frag frags[MAX_FRAG_NUM];
-} __rte_cache_aligned;
-
-
-struct ip_frag_death_row {
- uint32_t cnt;
- struct rte_mbuf *row[MAX_PKT_BURST * (MAX_FRAG_NUM + 1)];
-};
-
-#define IP_FRAG_MBUF2DR(dr, mb) ((dr)->row[(dr)->cnt++] = (mb))
-
-/* logging macros. */
-
-#ifdef IP_FRAG_DEBUG
-#define IP_FRAG_LOG(lvl, fmt, args...) RTE_LOG(lvl, USER1, fmt, ##args)
-#else
-#define IP_FRAG_LOG(lvl, fmt, args...) do {} while(0)
-#endif /* IP_FRAG_DEBUG */
-
-
-static inline void
-ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms)
-{
- static const struct ip_frag zero_frag = {
- .ofs = 0,
- .len = 0,
- .mb = NULL,
- };
-
- fp->start = tms;
- fp->total_size = UINT32_MAX;
- fp->frag_size = 0;
- fp->last_idx = MIN_FRAG_NUM;
- fp->frags[LAST_FRAG_IDX] = zero_frag;
- fp->frags[FIRST_FRAG_IDX] = zero_frag;
-}
-
-static inline void
-ip_frag_free(struct ip_frag_pkt *fp, struct ip_frag_death_row *dr)
-{
- uint32_t i, k;
-
- k = dr->cnt;
- for (i = 0; i != fp->last_idx; i++) {
- if (fp->frags[i].mb != NULL) {
- dr->row[k++] = fp->frags[i].mb;
- fp->frags[i].mb = NULL;
- }
- }
-
- fp->last_idx = 0;
- dr->cnt = k;
-}
-
-static inline void
-rte_ip_frag_free_death_row(struct ip_frag_death_row *dr, uint32_t prefetch)
-{
- uint32_t i, k, n;
-
- k = RTE_MIN(prefetch, dr->cnt);
- n = dr->cnt;
-
- for (i = 0; i != k; i++)
- rte_prefetch0(dr->row[i]);
-
- for (i = 0; i != n - k; i++) {
- rte_prefetch0(dr->row[i + k]);
- rte_pktmbuf_free(dr->row[i]);
- }
-
- for (; i != n; i++)
- rte_pktmbuf_free(dr->row[i]);
-
- dr->cnt = 0;
-}
-
-/*
- * Helper function.
- * Takes 2 mbufs that represents two framents of the same packet and
- * chains them into one mbuf.
- */
-static inline void
-ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp)
-{
- struct rte_mbuf *ms;
-
- /* adjust start of the last fragment data. */
- rte_pktmbuf_adj(mp, (uint16_t)(mp->pkt.vlan_macip.f.l2_len +
- mp->pkt.vlan_macip.f.l3_len));
-
- /* chain two fragments. */
- ms = rte_pktmbuf_lastseg(mn);
- ms->pkt.next = mp;
-
- /* accumulate number of segments and total length. */
- mn->pkt.nb_segs = (uint8_t)(mn->pkt.nb_segs + mp->pkt.nb_segs);
- mn->pkt.pkt_len += mp->pkt.pkt_len;
-
- /* reset pkt_len and nb_segs for chained fragment. */
- mp->pkt.pkt_len = mp->pkt.data_len;
- mp->pkt.nb_segs = 1;
-}
-
-/*
- * Reassemble fragments into one packet.
- */
-static inline struct rte_mbuf *
-ipv4_frag_reassemble(const struct ip_frag_pkt *fp)
-{
- struct ipv4_hdr *ip_hdr;
- struct rte_mbuf *m, *prev;
- uint32_t i, n, ofs, first_len;
-
- first_len = fp->frags[FIRST_FRAG_IDX].len;
- n = fp->last_idx - 1;
-
- /*start from the last fragment. */
- m = fp->frags[LAST_FRAG_IDX].mb;
- ofs = fp->frags[LAST_FRAG_IDX].ofs;
-
- while (ofs != first_len) {
-
- prev = m;
-
- for (i = n; i != FIRST_FRAG_IDX && ofs != first_len; i--) {
-
- /* previous fragment found. */
- if(fp->frags[i].ofs + fp->frags[i].len == ofs) {
-
- ip_frag_chain(fp->frags[i].mb, m);
-
- /* update our last fragment and offset. */
- m = fp->frags[i].mb;
- ofs = fp->frags[i].ofs;
- }
- }
-
- /* error - hole in the packet. */
- if (m == prev) {
- return (NULL);
- }
- }
-
- /* chain with the first fragment. */
- ip_frag_chain(fp->frags[FIRST_FRAG_IDX].mb, m);
- m = fp->frags[FIRST_FRAG_IDX].mb;
-
- /* update mbuf fields for reassembled packet. */
- m->ol_flags |= PKT_TX_IP_CKSUM;
-
- /* update ipv4 header for the reassmebled packet */
- ip_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(m, uint8_t *) +
- m->pkt.vlan_macip.f.l2_len);
-
- ip_hdr->total_length = rte_cpu_to_be_16((uint16_t)(fp->total_size +
- m->pkt.vlan_macip.f.l3_len));
- ip_hdr->fragment_offset = (uint16_t)(ip_hdr->fragment_offset &
- rte_cpu_to_be_16(IPV4_HDR_DF_FLAG));
- ip_hdr->hdr_checksum = 0;
-
- return (m);
-}
-
-static inline struct rte_mbuf *
-ip_frag_process(struct ip_frag_pkt *fp, struct ip_frag_death_row *dr,
- struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags)
-{
- uint32_t idx;
-
- fp->frag_size += len;
-
- /* this is the first fragment. */
- if (ofs == 0) {
- idx = (fp->frags[FIRST_FRAG_IDX].mb == NULL) ?
- FIRST_FRAG_IDX : UINT32_MAX;
-
- /* this is the last fragment. */
- } else if (more_frags == 0) {
- fp->total_size = ofs + len;
- idx = (fp->frags[LAST_FRAG_IDX].mb == NULL) ?
- LAST_FRAG_IDX : UINT32_MAX;
-
- /* this is the intermediate fragment. */
- } else if ((idx = fp->last_idx) <
- sizeof (fp->frags) / sizeof (fp->frags[0])) {
- fp->last_idx++;
- }
-
- /*
- * errorneous packet: either exceeed max allowed number of fragments,
- * or duplicate first/last fragment encountered.
- */
- if (idx >= sizeof (fp->frags) / sizeof (fp->frags[0])) {
-
- /* report an error. */
- IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
- "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
- "total_size: %u, frag_size: %u, last_idx: %u\n"
- "first fragment: ofs: %u, len: %u\n"
- "last fragment: ofs: %u, len: %u\n\n",
- __func__, __LINE__,
- fp, fp->key.src_dst, fp->key.id,
- fp->total_size, fp->frag_size, fp->last_idx,
- fp->frags[FIRST_FRAG_IDX].ofs,
- fp->frags[FIRST_FRAG_IDX].len,
- fp->frags[LAST_FRAG_IDX].ofs,
- fp->frags[LAST_FRAG_IDX].len);
-
- /* free all fragments, invalidate the entry. */
- ip_frag_free(fp, dr);
- IP_FRAG_KEY_INVALIDATE(&fp->key);
- IP_FRAG_MBUF2DR(dr, mb);
-
- return (NULL);
- }
-
- fp->frags[idx].ofs = ofs;
- fp->frags[idx].len = len;
- fp->frags[idx].mb = mb;
-
- mb = NULL;
-
- /* not all fragments are collected yet. */
- if (likely (fp->frag_size < fp->total_size)) {
- return (mb);
-
- /* if we collected all fragments, then try to reassemble. */
- } else if (fp->frag_size == fp->total_size &&
- fp->frags[FIRST_FRAG_IDX].mb != NULL) {
- mb = ipv4_frag_reassemble(fp);
- }
-
- /* errorenous set of fragments. */
- if (mb == NULL) {
-
- /* report an error. */
- IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
- "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
- "total_size: %u, frag_size: %u, last_idx: %u\n"
- "first fragment: ofs: %u, len: %u\n"
- "last fragment: ofs: %u, len: %u\n\n",
- __func__, __LINE__,
- fp, fp->key.src_dst, fp->key.id,
- fp->total_size, fp->frag_size, fp->last_idx,
- fp->frags[FIRST_FRAG_IDX].ofs,
- fp->frags[FIRST_FRAG_IDX].len,
- fp->frags[LAST_FRAG_IDX].ofs,
- fp->frags[LAST_FRAG_IDX].len);
-
- /* free associated resources. */
- ip_frag_free(fp, dr);
- }
-
- /* we are done with that entry, invalidate it. */
- IP_FRAG_KEY_INVALIDATE(&fp->key);
- return (mb);
-}
-
-#include "ipv4_frag_tbl.h"
-
-/*
- * Process new mbuf with fragment of IPV4 packet.
- * Incoming mbuf should have it's l2_len/l3_len fields setuped correclty.
- * @param tbl
- * Table where to lookup/add the fragmented packet.
- * @param mb
- * Incoming mbuf with IPV4 fragment.
- * @param tms
- * Fragment arrival timestamp.
- * @param ip_hdr
- * Pointer to the IPV4 header inside the fragment.
- * @param ip_ofs
- * Fragment's offset (as extracted from the header).
- * @param ip_flag
- * Fragment's MF flag.
- * @return
- * Pointer to mbuf for reassebled packet, or NULL if:
- * - an error occured.
- * - not all fragments of the packet are collected yet.
- */
-static inline struct rte_mbuf *
-rte_ipv4_reassemble_packet(struct ip_frag_tbl *tbl,
- struct ip_frag_death_row *dr, struct rte_mbuf *mb, uint64_t tms,
- struct ipv4_hdr *ip_hdr, uint16_t ip_ofs, uint16_t ip_flag)
-{
- struct ip_frag_pkt *fp;
- struct ip_frag_key key;
- const uint64_t *psd;
- uint16_t ip_len;
-
- psd = (uint64_t *)&ip_hdr->src_addr;
- key.src_dst = psd[0];
- key.id = ip_hdr->packet_id;
-
- ip_ofs *= IPV4_HDR_OFFSET_UNITS;
- ip_len = (uint16_t)(rte_be_to_cpu_16(ip_hdr->total_length) -
- mb->pkt.vlan_macip.f.l3_len);
-
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "mbuf: %p, tms: %" PRIu64
- ", key: <%" PRIx64 ", %#x>, ofs: %u, len: %u, flags: %#x\n"
- "tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
- "max_entries: %u, use_entries: %u\n\n",
- __func__, __LINE__,
- mb, tms, key.src_dst, key.id, ip_ofs, ip_len, ip_flag,
- tbl, tbl->max_cycles, tbl->entry_mask, tbl->max_entries,
- tbl->use_entries);
-
- /* try to find/add entry into the fragment's table. */
- if ((fp = ip_frag_find(tbl, dr, &key, tms)) == NULL) {
- IP_FRAG_MBUF2DR(dr, mb);
- return NULL;
- }
-
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "tbl: %p, max_entries: %u, use_entries: %u\n"
- "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, start: %" PRIu64
- ", total_size: %u, frag_size: %u, last_idx: %u\n\n",
- __func__, __LINE__,
- tbl, tbl->max_entries, tbl->use_entries,
- fp, fp->key.src_dst, fp->key.id, fp->start,
- fp->total_size, fp->frag_size, fp->last_idx);
-
-
- /* process the fragmented packet. */
- mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag);
- ip_frag_inuse(tbl, fp);
-
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "mbuf: %p\n"
- "tbl: %p, max_entries: %u, use_entries: %u\n"
- "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, start: %" PRIu64
- ", total_size: %u, frag_size: %u, last_idx: %u\n\n",
- __func__, __LINE__, mb,
- tbl, tbl->max_entries, tbl->use_entries,
- fp, fp->key.src_dst, fp->key.id, fp->start,
- fp->total_size, fp->frag_size, fp->last_idx);
-
- return (mb);
-}
-
-#endif /* _IPV4_RSMBL_H_ */
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 08/13] ip_frag: renamed ipv4 frag function
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (7 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 07/13] ip_frag: refactored reassembly code and made it a proper library Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 09/13] ip_frag: added IPv6 fragmentation support Anatoly Burakov
` (7 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
examples/ipv4_frag/main.c | 2 +-
lib/librte_ip_frag/rte_ip_frag.h | 2 +-
lib/librte_ip_frag/rte_ipv4_fragmentation.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/ipv4_frag/main.c b/examples/ipv4_frag/main.c
index 05a26b1..7aff99b 100644
--- a/examples/ipv4_frag/main.c
+++ b/examples/ipv4_frag/main.c
@@ -272,7 +272,7 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t port_in)
qconf->tx_mbufs[port_out].m_table[len] = m;
len2 = 1;
} else {
- len2 = rte_ipv4_fragmentation(m,
+ len2 = rte_ipv4_fragment_packet(m,
&qconf->tx_mbufs[port_out].m_table[len],
(uint16_t)(MBUF_TABLE_SIZE - len),
IPV4_MTU_DEFAULT,
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index 327e1f1..ecae782 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -194,7 +194,7 @@ rte_ip_frag_table_destroy( struct rte_ip_frag_tbl *tbl)
* in the pkts_out array.
* Otherwise - (-1) * <errno>.
*/
-int32_t rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
+int32_t rte_ipv4_fragment_packet(struct rte_mbuf *pkt_in,
struct rte_mbuf **pkts_out,
uint16_t nb_pkts_out, uint16_t mtu_size,
struct rte_mempool *pool_direct,
diff --git a/lib/librte_ip_frag/rte_ipv4_fragmentation.c b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
index 6e5feb6..7ec20cf 100644
--- a/lib/librte_ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/librte_ip_frag/rte_ipv4_fragmentation.c
@@ -96,7 +96,7 @@ static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
* Otherwise - (-1) * <errno>.
*/
int32_t
-rte_ipv4_fragmentation(struct rte_mbuf *pkt_in,
+rte_ipv4_fragment_packet(struct rte_mbuf *pkt_in,
struct rte_mbuf **pkts_out,
uint16_t nb_pkts_out,
uint16_t mtu_size,
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 09/13] ip_frag: added IPv6 fragmentation support
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (8 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 08/13] ip_frag: renamed ipv4 frag function Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 10/13] examples: renamed ipv4_frag example app to ip_fragmentation Anatoly Burakov
` (6 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Mostly a copy-paste of IPv4.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_ip_frag/Makefile | 1 +
lib/librte_ip_frag/rte_ip_frag.h | 27 ++++
lib/librte_ip_frag/rte_ipv6_fragmentation.c | 219 ++++++++++++++++++++++++++++
3 files changed, 247 insertions(+)
create mode 100644 lib/librte_ip_frag/rte_ipv6_fragmentation.c
diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
index 022092d..13a4f9f 100644
--- a/lib/librte_ip_frag/Makefile
+++ b/lib/librte_ip_frag/Makefile
@@ -40,6 +40,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
#source files
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ip_frag_common.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += ip_frag_internal.c
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index ecae782..4a4b5c3 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -174,6 +174,33 @@ rte_ip_frag_table_destroy( struct rte_ip_frag_tbl *tbl)
}
/**
+ * This function implements the fragmentation of IPv6 packets.
+ *
+ * @param pkt_in
+ * The input packet.
+ * @param pkts_out
+ * Array storing the output fragments.
+ * @param mtu_size
+ * Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing IPv6
+ * datagrams. This value includes the size of the IPv6 header.
+ * @param pool_direct
+ * MBUF pool used for allocating direct buffers for the output fragments.
+ * @param pool_indirect
+ * MBUF pool used for allocating indirect buffers for the output fragments.
+ * @return
+ * Upon successful completion - number of output fragments placed
+ * in the pkts_out array.
+ * Otherwise - (-1) * <errno>.
+ */
+int32_t
+rte_ipv6_fragment_packet(struct rte_mbuf *pkt_in,
+ struct rte_mbuf **pkts_out,
+ uint16_t nb_pkts_out,
+ uint16_t mtu_size,
+ struct rte_mempool *pool_direct,
+ struct rte_mempool *pool_indirect);
+
+/**
* IPv4 fragmentation.
*
* This function implements the fragmentation of IPv4 packets.
diff --git a/lib/librte_ip_frag/rte_ipv6_fragmentation.c b/lib/librte_ip_frag/rte_ipv6_fragmentation.c
new file mode 100644
index 0000000..e8f137c
--- /dev/null
+++ b/lib/librte_ip_frag/rte_ipv6_fragmentation.c
@@ -0,0 +1,219 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <errno.h>
+
+#include <rte_byteorder.h>
+#include <rte_memcpy.h>
+#include <rte_ip.h>
+
+#include "rte_ip_frag.h"
+#include "ip_frag_common.h"
+
+/**
+ * @file
+ * RTE IPv6 Fragmentation
+ *
+ * Implementation of IPv6 fragmentation.
+ *
+ */
+
+/* Fragment Extension Header */
+#define IPV6_HDR_MF_SHIFT 0
+#define IPV6_HDR_FO_SHIFT 3
+#define IPV6_HDR_MF_MASK (1 << IPV6_HDR_MF_SHIFT)
+#define IPV6_HDR_FO_MASK ((1 << IPV6_HDR_FO_SHIFT) - 1)
+
+static inline void
+__fill_ipv6hdr_frag(struct ipv6_hdr *dst,
+ const struct ipv6_hdr *src, uint16_t len, uint16_t fofs,
+ uint32_t mf)
+{
+ struct ipv6_extension_fragment *fh;
+
+ rte_memcpy(dst, src, sizeof(*dst));
+ dst->payload_len = rte_cpu_to_be_16(len);
+ dst->proto = IPPROTO_FRAGMENT;
+
+ fh = (struct ipv6_extension_fragment *) ++dst;
+ fh->next_header = src->proto;
+ fh->reserved1 = 0;
+ fh->frag_offset = rte_cpu_to_be_16(fofs);
+ fh->reserved2 = 0;
+ fh->more_frags = rte_cpu_to_be_16(mf);
+ fh->id = 0;
+}
+
+static inline void
+__free_fragments(struct rte_mbuf *mb[], uint32_t num)
+{
+ uint32_t i;
+ for (i = 0; i < num; i++)
+ rte_pktmbuf_free(mb[i]);
+}
+
+/**
+ * IPv6 fragmentation.
+ *
+ * This function implements the fragmentation of IPv6 packets.
+ *
+ * @param pkt_in
+ * The input packet.
+ * @param pkts_out
+ * Array storing the output fragments.
+ * @param mtu_size
+ * Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing IPv6
+ * datagrams. This value includes the size of the IPv6 header.
+ * @param pool_direct
+ * MBUF pool used for allocating direct buffers for the output fragments.
+ * @param pool_indirect
+ * MBUF pool used for allocating indirect buffers for the output fragments.
+ * @return
+ * Upon successful completion - number of output fragments placed
+ * in the pkts_out array.
+ * Otherwise - (-1) * <errno>.
+ */
+int32_t
+rte_ipv6_fragment_packet(struct rte_mbuf *pkt_in,
+ struct rte_mbuf **pkts_out,
+ uint16_t nb_pkts_out,
+ uint16_t mtu_size,
+ struct rte_mempool *pool_direct,
+ struct rte_mempool *pool_indirect)
+{
+ struct rte_mbuf *in_seg = NULL;
+ struct ipv6_hdr *in_hdr;
+ uint32_t out_pkt_pos, in_seg_data_pos;
+ uint32_t more_in_segs;
+ uint16_t fragment_offset, frag_size;
+
+ frag_size = (uint16_t)(mtu_size - sizeof(struct ipv6_hdr));
+
+ /* Fragment size should be a multiple of 8. */
+ RTE_IP_FRAG_ASSERT((frag_size & IPV6_HDR_FO_MASK) == 0);
+
+ /* Check that pkts_out is big enough to hold all fragments */
+ if (unlikely (frag_size * nb_pkts_out <
+ (uint16_t)(pkt_in->pkt.pkt_len - sizeof (struct ipv6_hdr))))
+ return (-EINVAL);
+
+ in_hdr = (struct ipv6_hdr *) pkt_in->pkt.data;
+
+ in_seg = pkt_in;
+ in_seg_data_pos = sizeof(struct ipv6_hdr);
+ out_pkt_pos = 0;
+ fragment_offset = 0;
+
+ more_in_segs = 1;
+ while (likely(more_in_segs)) {
+ struct rte_mbuf *out_pkt = NULL, *out_seg_prev = NULL;
+ uint32_t more_out_segs;
+ struct ipv6_hdr *out_hdr;
+
+ /* Allocate direct buffer */
+ out_pkt = rte_pktmbuf_alloc(pool_direct);
+ if (unlikely(out_pkt == NULL)) {
+ __free_fragments(pkts_out, out_pkt_pos);
+ return (-ENOMEM);
+ }
+
+ /* Reserve space for the IP header that will be built later */
+ out_pkt->pkt.data_len = sizeof(struct ipv6_hdr) + sizeof(struct ipv6_extension_fragment);
+ out_pkt->pkt.pkt_len = sizeof(struct ipv6_hdr) + sizeof(struct ipv6_extension_fragment);
+
+ out_seg_prev = out_pkt;
+ more_out_segs = 1;
+ while (likely(more_out_segs && more_in_segs)) {
+ struct rte_mbuf *out_seg = NULL;
+ uint32_t len;
+
+ /* Allocate indirect buffer */
+ out_seg = rte_pktmbuf_alloc(pool_indirect);
+ if (unlikely(out_seg == NULL)) {
+ rte_pktmbuf_free(out_pkt);
+ __free_fragments(pkts_out, out_pkt_pos);
+ return (-ENOMEM);
+ }
+ out_seg_prev->pkt.next = out_seg;
+ out_seg_prev = out_seg;
+
+ /* Prepare indirect buffer */
+ rte_pktmbuf_attach(out_seg, in_seg);
+ len = mtu_size - out_pkt->pkt.pkt_len;
+ if (len > (in_seg->pkt.data_len - in_seg_data_pos)) {
+ len = in_seg->pkt.data_len - in_seg_data_pos;
+ }
+ out_seg->pkt.data = (char *) in_seg->pkt.data + (uint16_t) in_seg_data_pos;
+ out_seg->pkt.data_len = (uint16_t)len;
+ out_pkt->pkt.pkt_len = (uint16_t)(len +
+ out_pkt->pkt.pkt_len);
+ out_pkt->pkt.nb_segs += 1;
+ in_seg_data_pos += len;
+
+ /* Current output packet (i.e. fragment) done ? */
+ if (unlikely(out_pkt->pkt.pkt_len >= mtu_size)) {
+ more_out_segs = 0;
+ }
+
+ /* Current input segment done ? */
+ if (unlikely(in_seg_data_pos == in_seg->pkt.data_len)) {
+ in_seg = in_seg->pkt.next;
+ in_seg_data_pos = 0;
+
+ if (unlikely(in_seg == NULL)) {
+ more_in_segs = 0;
+ }
+ }
+ }
+
+ /* Build the IP header */
+
+ out_hdr = (struct ipv6_hdr *) out_pkt->pkt.data;
+
+ __fill_ipv6hdr_frag(out_hdr, in_hdr,
+ (uint16_t) out_pkt->pkt.pkt_len - sizeof(struct ipv6_hdr),
+ fragment_offset, more_in_segs);
+
+ fragment_offset = (uint16_t)(fragment_offset +
+ out_pkt->pkt.pkt_len - sizeof(struct ipv6_hdr)
+ - sizeof(struct ipv6_extension_fragment));
+
+ /* Write the fragment to the output list */
+ pkts_out[out_pkt_pos] = out_pkt;
+ out_pkt_pos ++;
+ }
+
+ return (out_pkt_pos);
+}
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 10/13] examples: renamed ipv4_frag example app to ip_fragmentation
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (9 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 09/13] ip_frag: added IPv6 fragmentation support Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 11/13] example: overhaul of ip_fragmentation example app Anatoly Burakov
` (5 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
examples/{ipv4_frag => ip_fragmentation}/Makefile | 2 +-
examples/{ipv4_frag => ip_fragmentation}/main.c | 0
examples/{ipv4_frag => ip_fragmentation}/main.h | 0
3 files changed, 1 insertion(+), 1 deletion(-)
rename examples/{ipv4_frag => ip_fragmentation}/Makefile (99%)
rename examples/{ipv4_frag => ip_fragmentation}/main.c (100%)
rename examples/{ipv4_frag => ip_fragmentation}/main.h (100%)
diff --git a/examples/ipv4_frag/Makefile b/examples/ip_fragmentation/Makefile
similarity index 99%
rename from examples/ipv4_frag/Makefile
rename to examples/ip_fragmentation/Makefile
index 5fc4d9e..1482772 100644
--- a/examples/ipv4_frag/Makefile
+++ b/examples/ip_fragmentation/Makefile
@@ -44,7 +44,7 @@ $(error This application requires RTE_MBUF_SCATTER_GATHER to be enabled)
endif
# binary name
-APP = ipv4_frag
+APP = ip_fragmentation
# all source are stored in SRCS-y
SRCS-y := main.c
diff --git a/examples/ipv4_frag/main.c b/examples/ip_fragmentation/main.c
similarity index 100%
rename from examples/ipv4_frag/main.c
rename to examples/ip_fragmentation/main.c
diff --git a/examples/ipv4_frag/main.h b/examples/ip_fragmentation/main.h
similarity index 100%
rename from examples/ipv4_frag/main.h
rename to examples/ip_fragmentation/main.h
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 11/13] example: overhaul of ip_fragmentation example app
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (10 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 10/13] examples: renamed ipv4_frag example app to ip_fragmentation Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 12/13] ip_frag: add support for IPv6 reassembly Anatoly Burakov
` (4 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
examples/ip_fragmentation/main.c | 547 ++++++++++++++++++++++++++++-----------
1 file changed, 403 insertions(+), 144 deletions(-)
diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 7aff99b..2ce564c 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -69,23 +69,15 @@
#include <rte_mempool.h>
#include <rte_mbuf.h>
#include <rte_lpm.h>
+#include <rte_lpm6.h>
#include <rte_ip.h>
+#include <rte_string_fns.h>
-#include "rte_ip_frag.h"
-#include "main.h"
-
-/*
- * Default byte size for the IPv4 Maximum Transfer Unit (MTU).
- * This value includes the size of IPv4 header.
- */
-#define IPV4_MTU_DEFAULT ETHER_MTU
+#include <rte_ip_frag.h>
-/*
- * Default payload in bytes for the IPv4 packet.
- */
-#define IPV4_DEFAULT_PAYLOAD (IPV4_MTU_DEFAULT - sizeof(struct ipv4_hdr))
+#include "main.h"
-#define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1
+#define RTE_LOGTYPE_IP_FRAG RTE_LOGTYPE_USER1
#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
@@ -95,9 +87,22 @@
#define ROUNDUP_DIV(a, b) (((a) + (b) - 1) / (b))
/*
- * Max number of fragments per packet expected.
+ * Default byte size for the IPv6 Maximum Transfer Unit (MTU).
+ * This value includes the size of IPv6 header.
+ */
+#define IPV4_MTU_DEFAULT ETHER_MTU
+#define IPV6_MTU_DEFAULT ETHER_MTU
+
+/*
+ * Default payload in bytes for the IPv6 packet.
+ */
+#define IPV4_DEFAULT_PAYLOAD (IPV4_MTU_DEFAULT - sizeof(struct ipv4_hdr))
+#define IPV6_DEFAULT_PAYLOAD (IPV6_MTU_DEFAULT - sizeof(struct ipv6_hdr))
+
+/*
+ * Max number of fragments per packet expected - defined by config file.
*/
-#define MAX_PACKET_FRAG ROUNDUP_DIV(JUMBO_FRAME_MAX_SIZE, IPV4_DEFAULT_PAYLOAD)
+#define MAX_PACKET_FRAG RTE_LIBRTE_IP_FRAG_MAX_FRAG
#define NB_MBUF 8192
@@ -136,8 +141,27 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
/* ethernet addresses of ports */
static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
-static struct ether_addr remote_eth_addr =
- {{0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}};
+
+#ifndef IPv4_BYTES
+#define IPv4_BYTES_FMT "%" PRIu8 ".%" PRIu8 ".%" PRIu8 ".%" PRIu8
+#define IPv4_BYTES(addr) \
+ (uint8_t) (((addr) >> 24) & 0xFF),\
+ (uint8_t) (((addr) >> 16) & 0xFF),\
+ (uint8_t) (((addr) >> 8) & 0xFF),\
+ (uint8_t) ((addr) & 0xFF)
+#endif
+
+#ifndef IPv6_BYTES
+#define IPv6_BYTES_FMT "%02x%02x:%02x%02x:%02x%02x:%02x%02x:"\
+ "%02x%02x:%02x%02x:%02x%02x:%02x%02x"
+#define IPv6_BYTES(addr) \
+ addr[0], addr[1], addr[2], addr[3], \
+ addr[4], addr[5], addr[6], addr[7], \
+ addr[8], addr[9], addr[10], addr[11],\
+ addr[12], addr[13],addr[14], addr[15]
+#endif
+
+#define IPV6_ADDR_LEN 16
/* mask of enabled ports */
static int enabled_port_mask = 0;
@@ -151,14 +175,21 @@ struct mbuf_table {
struct rte_mbuf *m_table[MBUF_TABLE_SIZE];
};
+struct rx_queue {
+ struct rte_mempool * direct_pool;
+ struct rte_mempool * indirect_pool;
+ struct rte_lpm * lpm;
+ struct rte_lpm6 * lpm6;
+ uint8_t portid;
+};
+
#define MAX_RX_QUEUE_PER_LCORE 16
#define MAX_TX_QUEUE_PER_PORT 16
struct lcore_queue_conf {
uint16_t n_rx_queue;
- uint8_t rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
+ struct rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
-
} __rte_cache_aligned;
struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
@@ -167,7 +198,7 @@ static const struct rte_eth_conf port_conf = {
.max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE,
.split_hdr_size = 0,
.header_split = 0, /**< Header Split disabled */
- .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_ip_checksum = 1, /**< IP checksum offload enabled */
.hw_vlan_filter = 0, /**< VLAN filtering disabled */
.jumbo_frame = 1, /**< Jumbo Frame Support enabled */
.hw_strip_crc = 0, /**< CRC stripped by hardware */
@@ -195,27 +226,61 @@ static const struct rte_eth_txconf tx_conf = {
.tx_rs_thresh = 0, /* Use PMD default values */
};
-struct rte_mempool *pool_direct = NULL, *pool_indirect = NULL;
-
-struct l3fwd_route {
+/*
+ * IPv4 forwarding table
+ */
+struct l3fwd_ipv4_route {
uint32_t ip;
uint8_t depth;
uint8_t if_out;
};
-struct l3fwd_route l3fwd_route_array[] = {
- {IPv4(100,10,0,0), 16, 2},
- {IPv4(100,20,0,0), 16, 2},
- {IPv4(100,30,0,0), 16, 0},
- {IPv4(100,40,0,0), 16, 0},
+struct l3fwd_ipv4_route l3fwd_ipv4_route_array[] = {
+ {IPv4(100,10,0,0), 16, 0},
+ {IPv4(100,20,0,0), 16, 1},
+ {IPv4(100,30,0,0), 16, 2},
+ {IPv4(100,40,0,0), 16, 3},
+ {IPv4(100,50,0,0), 16, 4},
+ {IPv4(100,60,0,0), 16, 5},
+ {IPv4(100,70,0,0), 16, 6},
+ {IPv4(100,80,0,0), 16, 7},
};
-#define L3FWD_NUM_ROUTES \
- (sizeof(l3fwd_route_array) / sizeof(l3fwd_route_array[0]))
+/*
+ * IPv6 forwarding table
+ */
+
+struct l3fwd_ipv6_route {
+ uint8_t ip[IPV6_ADDR_LEN];
+ uint8_t depth;
+ uint8_t if_out;
+};
+
+static struct l3fwd_ipv6_route l3fwd_ipv6_route_array[] = {
+ {{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 0},
+ {{2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 1},
+ {{3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 2},
+ {{4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 3},
+ {{5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 4},
+ {{6,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 5},
+ {{7,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 6},
+ {{8,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 7},
+};
+
+#define LPM_MAX_RULES 1024
+#define LPM6_MAX_RULES 1024
+#define LPM6_NUMBER_TBL8S (1 << 16)
-#define L3FWD_LPM_MAX_RULES 1024
+struct rte_lpm6_config lpm6_config = {
+ .max_rules = LPM6_MAX_RULES,
+ .number_tbl8s = LPM6_NUMBER_TBL8S,
+ .flags = 0
+};
-struct rte_lpm *l3fwd_lpm = NULL;
+static struct rte_mempool *socket_direct_pool[RTE_MAX_NUMA_NODES];
+static struct rte_mempool *socket_indirect_pool[RTE_MAX_NUMA_NODES];
+static struct rte_lpm *socket_lpm[RTE_MAX_NUMA_NODES];
+static struct rte_lpm6 *socket_lpm6[RTE_MAX_NUMA_NODES];
/* Send burst of packets on an output interface */
static inline int
@@ -239,54 +304,108 @@ send_burst(struct lcore_queue_conf *qconf, uint16_t n, uint8_t port)
}
static inline void
-l3fwd_simple_forward(struct rte_mbuf *m, uint8_t port_in)
+l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
+ uint8_t queueid, uint8_t port_in)
{
- struct lcore_queue_conf *qconf;
- struct ipv4_hdr *ip_hdr;
- uint32_t i, len, lcore_id, ip_dst;
- uint8_t next_hop, port_out;
+ struct rx_queue *rxq;
+ uint32_t i, len;
+ uint8_t next_hop, port_out, ipv6;
int32_t len2;
- lcore_id = rte_lcore_id();
- qconf = &lcore_queue_conf[lcore_id];
+ ipv6 = 0;
+ rxq = &qconf->rx_queue_list[queueid];
+
+ /* by default, send everything back to the source port */
+ port_out = port_in;
/* Remove the Ethernet header and trailer from the input packet */
rte_pktmbuf_adj(m, (uint16_t)sizeof(struct ether_hdr));
- /* Read the lookup key (i.e. ip_dst) from the input packet */
- ip_hdr = rte_pktmbuf_mtod(m, struct ipv4_hdr *);
- ip_dst = rte_be_to_cpu_32(ip_hdr->dst_addr);
-
- /* Find destination port */
- if (rte_lpm_lookup(l3fwd_lpm, ip_dst, &next_hop) == 0 &&
- (enabled_port_mask & 1 << next_hop) != 0)
- port_out = next_hop;
- else
- port_out = port_in;
-
/* Build transmission burst */
len = qconf->tx_mbufs[port_out].len;
- /* if we don't need to do any fragmentation */
- if (likely (IPV4_MTU_DEFAULT >= m->pkt.pkt_len)) {
+ /* if this is an IPv4 packet */
+ if (m->ol_flags & PKT_RX_IPV4_HDR) {
+ struct ipv4_hdr *ip_hdr;
+ uint32_t ip_dst;
+ /* Read the lookup key (i.e. ip_dst) from the input packet */
+ ip_hdr = rte_pktmbuf_mtod(m, struct ipv4_hdr *);
+ ip_dst = rte_be_to_cpu_32(ip_hdr->dst_addr);
+
+ /* Find destination port */
+ if (rte_lpm_lookup(rxq->lpm, ip_dst, &next_hop) == 0 &&
+ (enabled_port_mask & 1 << next_hop) != 0) {
+ port_out = next_hop;
+
+ /* Build transmission burst for new port */
+ len = qconf->tx_mbufs[port_out].len;
+ }
+
+ /* if we don't need to do any fragmentation */
+ if (likely (IPV4_MTU_DEFAULT >= m->pkt.pkt_len)) {
+ qconf->tx_mbufs[port_out].m_table[len] = m;
+ len2 = 1;
+ } else {
+ len2 = rte_ipv4_fragment_packet(m,
+ &qconf->tx_mbufs[port_out].m_table[len],
+ (uint16_t)(MBUF_TABLE_SIZE - len),
+ IPV4_MTU_DEFAULT,
+ rxq->direct_pool, rxq->indirect_pool);
+
+ /* Free input packet */
+ rte_pktmbuf_free(m);
+
+ /* If we fail to fragment the packet */
+ if (unlikely (len2 < 0))
+ return;
+ }
+ }
+ /* if this is an IPv6 packet */
+ else if (m->ol_flags & PKT_RX_IPV6_HDR) {
+ struct ipv6_hdr *ip_hdr;
+
+ ipv6 = 1;
+
+ /* Read the lookup key (i.e. ip_dst) from the input packet */
+ ip_hdr = rte_pktmbuf_mtod(m, struct ipv6_hdr *);
+
+ /* Find destination port */
+ if (rte_lpm6_lookup(rxq->lpm6, ip_hdr->dst_addr, &next_hop) == 0 &&
+ (enabled_port_mask & 1 << next_hop) != 0) {
+ port_out = next_hop;
+
+ /* Build transmission burst for new port */
+ len = qconf->tx_mbufs[port_out].len;
+ }
+
+ /* if we don't need to do any fragmentation */
+ if (likely (IPV6_MTU_DEFAULT >= m->pkt.pkt_len)) {
+ qconf->tx_mbufs[port_out].m_table[len] = m;
+ len2 = 1;
+ } else {
+ len2 = rte_ipv6_fragment_packet(m,
+ &qconf->tx_mbufs[port_out].m_table[len],
+ (uint16_t)(MBUF_TABLE_SIZE - len),
+ IPV6_MTU_DEFAULT,
+ rxq->direct_pool, rxq->indirect_pool);
+
+ /* Free input packet */
+ rte_pktmbuf_free(m);
+
+ /* If we fail to fragment the packet */
+ if (unlikely (len2 < 0))
+ return;
+ }
+ }
+ /* else, just forward the packet */
+ else {
qconf->tx_mbufs[port_out].m_table[len] = m;
len2 = 1;
- } else {
- len2 = rte_ipv4_fragment_packet(m,
- &qconf->tx_mbufs[port_out].m_table[len],
- (uint16_t)(MBUF_TABLE_SIZE - len),
- IPV4_MTU_DEFAULT,
- pool_direct, pool_indirect);
-
- /* Free input packet */
- rte_pktmbuf_free(m);
-
- /* If we fail to fragment the packet */
- if (unlikely (len2 < 0))
- return;
}
for (i = len; i < len + len2; i ++) {
+ void *d_addr_bytes;
+
m = qconf->tx_mbufs[port_out].m_table[i];
struct ether_hdr *eth_hdr = (struct ether_hdr *)
rte_pktmbuf_prepend(m, (uint16_t)sizeof(struct ether_hdr));
@@ -296,9 +415,16 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t port_in)
m->pkt.vlan_macip.f.l2_len = sizeof(struct ether_hdr);
- ether_addr_copy(&remote_eth_addr, ð_hdr->d_addr);
+ /* 02:00:00:00:00:xx */
+ d_addr_bytes = ð_hdr->d_addr.addr_bytes[0];
+ *((uint64_t *)d_addr_bytes) = 0x000000000002 + ((uint64_t)port_out << 40);
+
+ /* src addr */
ether_addr_copy(&ports_eth_addr[port_out], ð_hdr->s_addr);
- eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
+ if (ipv6)
+ eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv6);
+ else
+ eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
}
len += len2;
@@ -331,17 +457,17 @@ main_loop(__attribute__((unused)) void *dummy)
qconf = &lcore_queue_conf[lcore_id];
if (qconf->n_rx_queue == 0) {
- RTE_LOG(INFO, L3FWD, "lcore %u has nothing to do\n", lcore_id);
+ RTE_LOG(INFO, IP_FRAG, "lcore %u has nothing to do\n", lcore_id);
return 0;
}
- RTE_LOG(INFO, L3FWD, "entering main loop on lcore %u\n", lcore_id);
+ RTE_LOG(INFO, IP_FRAG, "entering main loop on lcore %u\n", lcore_id);
for (i = 0; i < qconf->n_rx_queue; i++) {
- portid = qconf->rx_queue_list[i];
- RTE_LOG(INFO, L3FWD, " -- lcoreid=%u portid=%d\n", lcore_id,
- (int) portid);
+ portid = qconf->rx_queue_list[i].portid;
+ RTE_LOG(INFO, IP_FRAG, " -- lcoreid=%u portid=%d\n", lcore_id,
+ (int) portid);
}
while (1) {
@@ -375,7 +501,7 @@ main_loop(__attribute__((unused)) void *dummy)
*/
for (i = 0; i < qconf->n_rx_queue; i++) {
- portid = qconf->rx_queue_list[i];
+ portid = qconf->rx_queue_list[i].portid;
nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst,
MAX_PKT_BURST);
@@ -389,12 +515,12 @@ main_loop(__attribute__((unused)) void *dummy)
for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
j + PREFETCH_OFFSET], void *));
- l3fwd_simple_forward(pkts_burst[j], portid);
+ l3fwd_simple_forward(pkts_burst[j], qconf, i, portid);
}
/* Forward remaining prefetched packets */
for (; j < nb_rx; j++) {
- l3fwd_simple_forward(pkts_burst[j], portid);
+ l3fwd_simple_forward(pkts_burst[j], qconf, i, portid);
}
}
}
@@ -570,17 +696,164 @@ check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
/* set the print_flag if all ports up or timeout */
if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
print_flag = 1;
- printf("done\n");
+ printf("\ndone\n");
}
}
}
+static int
+init_routing_table(void)
+{
+ struct rte_lpm * lpm;
+ struct rte_lpm6 * lpm6;
+ int socket, ret;
+ unsigned i;
+
+ for (socket = 0; socket < RTE_MAX_NUMA_NODES; socket++) {
+ if (socket_lpm[socket]) {
+ lpm = socket_lpm[socket];
+ /* populate the LPM table */
+ for (i = 0; i < RTE_DIM(l3fwd_ipv4_route_array); i++) {
+ ret = rte_lpm_add(lpm,
+ l3fwd_ipv4_route_array[i].ip,
+ l3fwd_ipv4_route_array[i].depth,
+ l3fwd_ipv4_route_array[i].if_out);
+
+ if (ret < 0) {
+ RTE_LOG(ERR, IP_FRAG, "Unable to add entry %i to the l3fwd "
+ "LPM table\n", i);
+ return -1;
+ }
+
+ RTE_LOG(INFO, IP_FRAG, "Socket %i: adding route " IPv4_BYTES_FMT
+ "/%d (port %d)\n",
+ socket,
+ IPv4_BYTES(l3fwd_ipv4_route_array[i].ip),
+ l3fwd_ipv4_route_array[i].depth,
+ l3fwd_ipv4_route_array[i].if_out);
+ }
+ }
+
+ if (socket_lpm6[socket]) {
+ lpm6 = socket_lpm6[socket];
+ /* populate the LPM6 table */
+ for (i = 0; i < RTE_DIM(l3fwd_ipv6_route_array); i++) {
+ ret = rte_lpm6_add(lpm6,
+ l3fwd_ipv6_route_array[i].ip,
+ l3fwd_ipv6_route_array[i].depth,
+ l3fwd_ipv6_route_array[i].if_out);
+
+ if (ret < 0) {
+ RTE_LOG(ERR, IP_FRAG, "Unable to add entry %i to the l3fwd "
+ "LPM6 table\n", i);
+ return -1;
+ }
+
+ RTE_LOG(INFO, IP_FRAG, "Socket %i: adding route " IPv6_BYTES_FMT
+ "/%d (port %d)\n",
+ socket,
+ IPv6_BYTES(l3fwd_ipv6_route_array[i].ip),
+ l3fwd_ipv6_route_array[i].depth,
+ l3fwd_ipv6_route_array[i].if_out);
+ }
+ }
+ }
+ return 0;
+}
+
+static int
+init_mem(void)
+{
+ char buf[PATH_MAX];
+ struct rte_mempool * mp;
+ struct rte_lpm * lpm;
+ struct rte_lpm6 * lpm6;
+ int socket;
+ unsigned lcore_id;
+
+ /* traverse through lcores and initialize structures on each socket */
+
+ for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+
+ if (rte_lcore_is_enabled(lcore_id) == 0)
+ continue;
+
+ socket = rte_lcore_to_socket_id(lcore_id);
+
+ if (socket == SOCKET_ID_ANY)
+ socket = 0;
+
+ if (socket_direct_pool[socket] == NULL) {
+ RTE_LOG(INFO, IP_FRAG, "Creating direct mempool on socket %i\n",
+ socket);
+ rte_snprintf(buf, sizeof(buf), "pool_direct_%i", socket);
+
+ mp = rte_mempool_create(buf, NB_MBUF,
+ MBUF_SIZE, 32,
+ sizeof(struct rte_pktmbuf_pool_private),
+ rte_pktmbuf_pool_init, NULL,
+ rte_pktmbuf_init, NULL,
+ socket, 0);
+ if (mp == NULL) {
+ RTE_LOG(ERR, IP_FRAG, "Cannot create direct mempool\n");
+ return -1;
+ }
+ socket_direct_pool[socket] = mp;
+ }
+
+ if (socket_indirect_pool[socket] == NULL) {
+ RTE_LOG(INFO, IP_FRAG, "Creating indirect mempool on socket %i\n",
+ socket);
+ rte_snprintf(buf, sizeof(buf), "pool_indirect_%i", socket);
+
+ mp = rte_mempool_create(buf, NB_MBUF,
+ sizeof(struct rte_mbuf), 32,
+ 0,
+ NULL, NULL,
+ rte_pktmbuf_init, NULL,
+ socket, 0);
+ if (mp == NULL) {
+ RTE_LOG(ERR, IP_FRAG, "Cannot create indirect mempool\n");
+ return -1;
+ }
+ socket_indirect_pool[socket] = mp;
+ }
+
+ if (socket_lpm[socket] == NULL) {
+ RTE_LOG(INFO, IP_FRAG, "Creating LPM table on socket %i\n", socket);
+ rte_snprintf(buf, sizeof(buf), "IP_FRAG_LPM_%i", socket);
+
+ lpm = rte_lpm_create(buf, socket, LPM_MAX_RULES, 0);
+ if (lpm == NULL) {
+ RTE_LOG(ERR, IP_FRAG, "Cannot create LPM table\n");
+ return -1;
+ }
+ socket_lpm[socket] = lpm;
+ }
+
+ if (socket_lpm6[socket] == NULL) {
+ RTE_LOG(INFO, IP_FRAG, "Creating LPM6 table on socket %i\n", socket);
+ rte_snprintf(buf, sizeof(buf), "IP_FRAG_LPM_%i", socket);
+
+ lpm6 = rte_lpm6_create("IP_FRAG_LPM6", socket, &lpm6_config);
+ if (lpm6 == NULL) {
+ RTE_LOG(ERR, IP_FRAG, "Cannot create LPM table\n");
+ return -1;
+ }
+ socket_lpm6[socket] = lpm6;
+ }
+ }
+
+ return 0;
+}
+
int
MAIN(int argc, char **argv)
{
struct lcore_queue_conf *qconf;
- int ret;
- unsigned nb_ports, i;
+ struct rx_queue * rxq;
+ int socket, ret;
+ unsigned nb_ports;
uint16_t queueid = 0;
unsigned lcore_id = 0, rx_lcore_id = 0;
uint32_t n_tx_queue, nb_lcores;
@@ -598,36 +871,21 @@ MAIN(int argc, char **argv)
if (ret < 0)
rte_exit(EXIT_FAILURE, "Invalid arguments");
- /* create the mbuf pools */
- pool_direct =
- rte_mempool_create("pool_direct", NB_MBUF,
- MBUF_SIZE, 32,
- sizeof(struct rte_pktmbuf_pool_private),
- rte_pktmbuf_pool_init, NULL,
- rte_pktmbuf_init, NULL,
- rte_socket_id(), 0);
- if (pool_direct == NULL)
- rte_panic("Cannot init direct mbuf pool\n");
-
- pool_indirect =
- rte_mempool_create("pool_indirect", NB_MBUF,
- sizeof(struct rte_mbuf), 32,
- 0,
- NULL, NULL,
- rte_pktmbuf_init, NULL,
- rte_socket_id(), 0);
- if (pool_indirect == NULL)
- rte_panic("Cannot init indirect mbuf pool\n");
-
if (rte_eal_pci_probe() < 0)
rte_panic("Cannot probe PCI\n");
nb_ports = rte_eth_dev_count();
if (nb_ports > RTE_MAX_ETHPORTS)
nb_ports = RTE_MAX_ETHPORTS;
+ else if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No ports found!\n");
nb_lcores = rte_lcore_count();
+ /* initialize structures (mempools, lpm etc.) */
+ if (init_mem() < 0)
+ rte_panic("Cannot initialize memory structures!\n");
+
/* initialize all ports */
for (portid = 0; portid < nb_ports; portid++) {
/* skip ports that are not enabled */
@@ -648,11 +906,21 @@ MAIN(int argc, char **argv)
qconf = &lcore_queue_conf[rx_lcore_id];
}
- qconf->rx_queue_list[qconf->n_rx_queue] = portid;
+
+ socket = rte_eth_dev_socket_id(portid);
+ if (socket == SOCKET_ID_ANY)
+ socket = 0;
+
+ rxq = &qconf->rx_queue_list[qconf->n_rx_queue];
+ rxq->portid = portid;
+ rxq->direct_pool = socket_direct_pool[socket];
+ rxq->indirect_pool = socket_indirect_pool[socket];
+ rxq->lpm = socket_lpm[socket];
+ rxq->lpm6 = socket_lpm6[socket];
qconf->n_rx_queue++;
/* init port */
- printf("Initializing port %d on lcore %u... ", portid,
+ printf("Initializing port %d on lcore %u...", portid,
rx_lcore_id);
fflush(stdout);
@@ -661,82 +929,73 @@ MAIN(int argc, char **argv)
n_tx_queue = MAX_TX_QUEUE_PER_PORT;
ret = rte_eth_dev_configure(portid, 1, (uint16_t)n_tx_queue,
&port_conf);
- if (ret < 0)
+ if (ret < 0) {
+ printf("\n");
rte_exit(EXIT_FAILURE, "Cannot configure device: "
"err=%d, port=%d\n",
ret, portid);
-
- rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
- print_ethaddr(" Address:", &ports_eth_addr[portid]);
- printf(", ");
+ }
/* init one RX queue */
- queueid = 0;
- printf("rxq=%d ", queueid);
- fflush(stdout);
- ret = rte_eth_rx_queue_setup(portid, queueid, nb_rxd,
- rte_eth_dev_socket_id(portid), &rx_conf,
- pool_direct);
- if (ret < 0)
- rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: "
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ socket, &rx_conf,
+ socket_direct_pool[socket]);
+ if (ret < 0) {
+ printf("\n");
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup: "
"err=%d, port=%d\n",
ret, portid);
+ }
+
+ rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
+ print_ethaddr(" Address:", &ports_eth_addr[portid]);
+ printf("\n");
/* init one TX queue per couple (lcore,port) */
queueid = 0;
for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
if (rte_lcore_is_enabled(lcore_id) == 0)
continue;
+
+ socket = (int) rte_lcore_to_socket_id(lcore_id);
printf("txq=%u,%d ", lcore_id, queueid);
fflush(stdout);
ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
- rte_eth_dev_socket_id(portid), &tx_conf);
- if (ret < 0)
+ socket, &tx_conf);
+ if (ret < 0) {
+ printf("\n");
rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: "
"err=%d, port=%d\n", ret, portid);
+ }
qconf = &lcore_queue_conf[lcore_id];
qconf->tx_queue_id[portid] = queueid;
queueid++;
}
+ printf("\n");
+ }
+
+ printf("\n");
+
+ /* start ports */
+ for (portid = 0; portid < nb_ports; portid++) {
+ if ((enabled_port_mask & (1 << portid)) == 0) {
+ continue;
+ }
/* Start device */
ret = rte_eth_dev_start(portid);
if (ret < 0)
- rte_exit(EXIT_FAILURE, "rte_eth_dev_start: "
- "err=%d, port=%d\n",
+ rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, port=%d\n",
ret, portid);
- printf("done: ");
-
- /* Set port in promiscuous mode */
rte_eth_promiscuous_enable(portid);
}
- check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
-
- /* create the LPM table */
- l3fwd_lpm = rte_lpm_create("L3FWD_LPM", rte_socket_id(), L3FWD_LPM_MAX_RULES, 0);
- if (l3fwd_lpm == NULL)
- rte_panic("Unable to create the l3fwd LPM table\n");
-
- /* populate the LPM table */
- for (i = 0; i < L3FWD_NUM_ROUTES; i++) {
- ret = rte_lpm_add(l3fwd_lpm,
- l3fwd_route_array[i].ip,
- l3fwd_route_array[i].depth,
- l3fwd_route_array[i].if_out);
-
- if (ret < 0) {
- rte_panic("Unable to add entry %u to the l3fwd "
- "LPM table\n", i);
- }
+ if (init_routing_table() < 0)
+ rte_exit(EXIT_FAILURE, "Cannot init routing table\n");
- printf("Adding route 0x%08x / %d (%d)\n",
- (unsigned) l3fwd_route_array[i].ip,
- l3fwd_route_array[i].depth,
- l3fwd_route_array[i].if_out);
- }
+ check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
/* launch per-lcore init on every lcore */
rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 12/13] ip_frag: add support for IPv6 reassembly
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (11 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 11/13] example: overhaul of ip_fragmentation example app Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 13/13] examples: overhaul of ip_reassembly app Anatoly Burakov
` (3 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
Mostly a copy-paste of IPv4, with a few caveats.
Only supported packets are those in which fragment extension header is
just after the IPv6 header.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_ip_frag/Makefile | 1 +
lib/librte_ip_frag/ip_frag_common.h | 25 +++-
lib/librte_ip_frag/ip_frag_internal.c | 172 +++++++++++++++++-------
lib/librte_ip_frag/rte_ip_frag.h | 51 +++++++-
lib/librte_ip_frag/rte_ipv4_reassembly.c | 4 +-
lib/librte_ip_frag/rte_ipv6_reassembly.c | 218 +++++++++++++++++++++++++++++++
6 files changed, 421 insertions(+), 50 deletions(-)
create mode 100644 lib/librte_ip_frag/rte_ipv6_reassembly.c
diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
index 13a4f9f..29aa36f 100644
--- a/lib/librte_ip_frag/Makefile
+++ b/lib/librte_ip_frag/Makefile
@@ -41,6 +41,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_reassembly.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ip_frag_common.c
SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += ip_frag_internal.c
diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h
index 3e588a0..ac5cd61 100644
--- a/lib/librte_ip_frag/ip_frag_common.h
+++ b/lib/librte_ip_frag/ip_frag_common.h
@@ -51,9 +51,17 @@ if (!(exp)) { \
#define RTE_IP_FRAG_ASSERT(exp) do { } while(0)
#endif /* IP_FRAG_DEBUG */
+#define IPV4_KEYLEN 1
+#define IPV6_KEYLEN 4
+
/* helper macros */
#define IP_FRAG_MBUF2DR(dr, mb) ((dr)->row[(dr)->cnt++] = (mb))
+#define IPv6_KEY_BYTES(key) \
+ (key)[0], (key)[1], (key)[2], (key)[3]
+#define IPv6_KEY_BYTES_FMT \
+ "%08" PRIx64 "%08" PRIx64 "%08" PRIx64 "%08" PRIx64
+
/* internal functions declarations */
struct rte_mbuf * ip_frag_process(struct rte_ip_frag_pkt *fp,
struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb,
@@ -69,6 +77,7 @@ struct rte_ip_frag_pkt * ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
/* these functions need to be declared here as ip_frag_process relies on them */
struct rte_mbuf * ipv4_frag_reassemble(const struct rte_ip_frag_pkt *fp);
+struct rte_mbuf * ipv6_frag_reassemble(const struct rte_ip_frag_pkt *fp);
@@ -80,8 +89,10 @@ struct rte_mbuf * ipv4_frag_reassemble(const struct rte_ip_frag_pkt *fp);
static inline int
ip_frag_key_is_empty(const struct ip_frag_key * key)
{
- if (key->src_dst != 0)
- return 0;
+ uint32_t i;
+ for (i = 0; i < key->key_len; i++)
+ if (key->src_dst[i] != 0)
+ return 0;
return 1;
}
@@ -89,14 +100,20 @@ ip_frag_key_is_empty(const struct ip_frag_key * key)
static inline void
ip_frag_key_invalidate(struct ip_frag_key * key)
{
- key->src_dst = 0;
+ uint32_t i;
+ for (i = 0; i < key->key_len; i++)
+ key->src_dst[i] = 0;
}
/* compare two keys */
static inline int
ip_frag_key_cmp(const struct ip_frag_key * k1, const struct ip_frag_key * k2)
{
- return k1->src_dst ^ k2->src_dst;
+ uint32_t i, val;
+ val = k1->id ^ k2->id;
+ for (i = 0; i < k1->key_len; i++)
+ val |= k1->src_dst[i] ^ k2->src_dst[i];
+ return val;
}
/*
diff --git a/lib/librte_ip_frag/ip_frag_internal.c b/lib/librte_ip_frag/ip_frag_internal.c
index 2f5a4b8..5d35037 100644
--- a/lib/librte_ip_frag/ip_frag_internal.c
+++ b/lib/librte_ip_frag/ip_frag_internal.c
@@ -110,6 +110,35 @@ ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2)
*v2 = (v << 7) + (v >> 14);
}
+static inline void
+ipv6_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2)
+{
+ uint32_t v;
+ const uint32_t *p;
+
+ p = (const uint32_t *) &key->src_dst;
+
+#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
+ v = rte_hash_crc_4byte(p[0], PRIME_VALUE);
+ v = rte_hash_crc_4byte(p[1], v);
+ v = rte_hash_crc_4byte(p[2], v);
+ v = rte_hash_crc_4byte(p[3], v);
+ v = rte_hash_crc_4byte(p[4], v);
+ v = rte_hash_crc_4byte(p[5], v);
+ v = rte_hash_crc_4byte(p[6], v);
+ v = rte_hash_crc_4byte(p[7], v);
+ v = rte_hash_crc_4byte(key->id, v);
+#else
+
+ v = rte_jhash_3words(p[0], p[1], p[2], PRIME_VALUE);
+ v = rte_jhash_3words(p[3], p[4], p[5], v);
+ v = rte_jhash_3words(p[6], p[7], key->id, v);
+#endif /* RTE_MACHINE_CPUFLAG_SSE4_2 */
+
+ *v1 = v;
+ *v2 = (v << 7) + (v >> 14);
+}
+
struct rte_mbuf *
ip_frag_process(struct rte_ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr,
struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags)
@@ -142,18 +171,32 @@ ip_frag_process(struct rte_ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr,
if (idx >= sizeof (fp->frags) / sizeof (fp->frags[0])) {
/* report an error. */
- IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
- "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
- "total_size: %u, frag_size: %u, last_idx: %u\n"
- "first fragment: ofs: %u, len: %u\n"
- "last fragment: ofs: %u, len: %u\n\n",
- __func__, __LINE__,
- fp, fp->key.src_dst[0], fp->key.id,
- fp->total_size, fp->frag_size, fp->last_idx,
- fp->frags[IP_FIRST_FRAG_IDX].ofs,
- fp->frags[IP_FIRST_FRAG_IDX].len,
- fp->frags[IP_LAST_FRAG_IDX].ofs,
- fp->frags[IP_LAST_FRAG_IDX].len);
+ if (fp->key.key_len == IPV4_KEYLEN)
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
+ "total_size: %u, frag_size: %u, last_idx: %u\n"
+ "first fragment: ofs: %u, len: %u\n"
+ "last fragment: ofs: %u, len: %u\n\n",
+ __func__, __LINE__,
+ fp, fp->key.src_dst[0], fp->key.id,
+ fp->total_size, fp->frag_size, fp->last_idx,
+ fp->frags[IP_FIRST_FRAG_IDX].ofs,
+ fp->frags[IP_FIRST_FRAG_IDX].len,
+ fp->frags[IP_LAST_FRAG_IDX].ofs,
+ fp->frags[IP_LAST_FRAG_IDX].len);
+ else
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ "ipv4_frag_pkt: %p, key: <" IPv6_KEY_BYTES_FMT ", %#x>, "
+ "total_size: %u, frag_size: %u, last_idx: %u\n"
+ "first fragment: ofs: %u, len: %u\n"
+ "last fragment: ofs: %u, len: %u\n\n",
+ __func__, __LINE__,
+ fp, IPv6_KEY_BYTES(fp->key.src_dst), fp->key.id,
+ fp->total_size, fp->frag_size, fp->last_idx,
+ fp->frags[IP_FIRST_FRAG_IDX].ofs,
+ fp->frags[IP_FIRST_FRAG_IDX].len,
+ fp->frags[IP_LAST_FRAG_IDX].ofs,
+ fp->frags[IP_LAST_FRAG_IDX].len);
/* free all fragments, invalidate the entry. */
ip_frag_free(fp, dr);
@@ -175,25 +218,43 @@ ip_frag_process(struct rte_ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr,
/* if we collected all fragments, then try to reassemble. */
} else if (fp->frag_size == fp->total_size &&
- fp->frags[IP_FIRST_FRAG_IDX].mb != NULL)
- mb = ipv4_frag_reassemble(fp);
+ fp->frags[IP_FIRST_FRAG_IDX].mb != NULL) {
+ if (fp->key.key_len == IPV4_KEYLEN)
+ mb = ipv4_frag_reassemble(fp);
+ else
+ mb = ipv6_frag_reassemble(fp);
+ }
/* errorenous set of fragments. */
if (mb == NULL) {
/* report an error. */
- IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
- "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
- "total_size: %u, frag_size: %u, last_idx: %u\n"
- "first fragment: ofs: %u, len: %u\n"
- "last fragment: ofs: %u, len: %u\n\n",
- __func__, __LINE__,
- fp, fp->key.src_dst[0], fp->key.id,
- fp->total_size, fp->frag_size, fp->last_idx,
- fp->frags[IP_FIRST_FRAG_IDX].ofs,
- fp->frags[IP_FIRST_FRAG_IDX].len,
- fp->frags[IP_LAST_FRAG_IDX].ofs,
- fp->frags[IP_LAST_FRAG_IDX].len);
+ if (fp->key.key_len == IPV4_KEYLEN)
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ "ipv4_frag_pkt: %p, key: <%" PRIx64 ", %#x>, "
+ "total_size: %u, frag_size: %u, last_idx: %u\n"
+ "first fragment: ofs: %u, len: %u\n"
+ "last fragment: ofs: %u, len: %u\n\n",
+ __func__, __LINE__,
+ fp, fp->key.src_dst[0], fp->key.id,
+ fp->total_size, fp->frag_size, fp->last_idx,
+ fp->frags[IP_FIRST_FRAG_IDX].ofs,
+ fp->frags[IP_FIRST_FRAG_IDX].len,
+ fp->frags[IP_LAST_FRAG_IDX].ofs,
+ fp->frags[IP_LAST_FRAG_IDX].len);
+ else
+ IP_FRAG_LOG(DEBUG, "%s:%d invalid fragmented packet:\n"
+ "ipv4_frag_pkt: %p, key: <" IPv6_KEY_BYTES_FMT ", %#x>, "
+ "total_size: %u, frag_size: %u, last_idx: %u\n"
+ "first fragment: ofs: %u, len: %u\n"
+ "last fragment: ofs: %u, len: %u\n\n",
+ __func__, __LINE__,
+ fp, IPv6_KEY_BYTES(fp->key.src_dst), fp->key.id,
+ fp->total_size, fp->frag_size, fp->last_idx,
+ fp->frags[IP_FIRST_FRAG_IDX].ofs,
+ fp->frags[IP_FIRST_FRAG_IDX].len,
+ fp->frags[IP_LAST_FRAG_IDX].ofs,
+ fp->frags[IP_LAST_FRAG_IDX].len);
/* free associated resources. */
ip_frag_free(fp, dr);
@@ -291,21 +352,34 @@ ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
if (tbl->last != NULL && ip_frag_key_cmp(&tbl->last->key, key) == 0)
return (tbl->last);
- ipv4_frag_hash(key, &sig1, &sig2);
+ /* different hashing methods for IPv4 and IPv6 */
+ if (key->key_len == 1)
+ ipv4_frag_hash(key, &sig1, &sig2);
+ else
+ ipv6_frag_hash(key, &sig1, &sig2);
p1 = IP_FRAG_TBL_POS(tbl, sig1);
p2 = IP_FRAG_TBL_POS(tbl, sig2);
for (i = 0; i != assoc; i++) {
-
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "tbl: %p, max_entries: %u, use_entries: %u\n"
- "ipv6_frag_pkt line0: %p, index: %u from %u\n"
- "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
- __func__, __LINE__,
- tbl, tbl->max_entries, tbl->use_entries,
- p1, i, assoc,
- p1[i].key.src_dst[0], p1[i].key.id, p1[i].start);
+ if (p1->key.key_len == IPV4_KEYLEN)
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt line0: %p, index: %u from %u\n"
+ "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ p1, i, assoc,
+ p1[i].key.src_dst[0], p1[i].key.id, p1[i].start);
+ else
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt line0: %p, index: %u from %u\n"
+ "key: <" IPv6_KEY_BYTES_FMT ", %#x>, start: %" PRIu64 "\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ p1, i, assoc,
+ IPv6_KEY_BYTES(p1[i].key.src_dst), p1[i].key.id, p1[i].start);
if (ip_frag_key_cmp(&p1[i].key, key) == 0)
return (p1 + i);
@@ -314,14 +388,24 @@ ip_frag_lookup(struct rte_ip_frag_tbl *tbl,
else if (max_cycles + p1[i].start < tms)
old = (old == NULL) ? (p1 + i) : old;
- IP_FRAG_LOG(DEBUG, "%s:%d:\n"
- "tbl: %p, max_entries: %u, use_entries: %u\n"
- "ipv6_frag_pkt line1: %p, index: %u from %u\n"
- "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
- __func__, __LINE__,
- tbl, tbl->max_entries, tbl->use_entries,
- p2, i, assoc,
- p2[i].key.src_dst[0], p2[i].key.id, p2[i].start);
+ if (p2->key.key_len == IPV4_KEYLEN)
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt line1: %p, index: %u from %u\n"
+ "key: <%" PRIx64 ", %#x>, start: %" PRIu64 "\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ p2, i, assoc,
+ p2[i].key.src_dst[0], p2[i].key.id, p2[i].start);
+ else
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt line1: %p, index: %u from %u\n"
+ "key: <" IPv6_KEY_BYTES_FMT ", %#x>, start: %" PRIu64 "\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ p2, i, assoc,
+ IPv6_KEY_BYTES(p2[i].key.src_dst), p2[i].key.id, p2[i].start);
if (ip_frag_key_cmp(&p2[i].key, key) == 0)
return (p2 + i);
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index 4a4b5c3..8546c3a 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -65,8 +65,9 @@ struct ip_frag {
/** @internal <src addr, dst_addr, id> to uniquely indetify fragmented datagram. */
struct ip_frag_key {
- uint64_t src_dst; /**< src address */
+ uint64_t src_dst[4]; /**< src address, first 8 bytes used for IPv4 */
uint32_t id; /**< dst address */
+ uint32_t key_len; /**< src/dst key length */
};
/*
@@ -200,6 +201,54 @@ rte_ipv6_fragment_packet(struct rte_mbuf *pkt_in,
struct rte_mempool *pool_direct,
struct rte_mempool *pool_indirect);
+
+/*
+ * This function implements reassembly of fragmented IPv6 packets.
+ * Incoming mbuf should have its l2_len/l3_len fields setup correctly.
+ *
+ * @param tbl
+ * Table where to lookup/add the fragmented packet.
+ * @param dr
+ * Death row to free buffers to
+ * @param mb
+ * Incoming mbuf with IPv6 fragment.
+ * @param tms
+ * Fragment arrival timestamp.
+ * @param ip_hdr
+ * Pointer to the IPv6 header.
+ * @param frag_hdr
+ * Pointer to the IPv6 fragment extension header.
+ * @return
+ * Pointer to mbuf for reassembled packet, or NULL if:
+ * - an error occured.
+ * - not all fragments of the packet are collected yet.
+ */
+struct rte_mbuf * rte_ipv6_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
+ struct rte_ip_frag_death_row *dr,
+ struct rte_mbuf *mb, uint64_t tms, struct ipv6_hdr *ip_hdr,
+ struct ipv6_extension_fragment *frag_hdr);
+
+/*
+ * Return a pointer to the packet's fragment header, if found.
+ * It only looks at the extension header that's right after the fixed IPv6
+ * header, and doesn't follow the whole chain of extension headers.
+ *
+ * @param hdr
+ * Pointer to the IPv6 header.
+ * @return
+ * Pointer to the IPv6 fragment extension header, or NULL if it's not
+ * present.
+ */
+static inline struct ipv6_extension_fragment *
+rte_ipv6_frag_get_ipv6_fragment_header(struct ipv6_hdr *hdr)
+{
+ if (hdr->proto == IPPROTO_FRAGMENT) {
+ return (struct ipv6_extension_fragment *) ++hdr;
+ }
+ else
+ return NULL;
+}
+
/**
* IPv4 fragmentation.
*
diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c
index 483fb95..cc9a9c8 100644
--- a/lib/librte_ip_frag/rte_ipv4_reassembly.c
+++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c
@@ -138,8 +138,10 @@ rte_ipv4_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
ip_flag = (uint16_t)(flag_offset & IPV4_HDR_MF_FLAG);
psd = (uint64_t *)&ip_hdr->src_addr;
- key.src_dst = *psd;
+ /* use first 8 bytes only */
+ key.src_dst[0] = psd[0];
key.id = ip_hdr->packet_id;
+ key.key_len = IPV4_KEYLEN;
ip_ofs *= IPV4_HDR_OFFSET_UNITS;
ip_len = (uint16_t)(rte_be_to_cpu_16(ip_hdr->total_length) -
diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c
new file mode 100644
index 0000000..0cc7f93
--- /dev/null
+++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c
@@ -0,0 +1,218 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_byteorder.h>
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+
+#include "rte_ip_frag.h"
+#include "ip_frag_common.h"
+
+/**
+ * @file
+ * IPv6 reassemble
+ *
+ * Implementation of IPv6 reassembly.
+ *
+ */
+
+/*
+ * Reassemble fragments into one packet.
+ */
+struct rte_mbuf *
+ipv6_frag_reassemble(const struct rte_ip_frag_pkt *fp)
+{
+ struct ipv6_hdr *ip_hdr;
+ struct ipv6_extension_fragment * frag_hdr;
+ struct rte_mbuf *m, *prev;
+ uint32_t i, n, ofs, first_len;
+ uint32_t last_len, move_len, payload_len;
+
+ first_len = fp->frags[IP_FIRST_FRAG_IDX].len;
+ n = fp->last_idx - 1;
+
+ /*start from the last fragment. */
+ m = fp->frags[IP_LAST_FRAG_IDX].mb;
+ ofs = fp->frags[IP_LAST_FRAG_IDX].ofs;
+ last_len = fp->frags[IP_LAST_FRAG_IDX].len;
+
+ payload_len = ofs + last_len;
+
+ while (ofs != first_len) {
+
+ prev = m;
+
+ for (i = n; i != IP_FIRST_FRAG_IDX && ofs != first_len; i--) {
+
+ /* previous fragment found. */
+ if(fp->frags[i].ofs + fp->frags[i].len == ofs) {
+
+ ip_frag_chain(fp->frags[i].mb, m);
+
+ /* update our last fragment and offset. */
+ m = fp->frags[i].mb;
+ ofs = fp->frags[i].ofs;
+ }
+ }
+
+ /* error - hole in the packet. */
+ if (m == prev) {
+ return (NULL);
+ }
+ }
+
+ /* chain with the first fragment. */
+ ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m);
+ m = fp->frags[IP_FIRST_FRAG_IDX].mb;
+
+ /* update mbuf fields for reassembled packet. */
+ m->ol_flags |= PKT_TX_IP_CKSUM;
+
+ /* update ipv6 header for the reassembled datagram */
+ ip_hdr = (struct ipv6_hdr *) (rte_pktmbuf_mtod(m, uint8_t *) +
+ m->pkt.vlan_macip.f.l2_len);
+
+ ip_hdr->payload_len = rte_cpu_to_be_16(payload_len);
+
+ /*
+ * remove fragmentation header. note that per RFC2460, we need to update
+ * the last non-fragmentable header with the "next header" field to contain
+ * type of the first fragmentable header, but we currently don't support
+ * other headers, so we assume there are no other headers and thus update
+ * the main IPv6 header instead.
+ */
+ move_len = m->pkt.vlan_macip.f.l2_len + m->pkt.vlan_macip.f.l3_len -
+ sizeof(*frag_hdr);
+ frag_hdr = (struct ipv6_extension_fragment *) (ip_hdr + 1);
+ ip_hdr->proto = frag_hdr->next_header;
+
+ memmove(rte_pktmbuf_mtod(m, char*) + sizeof(*frag_hdr),
+ rte_pktmbuf_mtod(m, char*), move_len);
+
+ rte_pktmbuf_adj(m, sizeof(*frag_hdr));
+
+ return (m);
+}
+
+/*
+ * Process new mbuf with fragment of IPV6 datagram.
+ * Incoming mbuf should have its l2_len/l3_len fields setup correctly.
+ * @param tbl
+ * Table where to lookup/add the fragmented packet.
+ * @param mb
+ * Incoming mbuf with IPV6 fragment.
+ * @param tms
+ * Fragment arrival timestamp.
+ * @param ip_hdr
+ * Pointer to the IPV6 header.
+ * @param frag_hdr
+ * Pointer to the IPV6 fragment extension header.
+ * @return
+ * Pointer to mbuf for reassembled packet, or NULL if:
+ * - an error occured.
+ * - not all fragments of the packet are collected yet.
+ */
+#define MORE_FRAGS(x) (((x) & 0x100) >> 8)
+#define FRAG_OFFSET(x) (rte_cpu_to_be_16(x) >> 3)
+struct rte_mbuf *
+rte_ipv6_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl,
+ struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, uint64_t tms,
+ struct ipv6_hdr *ip_hdr, struct ipv6_extension_fragment *frag_hdr)
+{
+ struct rte_ip_frag_pkt *fp;
+ struct ip_frag_key key;
+ uint16_t ip_len, ip_ofs;
+
+ rte_memcpy(&key.src_dst[0], ip_hdr->src_addr, 16);
+ rte_memcpy(&key.src_dst[2], ip_hdr->dst_addr, 16);
+
+ key.id = frag_hdr->id;
+ key.key_len = IPV6_KEYLEN;
+
+ ip_ofs = FRAG_OFFSET(frag_hdr->frag_data) * 8;
+
+ /*
+ * as per RFC2460, payload length contains all extension headers as well.
+ * since we don't support anything but frag headers, this is what we remove
+ * from the payload len.
+ */
+ ip_len = rte_be_to_cpu_16(ip_hdr->payload_len) - sizeof(*frag_hdr);
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "mbuf: %p, tms: %" PRIu64
+ ", key: <" IPv6_KEY_BYTES_FMT ", %#x>, ofs: %u, len: %u, flags: %#x\n"
+ "tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
+ "max_entries: %u, use_entries: %u\n\n",
+ __func__, __LINE__,
+ mb, tms, IPv6_KEY_BYTES(key.src_dst), key.id, ip_ofs, ip_len, frag_hdr->more_frags,
+ tbl, tbl->max_cycles, tbl->entry_mask, tbl->max_entries,
+ tbl->use_entries);
+
+ /* try to find/add entry into the fragment's table. */
+ if ((fp = ip_frag_find(tbl, dr, &key, tms)) == NULL) {
+ IP_FRAG_MBUF2DR(dr, mb);
+ return (NULL);
+ }
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt: %p, key: <" IPv6_KEY_BYTES_FMT ", %#x>, start: %" PRIu64
+ ", total_size: %u, frag_size: %u, last_idx: %u\n\n",
+ __func__, __LINE__,
+ tbl, tbl->max_entries, tbl->use_entries,
+ fp, IPv6_KEY_BYTES(fp->key.src_dst), fp->key.id, fp->start,
+ fp->total_size, fp->frag_size, fp->last_idx);
+
+
+ /* process the fragmented packet. */
+ mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len,
+ MORE_FRAGS(frag_hdr->frag_data));
+ ip_frag_inuse(tbl, fp);
+
+ IP_FRAG_LOG(DEBUG, "%s:%d:\n"
+ "mbuf: %p\n"
+ "tbl: %p, max_entries: %u, use_entries: %u\n"
+ "ipv6_frag_pkt: %p, key: <" IPv6_KEY_BYTES_FMT ", %#x>, start: %" PRIu64
+ ", total_size: %u, frag_size: %u, last_idx: %u\n\n",
+ __func__, __LINE__, mb,
+ tbl, tbl->max_entries, tbl->use_entries,
+ fp, IPv6_KEY_BYTES(fp->key.src_dst), fp->key.id, fp->start,
+ fp->total_size, fp->frag_size, fp->last_idx);
+
+ return (mb);
+}
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [dpdk-dev] [PATCH 13/13] examples: overhaul of ip_reassembly app
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (12 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 12/13] ip_frag: add support for IPv6 reassembly Anatoly Burakov
@ 2014-05-28 17:32 ` Anatoly Burakov
2014-05-28 17:34 ` [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Burakov, Anatoly
` (2 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: Anatoly Burakov @ 2014-05-28 17:32 UTC (permalink / raw)
To: dev
New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
examples/ip_reassembly/Makefile | 1 -
examples/ip_reassembly/main.c | 1344 +++++++++++++--------------------------
2 files changed, 435 insertions(+), 910 deletions(-)
diff --git a/examples/ip_reassembly/Makefile b/examples/ip_reassembly/Makefile
index 3115b95..9c9e0fa 100644
--- a/examples/ip_reassembly/Makefile
+++ b/examples/ip_reassembly/Makefile
@@ -52,7 +52,6 @@ CFLAGS += $(WERROR_FLAGS)
# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
CFLAGS_main.o += -Wno-return-type
-CFLAGS_main.o += -DIPV4_FRAG_TBL_STAT
endif
include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 6c40d76..da3a0db 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -1,13 +1,13 @@
/*-
* BSD LICENSE
- *
+ *
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
- *
+ *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
- *
+ *
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
@@ -17,7 +17,7 @@
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
- *
+ *
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
@@ -42,6 +42,7 @@
#include <errno.h>
#include <getopt.h>
#include <signal.h>
+#include <sys/param.h>
#include <rte_common.h>
#include <rte_byteorder.h>
@@ -73,54 +74,29 @@
#include <rte_tcp.h>
#include <rte_udp.h>
#include <rte_string_fns.h>
-#include "main.h"
-
-#define APP_LOOKUP_EXACT_MATCH 0
-#define APP_LOOKUP_LPM 1
-#define DO_RFC_1812_CHECKS
-
-#ifndef APP_LOOKUP_METHOD
-#define APP_LOOKUP_METHOD APP_LOOKUP_LPM
-#endif
-
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
-#include <rte_hash.h>
-#elif (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
#include <rte_lpm.h>
#include <rte_lpm6.h>
-#else
-#error "APP_LOOKUP_METHOD set to incorrect value"
-#endif
-#define MAX_PKT_BURST 32
-
-#include "rte_ip_frag.h"
+#include <rte_ip_frag.h>
-#ifndef IPv6_BYTES
-#define IPv6_BYTES_FMT "%02x%02x:%02x%02x:%02x%02x:%02x%02x:"\
- "%02x%02x:%02x%02x:%02x%02x:%02x%02x"
-#define IPv6_BYTES(addr) \
- addr[0], addr[1], addr[2], addr[3], \
- addr[4], addr[5], addr[6], addr[7], \
- addr[8], addr[9], addr[10], addr[11],\
- addr[12], addr[13],addr[14], addr[15]
-#endif
+#include "main.h"
+#define MAX_PKT_BURST 32
-#define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1
-#define MAX_PORTS RTE_MAX_ETHPORTS
+#define RTE_LOGTYPE_IP_RSMBL RTE_LOGTYPE_USER1
#define MAX_JUMBO_PKT_LEN 9600
-#define IPV6_ADDR_LEN 16
-
-#define MEMPOOL_CACHE_SIZE 256
-
#define BUF_SIZE 2048
#define MBUF_SIZE \
(BUF_SIZE + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NB_MBUF 8192
+
+/* allow max jumbo frame 9.5 KB */
+#define JUMBO_FRAME_MAX_SIZE 0x2600
+
#define MAX_FLOW_NUM UINT16_MAX
#define MIN_FLOW_NUM 1
#define DEF_FLOW_NUM 0x1000
@@ -130,10 +106,10 @@
#define MIN_FLOW_TTL 1
#define DEF_FLOW_TTL MS_PER_S
-#define DEF_MBUF_NUM 0x400
+#define MAX_FRAG_NUM RTE_LIBRTE_IP_FRAG_MAX_FRAG
/* Should be power of two. */
-#define IPV4_FRAG_TBL_BUCKET_ENTRIES 2
+#define IP_FRAG_TBL_BUCKET_ENTRIES 16
static uint32_t max_flow_num = DEF_FLOW_NUM;
static uint32_t max_flow_ttl = DEF_FLOW_TTL;
@@ -174,12 +150,33 @@ static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
/* ethernet addresses of ports */
-static struct ether_addr ports_eth_addr[MAX_PORTS];
+static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
+
+#ifndef IPv4_BYTES
+#define IPv4_BYTES_FMT "%" PRIu8 ".%" PRIu8 ".%" PRIu8 ".%" PRIu8
+#define IPv4_BYTES(addr) \
+ (uint8_t) (((addr) >> 24) & 0xFF),\
+ (uint8_t) (((addr) >> 16) & 0xFF),\
+ (uint8_t) (((addr) >> 8) & 0xFF),\
+ (uint8_t) ((addr) & 0xFF)
+#endif
+
+#ifndef IPv6_BYTES
+#define IPv6_BYTES_FMT "%02x%02x:%02x%02x:%02x%02x:%02x%02x:"\
+ "%02x%02x:%02x%02x:%02x%02x:%02x%02x"
+#define IPv6_BYTES(addr) \
+ addr[0], addr[1], addr[2], addr[3], \
+ addr[4], addr[5], addr[6], addr[7], \
+ addr[8], addr[9], addr[10], addr[11],\
+ addr[12], addr[13],addr[14], addr[15]
+#endif
+
+#define IPV6_ADDR_LEN 16
/* mask of enabled ports */
static uint32_t enabled_port_mask = 0;
-static int promiscuous_on = 0; /**< Ports set in promiscuous mode off by default. */
-static int numa_on = 1; /**< NUMA is enabled by default. */
+
+static int rx_queue_per_lcore = 1;
struct mbuf_table {
uint32_t len;
@@ -188,54 +185,50 @@ struct mbuf_table {
struct rte_mbuf *m_table[0];
};
-struct lcore_rx_queue {
- uint8_t port_id;
- uint8_t queue_id;
-} __rte_cache_aligned;
+struct rx_queue {
+ struct rte_ip_frag_tbl * frag_tbl;
+ struct rte_mempool * pool;
+ struct rte_lpm * lpm;
+ struct rte_lpm6 * lpm6;
+ uint8_t portid;
+};
+
+struct tx_lcore_stat {
+ uint64_t call;
+ uint64_t drop;
+ uint64_t queue;
+ uint64_t send;
+};
#define MAX_RX_QUEUE_PER_LCORE 16
-#define MAX_TX_QUEUE_PER_PORT MAX_PORTS
+#define MAX_TX_QUEUE_PER_PORT 16
#define MAX_RX_QUEUE_PER_PORT 128
-#define MAX_LCORE_PARAMS 1024
-struct lcore_params {
- uint8_t port_id;
- uint8_t queue_id;
- uint8_t lcore_id;
+struct lcore_queue_conf {
+ uint16_t n_rx_queue;
+ struct rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
+ uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
+ struct rte_ip_frag_death_row death_row;
+ struct mbuf_table *tx_mbufs[RTE_MAX_ETHPORTS];
+ struct tx_lcore_stat tx_stat;
} __rte_cache_aligned;
-
-static struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
-static struct lcore_params lcore_params_array_default[] = {
- {0, 0, 2},
- {0, 1, 2},
- {0, 2, 2},
- {1, 0, 2},
- {1, 1, 2},
- {1, 2, 2},
- {2, 0, 2},
- {3, 0, 3},
- {3, 1, 3},
-};
-
-static struct lcore_params * lcore_params = lcore_params_array_default;
-static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
- sizeof(lcore_params_array_default[0]);
+static struct lcore_queue_conf lcore_queue_conf[RTE_MAX_LCORE];
static struct rte_eth_conf port_conf = {
.rxmode = {
- .mq_mode = ETH_MQ_RX_RSS,
- .max_rx_pkt_len = ETHER_MAX_LEN,
+ .mq_mode = ETH_MQ_RX_RSS,
+ .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE,
.split_hdr_size = 0,
.header_split = 0, /**< Header Split disabled */
.hw_ip_checksum = 1, /**< IP checksum offload enabled */
.hw_vlan_filter = 0, /**< VLAN filtering disabled */
- .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .jumbo_frame = 1, /**< Jumbo Frame Support disabled */
.hw_strip_crc = 0, /**< CRC stripped by hardware */
},
.rx_adv_conf = {
- .rss_conf = {
- .rss_key = NULL,
- .rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6,
+ .rss_conf = {
+ .rss_key = NULL,
+ .rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6,
},
},
.txmode = {
@@ -263,102 +256,37 @@ static const struct rte_eth_txconf tx_conf = {
.txq_flags = 0x0,
};
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
-
-#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
-#include <rte_hash_crc.h>
-#define DEFAULT_HASH_FUNC rte_hash_crc
-#else
-#include <rte_jhash.h>
-#define DEFAULT_HASH_FUNC rte_jhash
-#endif
-
-struct ipv4_5tuple {
- uint32_t ip_dst;
- uint32_t ip_src;
- uint16_t port_dst;
- uint16_t port_src;
- uint8_t proto;
-} __attribute__((__packed__));
-
-struct ipv6_5tuple {
- uint8_t ip_dst[IPV6_ADDR_LEN];
- uint8_t ip_src[IPV6_ADDR_LEN];
- uint16_t port_dst;
- uint16_t port_src;
- uint8_t proto;
-} __attribute__((__packed__));
-
-struct ipv4_l3fwd_route {
- struct ipv4_5tuple key;
- uint8_t if_out;
-};
-
-struct ipv6_l3fwd_route {
- struct ipv6_5tuple key;
- uint8_t if_out;
-};
-
-static struct ipv4_l3fwd_route ipv4_l3fwd_route_array[] = {
- {{IPv4(100,10,0,1), IPv4(200,10,0,1), 101, 11, IPPROTO_TCP}, 0},
- {{IPv4(100,20,0,2), IPv4(200,20,0,2), 102, 12, IPPROTO_TCP}, 1},
- {{IPv4(100,30,0,3), IPv4(200,30,0,3), 103, 13, IPPROTO_TCP}, 2},
- {{IPv4(100,40,0,4), IPv4(200,40,0,4), 104, 14, IPPROTO_TCP}, 3},
-};
-
-static struct ipv6_l3fwd_route ipv6_l3fwd_route_array[] = {
- {
- {
- {0xfe, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
- 0x02, 0x1b, 0x21, 0xff, 0xfe, 0x91, 0x38, 0x05},
- {0xfe, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
- 0x02, 0x1e, 0x67, 0xff, 0xfe, 0x0d, 0xb6, 0x0a},
- 1, 10, IPPROTO_UDP
- }, 4
- },
-};
-
-typedef struct rte_hash lookup_struct_t;
-static lookup_struct_t *ipv4_l3fwd_lookup_struct[NB_SOCKETS];
-static lookup_struct_t *ipv6_l3fwd_lookup_struct[NB_SOCKETS];
-
-#define L3FWD_HASH_ENTRIES 1024
-
-#define IPV4_L3FWD_NUM_ROUTES \
- (sizeof(ipv4_l3fwd_route_array) / sizeof(ipv4_l3fwd_route_array[0]))
-
-#define IPV6_L3FWD_NUM_ROUTES \
- (sizeof(ipv6_l3fwd_route_array) / sizeof(ipv6_l3fwd_route_array[0]))
-
-static uint8_t ipv4_l3fwd_out_if[L3FWD_HASH_ENTRIES] __rte_cache_aligned;
-static uint8_t ipv6_l3fwd_out_if[L3FWD_HASH_ENTRIES] __rte_cache_aligned;
-#endif
-
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
-struct ipv4_l3fwd_route {
+/*
+ * IPv4 forwarding table
+ */
+struct l3fwd_ipv4_route {
uint32_t ip;
uint8_t depth;
uint8_t if_out;
};
-struct ipv6_l3fwd_route {
- uint8_t ip[16];
- uint8_t depth;
- uint8_t if_out;
+struct l3fwd_ipv4_route l3fwd_ipv4_route_array[] = {
+ {IPv4(100,10,0,0), 16, 0},
+ {IPv4(100,20,0,0), 16, 1},
+ {IPv4(100,30,0,0), 16, 2},
+ {IPv4(100,40,0,0), 16, 3},
+ {IPv4(100,50,0,0), 16, 4},
+ {IPv4(100,60,0,0), 16, 5},
+ {IPv4(100,70,0,0), 16, 6},
+ {IPv4(100,80,0,0), 16, 7},
};
-static struct ipv4_l3fwd_route ipv4_l3fwd_route_array[] = {
- {IPv4(1,1,1,0), 24, 0},
- {IPv4(2,1,1,0), 24, 1},
- {IPv4(3,1,1,0), 24, 2},
- {IPv4(4,1,1,0), 24, 3},
- {IPv4(5,1,1,0), 24, 4},
- {IPv4(6,1,1,0), 24, 5},
- {IPv4(7,1,1,0), 24, 6},
- {IPv4(8,1,1,0), 24, 7},
+/*
+ * IPv6 forwarding table
+ */
+
+struct l3fwd_ipv6_route {
+ uint8_t ip[IPV6_ADDR_LEN];
+ uint8_t depth;
+ uint8_t if_out;
};
-static struct ipv6_l3fwd_route ipv6_l3fwd_route_array[] = {
+static struct l3fwd_ipv6_route l3fwd_ipv6_route_array[] = {
{{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 0},
{{2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 1},
{{3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 2},
@@ -369,59 +297,31 @@ static struct ipv6_l3fwd_route ipv6_l3fwd_route_array[] = {
{{8,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, 48, 7},
};
-#define IPV4_L3FWD_NUM_ROUTES \
- (sizeof(ipv4_l3fwd_route_array) / sizeof(ipv4_l3fwd_route_array[0]))
-#define IPV6_L3FWD_NUM_ROUTES \
- (sizeof(ipv6_l3fwd_route_array) / sizeof(ipv6_l3fwd_route_array[0]))
-
-#define IPV4_L3FWD_LPM_MAX_RULES 1024
-#define IPV6_L3FWD_LPM_MAX_RULES 1024
-#define IPV6_L3FWD_LPM_NUMBER_TBL8S (1 << 16)
-
-typedef struct rte_lpm lookup_struct_t;
-typedef struct rte_lpm6 lookup6_struct_t;
-static lookup_struct_t *ipv4_l3fwd_lookup_struct[NB_SOCKETS];
-static lookup6_struct_t *ipv6_l3fwd_lookup_struct[NB_SOCKETS];
-#endif
+#define LPM_MAX_RULES 1024
+#define LPM6_MAX_RULES 1024
+#define LPM6_NUMBER_TBL8S (1 << 16)
-struct tx_lcore_stat {
- uint64_t call;
- uint64_t drop;
- uint64_t queue;
- uint64_t send;
+struct rte_lpm6_config lpm6_config = {
+ .max_rules = LPM6_MAX_RULES,
+ .number_tbl8s = LPM6_NUMBER_TBL8S,
+ .flags = 0
};
-#ifdef IPV4_FRAG_TBL_STAT
-#define TX_LCORE_STAT_UPDATE(s, f, v) ((s)->f += (v))
-#else
-#define TX_LCORE_STAT_UPDATE(s, f, v) do {} while (0)
-#endif /* IPV4_FRAG_TBL_STAT */
+static struct rte_lpm *socket_lpm[RTE_MAX_NUMA_NODES];
+static struct rte_lpm6 *socket_lpm6[RTE_MAX_NUMA_NODES];
-struct lcore_conf {
- uint16_t n_rx_queue;
- struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
- uint16_t tx_queue_id[MAX_PORTS];
- lookup_struct_t * ipv4_lookup_struct;
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
- lookup6_struct_t * ipv6_lookup_struct;
+#ifdef IPV6_FRAG_TBL_STAT
+#define TX_LCORE_STAT_UPDATE(s, f, v) ((s)->f += (v))
#else
- lookup_struct_t * ipv6_lookup_struct;
-#endif
- struct rte_ip_frag_tbl *frag_tbl[MAX_RX_QUEUE_PER_LCORE];
- struct rte_mempool *pool[MAX_RX_QUEUE_PER_LCORE];
- struct rte_ip_frag_death_row death_row;
- struct mbuf_table *tx_mbufs[MAX_PORTS];
- struct tx_lcore_stat tx_stat;
-} __rte_cache_aligned;
-
-static struct lcore_conf lcore_conf[RTE_MAX_LCORE];
+#define TX_LCORE_STAT_UPDATE(s, f, v) do {} while (0)
+#endif /* IPV6_FRAG_TBL_STAT */
/*
* If number of queued packets reached given threahold, then
* send burst of packets on an output interface.
*/
static inline uint32_t
-send_burst(struct lcore_conf *qconf, uint32_t thresh, uint8_t port)
+send_burst(struct lcore_queue_conf *qconf, uint32_t thresh, uint8_t port)
{
uint32_t fill, len, k, n;
struct mbuf_table *txmb;
@@ -434,7 +334,7 @@ send_burst(struct lcore_conf *qconf, uint32_t thresh, uint8_t port)
if (fill >= thresh) {
n = RTE_MIN(len - txmb->tail, fill);
-
+
k = rte_eth_tx_burst(port, qconf->tx_queue_id[port],
txmb->m_table + txmb->tail, (uint16_t)n);
@@ -454,11 +354,11 @@ static inline int
send_single_packet(struct rte_mbuf *m, uint8_t port)
{
uint32_t fill, lcore_id, len;
- struct lcore_conf *qconf;
+ struct lcore_queue_conf *qconf;
struct mbuf_table *txmb;
lcore_id = rte_lcore_id();
- qconf = &lcore_conf[lcore_id];
+ qconf = &lcore_queue_conf[lcore_id];
txmb = qconf->tx_mbufs[port];
len = txmb->len;
@@ -471,7 +371,7 @@ send_single_packet(struct rte_mbuf *m, uint8_t port)
if (++txmb->tail == len)
txmb->tail = 0;
}
-
+
TX_LCORE_STAT_UPDATE(&qconf->tx_stat, queue, 1);
txmb->m_table[txmb->head] = m;
if(++txmb->head == len)
@@ -480,207 +380,43 @@ send_single_packet(struct rte_mbuf *m, uint8_t port)
return (0);
}
-#ifdef DO_RFC_1812_CHECKS
-static inline int
-is_valid_ipv4_pkt(struct ipv4_hdr *pkt, uint32_t link_len)
-{
- /* From http://www.rfc-editor.org/rfc/rfc1812.txt section 5.2.2 */
- /*
- * 1. The packet length reported by the Link Layer must be large
- * enough to hold the minimum length legal IP datagram (20 bytes).
- */
- if (link_len < sizeof(struct ipv4_hdr))
- return -1;
-
- /* 2. The IP checksum must be correct. */
- /* this is checked in H/W */
-
- /*
- * 3. The IP version number must be 4. If the version number is not 4
- * then the packet may be another version of IP, such as IPng or
- * ST-II.
- */
- if (((pkt->version_ihl) >> 4) != 4)
- return -3;
- /*
- * 4. The IP header length field must be large enough to hold the
- * minimum length legal IP datagram (20 bytes = 5 words).
- */
- if ((pkt->version_ihl & 0xf) < 5)
- return -4;
-
- /*
- * 5. The IP total length field must be large enough to hold the IP
- * datagram header, whose length is specified in the IP header length
- * field.
- */
- if (rte_cpu_to_be_16(pkt->total_length) < sizeof(struct ipv4_hdr))
- return -5;
-
- return 0;
-}
-#endif
-
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
-static void
-print_ipv4_key(struct ipv4_5tuple key)
-{
- printf("IP dst = %08x, IP src = %08x, port dst = %d, port src = %d, proto = %d\n",
- (unsigned)key.ip_dst, (unsigned)key.ip_src, key.port_dst, key.port_src, key.proto);
-}
-static void
-print_ipv6_key(struct ipv6_5tuple key)
-{
- printf( "IP dst = " IPv6_BYTES_FMT ", IP src = " IPv6_BYTES_FMT ", "
- "port dst = %d, port src = %d, proto = %d\n",
- IPv6_BYTES(key.ip_dst), IPv6_BYTES(key.ip_src),
- key.port_dst, key.port_src, key.proto);
-}
-
-static inline uint8_t
-get_ipv4_dst_port(struct ipv4_hdr *ipv4_hdr, uint8_t portid, lookup_struct_t * ipv4_l3fwd_lookup_struct)
-{
- struct ipv4_5tuple key;
- struct tcp_hdr *tcp;
- struct udp_hdr *udp;
- int ret = 0;
-
- key.ip_dst = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
- key.ip_src = rte_be_to_cpu_32(ipv4_hdr->src_addr);
- key.proto = ipv4_hdr->next_proto_id;
-
- switch (ipv4_hdr->next_proto_id) {
- case IPPROTO_TCP:
- tcp = (struct tcp_hdr *)((unsigned char *) ipv4_hdr +
- sizeof(struct ipv4_hdr));
- key.port_dst = rte_be_to_cpu_16(tcp->dst_port);
- key.port_src = rte_be_to_cpu_16(tcp->src_port);
- break;
-
- case IPPROTO_UDP:
- udp = (struct udp_hdr *)((unsigned char *) ipv4_hdr +
- sizeof(struct ipv4_hdr));
- key.port_dst = rte_be_to_cpu_16(udp->dst_port);
- key.port_src = rte_be_to_cpu_16(udp->src_port);
- break;
-
- default:
- key.port_dst = 0;
- key.port_src = 0;
- break;
- }
-
- /* Find destination port */
- ret = rte_hash_lookup(ipv4_l3fwd_lookup_struct, (const void *)&key);
- return (uint8_t)((ret < 0)? portid : ipv4_l3fwd_out_if[ret]);
-}
-
-static inline uint8_t
-get_ipv6_dst_port(struct ipv6_hdr *ipv6_hdr, uint8_t portid, lookup_struct_t * ipv6_l3fwd_lookup_struct)
-{
- struct ipv6_5tuple key;
- struct tcp_hdr *tcp;
- struct udp_hdr *udp;
- int ret = 0;
-
- memcpy(key.ip_dst, ipv6_hdr->dst_addr, IPV6_ADDR_LEN);
- memcpy(key.ip_src, ipv6_hdr->src_addr, IPV6_ADDR_LEN);
-
- key.proto = ipv6_hdr->proto;
-
- switch (ipv6_hdr->proto) {
- case IPPROTO_TCP:
- tcp = (struct tcp_hdr *)((unsigned char *) ipv6_hdr +
- sizeof(struct ipv6_hdr));
- key.port_dst = rte_be_to_cpu_16(tcp->dst_port);
- key.port_src = rte_be_to_cpu_16(tcp->src_port);
- break;
-
- case IPPROTO_UDP:
- udp = (struct udp_hdr *)((unsigned char *) ipv6_hdr +
- sizeof(struct ipv6_hdr));
- key.port_dst = rte_be_to_cpu_16(udp->dst_port);
- key.port_src = rte_be_to_cpu_16(udp->src_port);
- break;
-
- default:
- key.port_dst = 0;
- key.port_src = 0;
- break;
- }
-
- /* Find destination port */
- ret = rte_hash_lookup(ipv6_l3fwd_lookup_struct, (const void *)&key);
- return (uint8_t)((ret < 0)? portid : ipv6_l3fwd_out_if[ret]);
-}
-#endif
-
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
-static inline uint8_t
-get_ipv4_dst_port(struct ipv4_hdr *ipv4_hdr, uint8_t portid, lookup_struct_t * ipv4_l3fwd_lookup_struct)
-{
- uint8_t next_hop;
-
- return (uint8_t) ((rte_lpm_lookup(ipv4_l3fwd_lookup_struct,
- rte_be_to_cpu_32(ipv4_hdr->dst_addr), &next_hop) == 0)?
- next_hop : portid);
-}
-
-static inline uint8_t
-get_ipv6_dst_port(struct ipv6_hdr *ipv6_hdr, uint8_t portid, lookup6_struct_t * ipv6_l3fwd_lookup_struct)
-{
- uint8_t next_hop;
-
- return (uint8_t) ((rte_lpm6_lookup(ipv6_l3fwd_lookup_struct,
- ipv6_hdr->dst_addr, &next_hop) == 0)?
- next_hop : portid);
-}
-#endif
-
static inline void
-l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
- struct lcore_conf *qconf, uint64_t tms)
+reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
+ struct lcore_queue_conf *qconf, uint64_t tms)
{
struct ether_hdr *eth_hdr;
- struct ipv4_hdr *ipv4_hdr;
+ struct rte_ip_frag_tbl *tbl;
+ struct rte_ip_frag_death_row *dr;
+ struct rx_queue * rxq;
void *d_addr_bytes;
- uint8_t dst_port;
+ uint8_t next_hop, dst_port;
+
+ rxq = &qconf->rx_queue_list[queue];
eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
- if (m->ol_flags & PKT_RX_IPV4_HDR) {
- /* Handle IPv4 headers.*/
- ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+ dst_port = portid;
-#ifdef DO_RFC_1812_CHECKS
- /* Check to make sure the packet is valid (RFC1812) */
- if (is_valid_ipv4_pkt(ipv4_hdr, m->pkt.pkt_len) < 0) {
- rte_pktmbuf_free(m);
- return;
- }
+ /* if packet is IPv4 */
+ if (m->ol_flags & (PKT_RX_IPV4_HDR)) {
+ struct ipv4_hdr *ip_hdr;
+ uint32_t ip_dst;
- /* Update time to live and header checksum */
- --(ipv4_hdr->time_to_live);
- ++(ipv4_hdr->hdr_checksum);
-#endif
+ ip_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
/* if it is a fragmented packet, then try to reassemble. */
- if (rte_ipv4_frag_pkt_is_fragmented(ipv4_hdr)) {
-
+ if (rte_ipv4_frag_pkt_is_fragmented(ip_hdr)) {
struct rte_mbuf *mo;
- struct rte_ip_frag_tbl *tbl;
- struct rte_ip_frag_death_row *dr;
- tbl = qconf->frag_tbl[queue];
+ tbl = rxq->frag_tbl;
dr = &qconf->death_row;
/* prepare mbuf: setup l2_len/l3_len. */
m->pkt.vlan_macip.f.l2_len = sizeof(*eth_hdr);
- m->pkt.vlan_macip.f.l3_len = sizeof(*ipv4_hdr);
+ m->pkt.vlan_macip.f.l3_len = sizeof(*ip_hdr);
/* process this fragment. */
- if ((mo = rte_ipv4_frag_reassemble_packet(tbl, dr, m, tms,
- ipv4_hdr)) == NULL)
+ if ((mo = rte_ipv4_frag_reassemble_packet(tbl, dr, m, tms, ip_hdr)) == NULL)
/* no packet to send out. */
return;
@@ -689,47 +425,67 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
m = mo;
eth_hdr = rte_pktmbuf_mtod(m,
struct ether_hdr *);
- ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+ ip_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
}
}
+ ip_dst = rte_be_to_cpu_32(ip_hdr->dst_addr);
- dst_port = get_ipv4_dst_port(ipv4_hdr, portid,
- qconf->ipv4_lookup_struct);
- if (dst_port >= MAX_PORTS ||
- (enabled_port_mask & 1 << dst_port) == 0)
- dst_port = portid;
+ /* Find destination port */
+ if (rte_lpm_lookup(rxq->lpm, ip_dst, &next_hop) == 0 &&
+ (enabled_port_mask & 1 << next_hop) != 0) {
+ dst_port = next_hop;
+ }
- /* 02:00:00:00:00:xx */
- d_addr_bytes = ð_hdr->d_addr.addr_bytes[0];
- *((uint64_t *)d_addr_bytes) = 0x000000000002 + ((uint64_t)dst_port << 40);
+ eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
+ }
+ /* if packet is IPv6 */
+ else if (m->ol_flags & (PKT_RX_IPV6_HDR | PKT_RX_IPV6_HDR_EXT)) {
+ struct ipv6_extension_fragment *frag_hdr;
+ struct ipv6_hdr *ip_hdr;
- /* src addr */
- ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr);
+ ip_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
- send_single_packet(m, dst_port);
- }
- else {
- /* Handle IPv6 headers.*/
- struct ipv6_hdr *ipv6_hdr;
+ frag_hdr = rte_ipv6_frag_get_ipv6_fragment_header(ip_hdr);
- ipv6_hdr = (struct ipv6_hdr *)(rte_pktmbuf_mtod(m, unsigned char *) +
- sizeof(struct ether_hdr));
+ if(frag_hdr != NULL) {
+ struct rte_mbuf *mo;
- dst_port = get_ipv6_dst_port(ipv6_hdr, portid, qconf->ipv6_lookup_struct);
+ tbl = rxq->frag_tbl;
+ dr = &qconf->death_row;
- if (dst_port >= MAX_PORTS || (enabled_port_mask & 1 << dst_port) == 0)
- dst_port = portid;
+ /* prepare mbuf: setup l2_len/l3_len. */
+ m->pkt.vlan_macip.f.l2_len = sizeof(*eth_hdr);
+ m->pkt.vlan_macip.f.l3_len = sizeof(*ip_hdr) + sizeof(*frag_hdr);
- /* 02:00:00:00:00:xx */
- d_addr_bytes = ð_hdr->d_addr.addr_bytes[0];
- *((uint64_t *)d_addr_bytes) = 0x000000000002 + ((uint64_t)dst_port << 40);
+ if((mo = rte_ipv6_frag_reassemble_packet(tbl, dr, m, tms, ip_hdr,
+ frag_hdr)) == NULL)
+ return;
- /* src addr */
- ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr);
+ if(mo != m) {
+ m = mo;
+ eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
+ ip_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
+ }
+ }
- send_single_packet(m, dst_port);
+ /* Find destination port */
+ if (rte_lpm6_lookup(rxq->lpm6, ip_hdr->dst_addr, &next_hop) == 0 &&
+ (enabled_port_mask & 1 << next_hop) != 0) {
+ dst_port = next_hop;
+ }
+
+ eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv6);
}
+ /* if packet wasn't IPv4 or IPv6, it's forwarded to the port it came from */
+
+ /* 02:00:00:00:00:xx */
+ d_addr_bytes = ð_hdr->d_addr.addr_bytes[0];
+ *((uint64_t *)d_addr_bytes) = 0x000000000002 + ((uint64_t)dst_port << 40);
+ /* src addr */
+ ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr);
+
+ send_single_packet(m, dst_port);
}
/* main processing loop */
@@ -740,28 +496,27 @@ main_loop(__attribute__((unused)) void *dummy)
unsigned lcore_id;
uint64_t diff_tsc, cur_tsc, prev_tsc;
int i, j, nb_rx;
- uint8_t portid, queueid;
- struct lcore_conf *qconf;
+ uint8_t portid;
+ struct lcore_queue_conf *qconf;
const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
prev_tsc = 0;
lcore_id = rte_lcore_id();
- qconf = &lcore_conf[lcore_id];
+ qconf = &lcore_queue_conf[lcore_id];
if (qconf->n_rx_queue == 0) {
- RTE_LOG(INFO, L3FWD, "lcore %u has nothing to do\n", lcore_id);
+ RTE_LOG(INFO, IP_RSMBL, "lcore %u has nothing to do\n", lcore_id);
return 0;
}
- RTE_LOG(INFO, L3FWD, "entering main loop on lcore %u\n", lcore_id);
+ RTE_LOG(INFO, IP_RSMBL, "entering main loop on lcore %u\n", lcore_id);
for (i = 0; i < qconf->n_rx_queue; i++) {
- portid = qconf->rx_queue_list[i].port_id;
- queueid = qconf->rx_queue_list[i].queue_id;
- RTE_LOG(INFO, L3FWD, " -- lcoreid=%u portid=%hhu rxqueueid=%hhu\n", lcore_id,
- portid, queueid);
+ portid = qconf->rx_queue_list[i].portid;
+ RTE_LOG(INFO, IP_RSMBL, " -- lcoreid=%u portid=%hhu\n", lcore_id,
+ portid);
}
while (1) {
@@ -778,7 +533,7 @@ main_loop(__attribute__((unused)) void *dummy)
* This could be optimized (use queueid instead of
* portid), but it is not called so often
*/
- for (portid = 0; portid < MAX_PORTS; portid++) {
+ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
if ((enabled_port_mask & (1 << portid)) != 0)
send_burst(qconf, 1, portid);
}
@@ -791,10 +546,9 @@ main_loop(__attribute__((unused)) void *dummy)
*/
for (i = 0; i < qconf->n_rx_queue; ++i) {
- portid = qconf->rx_queue_list[i].port_id;
- queueid = qconf->rx_queue_list[i].queue_id;
+ portid = qconf->rx_queue_list[i].portid;
- nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+ nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst,
MAX_PKT_BURST);
/* Prefetch first packets */
@@ -807,13 +561,13 @@ main_loop(__attribute__((unused)) void *dummy)
for (j = 0; j < (nb_rx - PREFETCH_OFFSET); j++) {
rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[
j + PREFETCH_OFFSET], void *));
- l3fwd_simple_forward(pkts_burst[j], portid,
+ reassemble(pkts_burst[j], portid,
i, qconf, cur_tsc);
}
/* Forward remaining prefetched packets */
for (; j < nb_rx; j++) {
- l3fwd_simple_forward(pkts_burst[j], portid,
+ reassemble(pkts_burst[j], portid,
i, qconf, cur_tsc);
}
@@ -823,104 +577,15 @@ main_loop(__attribute__((unused)) void *dummy)
}
}
-static int
-check_lcore_params(void)
-{
- uint8_t queue, lcore;
- uint16_t i;
- int socketid;
-
- for (i = 0; i < nb_lcore_params; ++i) {
- queue = lcore_params[i].queue_id;
- if (queue >= MAX_RX_QUEUE_PER_PORT) {
- printf("invalid queue number: %hhu\n", queue);
- return -1;
- }
- lcore = lcore_params[i].lcore_id;
- if (!rte_lcore_is_enabled(lcore)) {
- printf("error: lcore %hhu is not enabled in lcore mask\n", lcore);
- return -1;
- }
- if ((socketid = rte_lcore_to_socket_id(lcore) != 0) &&
- (numa_on == 0)) {
- printf("warning: lcore %hhu is on socket %d with numa off \n",
- lcore, socketid);
- }
- }
- return 0;
-}
-
-static int
-check_port_config(const unsigned nb_ports)
-{
- unsigned portid;
- uint16_t i;
-
- for (i = 0; i < nb_lcore_params; ++i) {
- portid = lcore_params[i].port_id;
- if ((enabled_port_mask & (1 << portid)) == 0) {
- printf("port %u is not enabled in port mask\n", portid);
- return -1;
- }
- if (portid >= nb_ports) {
- printf("port %u is not present on the board\n", portid);
- return -1;
- }
- }
- return 0;
-}
-
-static uint8_t
-get_port_n_rx_queues(const uint8_t port)
-{
- int queue = -1;
- uint16_t i;
-
- for (i = 0; i < nb_lcore_params; ++i) {
- if (lcore_params[i].port_id == port && lcore_params[i].queue_id > queue)
- queue = lcore_params[i].queue_id;
- }
- return (uint8_t)(++queue);
-}
-
-static int
-init_lcore_rx_queues(void)
-{
- uint16_t i, nb_rx_queue;
- uint8_t lcore;
-
- for (i = 0; i < nb_lcore_params; ++i) {
- lcore = lcore_params[i].lcore_id;
- nb_rx_queue = lcore_conf[lcore].n_rx_queue;
- if (nb_rx_queue >= MAX_RX_QUEUE_PER_LCORE) {
- printf("error: too many queues (%u) for lcore: %u\n",
- (unsigned)nb_rx_queue + 1, (unsigned)lcore);
- return -1;
- } else {
- lcore_conf[lcore].rx_queue_list[nb_rx_queue].port_id =
- lcore_params[i].port_id;
- lcore_conf[lcore].rx_queue_list[nb_rx_queue].queue_id =
- lcore_params[i].queue_id;
- lcore_conf[lcore].n_rx_queue++;
- }
- }
- return 0;
-}
-
/* display usage */
static void
print_usage(const char *prgname)
{
- printf ("%s [EAL options] -- -p PORTMASK -P"
- " [--config (port,queue,lcore)[,(port,queue,lcore]]"
- " [--enable-jumbo [--max-pkt-len PKTLEN]]"
+ printf ("%s [EAL options] -- -p PORTMASK [-q NQ]"
+ " [--max-pkt-len PKTLEN]"
" [--maxflows=<flows>] [--flowttl=<ttl>[(s|ms)]]\n"
" -p PORTMASK: hexadecimal bitmask of ports to configure\n"
- " -P : enable promiscuous mode\n"
- " --config (port,queue,lcore): rx queues configuration\n"
- " --no-numa: optional, disable numa awareness\n"
- " --enable-jumbo: enable jumbo frame"
- " which max packet len is PKTLEN in decimal (64-9600)\n"
+ " -q NQ: number of RX queues per lcore\n"
" --maxflows=<flows>: optional, maximum number of flows "
"supported\n"
" --flowttl=<ttl>[(s|ms)]: optional, maximum TTL for each "
@@ -953,8 +618,8 @@ parse_flow_ttl(const char *str, uint32_t min, uint32_t max, uint32_t *val)
char *end;
uint64_t v;
- static const char frmt_sec[] = "s";
- static const char frmt_msec[] = "ms";
+ static const char frmt_sec[] = "s";
+ static const char frmt_msec[] = "ms";
/* parse decimal string */
errno = 0;
@@ -976,23 +641,6 @@ parse_flow_ttl(const char *str, uint32_t min, uint32_t max, uint32_t *val)
return (0);
}
-
-static int parse_max_pkt_len(const char *pktlen)
-{
- char *end = NULL;
- unsigned long len;
-
- /* parse decimal string */
- len = strtoul(pktlen, &end, 10);
- if ((pktlen[0] == '\0') || (end == NULL) || (*end != '\0'))
- return -1;
-
- if (len == 0)
- return -1;
-
- return len;
-}
-
static int
parse_portmask(const char *portmask)
{
@@ -1011,54 +659,23 @@ parse_portmask(const char *portmask)
}
static int
-parse_config(const char *q_arg)
+parse_nqueue(const char *q_arg)
{
- char s[256];
- const char *p, *p0 = q_arg;
- char *end;
- enum fieldnames {
- FLD_PORT = 0,
- FLD_QUEUE,
- FLD_LCORE,
- _NUM_FLD
- };
- unsigned long int_fld[_NUM_FLD];
- char *str_fld[_NUM_FLD];
- int i;
- unsigned size;
-
- nb_lcore_params = 0;
+ char *end = NULL;
+ unsigned long n;
- while ((p = strchr(p0,'(')) != NULL) {
- ++p;
- if((p0 = strchr(p,')')) == NULL)
- return -1;
+ printf("%p\n", q_arg);
- size = p0 - p;
- if(size >= sizeof(s))
- return -1;
+ /* parse hexadecimal string */
+ n = strtoul(q_arg, &end, 10);
+ if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+ return -1;
+ if (n == 0)
+ return -1;
+ if (n >= MAX_RX_QUEUE_PER_LCORE)
+ return -1;
- rte_snprintf(s, sizeof(s), "%.*s", size, p);
- if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') != _NUM_FLD)
- return -1;
- for (i = 0; i < _NUM_FLD; i++){
- errno = 0;
- int_fld[i] = strtoul(str_fld[i], &end, 0);
- if (errno != 0 || end == str_fld[i] || int_fld[i] > 255)
- return -1;
- }
- if (nb_lcore_params >= MAX_LCORE_PARAMS) {
- printf("exceeded max number of lcore params: %hu\n",
- nb_lcore_params);
- return -1;
- }
- lcore_params_array[nb_lcore_params].port_id = (uint8_t)int_fld[FLD_PORT];
- lcore_params_array[nb_lcore_params].queue_id = (uint8_t)int_fld[FLD_QUEUE];
- lcore_params_array[nb_lcore_params].lcore_id = (uint8_t)int_fld[FLD_LCORE];
- ++nb_lcore_params;
- }
- lcore_params = lcore_params_array;
- return 0;
+ return n;
}
/* Parse the argument given in the command line of the application */
@@ -1070,9 +687,7 @@ parse_args(int argc, char **argv)
int option_index;
char *prgname = argv[0];
static struct option lgopts[] = {
- {"config", 1, 0, 0},
- {"no-numa", 0, 0, 0},
- {"enable-jumbo", 0, 0, 0},
+ {"max-pkt-len", 1, 0, 0},
{"maxflows", 1, 0, 0},
{"flowttl", 1, 0, 0},
{NULL, 0, 0, 0}
@@ -1080,7 +695,7 @@ parse_args(int argc, char **argv)
argvopt = argv;
- while ((opt = getopt_long(argc, argvopt, "p:P",
+ while ((opt = getopt_long(argc, argvopt, "p:q:",
lgopts, &option_index)) != EOF) {
switch (opt) {
@@ -1093,27 +708,19 @@ parse_args(int argc, char **argv)
return -1;
}
break;
- case 'P':
- printf("Promiscuous mode selected\n");
- promiscuous_on = 1;
+
+ /* nqueue */
+ case 'q':
+ rx_queue_per_lcore = parse_nqueue(optarg);
+ if (rx_queue_per_lcore < 0) {
+ printf("invalid queue number\n");
+ print_usage(prgname);
+ return -1;
+ }
break;
/* long options */
case 0:
- if (!strncmp(lgopts[option_index].name, "config", 6)) {
- ret = parse_config(optarg);
- if (ret) {
- printf("invalid config\n");
- print_usage(prgname);
- return -1;
- }
- }
-
- if (!strncmp(lgopts[option_index].name, "no-numa", 7)) {
- printf("numa is disabled \n");
- numa_on = 0;
- }
-
if (!strncmp(lgopts[option_index].name,
"maxflows", 8)) {
if ((ret = parse_flow_num(optarg, MIN_FLOW_NUM,
@@ -1127,7 +734,7 @@ parse_args(int argc, char **argv)
return (ret);
}
}
-
+
if (!strncmp(lgopts[option_index].name, "flowttl", 7)) {
if ((ret = parse_flow_ttl(optarg, MIN_FLOW_TTL,
MAX_FLOW_TTL,
@@ -1141,26 +748,6 @@ parse_args(int argc, char **argv)
}
}
- if (!strncmp(lgopts[option_index].name, "enable-jumbo", 12)) {
- struct option lenopts = {"max-pkt-len", required_argument, 0, 0};
-
- printf("jumbo frame is enabled \n");
- port_conf.rxmode.jumbo_frame = 1;
-
- /* if no max-pkt-len set, use the default value ETHER_MAX_LEN */
- if (0 == getopt_long(argc, argvopt, "", &lenopts, &option_index)) {
- ret = parse_max_pkt_len(optarg);
- if ((ret < 64) || (ret > MAX_JUMBO_PKT_LEN)){
- printf("invalid packet length\n");
- print_usage(prgname);
- return -1;
- }
- port_conf.rxmode.max_rx_pkt_len = ret;
- }
- printf("set jumbo frame max packet length to %u\n",
- (unsigned int)port_conf.rxmode.max_rx_pkt_len);
- }
-
break;
default:
@@ -1189,182 +776,6 @@ print_ethaddr(const char *name, const struct ether_addr *eth_addr)
eth_addr->addr_bytes[5]);
}
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
-static void
-setup_hash(int socketid)
-{
- struct rte_hash_parameters ipv4_l3fwd_hash_params = {
- .name = NULL,
- .entries = L3FWD_HASH_ENTRIES,
- .bucket_entries = 4,
- .key_len = sizeof(struct ipv4_5tuple),
- .hash_func = DEFAULT_HASH_FUNC,
- .hash_func_init_val = 0,
- };
-
- struct rte_hash_parameters ipv6_l3fwd_hash_params = {
- .name = NULL,
- .entries = L3FWD_HASH_ENTRIES,
- .bucket_entries = 4,
- .key_len = sizeof(struct ipv6_5tuple),
- .hash_func = DEFAULT_HASH_FUNC,
- .hash_func_init_val = 0,
- };
-
- unsigned i;
- int ret;
- char s[64];
-
- /* create ipv4 hash */
- rte_snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
- ipv4_l3fwd_hash_params.name = s;
- ipv4_l3fwd_hash_params.socket_id = socketid;
- ipv4_l3fwd_lookup_struct[socketid] = rte_hash_create(&ipv4_l3fwd_hash_params);
- if (ipv4_l3fwd_lookup_struct[socketid] == NULL)
- rte_exit(EXIT_FAILURE, "Unable to create the l3fwd hash on "
- "socket %d\n", socketid);
-
- /* create ipv6 hash */
- rte_snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
- ipv6_l3fwd_hash_params.name = s;
- ipv6_l3fwd_hash_params.socket_id = socketid;
- ipv6_l3fwd_lookup_struct[socketid] = rte_hash_create(&ipv6_l3fwd_hash_params);
- if (ipv6_l3fwd_lookup_struct[socketid] == NULL)
- rte_exit(EXIT_FAILURE, "Unable to create the l3fwd hash on "
- "socket %d\n", socketid);
-
-
- /* populate the ipv4 hash */
- for (i = 0; i < IPV4_L3FWD_NUM_ROUTES; i++) {
- ret = rte_hash_add_key (ipv4_l3fwd_lookup_struct[socketid],
- (void *) &ipv4_l3fwd_route_array[i].key);
- if (ret < 0) {
- rte_exit(EXIT_FAILURE, "Unable to add entry %u to the"
- "l3fwd hash on socket %d\n", i, socketid);
- }
- ipv4_l3fwd_out_if[ret] = ipv4_l3fwd_route_array[i].if_out;
- printf("Hash: Adding key\n");
- print_ipv4_key(ipv4_l3fwd_route_array[i].key);
- }
-
- /* populate the ipv6 hash */
- for (i = 0; i < IPV6_L3FWD_NUM_ROUTES; i++) {
- ret = rte_hash_add_key (ipv6_l3fwd_lookup_struct[socketid],
- (void *) &ipv6_l3fwd_route_array[i].key);
- if (ret < 0) {
- rte_exit(EXIT_FAILURE, "Unable to add entry %u to the"
- "l3fwd hash on socket %d\n", i, socketid);
- }
- ipv6_l3fwd_out_if[ret] = ipv6_l3fwd_route_array[i].if_out;
- printf("Hash: Adding key\n");
- print_ipv6_key(ipv6_l3fwd_route_array[i].key);
- }
-}
-#endif
-
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
-static void
-setup_lpm(int socketid)
-{
- struct rte_lpm6_config config;
- unsigned i;
- int ret;
- char s[64];
-
- /* create the LPM table */
- rte_snprintf(s, sizeof(s), "IPV4_L3FWD_LPM_%d", socketid);
- ipv4_l3fwd_lookup_struct[socketid] = rte_lpm_create(s, socketid,
- IPV4_L3FWD_LPM_MAX_RULES, 0);
- if (ipv4_l3fwd_lookup_struct[socketid] == NULL)
- rte_exit(EXIT_FAILURE, "Unable to create the l3fwd LPM table"
- " on socket %d\n", socketid);
-
- /* populate the LPM table */
- for (i = 0; i < IPV4_L3FWD_NUM_ROUTES; i++) {
- ret = rte_lpm_add(ipv4_l3fwd_lookup_struct[socketid],
- ipv4_l3fwd_route_array[i].ip,
- ipv4_l3fwd_route_array[i].depth,
- ipv4_l3fwd_route_array[i].if_out);
-
- if (ret < 0) {
- rte_exit(EXIT_FAILURE, "Unable to add entry %u to the "
- "l3fwd LPM table on socket %d\n",
- i, socketid);
- }
-
- printf("LPM: Adding route 0x%08x / %d (%d)\n",
- (unsigned)ipv4_l3fwd_route_array[i].ip,
- ipv4_l3fwd_route_array[i].depth,
- ipv4_l3fwd_route_array[i].if_out);
- }
-
- /* create the LPM6 table */
- rte_snprintf(s, sizeof(s), "IPV6_L3FWD_LPM_%d", socketid);
-
- config.max_rules = IPV6_L3FWD_LPM_MAX_RULES;
- config.number_tbl8s = IPV6_L3FWD_LPM_NUMBER_TBL8S;
- config.flags = 0;
- ipv6_l3fwd_lookup_struct[socketid] = rte_lpm6_create(s, socketid,
- &config);
- if (ipv6_l3fwd_lookup_struct[socketid] == NULL)
- rte_exit(EXIT_FAILURE, "Unable to create the l3fwd LPM table"
- " on socket %d\n", socketid);
-
- /* populate the LPM table */
- for (i = 0; i < IPV6_L3FWD_NUM_ROUTES; i++) {
- ret = rte_lpm6_add(ipv6_l3fwd_lookup_struct[socketid],
- ipv6_l3fwd_route_array[i].ip,
- ipv6_l3fwd_route_array[i].depth,
- ipv6_l3fwd_route_array[i].if_out);
-
- if (ret < 0) {
- rte_exit(EXIT_FAILURE, "Unable to add entry %u to the "
- "l3fwd LPM table on socket %d\n",
- i, socketid);
- }
-
- printf("LPM: Adding route %s / %d (%d)\n",
- "IPV6",
- ipv6_l3fwd_route_array[i].depth,
- ipv6_l3fwd_route_array[i].if_out);
- }
-}
-#endif
-
-static int
-init_mem(void)
-{
- struct lcore_conf *qconf;
- int socketid;
- unsigned lcore_id;
-
- for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
- if (rte_lcore_is_enabled(lcore_id) == 0)
- continue;
-
- if (numa_on)
- socketid = rte_lcore_to_socket_id(lcore_id);
- else
- socketid = 0;
-
- if (socketid >= NB_SOCKETS) {
- rte_exit(EXIT_FAILURE,
- "Socket %d of lcore %u is out of range %d\n",
- socketid, lcore_id, NB_SOCKETS);
- }
-
-#if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
- setup_lpm(socketid);
-#else
- setup_hash(socketid);
-#endif
- qconf = &lcore_conf[lcore_id];
- qconf->ipv4_lookup_struct = ipv4_l3fwd_lookup_struct[socketid];
- qconf->ipv6_lookup_struct = ipv6_l3fwd_lookup_struct[socketid];
- }
- return 0;
-}
-
/* Check the link status of all ports in up to 9s, and print them finally */
static void
check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
@@ -1415,12 +826,73 @@ check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
/* set the print_flag if all ports up or timeout */
if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
print_flag = 1;
- printf("done\n");
+ printf("\ndone\n");
}
}
}
-static void
-setup_port_tbl(struct lcore_conf *qconf, uint32_t lcore, int socket,
+
+static int
+init_routing_table(void)
+{
+ struct rte_lpm * lpm;
+ struct rte_lpm6 * lpm6;
+ int socket, ret;
+ unsigned i;
+
+ for (socket = 0; socket < RTE_MAX_NUMA_NODES; socket++) {
+ if (socket_lpm[socket]) {
+ lpm = socket_lpm[socket];
+ /* populate the LPM table */
+ for (i = 0; i < RTE_DIM(l3fwd_ipv4_route_array); i++) {
+ ret = rte_lpm_add(lpm,
+ l3fwd_ipv4_route_array[i].ip,
+ l3fwd_ipv4_route_array[i].depth,
+ l3fwd_ipv4_route_array[i].if_out);
+
+ if (ret < 0) {
+ RTE_LOG(ERR, IP_RSMBL, "Unable to add entry %i to the l3fwd "
+ "LPM table\n", i);
+ return -1;
+ }
+
+ RTE_LOG(INFO, IP_RSMBL, "Socket %i: adding route " IPv4_BYTES_FMT
+ "/%d (port %d)\n",
+ socket,
+ IPv4_BYTES(l3fwd_ipv4_route_array[i].ip),
+ l3fwd_ipv4_route_array[i].depth,
+ l3fwd_ipv4_route_array[i].if_out);
+ }
+ }
+
+ if (socket_lpm6[socket]) {
+ lpm6 = socket_lpm6[socket];
+ /* populate the LPM6 table */
+ for (i = 0; i < RTE_DIM(l3fwd_ipv6_route_array); i++) {
+ ret = rte_lpm6_add(lpm6,
+ l3fwd_ipv6_route_array[i].ip,
+ l3fwd_ipv6_route_array[i].depth,
+ l3fwd_ipv6_route_array[i].if_out);
+
+ if (ret < 0) {
+ RTE_LOG(ERR, IP_RSMBL, "Unable to add entry %i to the l3fwd "
+ "LPM6 table\n", i);
+ return -1;
+ }
+
+ RTE_LOG(INFO, IP_RSMBL, "Socket %i: adding route " IPv6_BYTES_FMT
+ "/%d (port %d)\n",
+ socket,
+ IPv6_BYTES(l3fwd_ipv6_route_array[i].ip),
+ l3fwd_ipv6_route_array[i].depth,
+ l3fwd_ipv6_route_array[i].if_out);
+ }
+ }
+ }
+ return 0;
+}
+
+static int
+setup_port_tbl(struct lcore_queue_conf *qconf, uint32_t lcore, int socket,
uint32_t port)
{
struct mbuf_table *mtb;
@@ -1431,73 +903,136 @@ setup_port_tbl(struct lcore_conf *qconf, uint32_t lcore, int socket,
sz = sizeof (*mtb) + sizeof (mtb->m_table[0]) * n;
if ((mtb = rte_zmalloc_socket(__func__, sz, CACHE_LINE_SIZE,
- socket)) == NULL)
- rte_exit(EXIT_FAILURE, "%s() for lcore: %u, port: %u "
+ socket)) == NULL) {
+ RTE_LOG(ERR, IP_RSMBL, "%s() for lcore: %u, port: %u "
"failed to allocate %zu bytes\n",
__func__, lcore, port, sz);
+ return -1;
+ }
mtb->len = n;
qconf->tx_mbufs[port] = mtb;
+
+ return 0;
}
-static void
-setup_queue_tbl(struct lcore_conf *qconf, uint32_t lcore, int socket,
- uint32_t queue)
+static int
+setup_queue_tbl(struct rx_queue *rxq, uint32_t lcore, uint32_t queue)
{
+ int socket;
uint32_t nb_mbuf;
uint64_t frag_cycles;
char buf[RTE_MEMPOOL_NAMESIZE];
+ socket = rte_lcore_to_socket_id(lcore);
+ if (socket == SOCKET_ID_ANY)
+ socket = 0;
+
frag_cycles = (rte_get_tsc_hz() + MS_PER_S - 1) / MS_PER_S *
max_flow_ttl;
- if ((qconf->frag_tbl[queue] = rte_ip_frag_table_create(max_flow_num,
- IPV4_FRAG_TBL_BUCKET_ENTRIES, max_flow_num, frag_cycles,
- socket)) == NULL)
- rte_exit(EXIT_FAILURE, "ipv4_frag_tbl_create(%u) on "
+ if ((rxq->frag_tbl = rte_ip_frag_table_create(max_flow_num,
+ IP_FRAG_TBL_BUCKET_ENTRIES, max_flow_num, frag_cycles,
+ socket)) == NULL) {
+ RTE_LOG(ERR, IP_RSMBL, "ip_frag_tbl_create(%u) on "
"lcore: %u for queue: %u failed\n",
max_flow_num, lcore, queue);
+ return -1;
+ }
/*
* At any given moment up to <max_flow_num * (MAX_FRAG_NUM - 1)>
* mbufs could be stored int the fragment table.
* Plus, each TX queue can hold up to <max_flow_num> packets.
- */
+ */
- nb_mbuf = 2 * RTE_MAX(max_flow_num, 2UL * MAX_PKT_BURST) *
- RTE_LIBRTE_IP_FRAG_MAX_FRAG;
+ nb_mbuf = 2 * RTE_MAX(max_flow_num, 2UL * MAX_PKT_BURST) * MAX_FRAG_NUM;
nb_mbuf *= (port_conf.rxmode.max_rx_pkt_len + BUF_SIZE - 1) / BUF_SIZE;
nb_mbuf += RTE_TEST_RX_DESC_DEFAULT + RTE_TEST_TX_DESC_DEFAULT;
+ nb_mbuf *= 2; /* ipv4 and ipv6 */
+
+ nb_mbuf = RTE_MAX(nb_mbuf, (uint32_t)NB_MBUF);
- nb_mbuf = RTE_MAX(nb_mbuf, (uint32_t)DEF_MBUF_NUM);
-
rte_snprintf(buf, sizeof(buf), "mbuf_pool_%u_%u", lcore, queue);
- if ((qconf->pool[queue] = rte_mempool_create(buf, nb_mbuf, MBUF_SIZE, 0,
+ if ((rxq->pool = rte_mempool_create(buf, nb_mbuf, MBUF_SIZE, 0,
sizeof(struct rte_pktmbuf_pool_private),
rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, NULL,
- socket, MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET)) == NULL)
- rte_exit(EXIT_FAILURE, "mempool_create(%s) failed", buf);
+ socket, MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET)) == NULL) {
+ RTE_LOG(ERR, IP_RSMBL, "mempool_create(%s) failed", buf);
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+init_mem(void)
+{
+ char buf[PATH_MAX];
+ struct rte_lpm * lpm;
+ struct rte_lpm6 * lpm6;
+ int socket;
+ unsigned lcore_id;
+
+ /* traverse through lcores and initialize structures on each socket */
+
+ for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+
+ if (rte_lcore_is_enabled(lcore_id) == 0)
+ continue;
+
+ socket = rte_lcore_to_socket_id(lcore_id);
+
+ if (socket == SOCKET_ID_ANY)
+ socket = 0;
+
+ if (socket_lpm[socket] == NULL) {
+ RTE_LOG(INFO, IP_RSMBL, "Creating LPM table on socket %i\n", socket);
+ rte_snprintf(buf, sizeof(buf), "IP_RSMBL_LPM_%i", socket);
+
+ lpm = rte_lpm_create(buf, socket, LPM_MAX_RULES, 0);
+ if (lpm == NULL) {
+ RTE_LOG(ERR, IP_RSMBL, "Cannot create LPM table\n");
+ return -1;
+ }
+ socket_lpm[socket] = lpm;
+ }
+
+ if (socket_lpm6[socket] == NULL) {
+ RTE_LOG(INFO, IP_RSMBL, "Creating LPM6 table on socket %i\n", socket);
+ rte_snprintf(buf, sizeof(buf), "IP_RSMBL_LPM_%i", socket);
+
+ lpm6 = rte_lpm6_create("IP_RSMBL_LPM6", socket, &lpm6_config);
+ if (lpm6 == NULL) {
+ RTE_LOG(ERR, IP_RSMBL, "Cannot create LPM table\n");
+ return -1;
+ }
+ socket_lpm6[socket] = lpm6;
+ }
+ }
+
+ return 0;
}
static void
queue_dump_stat(void)
{
uint32_t i, lcore;
- const struct lcore_conf *qconf;
+ const struct lcore_queue_conf *qconf;
for (lcore = 0; lcore < RTE_MAX_LCORE; lcore++) {
if (rte_lcore_is_enabled(lcore) == 0)
continue;
- qconf = lcore_conf + lcore;
+ qconf = &lcore_queue_conf[lcore];
for (i = 0; i < qconf->n_rx_queue; i++) {
fprintf(stdout, " -- lcoreid=%u portid=%hhu "
- "rxqueueid=%hhu frag tbl stat:\n",
- lcore, qconf->rx_queue_list[i].port_id,
- qconf->rx_queue_list[i].queue_id);
- rte_ip_frag_table_statistics_dump(stdout, qconf->frag_tbl[i]);
+ "frag tbl stat:\n",
+ lcore, qconf->rx_queue_list[i].portid);
+ rte_ip_frag_table_statistics_dump(stdout,
+ qconf->rx_queue_list[i].frag_tbl);
fprintf(stdout, "TX bursts:\t%" PRIu64 "\n"
"TX packets _queued:\t%" PRIu64 "\n"
"TX packets dropped:\t%" PRIu64 "\n"
@@ -1521,13 +1056,14 @@ signal_handler(int signum)
int
MAIN(int argc, char **argv)
{
- struct lcore_conf *qconf;
- int ret;
+ struct lcore_queue_conf *qconf;
+ struct rx_queue * rxq;
+ int ret, socket;
unsigned nb_ports;
uint16_t queueid;
- unsigned lcore_id;
+ unsigned lcore_id = 0, rx_lcore_id = 0;
uint32_t n_tx_queue, nb_lcores;
- uint8_t portid, nb_rx_queue, queue, socketid;
+ uint8_t portid;
/* init EAL */
ret = rte_eal_init(argc, argv);
@@ -1539,28 +1075,23 @@ MAIN(int argc, char **argv)
/* parse application arguments (after the EAL ones) */
ret = parse_args(argc, argv);
if (ret < 0)
- rte_exit(EXIT_FAILURE, "Invalid L3FWD parameters\n");
-
- if (check_lcore_params() < 0)
- rte_exit(EXIT_FAILURE, "check_lcore_params failed\n");
-
- ret = init_lcore_rx_queues();
- if (ret < 0)
- rte_exit(EXIT_FAILURE, "init_lcore_rx_queues failed\n");
-
+ rte_exit(EXIT_FAILURE, "Invalid IP reassembly parameters\n");
if (rte_eal_pci_probe() < 0)
rte_exit(EXIT_FAILURE, "Cannot probe PCI\n");
nb_ports = rte_eth_dev_count();
- if (nb_ports > MAX_PORTS)
- nb_ports = MAX_PORTS;
-
- if (check_port_config(nb_ports) < 0)
- rte_exit(EXIT_FAILURE, "check_port_config failed\n");
+ if (nb_ports > RTE_MAX_ETHPORTS)
+ nb_ports = RTE_MAX_ETHPORTS;
+ else if (nb_ports == 0)
+ rte_exit(EXIT_FAILURE, "No ports found!\n");
nb_lcores = rte_lcore_count();
+ /* initialize structures (mempools, lpm etc.) */
+ if (init_mem() < 0)
+ rte_panic("Cannot initialize memory structures!\n");
+
/* initialize all ports */
for (portid = 0; portid < nb_ports; portid++) {
/* skip ports that are not enabled */
@@ -1569,30 +1100,62 @@ MAIN(int argc, char **argv)
continue;
}
+ qconf = &lcore_queue_conf[rx_lcore_id];
+
+ /* get the lcore_id for this port */
+ while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
+ qconf->n_rx_queue == (unsigned)rx_queue_per_lcore) {
+
+ rx_lcore_id ++;
+ if (rx_lcore_id >= RTE_MAX_LCORE)
+ rte_exit(EXIT_FAILURE, "Not enough cores\n");
+
+ qconf = &lcore_queue_conf[rx_lcore_id];
+ }
+
+ socket = rte_eth_dev_socket_id(portid);
+ if (socket == SOCKET_ID_ANY)
+ socket = 0;
+
+ queueid = qconf->n_rx_queue;
+ rxq = &qconf->rx_queue_list[queueid];
+ rxq->portid = portid;
+ rxq->lpm = socket_lpm[socket];
+ rxq->lpm6 = socket_lpm6[socket];
+ if (setup_queue_tbl(rxq, rx_lcore_id, queueid) < 0)
+ rte_exit(EXIT_FAILURE, "Failed to set up queue table\n");
+ qconf->n_rx_queue++;
+
/* init port */
printf("Initializing port %d ... ", portid );
fflush(stdout);
- nb_rx_queue = get_port_n_rx_queues(portid);
n_tx_queue = nb_lcores;
if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
n_tx_queue = MAX_TX_QUEUE_PER_PORT;
- printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
- nb_rx_queue, (unsigned)n_tx_queue );
- ret = rte_eth_dev_configure(portid, nb_rx_queue,
- (uint16_t)n_tx_queue, &port_conf);
- if (ret < 0)
- rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%d\n",
+ ret = rte_eth_dev_configure(portid, 1, (uint16_t)n_tx_queue,
+ &port_conf);
+ if (ret < 0) {
+ printf("\n");
+ rte_exit(EXIT_FAILURE, "Cannot configure device: "
+ "err=%d, port=%d\n",
ret, portid);
+ }
+
+ /* init one RX queue */
+ ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
+ socket, &rx_conf,
+ rxq->pool);
+ if (ret < 0) {
+ printf("\n");
+ rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup: "
+ "err=%d, port=%d\n",
+ ret, portid);
+ }
rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
print_ethaddr(" Address:", &ports_eth_addr[portid]);
- printf(", ");
-
- /* init memory */
- ret = init_mem();
- if (ret < 0)
- rte_exit(EXIT_FAILURE, "init_mem failed\n");
+ printf("\n");
/* init one TX queue per couple (lcore,port) */
queueid = 0;
@@ -1600,57 +1163,24 @@ MAIN(int argc, char **argv)
if (rte_lcore_is_enabled(lcore_id) == 0)
continue;
- if (numa_on)
- socketid = (uint8_t)rte_lcore_to_socket_id(lcore_id);
- else
- socketid = 0;
+ socket = (int) rte_lcore_to_socket_id(lcore_id);
- printf("txq=%u,%d,%d ", lcore_id, queueid, socketid);
+ printf("txq=%u,%d,%d ", lcore_id, queueid, socket);
fflush(stdout);
ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
- socketid, &tx_conf);
+ socket, &tx_conf);
if (ret < 0)
rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: err=%d, "
"port=%d\n", ret, portid);
- qconf = &lcore_conf[lcore_id];
+ qconf = &lcore_queue_conf[lcore_id];
qconf->tx_queue_id[portid] = queueid;
- setup_port_tbl(qconf, lcore_id, socketid, portid);
+ setup_port_tbl(qconf, lcore_id, socket, portid);
queueid++;
}
printf("\n");
}
- for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
- if (rte_lcore_is_enabled(lcore_id) == 0)
- continue;
- qconf = &lcore_conf[lcore_id];
- printf("\nInitializing rx queues on lcore %u ... ", lcore_id );
- fflush(stdout);
- /* init RX queues */
- for(queue = 0; queue < qconf->n_rx_queue; ++queue) {
- portid = qconf->rx_queue_list[queue].port_id;
- queueid = qconf->rx_queue_list[queue].queue_id;
-
- if (numa_on)
- socketid = (uint8_t)rte_lcore_to_socket_id(lcore_id);
- else
- socketid = 0;
-
- printf("rxq=%d,%d,%d ", portid, queueid, socketid);
- fflush(stdout);
-
- setup_queue_tbl(qconf, lcore_id, socketid, queue);
-
- ret = rte_eth_rx_queue_setup(portid, queueid, nb_rxd,
- socketid, &rx_conf, qconf->pool[queue]);
- if (ret < 0)
- rte_exit(EXIT_FAILURE,
- "rte_eth_rx_queue_setup: err=%d,"
- "port=%d\n", ret, portid);
- }
- }
-
printf("\n");
/* start ports */
@@ -1664,16 +1194,12 @@ MAIN(int argc, char **argv)
rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, port=%d\n",
ret, portid);
- /*
- * If enabled, put device in promiscuous mode.
- * This allows IO forwarding mode to forward packets
- * to itself through 2 cross-connected ports of the
- * target machine.
- */
- if (promiscuous_on)
- rte_eth_promiscuous_enable(portid);
+ rte_eth_promiscuous_enable(portid);
}
+ if (init_routing_table() < 0)
+ rte_exit(EXIT_FAILURE, "Cannot init routing table\n");
+
check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
signal(SIGUSR1, signal_handler);
--
1.8.1.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE ***
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (13 preceding siblings ...)
2014-05-28 17:32 ` [dpdk-dev] [PATCH 13/13] examples: overhaul of ip_reassembly app Anatoly Burakov
@ 2014-05-28 17:34 ` Burakov, Anatoly
2014-06-06 15:58 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Cao, Waterman
2014-06-16 16:59 ` [dpdk-dev] [PATCH 00/13] IP fragmentation and reassembly Thomas Monjalon
16 siblings, 0 replies; 18+ messages in thread
From: Burakov, Anatoly @ 2014-05-28 17:34 UTC (permalink / raw)
To: dev
Sorry, for some reason two cover letters were sent....
> Subject: [PATCH 00/13] *** SUBJECT HERE ***
>
> *** BLURB HERE ***
Best regards,
Anatoly Burakov
DPDK SW Engineer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (14 preceding siblings ...)
2014-05-28 17:34 ` [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Burakov, Anatoly
@ 2014-06-06 15:58 ` Cao, Waterman
2014-06-16 16:59 ` [dpdk-dev] [PATCH 00/13] IP fragmentation and reassembly Thomas Monjalon
16 siblings, 0 replies; 18+ messages in thread
From: Cao, Waterman @ 2014-06-06 15:58 UTC (permalink / raw)
To: Burakov, Anatoly, dev
Tested-by: Waterman Cao <waterman.cao@intel.com>
This patch includes 13 files, ip_fragmentation and ip_reassembly app have been tested by Intel.
We verified IP fragmentation/reassembly library with IPv4 and IPv6 . All cases passed.
Please see test guidance as the following:
IP reassembly:
1. ./examples/ip_reassembly/build/ip_reassembly -c f -n 3 -- -p 0x30
2. Configure Scapy setting
Ether() / IPv6() / IPv6ExtHdrFragment() / TCP() / ("X" * 3000)
packet[IPv6].dst = 'fe80::92e2:baff:fe48:81b5'
sendp(ptks,iface="eth5")
3. Use Wireshark to capture file and confirm its correction.
IP Fragment:
./ip_fragmentation -c <LCOREMASK> -n 4 -- [-P] -p PORTMASK
-q <NUM_OF_PORTS_PER_THREAD>
See the test environment information as the following :
Fedora 20 x86_64, Linux Kernel 3.11.10-301, GCC 4.8.2
Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz NIC: Niantic 82599
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dpdk-dev] [PATCH 00/13] IP fragmentation and reassembly
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
` (15 preceding siblings ...)
2014-06-06 15:58 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Cao, Waterman
@ 2014-06-16 16:59 ` Thomas Monjalon
16 siblings, 0 replies; 18+ messages in thread
From: Thomas Monjalon @ 2014-06-16 16:59 UTC (permalink / raw)
To: Anatoly Burakov; +Cc: dev
> Anatoly Burakov (13):
> ip_frag: Moving fragmentation/reassembly headers into a separate
> library
> Refactored IPv4 fragmentation into a proper library
> Fixing issues reported by checkpatch
> ip_frag: new internal common header
> ip_frag: removed unneeded check and macro
> ip_frag: renaming structures in fragmentation table to be more generic
> ip_frag: refactored reassembly code and made it a proper library
> ip_frag: renamed ipv4 frag function
> ip_frag: added IPv6 fragmentation support
> examples: renamed ipv4_frag example app to ip_fragmentation
> example: overhaul of ip_fragmentation example app
> ip_frag: add support for IPv6 reassembly
> examples: overhaul of ip_reassembly app
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
I've fixed few code style issues and added the library in doxygen.
Applied for version 1.7.0.
Thanks
--
Thomas
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-06-16 16:59 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-28 17:32 [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 01/13] ip_frag: Moving fragmentation/reassembly headers into a separate library Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 02/13] Refactored IPv4 fragmentation into a proper library Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 03/13] Fixing issues reported by checkpatch Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 04/13] ip_frag: new internal common header Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 05/13] ip_frag: removed unneeded check and macro Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 06/13] ip_frag: renaming structures in fragmentation table to be more generic Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 07/13] ip_frag: refactored reassembly code and made it a proper library Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 08/13] ip_frag: renamed ipv4 frag function Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 09/13] ip_frag: added IPv6 fragmentation support Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 10/13] examples: renamed ipv4_frag example app to ip_fragmentation Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 11/13] example: overhaul of ip_fragmentation example app Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 12/13] ip_frag: add support for IPv6 reassembly Anatoly Burakov
2014-05-28 17:32 ` [dpdk-dev] [PATCH 13/13] examples: overhaul of ip_reassembly app Anatoly Burakov
2014-05-28 17:34 ` [dpdk-dev] [PATCH 00/13] *** SUBJECT HERE *** Burakov, Anatoly
2014-06-06 15:58 ` [dpdk-dev] [PATCH 00/13] IPv4/IPv6 fragmentation/reassembly library Cao, Waterman
2014-06-16 16:59 ` [dpdk-dev] [PATCH 00/13] IP fragmentation and reassembly Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).