* Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
@ 2022-02-10 12:21 Ananyev, Konstantin
2022-02-11 2:12 ` Huichao Cai
2022-02-11 2:20 ` Huichao Cai
0 siblings, 2 replies; 9+ messages in thread
From: Ananyev, Konstantin @ 2022-02-10 12:21 UTC (permalink / raw)
To: chcchc88; +Cc: dev
>
> According to RFC791,the options may appear or not in datagrams.
> They must be implemented by all IP modules (host and gateways).
> What is optional is their transmission in any particular datagram,
> not their implementation.So we have to deal with it during the
> fragmenting process.Add some test data for the IPv4 header optional
> field fragmenting.
>
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
....
> diff --git a/lib/ip_frag/rte_ipv4_fragmentation.c b/lib/ip_frag/rte_ipv4_fragmentation.c
> index 2e7739d..bcafa29 100644
> --- a/lib/ip_frag/rte_ipv4_fragmentation.c
> +++ b/lib/ip_frag/rte_ipv4_fragmentation.c
> @@ -1,4 +1,4 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> +/* SPDX-License-Identifier: (BSD-3-Clause OR GPL-2.0)
> * Copyright(c) 2010-2014 Intel Corporation
> */
>
> @@ -12,6 +12,13 @@
>
> #include "ip_frag_common.h"
>
> +/* IP options */
> +#define RTE_IPOPT_COPY 0x80
> +#define RTE_IPOPT_CONTROL 0x00
> +#define RTE_IPOPT_END (0 | RTE_IPOPT_CONTROL)
> +#define RTE_IPOPT_NOOP (1 | RTE_IPOPT_CONTROL)
> +#define RTE_IPOPT_COPIED(o) ((o) & RTE_IPOPT_COPY)
> +
> /* Fragment Offset */
> #define RTE_IPV4_HDR_DF_SHIFT 14
> #define RTE_IPV4_HDR_MF_SHIFT 13
> @@ -41,6 +48,38 @@ static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
> rte_pktmbuf_free(mb[i]);
> }
>
> +/*
> + * Options "fragmenting", just fill options not
> + * allowed in fragments with NOOPs.
> + * Simple and stupid 8), but the most efficient way.
> + */
> +static inline void ip_options_fragment(struct rte_ipv4_hdr *iph)
> +{
> + unsigned char *optptr = (unsigned char *)iph +
> + sizeof(struct rte_ipv4_hdr);
As a nit, why not 'uint8_t *', to keep style the same through all file?
> + int l = (iph->version_ihl & RTE_IPV4_HDR_IHL_MASK) *
> + RTE_IPV4_IHL_MULTIPLIER - sizeof(struct rte_ipv4_hdr);
We already done such calculation in rte_ipv4_fragment_packet(),
so can re-use header_len value here.
> + int optlen;
> +
> + while (l > 0) {
> + switch (*optptr) {
> + case RTE_IPOPT_END:
> + return;
> + case RTE_IPOPT_NOOP:
> + l--;
> + optptr++;
> + continue;
> + }
> + optlen = optptr[1];
> + if (optlen < 2 || optlen > l)
Why optlen==1 is not considered as valid one?
> + return;
> + if (!RTE_IPOPT_COPIED(*optptr))
> + memset(optptr, RTE_IPOPT_NOOP, optlen);
> + l -= optlen;
> + optptr += optlen;
> + }
> +}
> +
> /**
> * IPv4 fragmentation.
> *
> @@ -188,6 +227,17 @@ static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
> (uint16_t)out_pkt->pkt_len,
> flag_offset, fragment_offset, more_in_segs);
>
> + /*
> + * ANK:
What means 'ANK' here?
> dirty, but effective trick. Upgrade options only if
> + * the segment to be fragmented was THE FIRST (otherwise,
> + * options are already fixed) and make it ONCE
> + * on the initial mbuf, so that all the following fragments
> + * will inherit fixed options.
> + */
> + if ((fragment_offset == 0) &&
> + ((flag_offset & RTE_IPV4_HDR_OFFSET_MASK) == 0))
> + ip_options_fragment(in_hdr);
> +
I see two problems with that approach:
- you modify incoming packet's header - which is the change in behaviour,
and doesn't look right at all.
- you remove not-copied options from all fragments.
As I can read RFC 791 - first fragment should have a copy of all options present
in original packet, while other fragments need to have only those that have to be
copied.
> fragment_offset = (uint16_t)(fragment_offset +
> out_pkt->pkt_len - header_len);
>
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re:Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
2022-02-10 12:21 [PATCH] ip_frag: add IPv4 options fragment and unit test data Ananyev, Konstantin
@ 2022-02-11 2:12 ` Huichao Cai
2022-02-11 9:41 ` Ananyev, Konstantin
2022-02-11 2:20 ` Huichao Cai
1 sibling, 1 reply; 9+ messages in thread
From: Huichao Cai @ 2022-02-11 2:12 UTC (permalink / raw)
To: Ananyev, Konstantin; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1797 bytes --]
Hi,Konstantin
Thank you for your reply!
>As a nit, why not 'uint8_t *', to keep style the same through all file?
Yes,I can use 'uint8_t *.Thank you for your correction.
>We already done such calculation in rte_ipv4_fragment_packet(),
>so can re-use header_len value here.
Yes,I can re-use header_len.Thank you for your correction.
>Why optlen==1 is not considered as valid one?
RFC791:
Case 2: An option-type octet, an option-length octet, and the actual option-data octets.
The option-length octet counts the option-type octet and the option-length octet as well as the option-data octets.
So for case 2, the value of optlen is at least 2.
>What means 'ANK' here?
Because this code comes from the Linux kernel and is licensed under the GPL, I kept the original comments, I looked up the Linux kernel code, and ‘ANK’ should be the name of a developer.
>I see two problems with that approach: >- you modify incoming packet's header - which is the change in behaviour,
> and doesn't look right at all.
Because the incoming packet is fragmented, the incoming packet IP header is no longer used, so this behavior does not cause problems.At the same time, in order to consider efficiency, modify the original IP header directly, rather than creating a separate IP header.Of course, if this method is not reasonable, I can create a separate IP header to modify.
>- you remove not-copied options from all fragments. > As I can read RFC 791 - first fragment should have a copy of all options present > in original packet, while other fragments need to have only those that have to be
> copied.
This function(ip_options_fragment) is called under that ‘__fill_ipv4hdr_frag’,so the first fragment have a copy of all options present in original packet.
Huichao Cai
[-- Attachment #2: Type: text/html, Size: 2480 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Re:Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
2022-02-11 2:12 ` Huichao Cai
@ 2022-02-11 9:41 ` Ananyev, Konstantin
2022-02-11 10:00 ` Huichao Cai
0 siblings, 1 reply; 9+ messages in thread
From: Ananyev, Konstantin @ 2022-02-11 9:41 UTC (permalink / raw)
To: Huichao Cai; +Cc: dev
Hi Huichao
>>As a nit, why not 'uint8_t *', to keep style the same through all file?
>Yes,I can use 'uint8_t *.Thank you for your correction.
>
>>We already done such calculation in rte_ipv4_fragment_packet(),
>>so can re-use header_len value here.
>Yes,I can re-use header_len.Thank you for your correction.
>
>>Why optlen==1 is not considered as valid one?
>RFC791:
> Case 2: An option-type octet, an option-length octet, and the actual option-data octets.
> The option-length octet counts the option-type octet and the option-length octet as well as the option-data octets.
>So for case 2, the value of optlen is at least 2.
Ok, thanks for explanation.
>>What means 'ANK' here?
>Because this code comes from the Linux kernel and is licensed under the GPL, I kept the original comments, I looked up the Linux kernel code, and ‘ANK’ should be the name of a >developer.
AFAIK, we can't copy-paste code from Linux kernel.
As you noted it is under GPL, while DPDK is under BSD-3 license.
>>I see two problems with that approach:
>>- you modify incoming packet's header - which is the change in behaviour,
>> and doesn't look right at all.
>Because the incoming packet is fragmented, the incoming packet IP header is no longer used, so this behavior does not cause problems.At the same time, in order to consider >efficiency, modify the original IP header directly, rather than creating a separate IP header.Of course, if this method is not reasonable, I can create a separate IP header to modify.
Library routine has no idea would original IP packet will be used later or not.
In your particular case it might be not needed, but there might be other usages,
that do use it (logging, send un-fragmented via other port, etc.).
So I think we have to preserve original behaviour.
>>- you remove not-copied options from all fragments.
>> As I can read RFC 791 - first fragment should have a copy of all options present
>> in original packet, while other fragments need to have only those that have to be
>> copied.
>This function(ip_options_fragment) is called under that ‘__fill_ipv4hdr_frag’,so the first fragment have a copy of all options present in original packet.
Right, I missed the fact that you do modify original packet after making a first copy.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re:RE: Re:Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
2022-02-11 9:41 ` Ananyev, Konstantin
@ 2022-02-11 10:00 ` Huichao Cai
0 siblings, 0 replies; 9+ messages in thread
From: Huichao Cai @ 2022-02-11 10:00 UTC (permalink / raw)
To: Ananyev, Konstantin; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 461 bytes --]
>AFAIK, we can't copy-paste code from Linux kernel.
>As you noted it is under GPL, while DPDK is under BSD-3 license.
Well, I'll rewrite the code.
>Library routine has no idea would original IP packet will be used later or not.
>In your particular case it might be not needed, but there might be other usages,
>that do use it (logging, send un-fragmented via other port, etc.).
>So I think we have to preserve original behaviour.
Ok,I will preserve original.
[-- Attachment #2: Type: text/html, Size: 965 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re:Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
2022-02-10 12:21 [PATCH] ip_frag: add IPv4 options fragment and unit test data Ananyev, Konstantin
2022-02-11 2:12 ` Huichao Cai
@ 2022-02-11 2:20 ` Huichao Cai
2022-02-11 10:11 ` Ferruh Yigit
1 sibling, 1 reply; 9+ messages in thread
From: Huichao Cai @ 2022-02-11 2:20 UTC (permalink / raw)
To: Ananyev, Konstantin; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 103 bytes --]
A small problem.Why is the content of the email just sent to you not visible at Patchwork (this patch).
[-- Attachment #2: Type: text/html, Size: 348 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
@ 2021-12-02 9:35 Dariusz Sosnowski
0 siblings, 0 replies; 9+ messages in thread
From: Dariusz Sosnowski @ 2021-12-02 9:35 UTC (permalink / raw)
To: Huichao Cai; +Cc: konstantin.ananyev, dev
Hi,
On Thu, 2 Dec 2021 10:24:40 +0800, Huichao Cai wrote:
> > Substituting options with NOOP might cause rte_ipv4_fragment_packet to produce more fragments than necessary, since options with copied flag unset will still occupy space in IPv4 header.
> --The "ip_options_fragment" just make a replacement and doesn't change the length of the IPv4 header.So I don't quite understand why it leads to produce more fragments.
If options with copied flag unset are not copied, then IPv4 headers in the fragments (despite 1st fragment) will be shorter. This leaves more byte space for the payload and in effect fragmentation might produce less fragments.
Best regards,
Dariusz Sosnowski
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ip_frag: add IPv4 options fragment and unit test data
@ 2021-11-24 8:47 Huichao Cai
2021-12-01 11:49 ` Dariusz Sosnowski
0 siblings, 1 reply; 9+ messages in thread
From: Huichao Cai @ 2021-11-24 8:47 UTC (permalink / raw)
To: dev; +Cc: konstantin.ananyev
According to RFC791,the options may appear or not in datagrams.
They must be implemented by all IP modules (host and gateways).
What is optional is their transmission in any particular datagram,
not their implementation.So we have to deal with it during the
fragmenting process.Add some test data for the IPv4 header optional
field fragmenting.
Signed-off-by: Huichao Cai <chcchc88@163.com>
---
app/test/test_ipfrag.c | 269 ++++++++++++++++++++++++++++++++---
lib/ip_frag/rte_ipv4_fragmentation.c | 52 ++++++-
2 files changed, 301 insertions(+), 20 deletions(-)
diff --git a/app/test/test_ipfrag.c b/app/test/test_ipfrag.c
index 1ced25a..ecb9426 100644
--- a/app/test/test_ipfrag.c
+++ b/app/test/test_ipfrag.c
@@ -18,10 +18,101 @@
#define NUM_MBUFS 128
#define BURST 32
+/* IP options */
+#define RTE_IPOPT_COPY 0x80
+#define RTE_IPOPT_CONTROL 0x00
+#define RTE_IPOPT_END (0 | RTE_IPOPT_CONTROL)
+#define RTE_IPOPT_NOOP (1 | RTE_IPOPT_CONTROL)
+#define RTE_IPOPT_COPIED(o) ((o) & RTE_IPOPT_COPY)
+#define RTE_IPOPT_MAX_LEN 40
+
+#define IPOPT_MANUAL
+
+#ifdef IPOPT_MANUAL
+uint8_t expected_first_frag_ipv4_opts[] = {
+ 0x07, 0x0b, 0x04, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x83,
+ 0x07, 0x04, 0xc0, 0xa8,
+ 0xe3, 0x96, 0x00, 0x00,
+};
+
+uint8_t expected_sub_frag_ipv4_opts[] = {
+ RTE_IPOPT_NOOP, RTE_IPOPT_NOOP, RTE_IPOPT_NOOP, RTE_IPOPT_NOOP,
+ RTE_IPOPT_NOOP, RTE_IPOPT_NOOP, RTE_IPOPT_NOOP, RTE_IPOPT_NOOP,
+ RTE_IPOPT_NOOP, RTE_IPOPT_NOOP, RTE_IPOPT_NOOP, 0x83,
+ 0x07, 0x04, 0xc0, 0xa8,
+ 0xe3, 0x96, 0x00, 0x00,
+};
+#else
+/**
+ * IPv4 Options
+ */
+struct test_ipv4_opt {
+ __extension__
+ union {
+ uint8_t type; /**< option type */
+ struct {
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+ uint8_t number:5; /**< option number */
+ uint8_t category:2; /**< option class */
+ uint8_t copied:1; /**< option copy flag */
+#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
+ uint8_t copied:1; /**< option copy flag */
+ uint8_t category:2; /**< option class */
+ uint8_t number:5; /**< option number */
+#endif
+ } s_type;
+ };
+ uint8_t length; /**< option length */
+ uint8_t pointer; /**< option pointer */
+ uint8_t data[37]; /**< option data */
+} __rte_packed;
+
+struct test_ipv4_opt test_ipv4_opts[] = {
+ {
+ .s_type.copied = 0,
+ .s_type.category = 0,
+ .s_type.number = 7,
+ .length = 11,
+ .pointer = 4,
+ },
+ {
+ .s_type.copied = 1,
+ .s_type.category = 0,
+ .s_type.number = 3,
+ .length = 7,
+ .pointer = 4,
+ .data[0] = 0xc0,
+ .data[1] = 0xa8,
+ .data[2] = 0xe3,
+ .data[3] = 0x96,
+ },
+};
+#endif
+
+struct test_opt_data {
+ bool is_first_frag; /**< offset is 0 */
+ uint16_t len; /**< option data len */
+ uint8_t data[RTE_IPOPT_MAX_LEN]; /**< option data */
+};
+
static struct rte_mempool *pkt_pool,
*direct_pool,
*indirect_pool;
+static inline void
+hex_to_str(uint8_t *hex, uint16_t len, char *str)
+{
+ int i;
+
+ for (i = 0; i < len; i++) {
+ sprintf(str, "%02x", hex[i]);
+ str += 2;
+ }
+ *str = 0;
+}
+
static int
setup_buf_pool(void)
{
@@ -88,23 +179,78 @@ static void ut_teardown(void)
{
}
+static inline void
+test_get_ipv4_opt(bool is_first_frag,
+ struct test_opt_data *expected_opt)
+{
+#ifdef IPOPT_MANUAL
+ if (is_first_frag) {
+ expected_opt->len = sizeof(expected_first_frag_ipv4_opts);
+ rte_memcpy(expected_opt->data, expected_first_frag_ipv4_opts,
+ sizeof(expected_first_frag_ipv4_opts));
+ } else {
+ expected_opt->len = sizeof(expected_sub_frag_ipv4_opts);
+ rte_memcpy(expected_opt->data, expected_sub_frag_ipv4_opts,
+ sizeof(expected_sub_frag_ipv4_opts));
+ }
+#else
+ uint16_t i;
+ uint16_t pos = 0;
+ expected_opt->len = 0;
+
+ for (i = 0; i < RTE_DIM(test_ipv4_opts); i++) {
+ if (unlikely(pos + test_ipv4_opts[i].length >
+ RTE_IPOPT_MAX_LEN))
+ return;
+
+ if (is_first_frag) {
+ rte_memcpy(expected_opt->data + pos, &test_ipv4_opts[i],
+ test_ipv4_opts[i].length);
+ } else {
+ if (test_ipv4_opts[i].s_type.copied)
+ rte_memcpy(expected_opt->data + pos,
+ &test_ipv4_opts[i],
+ test_ipv4_opts[i].length);
+ else
+ memset(expected_opt->data + pos, RTE_IPOPT_NOOP,
+ test_ipv4_opts[i].length);
+ }
+ expected_opt->len += test_ipv4_opts[i].length;
+ pos += test_ipv4_opts[i].length;
+ }
+
+ expected_opt->len = RTE_ALIGN_CEIL(expected_opt->len, 4);
+ memset(expected_opt->data + pos, RTE_IPOPT_END,
+ expected_opt->len - pos);
+#endif
+}
+
static void
-v4_allocate_packet_of(struct rte_mbuf *b, int fill,
- size_t s, int df, uint8_t mf, uint16_t off,
- uint8_t ttl, uint8_t proto, uint16_t pktid)
+v4_allocate_packet_of(struct rte_mbuf *b, int fill, size_t s,
+ int df, uint8_t mf, uint16_t off, uint8_t ttl, uint8_t proto,
+ uint16_t pktid, bool have_opt, bool is_first_frag)
{
/* Create a packet, 2k bytes long */
b->data_off = 0;
char *data = rte_pktmbuf_mtod(b, char *);
- rte_be16_t fragment_offset = 0; /**< fragmentation offset */
+ rte_be16_t fragment_offset = 0; /* fragmentation offset */
+ uint16_t iph_len;
+ struct test_opt_data opt;
- memset(data, fill, sizeof(struct rte_ipv4_hdr) + s);
+ opt.len = 0;
+
+ if (have_opt)
+ test_get_ipv4_opt(is_first_frag, &opt);
+
+ iph_len = sizeof(struct rte_ipv4_hdr) + opt.len;
+ memset(data, fill, iph_len + s);
struct rte_ipv4_hdr *hdr = (struct rte_ipv4_hdr *)data;
- hdr->version_ihl = 0x45; /* standard IP header... */
+ hdr->version_ihl = 0x40; /* ipv4 */
+ hdr->version_ihl += (iph_len / 4);
hdr->type_of_service = 0;
- b->pkt_len = s + sizeof(struct rte_ipv4_hdr);
+ b->pkt_len = s + iph_len;
b->data_len = b->pkt_len;
hdr->total_length = rte_cpu_to_be_16(b->pkt_len);
hdr->packet_id = rte_cpu_to_be_16(pktid);
@@ -131,6 +277,8 @@ static void ut_teardown(void)
hdr->hdr_checksum = 0;
hdr->src_addr = rte_cpu_to_be_32(0x8080808);
hdr->dst_addr = rte_cpu_to_be_32(0x8080404);
+
+ rte_memcpy(hdr + 1, opt.data, opt.len);
}
static void
@@ -187,6 +335,43 @@ static void ut_teardown(void)
}
}
+static inline void
+test_get_frag_opt(struct rte_mbuf **mb, int32_t num,
+ struct test_opt_data *opt, int ipv)
+{
+ int32_t i;
+
+ for (i = 0; i < num; i++) {
+ if (ipv == 4) {
+ struct rte_ipv4_hdr *iph =
+ rte_pktmbuf_mtod(mb[i], struct rte_ipv4_hdr *);
+ uint16_t header_len = (iph->version_ihl &
+ RTE_IPV4_HDR_IHL_MASK) *
+ RTE_IPV4_IHL_MULTIPLIER;
+ uint16_t opt_len = header_len -
+ sizeof(struct rte_ipv4_hdr);
+
+ if ((rte_be_to_cpu_16(iph->fragment_offset) &
+ RTE_IPV4_HDR_OFFSET_MASK) == 0)
+ opt->is_first_frag = true;
+ else
+ opt->is_first_frag = false;
+
+ if (opt_len && (opt_len <= RTE_IPOPT_MAX_LEN)) {
+ char *iph_opt = rte_pktmbuf_mtod_offset(mb[i],
+ char *, sizeof(struct rte_ipv4_hdr));
+ opt->len = opt_len;
+ rte_memcpy(opt->data, iph_opt, opt_len);
+ } else {
+ opt->len = RTE_IPOPT_MAX_LEN;
+ memset(opt->data, RTE_IPOPT_END,
+ sizeof(opt->data));
+ }
+ opt++;
+ }
+ }
+}
+
static int
test_ip_frag(void)
{
@@ -206,32 +391,43 @@ static void ut_teardown(void)
uint16_t pkt_id;
int expected_frags;
uint16_t expected_fragment_offset[BURST];
+ bool have_opt;
+ bool is_first_frag;
} tests[] = {
{4, 1280, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID, 2,
- {0x2000, 0x009D}},
+ {0x2000, 0x009D}, 0},
{4, 1280, 1400, 0, 0, 0, 64, IPPROTO_ICMP, 0, 2,
- {0x2000, 0x009D}},
+ {0x2000, 0x009D}, 0},
{4, 600, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID, 3,
- {0x2000, 0x2048, 0x0090}},
+ {0x2000, 0x2048, 0x0090}, 0},
{4, 4, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID, -EINVAL},
{4, 600, 1400, 1, 0, 0, 64, IPPROTO_ICMP, RND_ID, -ENOTSUP},
{4, 600, 1400, 0, 0, 0, 0, IPPROTO_ICMP, RND_ID, 3,
- {0x2000, 0x2048, 0x0090}},
- {4, 68, 104, 0, 1, 13, 0, IPPROTO_ICMP, RND_ID, 3,
- {0x200D, 0x2013, 0x2019}},
+ {0x2000, 0x2046, 0x008C}, 1, 1},
+ /* The first fragment */
+ {4, 68, 104, 0, 1, 0, 0, IPPROTO_ICMP, RND_ID, 5,
+ {0x2000, 0x2003, 0x2006, 0x2009, 0x200C}, 1, 1},
+ /* The middle fragment */
+ {4, 68, 104, 0, 1, 13, 0, IPPROTO_ICMP, RND_ID, 5,
+ {0x200D, 0x2010, 0x2013, 0x2016, 0x2019}, 1, 0},
+ /* The last fragment */
+ {4, 68, 104, 0, 0, 26, 0, IPPROTO_ICMP, RND_ID, 5,
+ {0x201A, 0x201D, 0x2020, 0x2023, 0x0026}, 1, 0},
{6, 1280, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID, 2,
- {0x0001, 0x04D0}},
+ {0x0001, 0x04D0}, 0},
{6, 1300, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID, 2,
- {0x0001, 0x04E0}},
+ {0x0001, 0x04E0}, 0},
{6, 4, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID, -EINVAL},
{6, 1300, 1400, 0, 0, 0, 0, IPPROTO_ICMP, RND_ID, 2,
- {0x0001, 0x04E0}},
+ {0x0001, 0x04E0}, 0},
};
for (i = 0; i < RTE_DIM(tests); i++) {
int32_t len = 0;
uint16_t fragment_offset[BURST];
+ struct test_opt_data opt_res[BURST];
+ struct test_opt_data opt_exp;
uint16_t pktid = tests[i].pkt_id;
struct rte_mbuf *pkts_out[BURST];
struct rte_mbuf *b = rte_pktmbuf_alloc(pkt_pool);
@@ -250,7 +446,9 @@ static void ut_teardown(void)
tests[i].set_of,
tests[i].ttl,
tests[i].proto,
- pktid);
+ pktid,
+ tests[i].have_opt,
+ tests[i].is_first_frag);
} else if (tests[i].ipv == 6) {
v6_allocate_packet_of(b, 0x41414141,
tests[i].pkt_size,
@@ -275,17 +473,21 @@ static void ut_teardown(void)
if (len > 0) {
test_get_offset(pkts_out, len,
fragment_offset, tests[i].ipv);
+ if (tests[i].have_opt)
+ test_get_frag_opt(pkts_out, len,
+ opt_res,
+ tests[i].ipv);
test_free_fragments(pkts_out, len);
}
- printf("%zd: checking %d with %d\n", i, len,
+ printf("[check frag number]%zd: checking %d with %d\n", i, len,
tests[i].expected_frags);
RTE_TEST_ASSERT_EQUAL(len, tests[i].expected_frags,
"Failed case %zd.\n", i);
if (len > 0) {
for (j = 0; j < (size_t)len; j++) {
- printf("%zd-%zd: checking %d with %d\n",
+ printf("[check offset]%zd-%zd: checking %d with %d\n",
i, j, fragment_offset[j],
rte_cpu_to_be_16(
tests[i].expected_fragment_offset[j]));
@@ -294,6 +496,35 @@ static void ut_teardown(void)
tests[i].expected_fragment_offset[j]),
"Failed case %zd.\n", i);
}
+
+ if (tests[i].have_opt && (tests[i].ipv == 4)) {
+ for (j = 0; j < (size_t)len; j++) {
+ char opt_res_str[2 *
+ RTE_IPOPT_MAX_LEN + 1];
+ char opt_exp_str[2 *
+ RTE_IPOPT_MAX_LEN + 1];
+
+ test_get_ipv4_opt(
+ opt_res[j].is_first_frag,
+ &opt_exp);
+ hex_to_str(opt_res[j].data,
+ opt_res[j].len,
+ opt_res_str);
+ hex_to_str(opt_exp.data,
+ opt_exp.len,
+ opt_exp_str);
+
+ printf(
+ "[check ipv4 option]%zd-%zd: checking (%u)%s with (%u)%s\n",
+ i, j,
+ opt_res[j].len, opt_res_str,
+ opt_exp.len, opt_exp_str);
+ RTE_TEST_ASSERT_SUCCESS(
+ strcmp(opt_res_str,
+ opt_exp_str),
+ "Failed case %zd.\n", i);
+ }
+ }
}
}
diff --git a/lib/ip_frag/rte_ipv4_fragmentation.c b/lib/ip_frag/rte_ipv4_fragmentation.c
index 2e7739d..bcafa29 100644
--- a/lib/ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/ip_frag/rte_ipv4_fragmentation.c
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: BSD-3-Clause
+/* SPDX-License-Identifier: (BSD-3-Clause OR GPL-2.0)
* Copyright(c) 2010-2014 Intel Corporation
*/
@@ -12,6 +12,13 @@
#include "ip_frag_common.h"
+/* IP options */
+#define RTE_IPOPT_COPY 0x80
+#define RTE_IPOPT_CONTROL 0x00
+#define RTE_IPOPT_END (0 | RTE_IPOPT_CONTROL)
+#define RTE_IPOPT_NOOP (1 | RTE_IPOPT_CONTROL)
+#define RTE_IPOPT_COPIED(o) ((o) & RTE_IPOPT_COPY)
+
/* Fragment Offset */
#define RTE_IPV4_HDR_DF_SHIFT 14
#define RTE_IPV4_HDR_MF_SHIFT 13
@@ -41,6 +48,38 @@ static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
rte_pktmbuf_free(mb[i]);
}
+/*
+ * Options "fragmenting", just fill options not
+ * allowed in fragments with NOOPs.
+ * Simple and stupid 8), but the most efficient way.
+ */
+static inline void ip_options_fragment(struct rte_ipv4_hdr *iph)
+{
+ unsigned char *optptr = (unsigned char *)iph +
+ sizeof(struct rte_ipv4_hdr);
+ int l = (iph->version_ihl & RTE_IPV4_HDR_IHL_MASK) *
+ RTE_IPV4_IHL_MULTIPLIER - sizeof(struct rte_ipv4_hdr);
+ int optlen;
+
+ while (l > 0) {
+ switch (*optptr) {
+ case RTE_IPOPT_END:
+ return;
+ case RTE_IPOPT_NOOP:
+ l--;
+ optptr++;
+ continue;
+ }
+ optlen = optptr[1];
+ if (optlen < 2 || optlen > l)
+ return;
+ if (!RTE_IPOPT_COPIED(*optptr))
+ memset(optptr, RTE_IPOPT_NOOP, optlen);
+ l -= optlen;
+ optptr += optlen;
+ }
+}
+
/**
* IPv4 fragmentation.
*
@@ -188,6 +227,17 @@ static inline void __free_fragments(struct rte_mbuf *mb[], uint32_t num)
(uint16_t)out_pkt->pkt_len,
flag_offset, fragment_offset, more_in_segs);
+ /*
+ * ANK: dirty, but effective trick. Upgrade options only if
+ * the segment to be fragmented was THE FIRST (otherwise,
+ * options are already fixed) and make it ONCE
+ * on the initial mbuf, so that all the following fragments
+ * will inherit fixed options.
+ */
+ if ((fragment_offset == 0) &&
+ ((flag_offset & RTE_IPV4_HDR_OFFSET_MASK) == 0))
+ ip_options_fragment(in_hdr);
+
fragment_offset = (uint16_t)(fragment_offset +
out_pkt->pkt_len - header_len);
--
1.8.3.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ip_frag: add IPv4 options fragment and unit test data
2021-11-24 8:47 Huichao Cai
@ 2021-12-01 11:49 ` Dariusz Sosnowski
0 siblings, 0 replies; 9+ messages in thread
From: Dariusz Sosnowski @ 2021-12-01 11:49 UTC (permalink / raw)
To: Huichao Cai; +Cc: konstantin.ananyev, dev
On Wed, 24 Nov 2021 16:47:06 +0800, Huichao Cai wrote:
> +/*
> + * Options "fragmenting", just fill options not
> + * allowed in fragments with NOOPs.
> + * Simple and stupid 8), but the most efficient way.
> + */
> +static inline void ip_options_fragment(struct rte_ipv4_hdr *iph)
> +{
> + unsigned char *optptr = (unsigned char *)iph +
> + sizeof(struct rte_ipv4_hdr);
> + int l = (iph->version_ihl & RTE_IPV4_HDR_IHL_MASK) *
> + RTE_IPV4_IHL_MULTIPLIER - sizeof(struct rte_ipv4_hdr);
> + int optlen;
> +
> + while (l > 0) {
> + switch (*optptr) {
> + case RTE_IPOPT_END:
> + return;
> + case RTE_IPOPT_NOOP:
> + l--;
> + optptr++;
> + continue;
> + }
> + optlen = optptr[1];
> + if (optlen < 2 || optlen > l)
> + return;
> + if (!RTE_IPOPT_COPIED(*optptr))
> + memset(optptr, RTE_IPOPT_NOOP, optlen);
> + l -= optlen;
> + optptr += optlen;
> + }
> +}
> +
I have a few concerns regarding this implementation:
- Any IPv4 option longer than 2 bytes with copied flag unset, will not be substituted by NOOP option. In effect it will be copied to all fragments.
- Substituting options with NOOP might cause rte_ipv4_fragment_packet to produce more fragments than necessary, since options with copied flag unset will still occupy space in IPv4 header.
It would require some benchmarking, but maybe a better solution would be to prepare a separate IPv4 header for fragments without unnecessary options.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-02-11 10:11 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 12:21 [PATCH] ip_frag: add IPv4 options fragment and unit test data Ananyev, Konstantin
2022-02-11 2:12 ` Huichao Cai
2022-02-11 9:41 ` Ananyev, Konstantin
2022-02-11 10:00 ` Huichao Cai
2022-02-11 2:20 ` Huichao Cai
2022-02-11 10:11 ` Ferruh Yigit
-- strict thread matches above, loose matches on Subject: below --
2021-12-02 9:35 Dariusz Sosnowski
2021-11-24 8:47 Huichao Cai
2021-12-01 11:49 ` Dariusz Sosnowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).