From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Yicai Lu <luyicai@huawei.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "zhoujingbin@huawei.com" <zhoujingbin@huawei.com>,
"chenchanghu@huawei.com" <chenchanghu@huawei.com>,
"jerry.lilijun@huawei.com" <jerry.lilijun@huawei.com>,
"haifeng.lin@huawei.com" <haifeng.lin@huawei.com>,
"guohongzhi1@huawei.com" <guohongzhi1@huawei.com>,
"wangyunjian@huawei.com" <wangyunjian@huawei.com>,
"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v6] ip_frag: remove padding length of fragment
Date: Fri, 18 Dec 2020 11:41:33 +0000 [thread overview]
Message-ID: <BYAPR11MB3301E8D991652111ACCD32CE9AC30@BYAPR11MB3301.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1608125790-26496-1-git-send-email-luyicai@huawei.com>
>
> In some situations, we would get several ip fragments, which total
> data length is less than min_ip_len(64) and padding with zeros.
> We simulated intermediate fragments by modifying the MTU.
> To illustrate the problem, we simplify the packet format and
> ignore the impact of the packet header.In namespace2,
> a packet whose data length is 1520 is sent.
> When the packet passes tap2, the packet is divided into two
> fragments: fragment A and B, similar to (1520 = 1510 + 10).
> When the packet passes tap3, the larger fragment packet A is
> divided into two fragments A1 and A2, similar to (1510 = 1500 + 10).
> Finally, the bond interface receives three fragments:
> A1, A2, and B (1520 = 1500 + 10 + 10).
> One fragmented packet A2 is smaller than the minimum Ethernet
> frame length, so it needs to be padded.
>
> |---------------------------------------------------|
> | HOST |
> | |--------------| |----------------------------| |
> | | ns2 | | |--------------| | |
> | | |--------| | | |--------| |--------| | |
> | | | tap1 | | | | tap2 | ns1| tap3 | | |
> | | |mtu=1510| | | |mtu=1510| |mtu=1500| | |
> | |--|1.1.1.1 |--| |--|1.1.1.2 |----|2.1.1.1 |--| |
> | |--------| |--------| |--------| |
> | | | | |
> | |-----------------| | |
> | | |
> | |--------| |
> | | bond | |
> |--------------------------------------|mtu=1500|---|
> |--------|
>
> When processing the preceding packets above,
> DPDK would aggregate fragmented packets A2 and B.
> And error packets are generated, which padding(zero)
> is displayed in the middle of the packet.
>
> A2 + B:
> 0000 fa 16 3e 9f fb 82 fa 47 b2 57 dc 20 08 00 45 00
> 0010 00 33 b4 66 00 ba 3f 01 c1 a5 01 01 01 01 02 01
> 0020 01 02 c0 c1 c2 c3 c4 c5 c6 c7 00 00 00 00 00 00
> 0030 00 00 00 00 00 00 00 00 00 00 00 00 c8 c9 ca cb
> 0040 cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db
> 0050 dc dd de df e0 e1 e2 e3 e4 e5 e6
>
> So, we would calculate the length of padding, and remove
> the padding in pkt_len and data_len before aggregation.
> And also we have the fix for both ipv4 and ipv6.
>
> Fixes: 7f0983ee331c ("ip_frag: check fragment length of incoming packet")
> Cc: stable@dpdk.org
>
> Signed-off-by: Yicai Lu <luyicai@huawei.com>
> ---
> v5 -> v6: Have the fix for ipv6.
> ---
> lib/librte_ip_frag/rte_ipv4_reassembly.c | 11 ++++++++---
> lib/librte_ip_frag/rte_ipv6_reassembly.c | 9 +++++++--
> 2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c
> index 1dda8ac..69666c8 100644
> --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c
> +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c
> @@ -104,6 +104,7 @@ struct rte_mbuf *
> const unaligned_uint64_t *psd;
> uint16_t flag_offset, ip_ofs, ip_flag;
> int32_t ip_len;
> + int32_t trim;
>
> flag_offset = rte_be_to_cpu_16(ip_hdr->fragment_offset);
> ip_ofs = (uint16_t)(flag_offset & RTE_IPV4_HDR_OFFSET_MASK);
> @@ -117,14 +118,15 @@ struct rte_mbuf *
>
> ip_ofs *= RTE_IPV4_HDR_OFFSET_UNITS;
> ip_len = rte_be_to_cpu_16(ip_hdr->total_length) - mb->l3_len;
> + trim = mb->pkt_len - (ip_len + mb->l3_len + mb->l2_len);
>
> IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> - "mbuf: %p, tms: %" PRIu64
> - ", key: <%" PRIx64 ", %#x>, ofs: %u, len: %d, flags: %#x\n"
> + "mbuf: %p, tms: %" PRIu64 ", key: <%" PRIx64 ", %#x>"
> + "ofs: %u, len: %d, padding: %d, flags: %#x\n"
> "tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
> "max_entries: %u, use_entries: %u\n\n",
> __func__, __LINE__,
> - mb, tms, key.src_dst[0], key.id, ip_ofs, ip_len, ip_flag,
> + mb, tms, key.src_dst[0], key.id, ip_ofs, ip_len, trim, ip_flag,
> tbl, tbl->max_cycles, tbl->entry_mask, tbl->max_entries,
> tbl->use_entries);
>
> @@ -134,6 +136,9 @@ struct rte_mbuf *
> return NULL;
> }
>
> + if (unlikely(trim > 0))
> + rte_pktmbuf_trim(mb, trim);
> +
> /* try to find/add entry into the fragment's table. */
> if ((fp = ip_frag_find(tbl, dr, &key, tms)) == NULL) {
> IP_FRAG_MBUF2DR(dr, mb);
> diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c
> index ad01055..4037da0 100644
> --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c
> +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c
> @@ -142,6 +142,7 @@ struct rte_mbuf *
> struct ip_frag_key key;
> uint16_t ip_ofs;
> int32_t ip_len;
> + int32_t trim;
>
> rte_memcpy(&key.src_dst[0], ip_hdr->src_addr, 16);
> rte_memcpy(&key.src_dst[2], ip_hdr->dst_addr, 16);
> @@ -158,16 +159,17 @@ struct rte_mbuf *
> * this is what we remove from the payload len.
> */
> ip_len = rte_be_to_cpu_16(ip_hdr->payload_len) - sizeof(*frag_hdr);
> + trim = mb->pkt_len - (ip_len + mb->l3_len + mb->l2_len);
>
> IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> "mbuf: %p, tms: %" PRIu64
> ", key: <" IPv6_KEY_BYTES_FMT ", %#x>, "
> - "ofs: %u, len: %d, flags: %#x\n"
> + "ofs: %u, len: %d, padding: %d, flags: %#x\n"
> "tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
> "max_entries: %u, use_entries: %u\n\n",
> __func__, __LINE__,
> mb, tms, IPv6_KEY_BYTES(key.src_dst), key.id, ip_ofs, ip_len,
> - RTE_IPV6_GET_MF(frag_hdr->frag_data),
> + trim, RTE_IPV6_GET_MF(frag_hdr->frag_data),
> tbl, tbl->max_cycles, tbl->entry_mask, tbl->max_entries,
> tbl->use_entries);
>
> @@ -177,6 +179,9 @@ struct rte_mbuf *
> return NULL;
> }
>
> + if (unlikely(trim > 0))
> + rte_pktmbuf_trim(mb, trim);
> +
> /* try to find/add entry into the fragment's table. */
> fp = ip_frag_find(tbl, dr, &key, tms);
> if (fp == NULL) {
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 1.9.5.msysgit.1
next prev parent reply other threads:[~2020-12-18 11:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-18 13:29 [dpdk-dev] [PATCH] ip_frag: recalculate data " Yicai Lu
2020-11-22 11:22 ` Thomas Monjalon
2020-11-23 2:27 ` [dpdk-dev] [PATCH v2] " Yicai Lu
2020-12-12 11:05 ` [dpdk-dev] [PATCH v5] ip_frag: remove padding " Yicai Lu
2020-12-14 14:44 ` Ananyev, Konstantin
2020-12-15 3:18 ` luyicai
2020-12-16 10:44 ` Ananyev, Konstantin
2020-12-16 10:54 ` luyicai
2020-12-16 13:36 ` [dpdk-dev] [PATCH v6] " Yicai Lu
2020-12-18 11:41 ` Ananyev, Konstantin [this message]
2021-01-15 10:29 ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BYAPR11MB3301E8D991652111ACCD32CE9AC30@BYAPR11MB3301.namprd11.prod.outlook.com \
--to=konstantin.ananyev@intel.com \
--cc=chenchanghu@huawei.com \
--cc=dev@dpdk.org \
--cc=guohongzhi1@huawei.com \
--cc=haifeng.lin@huawei.com \
--cc=jerry.lilijun@huawei.com \
--cc=luyicai@huawei.com \
--cc=stable@dpdk.org \
--cc=wangyunjian@huawei.com \
--cc=zhoujingbin@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).