From: Bruce Richardson <bruce.richardson@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Tao, Zhe" <zhe.tao@intel.com>, "dev@dpdk.org" <dev@dpdk.org>,
"Wu, Jingjing" <jingjing.wu@intel.com>
Subject: Re: [dpdk-dev] [PATCH v2] i40: fix the VXLAN TSO issue
Date: Fri, 15 Jul 2016 16:40:16 +0100 [thread overview]
Message-ID: <20160715154015.GA51144@bricha3-MOBL3> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725836B7B3ED@irsmsx105.ger.corp.intel.com>
Since we are now heading to RC3 for 16.07 and there are quite a number of
open comments on this patch unresolved, it's going to be deferred till
16.11, when it can have more review and discussion.
/Bruce
On Thu, Jul 07, 2016 at 12:24:43PM +0000, Ananyev, Konstantin wrote:
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > Sent: Thursday, July 07, 2016 11:51 AM
> > To: Tao, Zhe; dev@dpdk.org
> > Cc: Tao, Zhe; Wu, Jingjing
> > Subject: Re: [dpdk-dev] [PATCH v2] i40: fix the VXLAN TSO issue
> >
> >
> > Hi Tao,
> >
> > Sorry hit send button too early by accident :)
> >
> > > >
> > > > Problem:
> > > > When using the TSO + VXLAN feature in i40e, the outer UDP length fields in
> > > > the multiple UDP segments which are TSOed by the i40e will have the
> > > > wrong value.
> > > >
> > > > Fix this problem by adding the tunnel type field in the i40e descriptor
> > > > which was missed before.
> > > >
> > > > Fixes: 77b8301733c3 ("i40e: VXLAN Tx checksum offload")
> > > >
> > > > Signed-off-by: Zhe Tao <zhe.tao@intel.com>
> > > > ---
> > > > V2: Edited some comments for mbuf structure and i40e driver.
> > > >
> > > > app/test-pmd/csumonly.c | 26 +++++++++++++++++++-------
> > > > drivers/net/i40e/i40e_rxtx.c | 12 +++++++++---
> > > > lib/librte_mbuf/rte_mbuf.h | 16 +++++++++++++++-
> > > > 3 files changed, 43 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
> > > > index ac4bd8f..d423c20 100644
> > > > --- a/app/test-pmd/csumonly.c
> > > > +++ b/app/test-pmd/csumonly.c
> > > > @@ -204,7 +204,8 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct testpmd_offload_info *info)
> > > > static void
> > > > parse_vxlan(struct udp_hdr *udp_hdr,
> > > > struct testpmd_offload_info *info,
> > > > - uint32_t pkt_type)
> > > > + uint32_t pkt_type,
> > > > + uint64_t *ol_flags)
> > > > {
> > > > struct ether_hdr *eth_hdr;
> > > >
> > > > @@ -215,6 +216,7 @@ parse_vxlan(struct udp_hdr *udp_hdr,
> > > > RTE_ETH_IS_TUNNEL_PKT(pkt_type) == 0)
> > > > return;
> > > >
> > > > + *ol_flags |= PKT_TX_TUNNEL_VXLAN;
>
> Do we always have to setup tunnelling flags here?
> Obviously it would mean an extra CTD per packet and might slowdown things.
> In fact, I think current patch wouldn't work correctly if
> TESTPMD_TX_OFFLOAD_OUTER_IP_CKSUM is not set.
> So, can we do it only when TSO is enabled or outer IP checksum is enabled?
>
> > > > info->is_tunnel = 1;
> > > > info->outer_ethertype = info->ethertype;
> > > > info->outer_l2_len = info->l2_len;
> > > > @@ -231,7 +233,9 @@ parse_vxlan(struct udp_hdr *udp_hdr,
> > > >
> > > > /* Parse a gre header */
> > > > static void
> > > > -parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info *info)
> > > > +parse_gre(struct simple_gre_hdr *gre_hdr,
> > > > + struct testpmd_offload_info *info,
> > > > + uint64_t *ol_flags)
> > > > {
> > > > struct ether_hdr *eth_hdr;
> > > > struct ipv4_hdr *ipv4_hdr;
> > > > @@ -242,6 +246,8 @@ parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info *info)
> > > > if ((gre_hdr->flags & _htons(~GRE_SUPPORTED_FIELDS)) != 0)
> > > > return;
> > > >
> > > > + *ol_flags |= PKT_TX_TUNNEL_GRE;
> > > > +
> > > > gre_len += sizeof(struct simple_gre_hdr);
> > > >
> > > > if (gre_hdr->flags & _htons(GRE_KEY_PRESENT))
> > > > @@ -417,7 +423,7 @@ process_inner_cksums(void *l3_hdr, const struct testpmd_offload_info *info,
> > > > * packet */
> > > > static uint64_t
> > > > process_outer_cksums(void *outer_l3_hdr, struct testpmd_offload_info *info,
> > > > - uint16_t testpmd_ol_flags)
> > > > + uint16_t testpmd_ol_flags, uint64_t orig_ol_flags)
> > > > {
> > > > struct ipv4_hdr *ipv4_hdr = outer_l3_hdr;
> > > > struct ipv6_hdr *ipv6_hdr = outer_l3_hdr;
> > > > @@ -442,6 +448,9 @@ process_outer_cksums(void *outer_l3_hdr, struct testpmd_offload_info *info,
> > > > * hardware supporting it today, and no API for it. */
> > > >
> > > > udp_hdr = (struct udp_hdr *)((char *)outer_l3_hdr + info->outer_l3_len);
> > > > + if ((orig_ol_flags & PKT_TX_TCP_SEG) &&
> > > > + ((orig_ol_flags & PKT_TX_TUNNEL_MASK) == PKT_TX_TUNNEL_VXLAN))
> > > > + udp_hdr->dgram_cksum = 0;
> > > > /* do not recalculate udp cksum if it was 0 */
> > > > if (udp_hdr->dgram_cksum != 0) {
> > > > udp_hdr->dgram_cksum = 0;
> > > > @@ -705,15 +714,18 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
> > > > if (info.l4_proto == IPPROTO_UDP) {
> > > > struct udp_hdr *udp_hdr;
> > > > udp_hdr = (struct udp_hdr *)((char *)l3_hdr +
> > > > - info.l3_len);
> > > > - parse_vxlan(udp_hdr, &info, m->packet_type);
> > > > + info.l3_len);
> > > > + parse_vxlan(udp_hdr, &info, m->packet_type,
> > > > + &ol_flags);
> > > > } else if (info.l4_proto == IPPROTO_GRE) {
> > > > struct simple_gre_hdr *gre_hdr;
> > > > gre_hdr = (struct simple_gre_hdr *)
> > > > ((char *)l3_hdr + info.l3_len);
> > > > - parse_gre(gre_hdr, &info);
> > > > + parse_gre(gre_hdr, &info, &ol_flags);
> > > > } else if (info.l4_proto == IPPROTO_IPIP) {
> > > > void *encap_ip_hdr;
> > > > +
> > > > + ol_flags |= PKT_TX_TUNNEL_IPIP;
> > > > encap_ip_hdr = (char *)l3_hdr + info.l3_len;
> > > > parse_encap_ip(encap_ip_hdr, &info);
> > > > }
> > > > @@ -745,7 +757,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
> > > > * processed in hardware. */
> > > > if (info.is_tunnel == 1) {
> > > > ol_flags |= process_outer_cksums(outer_l3_hdr, &info,
> > > > - testpmd_ol_flags);
> > > > + testpmd_ol_flags, ol_flags);
> > > > }
> > > >
> > > > /* step 4: fill the mbuf meta data (flags and header lengths) */
> > > > diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> > > > index 049a813..4c987f2 100644
> > > > --- a/drivers/net/i40e/i40e_rxtx.c
> > > > +++ b/drivers/net/i40e/i40e_rxtx.c
> > > > @@ -801,6 +801,12 @@ i40e_txd_enable_checksum(uint64_t ol_flags,
> > > > union i40e_tx_offload tx_offload,
> > > > uint32_t *cd_tunneling)
> > > > {
> > > > + /* Tx pkts tunnel type*/
> > > > + if ((ol_flags & PKT_TX_TUNNEL_MASK) == PKT_TX_TUNNEL_VXLAN)
> > > > + *cd_tunneling |= I40E_TXD_CTX_UDP_TUNNELING;
> > > > + else if ((ol_flags & PKT_TX_TUNNEL_MASK) == PKT_TX_TUNNEL_GRE)
> > > > + *cd_tunneling |= I40E_TXD_CTX_GRE_TUNNELING;
> > > > +
> >
> > As I understand that fix is needed to enable TSO for tunnelling packets, correct?
> > For that case, should we setup EIPLEN also, no matter is PKT_TX_OUTER_IP_CKSUM
> > is on/off?
>
> In fact, it seems we have always to setup all 3: EIPLEN, MACLEN and L4TUNLEN,
> when L4TUNT != 0.
>
> >
> > > > /* UDP tunneling packet TX checksum offload */
> > > > if (ol_flags & PKT_TX_OUTER_IP_CKSUM) {
> > > >
> > > > @@ -1510,7 +1516,8 @@ i40e_calc_context_desc(uint64_t flags)
> > > >
> > > > /* set i40e TSO context descriptor */
> > > > static inline uint64_t
> > > > -i40e_set_tso_ctx(struct rte_mbuf *mbuf, union i40e_tx_offload tx_offload)
> > > > +i40e_set_tso_ctx(struct rte_mbuf *mbuf,
> > > > + union i40e_tx_offload tx_offload)
> > > > {
> > > > uint64_t ctx_desc = 0;
> > > > uint32_t cd_cmd, hdr_len, cd_tso_len;
> > > > @@ -1521,7 +1528,7 @@ i40e_set_tso_ctx(struct rte_mbuf *mbuf, union i40e_tx_offload tx_offload)
> > > > }
> > > >
> > > > /**
> > > > - * in case of tunneling packet, the outer_l2_len and
> > > > + * in case of non tunneling packet, the outer_l2_len and
> > > > * outer_l3_len must be 0.
> > > > */
> > > > hdr_len = tx_offload.outer_l2_len +
> > > > @@ -1537,7 +1544,6 @@ i40e_set_tso_ctx(struct rte_mbuf *mbuf, union i40e_tx_offload tx_offload)
> > > > I40E_TXD_CTX_QW1_TSO_LEN_SHIFT) |
> > > > ((uint64_t)mbuf->tso_segsz <<
> > > > I40E_TXD_CTX_QW1_MSS_SHIFT);
> > > > -
> > > > return ctx_desc;
> > > > }
> > > >
> > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > index 15e3a10..8eb0d33 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > @@ -133,6 +133,17 @@ extern "C" {
> > > > /* add new TX flags here */
> > > >
> > > > /**
> > > > + * Bits 45:48 used for the tunnel type.
> > > > + * When doing Tx offload like TSO or checksum, the HW needs to configure the
> > > > + * tunnel type into the HW descriptors.
> > > > + */
> > > > +#define PKT_TX_TUNNEL_VXLAN (1ULL << 45)
> > > > +#define PKT_TX_TUNNEL_GRE (2ULL << 45)
> > > > +#define PKT_TX_TUNNEL_IPIP (3ULL << 45)
> > > > +/* add new TX TUNNEL type here */
> > > > +#define PKT_TX_TUNNEL_MASK (0xFULL << 45)
> > > > +
> > > > +/**
> > > > * Second VLAN insertion (QinQ) flag.
> > > > */
> > > > #define PKT_TX_QINQ_PKT (1ULL << 49) /**< TX packet with double VLAN inserted. */
> > > > @@ -867,7 +878,10 @@ struct rte_mbuf {
> > > > union {
> > > > uint64_t tx_offload; /**< combined for easy fetch */
> > > > struct {
> > > > - uint64_t l2_len:7; /**< L2 (MAC) Header Length. */
> > > > + /* L2 (MAC) Header Length if it is not a tunneling pkt.
> > > > + * for tunnel it is outer L4 len+tunnel len+inner L2 len
> > > > + */
> >
> > As a nit: that doesn't look like doxygen style comment to me.
> > Konstantin
> >
> > > > + uint64_t l2_len:7;
> > > > uint64_t l3_len:9; /**< L3 (IP) Header Length. */
> > > > uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */
> > > > uint64_t tso_segsz:16; /**< TCP TSO segment size */
> > > > --
> > > > 2.1.4
>
next prev parent reply other threads:[~2016-07-15 15:40 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-05 20:59 [dpdk-dev] [PATCH v1] " Zhe Tao
2016-07-06 5:38 ` Wu, Jingjing
2016-07-07 4:27 ` [dpdk-dev] [PATCH v2] " Zhe Tao
2016-07-07 10:01 ` Ananyev, Konstantin
2016-07-07 10:50 ` Ananyev, Konstantin
2016-07-07 12:24 ` Ananyev, Konstantin
2016-07-15 15:40 ` Bruce Richardson [this message]
2016-07-18 2:57 ` Zhe Tao
2016-07-18 11:56 ` [dpdk-dev] [PATCH v3] " Zhe Tao
2016-07-19 10:29 ` Ananyev, Konstantin
2016-07-26 12:22 ` Tan, Jianfeng
2016-07-29 7:11 ` Tan, Jianfeng
2016-07-29 8:45 ` Ananyev, Konstantin
2016-07-29 10:11 ` Tan, Jianfeng
2016-10-10 3:58 ` [dpdk-dev] [PATCH v2] " Wu, Jingjing
2016-10-10 4:14 ` Yuanhan Liu
2016-08-01 3:56 ` [dpdk-dev] [PATCH v4 0/3] Add TSO on tunneling packet Jianfeng Tan
2016-08-01 3:56 ` [dpdk-dev] [PATCH v4 1/3] mbuf: add Tx side tunneling type Jianfeng Tan
2016-08-01 3:56 ` [dpdk-dev] [PATCH v4 2/3] net/i40e: add TSO support on tunneling packet Jianfeng Tan
2016-08-01 3:56 ` [dpdk-dev] [PATCH v4 3/3] app/testpmd: fix Tx offload " Jianfeng Tan
2016-09-19 12:09 ` Ananyev, Konstantin
2016-09-21 12:36 ` Tan, Jianfeng
2016-09-21 15:47 ` Ananyev, Konstantin
2016-09-22 1:29 ` Tan, Jianfeng
2016-09-22 9:15 ` Ananyev, Konstantin
[not found] ` <ED26CBA2FAD1BF48A8719AEF02201E364E5E09BC@SHSMSX103.ccr.corp.intel.com>
[not found] ` <2601191342CEEE43887BDE71AB97725836BA2698@irsmsx105.ger.corp.intel.com>
2016-09-27 17:29 ` [dpdk-dev] [PATCH v4 0/3] Add TSO " Ananyev, Konstantin
2016-09-27 17:52 ` Tan, Jianfeng
2016-09-27 19:47 ` Thomas Monjalon
2016-10-09 21:27 ` Thomas Monjalon
2016-09-26 13:48 ` [dpdk-dev] [PATCH v5 3/3] app/testpmd: support tunneled TSO in csum fwd engine Jianfeng Tan
2016-09-27 17:25 ` Ananyev, Konstantin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160715154015.GA51144@bricha3-MOBL3 \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=jingjing.wu@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=zhe.tao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).