DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Hu, Jiayu" <jiayu.hu@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	Ophir Munk <ophirmu@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: Thomas Monjalon <thomas@monjalon.net>,
	Olga Shern <olgas@mellanox.com>,
	Pascal Mazon <pascal.mazon@6wind.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
Date: Tue, 24 Apr 2018 12:55:30 +0000	[thread overview]
Message-ID: <ED946F0BEFE0A141B63BABBD629A2A9B388B5640@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258AEA5221E@IRSMSX102.ger.corp.intel.com>

Hi Konstantin and Ophir,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, April 24, 2018 8:31 PM
> To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu <jiayu.hu@intel.com>;
> dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> segments
> 
> 
> Hi Ophir,
> 
> >
> > Hi Konstantin,
> > Please see inline
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Tuesday, April 24, 2018 1:56 PM
> > > To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu
> <jiayu.hu@intel.com>;
> > > dev@dpdk.org
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > stable@dpdk.org
> > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> segments
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > Sent: Tuesday, April 24, 2018 10:44 AM
> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev,
> Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > stable@dpdk.org
> > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > > Hi Jiayu,
> > > > Please find comments inline
> > > >
> > > > > -----Original Message-----
> > > > > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > > > > Sent: Monday, April 23, 2018 7:14 AM
> > > > > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org;
> Ananyev,
> > > > > Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > <olgas@mellanox.com>; Pascal Mazon
> <pascal.mazon@6wind.com>;
> > > > > stable@dpdk.org
> > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > segments
> > > > >
> > > > > Hi Ophir,
> > > > >
> > > > > In the GSO design, the GSO library doesn't care about checksums,
> > > > > which means it doesn't check if input packets have correct
> > > > > checksums, and it doesn't do any checksum related work for the
> > > > > output GSO segments. It depends on the callers to use HW or SW
> > > > > checksum calculation for output packets. This is why the GSO library
> > > > > doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > > > >
> > > >
> > > > Can you please reconsider this design? I think the GSO library should
> > > > imitate the HW behavior where TCP segments checksum is
> automatically
> > > > calculated without explicitly requesting it. I am not saying that GSO
> library
> > > itself should calculate the checksums - but at least it should mark each
> > > segment as requiring this calculation.
> > >
> > > But gso has no idea how this packet will be processed after it.
> >
> > GSO shouldn't know. It should only mark the fact that a new TCP segment
> was created without a TCP checksum.
> >
> 
> Ok, but new IP header was also created. And might be outer ip/udp (in case
> of tunnel).
> If we go that way we'll have to set flags for each them.
> 
> > > Caller can choose to calculate L3/L4 cksum in SW or might be going to
> use
> > > HW offloads.
> >
> > Assuming TSO is configured then you suggest that TAP PMD will mark by
> itself the TCP_CKSUM flag for all packets returned from GSO
> > library?
> 
> Yes.
> 
> >
> > > In later case nothing stops the caller to update mbuf->ol_flags in a way
> he
> > > likes (TCP_CKSUM, IP_CKSUM, etc.).
> > > Konstantin
> > >
> >
> > Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
> > 1. Packet length equals to or is bigger than GSO size. In this case new TCP
> segments are created with no TCP checksum.
> > 2. Packet length is smaller than GSO size. In this case no TCP
> segmentation is required. The original packet is returned and its existing
> TCP
> > checksum is OK.
> >
> > In the latter case TAP PMD will always calculate TCP checksum in SW
> (performance concerns) where this could have been saved.
> > I am thinking of a practical scenario where TSO is configured but all
> packets are smaller than GSO size, however TAP PMD unnecessarily
> > recalculates their checksums.
> >
> > How do you suggest to avoid this scenario?
> 
> Probably something like that:
> 
> rc = rte_gso_segment(pkt_in, gso_ctx, pkts_out, nb_pkts_out);
> if (rc == 1 && pkt_in == pkts_out[0] == pkt_in) {
>      /* no gso was performed */
> } else {
>    /* new packets, update ol_flags if needed */
> }
> 
> ?
> 
> Another possibility - might be make chages in librte_gso to allow user to
> specify what flags to set for the output packets (probably via
> rte_gso_ctx.flag)

It would be a solution. We can add flags to rte_gso_ctx.flag and the GSO library
can do some extra work (like checksum) according to the flags.

Thanks,
Jiayu
> 
> Konstantin
> 
> > >
> > > > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > > > calculation silently, and without the aware of the caller. In fact,
> > > > > the caller always know it does SW TSO (i.e. GSO), instead of real HW
> TSO.
> > > >
> > > > This is not correct. Consider net_failsafe with 2 sub-devices: one is
> > > > a HW PCI device, the other one is a SW TAP device. Failsafe must work
> > > transparently with these two sub-devices and the caller cannot tell if
> TSO is
> > > done in SW or HW.
> > > >
> > > > > If the caller wants HW
> > > > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags
> before
> > > > > or after calling the GSO library.
> > > > >
> > > >
> > > > FYI - TAP TSO patches were submitted to dpdk.org mailing list. These
> > > patches use the GSO library.
> > > >
> > >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdp
> d
> > > >
> > >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Cop
> h
> > > irmu%40me
> > > >
> > >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4
> d
> > > 9ba6a4d1
> > > >
> > >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> > > 2BPiQEbvM0
> > > > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> > > >
> > >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdp
> d
> > > >
> > >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Cop
> h
> > > irmu%40me
> > > >
> > >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4
> d
> > > 9ba6a4d1
> > > >
> > >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> > > WXu3mr5By1
> > > > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> > > >
> > > > Running testpmd with TAP TSO is currently broken without the
> suggested
> > > librte_gso patch.
> > > > Please note testpmd implementation (app/test-pmd/csumonly.c
> > > > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > > > configured:
> > > >
> > > >   if (tso_segsz)
> > > >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the
> packet
> > > flags are only marked with PKT_TX_TCP_SEG and no
> > > > PKT_TX_TCP_CKSUM ***
> > > >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > > >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> > > marked only if TSO is not applicable ***
> > > >   else {
> > > >       tcp_hdr->cksum =
> > > >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> > > >
> > > > In other words - testpmd does not set TCP_CKSUM along with
> TCP_SEG
> > > > therefore using testpmd with TAP/TSO will result in TCP segments with
> 0
> > > (incorrect) TCP checksums.
> > > >
> > > > In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h
> > > > which specify that PKT_TX_TCP_SEG flag implies the
> PKT_TX_TCP_CKSUM
> > > > (hence it is not required to be explicitly set by the caller)
> > > >
> > > > /**
> > > > * TCP segmentation offload. To enable this offload feature for a
> > > > * packet to be transmitted on hardware supporting TSO:
> > > > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > > *    PKT_TX_TCP_CKSUM)
> > > > ...
> > > >
> > > > > Add Konstantin for more suggestions.
> > > > >
> > > > > Thanks,
> > > > > Jiayu
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > > <olgas@mellanox.com>; Pascal Mazon
> <pascal.mazon@6wind.com>;
> > > > > Ophir
> > > > > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > > segments
> > > > > >
> > > > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag
> are
> > > > > > segmented and the flag is cleared in the resulting segments,
> > > > > > however, the segments checksum is not updated. It is therefore
> > > > > > required to set the PKT_TX_TCP_CKSUM flag in each TCP segment
> in
> > > > > > order to mark for the sending driver the need to update the TCP
> > > > > > checksum before transmitting the segment.
> > > > > >
> > > > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > > > Cc: stable@dpdk.org
> > > > > >
> > > > > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > > > > ---
> > > > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > > index a44e3d4..e9ce9ce 100644
> > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > > > >  			 (gso_ctx->gso_types &
> > > > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size,
> ipid_delta,
> > > > > >  				direct_pool, indirect_pool,
> > > > > >  				pkts_out, nb_pkts_out);
> > > > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > > > >  			(gso_ctx->gso_types &
> > > > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > > >  				direct_pool, indirect_pool,
> > > > > >  				pkts_out, nb_pkts_out);
> > > > > > --
> > > > > > 2.7.4

  reply	other threads:[~2018-04-24 12:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-22 14:20 Ophir Munk
2018-04-22 14:47 ` Ophir Munk
2018-04-23  4:13 ` Hu, Jiayu
2018-04-24  9:44   ` Ophir Munk
2018-04-24 10:56     ` Ananyev, Konstantin
2018-04-24 11:45       ` Ophir Munk
2018-04-24 12:31         ` Ananyev, Konstantin
2018-04-24 12:55           ` Hu, Jiayu [this message]
2018-04-24 13:53             ` Ophir Munk
2018-04-25  1:51               ` Hu, Jiayu
2018-04-24 13:41           ` Ophir Munk
2018-04-24 14:26             ` Ananyev, Konstantin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ED946F0BEFE0A141B63BABBD629A2A9B388B5640@shsmsx102.ccr.corp.intel.com \
    --to=jiayu.hu@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=olgas@mellanox.com \
    --cc=ophirmu@mellanox.com \
    --cc=pascal.mazon@6wind.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).