From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 1C4377F0D for ; Tue, 11 Nov 2014 04:08:56 +0100 (CET) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 10 Nov 2014 19:18:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,357,1413270000"; d="scan'208";a="620326994" Received: from pgsmsx101.gar.corp.intel.com ([10.221.44.78]) by fmsmga001.fm.intel.com with ESMTP; 10 Nov 2014 19:18:08 -0800 Received: from pgsmsx105.gar.corp.intel.com (10.221.44.96) by PGSMSX101.gar.corp.intel.com (10.221.44.78) with Microsoft SMTP Server (TLS) id 14.3.195.1; Tue, 11 Nov 2014 11:17:38 +0800 Received: from shsmsx102.ccr.corp.intel.com (10.239.4.154) by pgsmsx105.gar.corp.intel.com (10.221.44.96) with Microsoft SMTP Server (TLS) id 14.3.195.1; Tue, 11 Nov 2014 11:17:37 +0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.130]) by shsmsx102.ccr.corp.intel.com ([169.254.2.136]) with mapi id 14.03.0195.001; Tue, 11 Nov 2014 11:17:36 +0800 From: "Liu, Jijiang" To: Olivier Matz , "dev@dpdk.org" Thread-Topic: [PATCH 07/12] mbuf: generic support for TCP segmentation offload Thread-Index: AQHP/P9xVhklkuXsqEapIQtxbrPFzpxawEWA Date: Tue, 11 Nov 2014 03:17:36 +0000 Message-ID: <1ED644BD7E0A5F4091CF203DAFB8E4CC01D8F751@SHSMSX101.ccr.corp.intel.com> References: <1415635166-1364-1-git-send-email-olivier.matz@6wind.com> <1415635166-1364-8-git-send-email-olivier.matz@6wind.com> In-Reply-To: <1415635166-1364-8-git-send-email-olivier.matz@6wind.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "jigsaw@gmail.com" Subject: Re: [dpdk-dev] [PATCH 07/12] mbuf: generic support for TCP segmentation offload X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Nov 2014 03:08:58 -0000 > -----Original Message----- > From: Olivier Matz [mailto:olivier.matz@6wind.com] > Sent: Monday, November 10, 2014 11:59 PM > To: dev@dpdk.org > Cc: olivier.matz@6wind.com; Walukiewicz, Miroslaw; Liu, Jijiang; Liu, Yon= g; > jigsaw@gmail.com; Richardson, Bruce; Ananyev, Konstantin > Subject: [PATCH 07/12] mbuf: generic support for TCP segmentation offload >=20 > Some of the NICs supported by DPDK have a possibility to accelerate TCP t= raffic > by using segmentation offload. The application prepares a packet with val= id TCP > header with size up to 64K and deleguates the segmentation to the NIC. >=20 > Implement the generic part of TCP segmentation offload in rte_mbuf. It > introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes= ) and > tso_segsz (MSS of packets). >=20 > To delegate the TCP segmentation to the hardware, the user has to: >=20 > - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies > PKT_TX_TCP_CKSUM) > - set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in > the packet > - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz > - calculate the pseudo header checksum and set it in the TCP header, > as required when doing hardware TCP checksum offload >=20 > The API is inspired from ixgbe hardware (the next commit adds the support= for > ixgbe), but it seems generic enough to be used for other hw/drivers in th= e future. >=20 > This commit also reworks the way l2_len and l3_len are used in igb and ix= gbe > drivers as the l2_l3_len is not available anymore in mbuf. >=20 > Signed-off-by: Mirek Walukiewicz > Signed-off-by: Olivier Matz > --- > app/test-pmd/testpmd.c | 3 ++- > examples/ipv4_multicast/main.c | 3 ++- > lib/librte_mbuf/rte_mbuf.h | 44 +++++++++++++++++++++++----------= ------ > lib/librte_pmd_e1000/igb_rxtx.c | 11 +++++++++- > lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 11 +++++++++- > 5 files changed, 50 insertions(+), 22 deletions(-) >=20 > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > 12adafa..a831e31 100644 > --- a/app/test-pmd/testpmd.c > +++ b/app/test-pmd/testpmd.c > @@ -408,7 +408,8 @@ testpmd_mbuf_ctor(struct rte_mempool *mp, > mb->ol_flags =3D 0; > mb->data_off =3D RTE_PKTMBUF_HEADROOM; > mb->nb_segs =3D 1; > - mb->l2_l3_len =3D 0; > + mb->l2_len =3D 0; > + mb->l3_len =3D 0; The mb->inner_l2_len and mb->inner_l3_len are missed here; I also can ad= d them later. > mb->vlan_tci =3D 0; > mb->hash.rss =3D 0; > } > diff --git a/examples/ipv4_multicast/main.c b/examples/ipv4_multicast/mai= n.c > index de5e6be..a31d43d 100644 > --- a/examples/ipv4_multicast/main.c > +++ b/examples/ipv4_multicast/main.c > @@ -302,7 +302,8 @@ mcast_out_pkt(struct rte_mbuf *pkt, int use_clone) > /* copy metadata from source packet*/ > hdr->port =3D pkt->port; > hdr->vlan_tci =3D pkt->vlan_tci; > - hdr->l2_l3_len =3D pkt->l2_l3_len; > + hdr->l2_len =3D pkt->l2_len; > + hdr->l3_len =3D pkt->l3_len; The mb->inner_l2_len and mb->inner_l3_len are missed here, too. =20 > hdr->hash =3D pkt->hash; >=20 > hdr->ol_flags =3D pkt->ol_flags; > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h inde= x > bcd8996..f76b768 100644 > --- a/lib/librte_mbuf/rte_mbuf.h > +++ b/lib/librte_mbuf/rte_mbuf.h > @@ -126,6 +126,19 @@ extern "C" { >=20 > #define PKT_TX_VXLAN_CKSUM (1ULL << 50) /**< TX checksum of VXLAN > computed by NIC */ >=20 > +/** > + * TCP segmentation offload. To enable this offload feature for a > + * packet to be transmitted on hardware supporting TSO: > + * - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies > + * PKT_TX_TCP_CKSUM) > + * - if it's IPv4, set the PKT_TX_IP_CKSUM flag and write the IP checks= um > + * to 0 in the packet > + * - fill the mbuf offload information: l2_len, l3_len, l4_len, > +tso_segsz > + * - calculate the pseudo header checksum and set it in the TCP header, > + * as required when doing hardware TCP checksum offload > + */ > +#define PKT_TX_TCP_SEG (1ULL << 49) > + > /* Use final bit of flags to indicate a control mbuf */ > #define CTRL_MBUF_FLAG (1ULL << 63) /**< Mbuf contains control dat= a */ >=20 > @@ -185,6 +198,7 @@ static inline const char > *rte_get_tx_ol_flag_name(uint64_t mask) > case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM"; > case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST"; > case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM"; > + case PKT_TX_TCP_SEG: return "PKT_TX_TCP_SEG"; > default: return NULL; > } > } > @@ -264,22 +278,18 @@ struct rte_mbuf { >=20 > /* fields to support TX offloads */ > union { > - uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var */ > + uint64_t tx_offload; /**< combined for easy fetch */ > struct { > - uint16_t l3_len:9; /**< L3 (IP) Header Length. */ > - uint16_t l2_len:7; /**< L2 (MAC) Header Length. */ > - }; > - }; > + uint64_t l2_len:7; /**< L2 (MAC) Header Length. */ > + uint64_t l3_len:9; /**< L3 (IP) Header Length. */ > + uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */ > + uint64_t tso_segsz:16; /**< TCP TSO segment size */ >=20 > - /* fields for TX offloading of tunnels */ > - union { > - uint16_t inner_l2_l3_len; > - /**< combined inner l2/l3 lengths as single var */ > - struct { > - uint16_t inner_l3_len:9; > - /**< inner L3 (IP) Header Length. */ > - uint16_t inner_l2_len:7; > - /**< inner L2 (MAC) Header Length. */ > + /* fields for TX offloading of tunnels */ > + uint16_t inner_l3_len:9; /**< inner L3 (IP) Hdr Length. > */ > + uint16_t inner_l2_len:7; /**< inner L2 (MAC) Hdr > Length. */ > + > + /* uint64_t unused:8; */ > }; > }; > } __rte_cache_aligned; > @@ -631,8 +641,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf = *m) > { > m->next =3D NULL; > m->pkt_len =3D 0; > - m->l2_l3_len =3D 0; > - m->inner_l2_l3_len =3D 0; > + m->tx_offload =3D 0; > m->vlan_tci =3D 0; > m->nb_segs =3D 1; > m->port =3D 0xff; > @@ -701,8 +710,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf > *mi, struct rte_mbuf *md) > mi->data_len =3D md->data_len; > mi->port =3D md->port; > mi->vlan_tci =3D md->vlan_tci; > - mi->l2_l3_len =3D md->l2_l3_len; > - mi->inner_l2_l3_len =3D md->inner_l2_l3_len; > + mi->tx_offload =3D md->tx_offload; > mi->hash =3D md->hash; >=20 > mi->next =3D NULL; > diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_r= xtx.c > index dbf5074..0a9447e 100644 > --- a/lib/librte_pmd_e1000/igb_rxtx.c > +++ b/lib/librte_pmd_e1000/igb_rxtx.c > @@ -361,6 +361,13 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf > **tx_pkts, > struct rte_mbuf *tx_pkt; > struct rte_mbuf *m_seg; > union igb_vlan_macip vlan_macip_lens; > + union { > + uint16_t u16; > + struct { > + uint16_t l3_len:9; > + uint16_t l2_len:7; > + }; > + } l2_l3_len; > uint64_t buf_dma_addr; > uint32_t olinfo_status; > uint32_t cmd_type_len; > @@ -398,8 +405,10 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf > **tx_pkts, > tx_last =3D (uint16_t) (tx_id + tx_pkt->nb_segs - 1); >=20 > ol_flags =3D tx_pkt->ol_flags; > + l2_l3_len.l2_len =3D tx_pkt->l2_len; > + l2_l3_len.l3_len =3D tx_pkt->l3_len; > vlan_macip_lens.f.vlan_tci =3D tx_pkt->vlan_tci; > - vlan_macip_lens.f.l2_l3_len =3D tx_pkt->l2_l3_len; > + vlan_macip_lens.f.l2_l3_len =3D l2_l3_len.u16; > tx_ol_req =3D ol_flags & (PKT_TX_VLAN_PKT | PKT_TX_IP_CKSUM > | > PKT_TX_L4_MASK); >=20 > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixg= be_rxtx.c > index 70ca254..54a0fc1 100644 > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > @@ -540,6 +540,13 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf > **tx_pkts, > struct rte_mbuf *tx_pkt; > struct rte_mbuf *m_seg; > union ixgbe_vlan_macip vlan_macip_lens; > + union { > + uint16_t u16; > + struct { > + uint16_t l3_len:9; > + uint16_t l2_len:7; > + }; > + } l2_l3_len; > uint64_t buf_dma_addr; > uint32_t olinfo_status; > uint32_t cmd_type_len; > @@ -583,8 +590,10 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf > **tx_pkts, > tx_ol_req =3D ol_flags & (PKT_TX_VLAN_PKT | PKT_TX_IP_CKSUM > | > PKT_TX_L4_MASK); > if (tx_ol_req) { > + l2_l3_len.l2_len =3D tx_pkt->l2_len; > + l2_l3_len.l3_len =3D tx_pkt->l3_len; > vlan_macip_lens.f.vlan_tci =3D tx_pkt->vlan_tci; > - vlan_macip_lens.f.l2_l3_len =3D tx_pkt->l2_l3_len; > + vlan_macip_lens.f.l2_l3_len =3D l2_l3_len.u16; >=20 > /* If new context need be built or reuse the exist ctx. */ > ctx =3D what_advctx_update(txq, tx_ol_req, > -- > 2.1.0