From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 416F3223 for ; Thu, 14 Sep 2017 10:39:45 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Sep 2017 01:39:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,391,1500966000"; d="scan'208";a="900152430" Received: from irsmsx152.ger.corp.intel.com ([163.33.192.66]) by FMSMGA003.fm.intel.com with ESMTP; 14 Sep 2017 01:39:43 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.75]) by IRSMSX152.ger.corp.intel.com ([169.254.6.87]) with mapi id 14.03.0319.002; Thu, 14 Sep 2017 09:39:43 +0100 From: "Ananyev, Konstantin" To: "Kavanagh, Mark B" , "Hu, Jiayu" CC: "dev@dpdk.org" , "Tan, Jianfeng" Thread-Topic: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Thread-Index: AQHTK3CNNYr96WyXIECiBIjbMlkDdaKxEVgwgAD8cICAAH4TIIAATEKAgAAWLkCAAJOiAIAAf0WAgAAReUA= Date: Thu, 14 Sep 2017 08:39:42 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772584F24ADBC@irsmsx105.ger.corp.intel.com> References: <1504598270-60080-1-git-send-email-jiayu.hu@intel.com> <1505184211-36728-1-git-send-email-jiayu.hu@intel.com> <1505184211-36728-3-git-send-email-jiayu.hu@intel.com> <2601191342CEEE43887BDE71AB9772584F249E8E@irsmsx105.ger.corp.intel.com> <20170913024801.GB44293@dpdk15.sh.intel.com> <2601191342CEEE43887BDE71AB9772584F24A622@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772584F24A843@irsmsx105.ger.corp.intel.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZTVmZjZmZjYtZDQ4Mi00MzNjLTgzNTctYjY0Mjc2ZjMyMDkxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6ImFTaWJBUGFtT21kRkR1eXpnTVM2ck9oeHdEWk9EcmVmZUllOVpsamlCUlE9In0= x-ctpclassification: CTP_IC dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 2/5] gso: add TCP/IPv4 GSO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Sep 2017 08:39:46 -0000 > -----Original Message----- > From: Kavanagh, Mark B > Sent: Thursday, September 14, 2017 9:35 AM > To: Hu, Jiayu ; Ananyev, Konstantin > Cc: dev@dpdk.org; Tan, Jianfeng > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support >=20 > >From: Hu, Jiayu > >Sent: Thursday, September 14, 2017 2:00 AM > >To: Ananyev, Konstantin ; Kavanagh, Mark B > > > >Cc: dev@dpdk.org; Tan, Jianfeng > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support > > > >Hi Konstantin, > > > >> -----Original Message----- > >> From: Ananyev, Konstantin > >> Sent: Wednesday, September 13, 2017 11:13 PM > >> To: Kavanagh, Mark B ; Hu, Jiayu > >> > >> Cc: dev@dpdk.org; Tan, Jianfeng > >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support > >> > >> Hi Mark, > >> > >> > -----Original Message----- > >> > From: Kavanagh, Mark B > >> > Sent: Wednesday, September 13, 2017 3:52 PM > >> > To: Ananyev, Konstantin ; Hu, Jiayu > >> > >> > Cc: dev@dpdk.org; Tan, Jianfeng > >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support > >> > > >> > >From: Ananyev, Konstantin > >> > >Sent: Wednesday, September 13, 2017 10:38 AM > >> > >To: Hu, Jiayu > >> > >Cc: dev@dpdk.org; Kavanagh, Mark B ; > >> Tan, Jianfeng > >> > > > >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support > >> > > > >> > > > >> > > > >> > >> > > + > >> > >> > > +int > >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt, > >> > >> > > + uint16_t gso_size, > >> > >> > > + uint8_t ipid_delta, > >> > >> > > + struct rte_mempool *direct_pool, > >> > >> > > + struct rte_mempool *indirect_pool, > >> > >> > > + struct rte_mbuf **pkts_out, > >> > >> > > + uint16_t nb_pkts_out) > >> > >> > > +{ > >> > >> > > + struct ipv4_hdr *ipv4_hdr; > >> > >> > > + uint16_t tcp_dl; > >> > >> > > + uint16_t pyld_unit_size; > >> > >> > > + uint16_t hdr_offset; > >> > >> > > + int ret =3D 1; > >> > >> > > + > >> > >> > > + ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char= *) > >> + > >> > >> > > + pkt->l2_len); > >> > >> > > + /* Don't process the fragmented packet */ > >> > >> > > + if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16( > >> > >> > > + > >> IPV4_HDR_DF_MASK)) =3D=3D 0)) { > >> > >> > > >> > >> > > >> > >> > It is not a check for fragmented packet - it is a check that > >> fragmentation > >> > >is allowed for that packet. > >> > >> > Should be IPV4_HDR_DF_MASK - 1, I think. > >> > > > >> > >DF bit doesn't indicate is packet fragmented or not. > >> > >It forbids to fragment packet any further. > >> > >To check is packet already fragmented or not, you have to check MF = bit > >> and > >> > >frag_offset. > >> > >Both have to be zero for un-fragmented packets. > >> > > > >> > >> > >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF = bit. > >> It's a > >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian = order. > >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset & > >> > >rte_cpu_to_be_16( > >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragm= ented. > >> > >> > >> > >> > > >> > >> > > + pkts_out[0] =3D pkt; > >> > >> > > + return ret; > >> > >> > > + } > >> > >> > > + > >> > >> > > + tcp_dl =3D rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt- > >> >l3_len - > >> > >> > > + pkt->l4_len; > >> > >> > > >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len= ? > >> > >> > >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_l= en > >here. > >> > >> > >> > >> > > >> > >> > > + /* Don't process the packet without data */ > >> > >> > > + if (unlikely(tcp_dl =3D=3D 0)) { > >> > >> > > + pkts_out[0] =3D pkt; > >> > >> > > + return ret; > >> > >> > > + } > >> > >> > > + > >> > >> > > + hdr_offset =3D pkt->l2_len + pkt->l3_len + pkt->l4_len; > >> > >> > > + pyld_unit_size =3D gso_size - hdr_offset - ETHER_CRC_LEN; > >> > >> > > >> > >> > Hmm, why do we need to count CRC_LEN here? > >> > >> > >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be > >> > >> included in gso_size. > >> > > > >> > >Why? > >> > >What is the point to account crc len into this computation? > >> > >Why not just assume that gso_size is already a max_frame_size - crc= _len > >> > >As I remember, when we RX packet crc bytes will be already stripped= , > >> > >when user populates the packet, he doesn't care about crc bytes too= . > >> > > >> > Hi Konstantin, > >> > > >> > When packet is tx'd, the 4B for CRC are added back into the packet; = if the > >> payload is already at max capacity, then the actual segment size > >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B). > >> > To prevent that from happening, we account for the CRC len in this > >> calculation. > >> > >> > >> Ok, and what prevents you to set gso_ctx.gso_size =3D 1514; /*ether f= rame > >> size without crc bytes */ > >> ? >=20 > Hey Konstantin, >=20 > If the user sets the gso_size to 1514, the resultant output segments' siz= e should be 1514, and not 1518. Yes and then NIC HW will add CRC bytes for you. You are not filling CRC bytes in HW, and when providing to the HW size to s= end - it is a payload size (CRC bytes are not accounted). Konstantin Consequently, the payload capacity > of each segment would be reduced accordingly. > The user only cares about the output segment size (i.e. gso_ctx.gso_size)= ; we need to ensure that the size of the segments that are > produced is consistent with that. As a result, we need to ensure that any= packet overhead is accounted for in the segment size, before we > can determine how much space remains for data. >=20 > Hope this makes sense. >=20 > Thanks, > Mark >=20 > > > >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the = lower > >layer > >will add CRC to the packet. > > > >Jiayu > > > >> Konstantin > >> > >> > > >> > If I've missed anything, please do let me know! > >> > > >> > Thanks, > >> > Mark > >> > > >> > > > >> > >Konstantin