From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id DD15F7F6D for ; Thu, 13 Oct 2016 17:28:41 +0200 (CEST) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP; 13 Oct 2016 08:28:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,340,1473145200"; d="scan'208";a="19232294" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by fmsmga006.fm.intel.com with ESMTP; 13 Oct 2016 08:28:38 -0700 Date: Thu, 13 Oct 2016 23:29:35 +0800 From: Yuanhan Liu To: Olivier MATZ Cc: dev@dpdk.org, konstantin.ananyev@intel.com, sugesh.chandran@intel.com, bruce.richardson@intel.com, jianfeng.tan@intel.com, helin.zhang@intel.com, adrien.mazarguil@6wind.com, stephen@networkplumber.org, dprovan@bivio.net, xiao.w.wang@intel.com Message-ID: <20161013152935.GZ16751@yliu-dev.sh.intel.com> References: <1475485223-30566-1-git-send-email-olivier.matz@6wind.com> <1475485223-30566-13-git-send-email-olivier.matz@6wind.com> <20161013081839.GT16751@yliu-dev.sh.intel.com> <57FF9409.2090205@6wind.com> <20161013141654.GW16751@yliu-dev.sh.intel.com> <57FF9FA9.8020403@6wind.com> <20161013150107.GX16751@yliu-dev.sh.intel.com> <57FFA50C.6030805@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57FFA50C.6030805@6wind.com> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v2 12/12] virtio: add Tso support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2016 15:28:42 -0000 On Thu, Oct 13, 2016 at 05:15:24PM +0200, Olivier MATZ wrote: > > > On 10/13/2016 05:01 PM, Yuanhan Liu wrote: > >On Thu, Oct 13, 2016 at 04:52:25PM +0200, Olivier MATZ wrote: > >> > >> > >>On 10/13/2016 04:16 PM, Yuanhan Liu wrote: > >>>On Thu, Oct 13, 2016 at 04:02:49PM +0200, Olivier MATZ wrote: > >>>> > >>>> > >>>>On 10/13/2016 10:18 AM, Yuanhan Liu wrote: > >>>>>On Mon, Oct 03, 2016 at 11:00:23AM +0200, Olivier Matz wrote: > >>>>>>+/* When doing TSO, the IP length is not included in the pseudo header > >>>>>>+ * checksum of the packet given to the PMD, but for virtio it is > >>>>>>+ * expected. > >>>>>>+ */ > >>>>>>+static void > >>>>>>+virtio_tso_fix_cksum(struct rte_mbuf *m) > >>>>>>+{ > >>>>>>+ /* common case: header is not fragmented */ > >>>>>>+ if (likely(rte_pktmbuf_data_len(m) >= m->l2_len + m->l3_len + > >>>>>>+ m->l4_len)) { > >>>>>... > >>>>>>+ /* replace it in the packet */ > >>>>>>+ th->cksum = new_cksum; > >>>>>>+ } else { > >>>>>... > >>>>>>+ /* replace it in the packet */ > >>>>>>+ *rte_pktmbuf_mtod_offset(m, uint8_t *, > >>>>>>+ m->l2_len + m->l3_len + 16) = new_cksum.u8[0]; > >>>>>>+ *rte_pktmbuf_mtod_offset(m, uint8_t *, > >>>>>>+ m->l2_len + m->l3_len + 17) = new_cksum.u8[1]; > >>>>>>+ } > >>>>> > >>>>>The tcp header will always be in the mbuf, right? Otherwise, you can't > >>>>>update the cksum field here. What's the point of introducing the "else > >>>>>clause" then? > >>>> > >>>>Sorry, I don't see the problem you're pointing out here. > >>>> > >>>>What I want to solve here is to support the cases where the mbuf is > >>>>segmented in the middle of the network header (which is probably a rare > >>>>case). > >>> > >>>How it's gonna segmented? > >> > >>The mbuf is given by the application. So if the application generates a > >>segmented mbuf, it should work. > >> > >>This could happen for instance if the application uses mbuf clones to share > >>the IP/TCP/data part of the mbuf and prepend a specific Ethernet/vlan for > >>different destination. > >> > >> > >>>>In the "else" part, I only access the mbuf byte by byte using the > >>>>rte_pktmbuf_mtod_offset() accessor. An alternative would have been to copy > >>>>the header in a linear buffer, fix the checksum, then copy it again in the > >>>>packet, but there is no mbuf helpers to do these copies for now. > >>> > >>>In the "else" clause, the ip header is still in the mbuf, right? > >>>Why do you have to access it the way like: > >>> > >>> ip_version = *rte_pktmbuf_mtod_offset(m, uint8_t *, > >>> m->l2_len) >> 4; > >>> > >>>Why can't you just use > >>> > >>> iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len); > >>> iph->version_ihl ....; > >> > >>AFAIK, there is no requirement that each network header has to be contiguous > >>in a mbuf segment. > >> > >>Of course, a split in the middle of a network header probably never > >>happens... but we never knows, as it is not forbidden. I think the code > >>should be robust enough to avoid accesses to wrong addresses. > >> > >>Hope it's clear enough :) > > > >Thanks, but not really. Maybe let me ask this way: what wrong would > >happen if we use > > iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len); > >to access the IP header? Is it about the endian? > > If you have a packet split like this: > > mbuf segment 1 mbuf segment 2 > ---------------------------- ------------------------------ > | Ethernet header | IP hea| |der | TCP header | data > ---------------------------- ------------------------------ > ^ > iph Thanks, that's clear. How could you be able to access the tcp header from the first mbuf then? I mean, how is the following code supposed to work? prev_cksum.u8[0] = *rte_pktmbuf_mtod_offset(m, uint8_t *, m->l2_len + m->l3_len + 16); > The IP header is not contiguous. So accessing to the end of the structure > will access to a wrong location. > > >One more question is do you have any case to trigger the "else" clause? > > No, but I think it may happen. A piece of untest code is not trusted though ... --yliu