From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id DC1E4558B for ; Thu, 13 Oct 2016 17:15:28 +0200 (CEST) Received: from [10.16.0.195] (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id 474DE24F19; Thu, 13 Oct 2016 17:15:26 +0200 (CEST) To: Yuanhan Liu References: <1475485223-30566-1-git-send-email-olivier.matz@6wind.com> <1475485223-30566-13-git-send-email-olivier.matz@6wind.com> <20161013081839.GT16751@yliu-dev.sh.intel.com> <57FF9409.2090205@6wind.com> <20161013141654.GW16751@yliu-dev.sh.intel.com> <57FF9FA9.8020403@6wind.com> <20161013150107.GX16751@yliu-dev.sh.intel.com> Cc: dev@dpdk.org, konstantin.ananyev@intel.com, sugesh.chandran@intel.com, bruce.richardson@intel.com, jianfeng.tan@intel.com, helin.zhang@intel.com, adrien.mazarguil@6wind.com, stephen@networkplumber.org, dprovan@bivio.net, xiao.w.wang@intel.com From: Olivier MATZ Message-ID: <57FFA50C.6030805@6wind.com> Date: Thu, 13 Oct 2016 17:15:24 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0 MIME-Version: 1.0 In-Reply-To: <20161013150107.GX16751@yliu-dev.sh.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 12/12] virtio: add Tso support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2016 15:15:29 -0000 On 10/13/2016 05:01 PM, Yuanhan Liu wrote: > On Thu, Oct 13, 2016 at 04:52:25PM +0200, Olivier MATZ wrote: >> >> >> On 10/13/2016 04:16 PM, Yuanhan Liu wrote: >>> On Thu, Oct 13, 2016 at 04:02:49PM +0200, Olivier MATZ wrote: >>>> >>>> >>>> On 10/13/2016 10:18 AM, Yuanhan Liu wrote: >>>>> On Mon, Oct 03, 2016 at 11:00:23AM +0200, Olivier Matz wrote: >>>>>> +/* When doing TSO, the IP length is not included in the pseudo header >>>>>> + * checksum of the packet given to the PMD, but for virtio it is >>>>>> + * expected. >>>>>> + */ >>>>>> +static void >>>>>> +virtio_tso_fix_cksum(struct rte_mbuf *m) >>>>>> +{ >>>>>> + /* common case: header is not fragmented */ >>>>>> + if (likely(rte_pktmbuf_data_len(m) >= m->l2_len + m->l3_len + >>>>>> + m->l4_len)) { >>>>> ... >>>>>> + /* replace it in the packet */ >>>>>> + th->cksum = new_cksum; >>>>>> + } else { >>>>> ... >>>>>> + /* replace it in the packet */ >>>>>> + *rte_pktmbuf_mtod_offset(m, uint8_t *, >>>>>> + m->l2_len + m->l3_len + 16) = new_cksum.u8[0]; >>>>>> + *rte_pktmbuf_mtod_offset(m, uint8_t *, >>>>>> + m->l2_len + m->l3_len + 17) = new_cksum.u8[1]; >>>>>> + } >>>>> >>>>> The tcp header will always be in the mbuf, right? Otherwise, you can't >>>>> update the cksum field here. What's the point of introducing the "else >>>>> clause" then? >>>> >>>> Sorry, I don't see the problem you're pointing out here. >>>> >>>> What I want to solve here is to support the cases where the mbuf is >>>> segmented in the middle of the network header (which is probably a rare >>>> case). >>> >>> How it's gonna segmented? >> >> The mbuf is given by the application. So if the application generates a >> segmented mbuf, it should work. >> >> This could happen for instance if the application uses mbuf clones to share >> the IP/TCP/data part of the mbuf and prepend a specific Ethernet/vlan for >> different destination. >> >> >>>> In the "else" part, I only access the mbuf byte by byte using the >>>> rte_pktmbuf_mtod_offset() accessor. An alternative would have been to copy >>>> the header in a linear buffer, fix the checksum, then copy it again in the >>>> packet, but there is no mbuf helpers to do these copies for now. >>> >>> In the "else" clause, the ip header is still in the mbuf, right? >>> Why do you have to access it the way like: >>> >>> ip_version = *rte_pktmbuf_mtod_offset(m, uint8_t *, >>> m->l2_len) >> 4; >>> >>> Why can't you just use >>> >>> iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len); >>> iph->version_ihl ....; >> >> AFAIK, there is no requirement that each network header has to be contiguous >> in a mbuf segment. >> >> Of course, a split in the middle of a network header probably never >> happens... but we never knows, as it is not forbidden. I think the code >> should be robust enough to avoid accesses to wrong addresses. >> >> Hope it's clear enough :) > > Thanks, but not really. Maybe let me ask this way: what wrong would > happen if we use > iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len); > to access the IP header? Is it about the endian? If you have a packet split like this: mbuf segment 1 mbuf segment 2 ---------------------------- ------------------------------ | Ethernet header | IP hea| |der | TCP header | data ---------------------------- ------------------------------ ^ iph The IP header is not contiguous. So accessing to the end of the structure will access to a wrong location. > One more question is do you have any case to trigger the "else" clause? No, but I think it may happen. Olivier