From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f52.google.com (mail-wm0-f52.google.com [74.125.82.52]) by dpdk.org (Postfix) with ESMTP id 47843DE5 for ; Tue, 17 Jan 2017 12:18:31 +0100 (CET) Received: by mail-wm0-f52.google.com with SMTP id f73so38186651wmf.1 for ; Tue, 17 Jan 2017 03:18:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=65gii7sI/OjRAFtWGdzPWSqi78FG0eLa3nlditNDTtE=; b=CR1Z5llU5TZlP/X36mBkuIdO1D3codiicXhfT4mXncu1dFQvkdwm0Cua5gQDDHxmZZ UI4x84W4y1d2e/iRLM+Fgq6cNtJa82ruEu/XqjTwibNlW1yLVaPGUFhqZYnhaZyosdaV T7SZoo/6XKjQLJCSK46kA4AGGw2O/aqEcHuI6o3F4KxvOCU5jfyz9JtOZspfr5U/BfrP LKWTsxAJyozvEb26BmZjbC0eSqkswGfxrGmrRlyTANR5zoDd8ylsvEt1l0WJOQr1C0Ho pCTHGs5Q+YYKFHpPZHKxC6+Zgm/s6gUTcPyU7KOex+ZchaAc6rs/i0Xxy7lALxrAUEJL Jdxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=65gii7sI/OjRAFtWGdzPWSqi78FG0eLa3nlditNDTtE=; b=BHFQRfcbG08+Pz4BhNfyheF7dxyg6LyyJEKpydJUlPPYaMFlr1BZLY/QFOWUkV8cjV S99TF+EaMtgR0zhVBi5BzxB2/BFpuLvn+kgBri8xyJYluVyZAFAPck15sqOCTm8XDZZN KJKsc1JoXb1f+ocq0H0t9t/6p50eMNz34IwqOi4nsxG8Qf/VXy/CUXkO4PJojwrK1S2J gDy8h7Ndi/mo7lgapmmtKtB6OiYp1wdemLDbe47gD+gI42LEzBor9f6ar31/b5A9V9f5 rmHCxVVK8T6o66wkxgbeGqaiRlh4yX/f0+rjHlcuC2DrsDFutu6kcZ9TED/C2DVdwr6N N4gw== X-Gm-Message-State: AIkVDXI7KAjlTCUdY76UHlbIVT9NcQg1/b6dI+baG3IjcwxWb78xIhU71IWdSQDRNIyFk3MP X-Received: by 10.223.168.111 with SMTP id l102mr10319045wrc.150.1484651910904; Tue, 17 Jan 2017 03:18:30 -0800 (PST) Received: from platinum (2a01cb0c03c651000226b0fffeed02fc.ipv6.abo.wanadoo.fr. [2a01:cb0c:3c6:5100:226:b0ff:feed:2fc]) by smtp.gmail.com with ESMTPSA id c81sm36107175wmf.22.2017.01.17.03.18.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 17 Jan 2017 03:18:30 -0800 (PST) Date: Tue, 17 Jan 2017 12:18:25 +0100 From: Olivier Matz To: Yuanhan Liu Cc: dev@dpdk.org, maxime.coquelin@redhat.com, huawei.xie@intel.com, stephen@networkplumber.org, "Tan, Jianfeng" , Tomasz Kulasek , Thomas Monjalon , Konstantin Ananyev Message-ID: <20170117121825.3db48369@platinum> In-Reply-To: <20170116064819.GL9770@yliu-dev.sh.intel.com> References: <1479977798-13417-1-git-send-email-olivier.matz@6wind.com> <1479977798-13417-6-git-send-email-olivier.matz@6wind.com> <20161214072750.GK18991@yliu-dev.sh.intel.com> <20170109184625.7884290a@platinum> <20170116064819.GL9770@yliu-dev.sh.intel.com> X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 5/5] net/virtio: fix Tso when mbuf is shared X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2017 11:18:31 -0000 Hi Yuanhan, On Mon, 16 Jan 2017 14:48:19 +0800, Yuanhan Liu wrote: > On Mon, Jan 09, 2017 at 06:46:25PM +0100, Olivier Matz wrote: > > The virtio specifications requires that the L4 checksum is set to > > the pseudo header checksum. You can search for "pseudo header" in > > the following doc: > > http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.pdf > > > > Especially in 5.1.6.2.1, we can see that if we use the csum flag, we > > must set the checksum to phdr, and if we do tso, we must set the > > csum flag. > > > > We can check that this is really needed with Linux vhost by > > replaying the test plan described at [1]. > > > > [1] http://dpdk.org/ml/archives/dev/2016-October/048793.html > > > > If we add the following patch to disable the checksum fix (on top of > > this patchset), the test1 "large packets (lro/tso)" won't work. > > > > --- a/drivers/net/virtio/virtio_rxtx.c > > +++ b/drivers/net/virtio/virtio_rxtx.c > > @@ -224,6 +224,9 @@ > > uint32_t tmp; > > int shared = 0; > > > > + if (1) > > + return 0; > > + > > /* mbuf is write-only, we need to copy the headers in a > > linear buffer */ if (unlikely(rte_pktmbuf_data_is_shared(m, 0, > > hdrlen))) { shared = 1; > > > > > > In one direction ("flow1" in the test desc), large packets are > > transmitted from host on the ixgbe interface, and received by the > > guest. Then, testpmd bridges the packet to the virtio interface. But > > the packet is not received by the host. > > I hope I could have time to dig this further, since, honestly, I don't > quite like this patch: it makes things un-maintainable. Well, I'm not that proud of the patch, but that's the best solution I've found. Nevertheless saying it makes things un-maintainable looks a bit excessive to me :) The option of reallocating a mbuf, copy and fix network headers in it looks even more complex to me (that was my first approach). > Besides that, I think we have similar issue with nic drivers. See the > rte_net_intel_cksum_flags_prepare() function introduced at commit > 4fb7e803eb1a ("ethdev: add Tx preparation"). Yes, that was discussed a bit. See [1] and the subsequent mails. http://dpdk.org/ml/archives/dev/2016-December/051014.html My opinion is that tx_burst() should not change the mbuf data, it's always been like this. For Intel NICs, there is no issue since the DPDK API is derived from Intel NICs API, so there is no fix to do in the mbuf data. For tx_prepare(), it's explicitly said that it can update the data. If tx_prepare() becomes mandatory, it will naturally fix this issue without modifying the driver, because the phdr csum calculation will be done in tx_prepare(). An alternative is to mark this as a known issue for now, and wait until tx_prepare() is mandatory. > Cc more people here. And here is a quick background for them: NIC > drivers doing TSO need change the mbuf (say, for cksum updating), > however, as Stephen pointed out, we could not do that if the mbuf is > shared: I don't see such checks in the driver code as well. > > > There are at least 2 options for this one: > > > > - try to use 2 different descriptors (the patch is probably harder, > > and it may slow-down the case where ANY_LAYOUT is supported) > > > > - refuse to initialize with TSO enabled if ANY_LAYOUT is not > > supported. > > > > If you think ANY_LAYOUT is most likely true today, we could choose > > option 2. Let me know what's your preference here. > > Maybe we could go with a simpler one: COW. Yeah, it costs more, but > this would be rare, that it should be OKay, right? Besides, we just > need copy the heading mbuf. Could you detail what you mean by COW in this context? Do you mean reallocating a new mbuf? If yes, it's not only a problem of cost: - There is no mempool pointer associated to tx queues, so we cannot allocate a mbuf. Reusing a mempool pointer from the current mbuf looks risky, because it can be a special pool, like a pool dedicated for clones, without data. - It makes allocation error quite hard to manage, it would require some rework in tx functions. Thanks for your review and the discussion. Let me know what you think. Regards, Olivier