DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To: Olivier Matz <olivier.matz@6wind.com>
Cc: "Xia, Chenbo" <chenbo.xia@intel.com>,
	Ivan Malov <ivan.malov@oktetlabs.ru>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	"stable@dpdk.org" <stable@dpdk.org>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>
Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: handle Tx checksums correctly for tunnel packets
Date: Mon, 18 Oct 2021 11:20:25 +0300
Message-ID: <42d842c8-d0e9-f9a3-495c-0a693f146a32@oktetlabs.ru> (raw)
In-Reply-To: <YW0oIquEX7Q52Ucb@platinum>

On 10/18/21 10:54 AM, Olivier Matz wrote:
> On Mon, Oct 18, 2021 at 10:04:46AM +0300, Andrew Rybchenko wrote:
>> On 10/15/21 11:32 AM, Olivier Matz wrote:
>>> On Thu, Oct 14, 2021 at 07:12:29AM +0000, Xia, Chenbo wrote:
>>>>> -----Original Message-----
>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>> Sent: Friday, September 17, 2021 2:50 AM
>>>>> To: dev@dpdk.org
>>>>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; stable@dpdk.org; Andrew
>>>>> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Xia, Chenbo <chenbo.xia@intel.com>;
>>>>> Yuanhan Liu <yuanhan.liu@linux.intel.com>; Olivier Matz
>>>>> <olivier.matz@6wind.com>
>>>>> Subject: [PATCH v2] net/virtio: handle Tx checksums correctly for tunnel
>>>>> packets
>>>>>
>>>>> Tx prepare method calls rte_net_intel_cksum_prepare(), which
>>>>> handles tunnel packets correctly, but Tx burst path does not
>>>>> take tunnel presence into account when computing the offsets.
>>>>>
>>>>> Fixes: 58169a9c8153 ("net/virtio: support Tx checksum offload")
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>> ---
>>>>>  drivers/net/virtio/virtqueue.h | 9 ++++++---
>>>>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
>>>>> index 03957b2bd0..b83ff32efb 100644
>>>>> --- a/drivers/net/virtio/virtqueue.h
>>>>> +++ b/drivers/net/virtio/virtqueue.h
>>>>> @@ -620,19 +620,21 @@ static inline void
>>>>>  virtqueue_xmit_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *cookie)
>>>>>  {
>>>>>  	uint64_t csum_l4 = cookie->ol_flags & PKT_TX_L4_MASK;
>>>>> +	uint16_t o_l23_len = (cookie->ol_flags & PKT_TX_TUNNEL_MASK) ?
>>>>> +			     cookie->outer_l2_len + cookie->outer_l3_len : 0;
>>>>>
>>>>>  	if (cookie->ol_flags & PKT_TX_TCP_SEG)
>>>>>  		csum_l4 |= PKT_TX_TCP_CKSUM;
>>>>>
>>>>>  	switch (csum_l4) {
>>>>>  	case PKT_TX_UDP_CKSUM:
>>>>> -		hdr->csum_start = cookie->l2_len + cookie->l3_len;
>>>>> +		hdr->csum_start = o_l23_len + cookie->l2_len + cookie->l3_len;
>>>>>  		hdr->csum_offset = offsetof(struct rte_udp_hdr, dgram_cksum);
>>>>>  		hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
>>>>>  		break;
>>>>>
>>>>>  	case PKT_TX_TCP_CKSUM:
>>>>> -		hdr->csum_start = cookie->l2_len + cookie->l3_len;
>>>>> +		hdr->csum_start = o_l23_len + cookie->l2_len + cookie->l3_len;
>>>>>  		hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum);
>>>>>  		hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
>>>>>  		break;
>>>>> @@ -650,7 +652,8 @@ virtqueue_xmit_offload(struct virtio_net_hdr *hdr, struct
>>>>> rte_mbuf *cookie)
>>>>>  			VIRTIO_NET_HDR_GSO_TCPV6 :
>>>>>  			VIRTIO_NET_HDR_GSO_TCPV4;
>>>>>  		hdr->gso_size = cookie->tso_segsz;
>>>>> -		hdr->hdr_len = cookie->l2_len + cookie->l3_len + cookie->l4_len;
>>>>> +		hdr->hdr_len = o_l23_len + cookie->l2_len + cookie->l3_len +
>>>>> +			       cookie->l4_len;
>>>>>  	} else {
>>>>>  		ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
>>>>>  		ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
>>>>> --
>>>>> 2.20.1
>>>>
>>>> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
>>>>
>>>
>>> I have one comment to mention that from application perspective, it has
>>> to take care that the driver does not support outer tunnel offload (this
>>> matches the advertised capabilities). For instance, in case of a vxlan
>>> tunnel, if the outer checksum needs to be calculated, it has to be done
>>> by the application. In short, the application can ask to offload the
>>> inner part if no offload is required on the outer part.
>>
>> Yes. It is really complicated.
>>
>> Let's name below case (A).
>>
>>> Also, since grep "PKT_TX_TUNNEL" in driver/net/ixgbe gives nothing, it
>>> seems the ixgbe driver does not support the same offload request than
>>> described in this patch:
>>>   (m->ol_flags & PKT_TX_TUNNEL_MASK) == PKT_TX_TUNNEL_XXXXX
>>>   m->outer_l2_len = outer l2 length
>>>   m->outer_l3_len = outer l3 length
>>>   m->l2_len = outer l4 length + tunnel len + inner l2 len
>>>   m->l3_len = inner l3 len
>>>   m->l4_len = inner l4 len
>>
>> True.
>>
>> Let's name below case (B).
>>
>>> An alternative for doing the same (that would work with ixgbe and
>>> current virtio) is to give:
>>>   (m->ol_flags & PKT_TX_TUNNEL_MASK) == 0
>>>   m->l2_len = outer lengths + tunnel len + inner l2 len
>>>   m->l3_len = inner l3 len
>>>   m->l4_len = inner l4 len
>>
>> Possible, but I think it is a bit less friendly.
>> If application supports both cases with and without
>> outer IPv4 checksum offload, it will require more extra
>> steps from the application to remove tunnel mask and
>> collapse outer lengths into l2_len (in addition outer
>> IPv4 checksum calculation in SW).
>>
>>> I think a capability may be missing to differentiate which drivers
>>> support which mode.
>>
>> Yes, something like TX_INNER_CKSUM.
>> Some drivers already support it, some do not.
>>  > Or, all drivers could be fixed to support both modes
>>> (and this would make this patch valid).
>>
>> I disagree with the point that something is required to
>> make the patch valid. Before the patch the driver supports
>> just (B). After the patch the driver supports both (A) and (B).
>> It makes the driver to support more cases.
> 
> Yes, I misspoke, I wanted to say that today it is difficult for an
> application to use this new mode because it does not know which driver
> supports it. Fixing all drivers would solve the issue without a
> capability.
> 
> A drivers/hw supporting one mode can easily support the other one (this
> patch demonstrates it), so maybe it's not worth having a capability.

Good point. I'll keep it in mind and try to find time to
take a look at other drivers as well.

> 
> But yes, this is not a good reason to reject the patch.
> 
> So:
> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
> 
> 
>> Please, note that rte_net_intel_cksum_prepare() supports
>> both (A) and (B) as well.


  reply	other threads:[~2021-10-18  8:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-30 14:26 [dpdk-dev] [PATCH] " Ivan Malov
2021-09-13 20:06 ` Maxime Coquelin
2021-09-16 18:49 ` [dpdk-dev] [PATCH v2] " Ivan Malov
2021-10-14  6:45   ` Andrew Rybchenko
2021-10-14  7:12   ` Xia, Chenbo
2021-10-15  8:32     ` Olivier Matz
2021-10-18  7:04       ` Andrew Rybchenko
2021-10-18  7:54         ` Olivier Matz
2021-10-18  8:20           ` Andrew Rybchenko [this message]
2021-10-21 12:28   ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42d842c8-d0e9-f9a3-495c-0a693f146a32@oktetlabs.ru \
    --to=andrew.rybchenko@oktetlabs.ru \
    --cc=chenbo.xia@intel.com \
    --cc=dev@dpdk.org \
    --cc=ivan.malov@oktetlabs.ru \
    --cc=maxime.coquelin@redhat.com \
    --cc=olivier.matz@6wind.com \
    --cc=stable@dpdk.org \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git