Re: [dpdk-dev] [PATCH] mbuf: move headers not fragmented check to checksum

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Andrew Rybchenko <arybchenko@solarflare.com>
To: Olivier Matz <olivier.matz@6wind.com>
Cc: Tomasz Kulasek <tomaszx.kulasek@intel.com>, <dev@dpdk.org>,
	"Konstantin Ananyev" <konstantin.ananyev@intel.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@intel.com>
Subject: Re: [dpdk-dev] [PATCH] mbuf: move headers not fragmented check to checksum
Date: Fri, 29 Mar 2019 16:30:48 +0300	[thread overview]
Message-ID: <4a844c8d-d466-d58e-80a3-9473b1943c80@solarflare.com> (raw)
Message-ID: <20190329133048.bOku7seChiHWgW7M8WJGwDniq3_4eVBCEiXwbuEif5g@z> (raw)
In-Reply-To: <20190329130949.tjjo2e5onssvoru4@platinum>

Hi Olivier,

On 3/29/19 4:09 PM, Olivier Matz wrote:
> Hi Andrew,
>
> On Thu, Mar 28, 2019 at 08:04:31PM +0300, Andrew Rybchenko wrote:
>> Ping? (I have a number of net/sfc patches which heavily depend on this
>> one and must not be applied without this one)
>>
>> Andrew.
>>
>> On 2/19/19 9:30 AM, Andrew Rybchenko wrote:
>>> rte_validate_tx_offload() is used in Tx prepare callbacks
>>> (RTE_LIBRTE_ETHDEV_DEBUG only) to check Tx offloads consistency.
>>> Requirement that packet headers should not be fragmented is not
>>> documented and unclear where it comes from except
>>> rte_net_intel_cksum_prepare() functions which relies on it.
>>>
>>> It could be NIC vendor specific driver or hardware limitation, but,
>>> if so, it should be documented and checked in corresponding Tx
>>> prepare callbacks.
>>>
>>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
>>> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
>>> ---
>>> Looks good to me, though extra-testing would be needed.
>>> Konstantin Ananyev <konstantin.ananyev@intel.com>
>>>
>>>    lib/librte_mbuf/rte_mbuf.h | 12 ------------
>>>    lib/librte_net/rte_net.h   | 17 +++++++++++++++++
>>>    2 files changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>>> index d961cca..73daa81 100644
>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>> @@ -2257,23 +2257,11 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
>>>    rte_validate_tx_offload(const struct rte_mbuf *m)
>>>    {
>>>    	uint64_t ol_flags = m->ol_flags;
>>> -	uint64_t inner_l3_offset = m->l2_len;
>>>    	/* Does packet set any of available offloads? */
>>>    	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
>>>    		return 0;
>>> -	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
>>> -		/* NB: elaborating the addition like this instead of using
>>> -		 *     += gives the result uint64_t type instead of int,
>>> -		 *     avoiding compiler warnings on gcc 8.1 at least */
>>> -		inner_l3_offset = inner_l3_offset + m->outer_l2_len +
>>> -				  m->outer_l3_len;
>>> -
>>> -	/* Headers are fragmented */
>>> -	if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len)
>>> -		return -ENOTSUP;
>>> -
>>>    	/* IP checksum can be counted only for IPv4 packet */
>>>    	if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6))
>>>    		return -EINVAL;
>>> diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
>>> index e59760a..bd75aea 100644
>>> --- a/lib/librte_net/rte_net.h
>>> +++ b/lib/librte_net/rte_net.h
>>> @@ -118,10 +118,27 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
>>>    	struct udp_hdr *udp_hdr;
>>>    	uint64_t inner_l3_offset = m->l2_len;
>>> +	/*
>>> +	 * Does packet set any of available offloads?
>>> +	 * Mainly it is required to avoid fragmented headers check if
>>> +	 * no offloads are requested.
>>> +	 */
>>> +	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
>>> +		return 0;
>>> +
>>>    	if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) ||
>>>    		(ol_flags & PKT_TX_OUTER_IPV6))
>>>    		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
>>> +	/*
>>> +	 * Check if headers are fragmented.
>>> +	 * The check could be less strict depending on which offloads are
>>> +	 * requested and headers to be used, but let's keep it simple.
>>> +	 */
>>> +	if (unlikely(rte_pktmbuf_data_len(m) <
>>> +		     inner_l3_offset + m->l3_len + m->l4_len))
>>> +		return -ENOTSUP;
>>> +
>>>    	if (ol_flags & PKT_TX_IPV4) {
>>>    		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
>>>    				inner_l3_offset);
>
> To summarize, the previous code was in a generic part, only enabled if
> RTE_LIBRTE_ETHDEV_DEBUG is set, and it is moved in an intel-specific part,
> but always enabled. Am I correct?

Yes, correct.

> So it may have a performance impact on intel NICs. Shouldn't it be under
> a debug option?

Yes, to be 100% equivalent.

May be making these checks non-debug is a separate story since IMHO
these checks should be non-debug. Below code really depends on these
checks and if the condition is violated it will read and could write outside
of provided buffer (bad checksums, spoiled memory etc).

I'll send v2 shortly with RTE_LIBRTE_ETHDEV_DEBUG to make it easy to
pickup finally chosen version.

Thanks,
Andrew.

next prev parent reply	other threads:[~2019-03-29 13:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-29  8:49 [dpdk-dev] [RFC PATCH] " Andrew Rybchenko
2019-02-13  9:50 ` Andrew Rybchenko
2019-02-13 14:48   ` Wiles, Keith
2019-02-13 23:27 ` Ananyev, Konstantin
2019-02-19  6:30 ` [dpdk-dev] [PATCH] " Andrew Rybchenko
2019-03-28 17:04   ` Andrew Rybchenko
2019-03-28 17:04     ` Andrew Rybchenko
2019-03-29 13:09     ` Olivier Matz
2019-03-29 13:09       ` Olivier Matz
2019-03-29 13:30       ` Andrew Rybchenko [this message]
2019-03-29 13:30         ` Andrew Rybchenko
2019-03-29 13:42 ` [dpdk-dev] [PATCH v2] " Andrew Rybchenko
2019-03-29 13:42   ` Andrew Rybchenko
2019-03-29 14:18   ` Olivier Matz
2019-03-29 14:18     ` Olivier Matz
2019-04-02 14:48     ` Thomas Monjalon
2019-04-02 14:48       ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a844c8d-d466-d58e-80a3-9473b1943c80@solarflare.com \
    --to=arybchenko@solarflare.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    --cc=tomaszx.kulasek@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).