Re: rte_flow API change request: tunnel info restoration is too slow

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Thomas Monjalon <thomas@monjalon.net>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: dev <dev@dpdk.org>,
	Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>,
	Gaetan Rivet <grive@u256.net>, Eli Britstein <elibr@nvidia.com>,
	Ivan Malov <ivan.malov@oktetlabs.ru>,
	Andrew Rybchenko <Andrew.Rybchenko@oktetlabs.ru>,
	Ori Kam <orika@nvidia.com>, Ian Stokes <ian.stokes@intel.com>
Subject: Re: rte_flow API change request: tunnel info restoration is too slow
Date: Wed, 16 Mar 2022 10:41:20 +0100	[thread overview]
Message-ID: <6043769.DvuYhMxLoT@thomas> (raw)
In-Reply-To: <5248c2ca-f2a6-3fb0-38b8-7f659bfa40de@ovn.org>

15/03/2022 23:12, Ilya Maximets:
> Hi, everyone.
> 
> After implementing support for tunnel offloading in OVS we faced a
> significant performance issue caused by the requirement to call
> rte_flow_get_restore_info() function on a per-packet basis.
> 
> The main problem is that once the tunnel offloading is configured,
> there is no way for the application to tell if a particular packet
> was partially processed in the hardware and the tunnel info has to
> be restored.  What we have to do right now is to call the
> rte_flow_get_restore_info() unconditionally for every packet.  The
> result of that call says if we have the tunnel info or not.
> 
> rte_flow_get_restore_info() call itself is very heavy.  It is at
> least a couple of indirect function calls and the device lock
> on the application side (not-really-thread-safety of the rte_flow
> API is a separate topic).  Internal info lookup inside the driver
> also costs a lot (depends on a driver).
> 
> It has been measured that having this call on a per-packet basis can
> reduce application performance (OVS) by a factor of 3 in some cases:
>   https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389265.html
>   https://github.com/openvswitch/ovs/commit/6e50c1651869de0335eb4b7fd0960059c5505f5c
> (Above patch avoid the problem in a hacky way for devices that doesn't
>  support tunnel offloading, but it's not applicable to situation
>  where device actually supports it, since the API has to be called.)
> 
> Another tricky part is that we have to call rte_flow_get_restore_info()
> before checking other parts of the mbuf, because mlx5 driver, for
> example, re-uses the mbuf.hash.fdir value for both tunnel info
> restoration and classification offloading, so the application has
> no way to tell which one is used right now and has to call the
> restoration API first in order to find out.
> 
> 
> What we need:
> 
> A generic and fast (couple of CPU cycles) API that will clearly say
> if the heavy rte_flow_get_restore_info() has to be called for a
> particular packet or not.  Ideally, that should be a static mbuf
> flag that can be easily checked by the application.

A dynamic mbuf flag, defined in the API, looks to be a good idea.

> Calls inside the device driver are way too slow for that purpose,
> especially if they are not fully thread-safe, or require complex
> lookups or calculations.
> 
> I'm assuming here that packets that really need the tunnel info
> restoration should be fairly rare.
> 
> 
> Current state:
> 
> Currently, the get_restore_info() API is implemented only for mlx5
> and sfc drivers, AFAICT.
> SFC driver is already using mbuf flag, but
> it's dynamic and not exposed to the application.

What is this flag?

> MLX5 driver re-uses mbuf.hash.fdir value
> and performs a heavy lookup inside the driver.

We should avoid re-using a field.

> For now OVS doesn't support tunnel offload with DPDK formally, the
> code in OVS is under the experimental API ifdef and not compiled-in
> by default.
> 
> //Let me know if there is more formal way to submit such requests.

That's a well written request, thanks.
If you are looking for something more formal,
it could be a patch in the API.

next prev parent reply	other threads:[~2022-03-16  9:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 22:12 Ilya Maximets
2022-03-16  9:41 ` Thomas Monjalon [this message]
2022-03-16 12:25   ` Ilya Maximets
2022-03-16 13:46     ` Thomas Monjalon
2022-03-16 14:01       ` Ori Kam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6043769.DvuYhMxLoT@thomas \
    --to=thomas@monjalon.net \
    --cc=Andrew.Rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=elibr@nvidia.com \
    --cc=grive@u256.net \
    --cc=i.maximets@ovn.org \
    --cc=ian.stokes@intel.com \
    --cc=ivan.malov@oktetlabs.ru \
    --cc=orika@nvidia.com \
    --cc=sriharsha.basavapatna@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).