From: Thomas Monjalon <thomas@monjalon.net>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: dev <dev@dpdk.org>,
Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>,
Gaetan Rivet <grive@u256.net>, Eli Britstein <elibr@nvidia.com>,
Ivan Malov <ivan.malov@oktetlabs.ru>,
Andrew Rybchenko <Andrew.Rybchenko@oktetlabs.ru>,
Ori Kam <orika@nvidia.com>, Ian Stokes <ian.stokes@intel.com>
Subject: Re: rte_flow API change request: tunnel info restoration is too slow
Date: Wed, 16 Mar 2022 14:46:42 +0100 [thread overview]
Message-ID: <4878373.LvFx2qVVIh@thomas> (raw)
In-Reply-To: <7c70ff8a-296b-6f4e-53c7-ad0f825b838c@ovn.org>
16/03/2022 13:25, Ilya Maximets:
> On 3/16/22 10:41, Thomas Monjalon wrote:
> > 15/03/2022 23:12, Ilya Maximets:
> >> Hi, everyone.
> >>
> >> After implementing support for tunnel offloading in OVS we faced a
> >> significant performance issue caused by the requirement to call
> >> rte_flow_get_restore_info() function on a per-packet basis.
> >>
> >> The main problem is that once the tunnel offloading is configured,
> >> there is no way for the application to tell if a particular packet
> >> was partially processed in the hardware and the tunnel info has to
> >> be restored. What we have to do right now is to call the
> >> rte_flow_get_restore_info() unconditionally for every packet. The
> >> result of that call says if we have the tunnel info or not.
> >>
> >> rte_flow_get_restore_info() call itself is very heavy. It is at
> >> least a couple of indirect function calls and the device lock
> >> on the application side (not-really-thread-safety of the rte_flow
> >> API is a separate topic). Internal info lookup inside the driver
> >> also costs a lot (depends on a driver).
> >>
> >> It has been measured that having this call on a per-packet basis can
> >> reduce application performance (OVS) by a factor of 3 in some cases:
> >> https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389265.html
> >> https://github.com/openvswitch/ovs/commit/6e50c1651869de0335eb4b7fd0960059c5505f5c
> >> (Above patch avoid the problem in a hacky way for devices that doesn't
> >> support tunnel offloading, but it's not applicable to situation
> >> where device actually supports it, since the API has to be called.)
> >>
> >> Another tricky part is that we have to call rte_flow_get_restore_info()
> >> before checking other parts of the mbuf, because mlx5 driver, for
> >> example, re-uses the mbuf.hash.fdir value for both tunnel info
> >> restoration and classification offloading, so the application has
> >> no way to tell which one is used right now and has to call the
> >> restoration API first in order to find out.
> >>
> >>
> >> What we need:
> >>
> >> A generic and fast (couple of CPU cycles) API that will clearly say
> >> if the heavy rte_flow_get_restore_info() has to be called for a
> >> particular packet or not. Ideally, that should be a static mbuf
> >> flag that can be easily checked by the application.
> >
> > A dynamic mbuf flag, defined in the API, looks to be a good idea.
>
> Makes sense. OTOH, I'm not sure what is the profit of having it
> dynamically allocated if it will need to be always defined.
True.
We need to discuss whether we can have situations where it is not registered
at all. We recently introduced a function for initial config of rte_flow,
it could be the trigger to register such flag or field.
> But, well, it doesn't really matter, I guess.
That's a detail but it should be discussed.
> >> Calls inside the device driver are way too slow for that purpose,
> >> especially if they are not fully thread-safe, or require complex
> >> lookups or calculations.
> >>
> >> I'm assuming here that packets that really need the tunnel info
> >> restoration should be fairly rare.
> >>
> >>
> >> Current state:
> >>
> >> Currently, the get_restore_info() API is implemented only for mlx5
> >> and sfc drivers, AFAICT.
> >> SFC driver is already using mbuf flag, but
> >> it's dynamic and not exposed to the application.
> >
> > What is this flag?
>
> SFC driver defines a dynamic field 'rte_net_sfc_dynfield_ft_id'
> and the corresponding flag 'rte_net_sfc_dynflag_ft_id_valid' to
> check if the field contains a valid data.
OK, so we could deprecate this flag.
> >> MLX5 driver re-uses mbuf.hash.fdir value
> >> and performs a heavy lookup inside the driver.
> >
> > We should avoid re-using a field.
>
> +1 from me, but I'm not familiar with the mlx5 driver enough
> to tell how to change it.
>
> >
> >> For now OVS doesn't support tunnel offload with DPDK formally, the
> >> code in OVS is under the experimental API ifdef and not compiled-in
> >> by default.
> >>
> >> //Let me know if there is more formal way to submit such requests.
> >
> > That's a well written request, thanks.
> > If you are looking for something more formal,
> > it could be a patch in the API.
>
> I'm not looking for now. :)
> I think we need an agreement from driver maintainers first. And
> I don't think we can introduce such API change without changing
> drivers first, because otherwise we'll end up with the 'has_vlan'
> situation and the broken offloading support.
OK
next prev parent reply other threads:[~2022-03-16 13:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-15 22:12 Ilya Maximets
2022-03-16 9:41 ` Thomas Monjalon
2022-03-16 12:25 ` Ilya Maximets
2022-03-16 13:46 ` Thomas Monjalon [this message]
2022-03-16 14:01 ` Ori Kam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4878373.LvFx2qVVIh@thomas \
--to=thomas@monjalon.net \
--cc=Andrew.Rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=elibr@nvidia.com \
--cc=grive@u256.net \
--cc=i.maximets@ovn.org \
--cc=ian.stokes@intel.com \
--cc=ivan.malov@oktetlabs.ru \
--cc=orika@nvidia.com \
--cc=sriharsha.basavapatna@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).