From: Thomas Monjalon <thomas@monjalon.net>
To: Gavin Li <gavinl@nvidia.com>,
orika@nvidia.com, andrew.rybchenko@oktetlabs.ru,
Ferruh Yigit <ferruh.yigit@amd.com>
Cc: dev@dpdk.org, jiaweiw@nvidia.com
Subject: Re: [RFC V1 1/1] net: extend VXLAN header to support more extensions
Date: Fri, 09 Feb 2024 14:44:32 +0100 [thread overview]
Message-ID: <4126957.6PsWsQAL7t@thomas> (raw)
In-Reply-To: <59a69287-cde3-4e9e-8174-c454f964fe5e@amd.com>
09/02/2024 13:11, Ferruh Yigit:
> On 2/9/2024 10:12 AM, Thomas Monjalon wrote:
> > 09/02/2024 00:54, Ferruh Yigit:
> >> On 1/30/2024 11:25 AM, Gavin Li wrote:
> >>> Currently, DPDK supports VXLAN and VXLAN-GPE with similar header
> >>> structures and we are working on adding support for VXLAN-GBP which is
> >>> another extension to VXLAN. More extension of VXLAN may be added in the
> >>> future.
> >>>
> >>> VXLAN and VXLAN-GBP use the same UDP port(4789) while VXLAN-GPE uses a
> >>> different one, 4790. The three protocols have the same header length and
> >>> overall similar header structure as below.
> >>> 0 1 2 3
> >>> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> |R|R|R|R|I|R|R|R| Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> | VXLAN Network Identifier (VNI) | Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>>
> >>> Figure 1: VXLAN Header
> >>>
> >>> 0 1 2 3
> >>> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> |R|R|Ver|I|P|B|O| Reserved |Next Protocol |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> | VXLAN Network Identifier (VNI) | Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>>
> >>> Figure 2: VXLAN-GPE Header
> >>>
> >>> 0 1 2 3
> >>> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> |G|R|R|R|I|R|R|R|R|D|R|R|A|R|R|R| Group Policy ID |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> | VXLAN Network Identifier (VNI) | Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>>
> >>> Figure 3: VXLAN-GBP Extension
> >>>
> >>> Both VXLAN-GPE and VXLAN-GBP extended VXLAN by redefining its reserved
> >>> bits, which means the packets can be processed with same pattern and most
> >>> of the code can be reused. Instead of adding more new items by
> >>> copying/pasting code for the VXLAN extensions in the future, it’s better
> >>> to use existing VXLAN infrastructure and add support code in it.
> >>>
> >>
> >> Hi Gavin,
> >>
> >> The motivation is to prevent code duplication, and the code mentioned is
> >> the driver code, right?
> >
> > The motivation is mainly to provide a unified and more explicit API.
> >
>
> From user perspective, I think existing approach is more explicit,
> because it sets VXLAN or VXLAN_GPE flow types.
>
> I am trying to understand the benefit, how unifying flow type in the API
> helps to the user?
>
> >> Overall OK to unify "struct rte_vxlan_hdr" although it makes the struct
> >> a little complex, perhaps we can consider extraction some nested structs
> >> as named struct, no strong opinion.
> >>
> >>
> >> But not sure about removing the flow item types for VXLAN-GPE, or not
> >> adding for VXLAN-GBP.
> >>
> >> Think about a case user adding a rule, which has a item type as VXLAN
> >> and in the protocol header some bits are set, lets say first word, last
> >> byte is set, how driver will know if to take it as GPE "next protocol"
> >> or "group policy id".
> >
> > The driver may decide depending on the UDP port and some distinguishing flags.
> > If you want to match on GBP, you should includes the GBP flag in your pattern,
> > no need to use a separate item.
> >
>
> Why not be more explicit?
> It helps to driver to know more about the pattern to be able to create
> proper flow rule, if there is an obvious way for driver to differentiate
> these protocol extensions, and flow item type is redundant, I can
> understand the proposal, but otherwise I think better to keep flow items
> for extensions.
In any case we need the simple VXLAN item.
If we have GPE and GBP specialized items,
what means a match on the simple VXLAN?
Does it include packets with other extensions or exclude them?
Matching the bits in the protocol make such decision explicit.
> When a rule is set in HW, HW may not care about the protocol, as long as
> bits in the rule match with the packet, HW can apply the action.
> But for driver to be able to set the rule properly, it needs more
> explicit information.
Yes information is in the pattern to match.
> Lets assume driver API get a pattern with 'RTE_FLOW_ITEM_TYPE_VXLAN'
> type and "struct rte_flow_item_vxlan", at this point driver doesn't know
> if it is VXLAN or any of the extensions.
Yes it knows because of the matched bits in the pattern.
If the rule specify a match on GBP flag = 1, it is GBP only.
If the rule specify a match on GBP flag = 0, it excludes GBP.
If the rule does not mask GBP flag, it includes GBP.
> Should driver go and check GBP flags to deduce if it is GBP flag, and
Yes
> what if they are all zero, how can driver can tell if this is GBP flag
> that is zero or is it VXLAN header and bit is reserved.
As explained above, value & mask does the matching.
> Or I will just make up a sample, it may not be accurate but please take
> it as a case to prove a point.
> Lets assume driver API again get "struct rte_flow_item_vxlan" whose
> first word's last bit is set, which union in the struct will driver
> check to detect this, GPE one or GBP one?
It depends on the G flag.
> This can be GPE-"next protocol" or GBP-"Group policy id", if driver
> knows this it can set the mask better for the field in the rule.
> Driver may try to deduce this extension information from other
> information, but why not proper specific flow type for each extension
> instead?
I believe relying on bit matching is a cleaner approach
than creating a new item each time a protocol has a variation.
And it makes include/exclude policy easier to understand
thanks to the value/mask logic.
> >> And if a specific HW detects VXLAN-GPE and VXLAN-GBP protocols as two
> >> separate protocols, won't only having capability to describe VXLAN
> >> protocol in SW be a limitation.
> >
> > I think the driver may know based on the flow rule.
> >
> > That's a good question about the item granularity.
> > What is the separation between a different protocol and protocol options?
> > How flow item should match protocol variations?
> > My current thinking is that we should minimize the number of items.
> >
> >> If the target is to remove code duplication in the driver, can it be
> >> accomplished by extracting code that parse the common fields of these
> >> protocols?
> >
> > Driver code is not a concern as far as any driver can implement the API.
> >
>
> Got it, I want to clarify this one.
> Where the code duplication via copy/paste that commit log mention occurs?
>
> >
> >>> In this patch, all the VXLAN extension header will be merged with VXLAN as
> >>> union if the overlapped field has different format among protocols. The
> >>> existing VXLAN-GPE will be marked as deprecated and new extensions of
> >>> VXLAN should be added to VXLAN instead of a new RTE item.
> >>>
> >>> Signed-off-by: Gavin Li <gavinl@nvidia.com>
> >>> ---
> >>> doc/guides/rel_notes/deprecation.rst | 5 +++
> >>> lib/ethdev/rte_flow.h | 13 +++++-
> >>> lib/net/rte_vxlan.h | 67 ++++++++++++++++++++++++++--
> >>> 3 files changed, 80 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> >>> index 81b93515cb..f9cf931b77 100644
> >>> --- a/doc/guides/rel_notes/deprecation.rst
> >>> +++ b/doc/guides/rel_notes/deprecation.rst
> >>> @@ -95,6 +95,11 @@ Deprecation Notices
> >>> - ``rte_flow_item_pppoe``
> >>> - ``rte_flow_item_pppoe_proto_id``
> >>>
> >>> +* ethdev: The flow item ``RTE_FLOW_ITEM_TYPE_VXLAN_GPE`` is replaced with ``RTE_FLOW_ITEM_TYPE_VXLAN``.
> >>> + The item ``RTE_FLOW_ITEM_TYPE_VXLAN_GPE``, the struct ``rte_flow_item_vxlan_gpe``, its mask ``rte_flow_item_vxlan_gpe_mask``,
> >>> + and the header struct ``rte_vxlan_gpe_hdr`` with the macro ``RTE_ETHER_VXLAN_GPE_HLEN``
> >>> + will be removed in DPDK 25.11.
> >>> +
> >>>
> >>
> >> We have 24.11 in between.
> >
> > Isn't it too soon?
> > Should we remove at all?
next prev parent reply other threads:[~2024-02-09 13:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-30 11:25 [RFC V1 0/1] " Gavin Li
2024-01-30 11:25 ` [RFC V1 1/1] " Gavin Li
2024-02-06 22:51 ` Thomas Monjalon
2024-02-07 4:49 ` Ajit Khaparde
2024-02-08 23:54 ` Ferruh Yigit
2024-02-09 10:12 ` Thomas Monjalon
2024-02-09 12:11 ` Ferruh Yigit
2024-02-09 13:44 ` Thomas Monjalon [this message]
2024-02-09 14:58 ` Ferruh Yigit
2024-02-09 15:32 ` Thomas Monjalon
2024-02-09 15:58 ` Ferruh Yigit
2024-02-19 3:16 ` Gavin Li
2024-02-19 3:44 ` Gavin Li
2024-02-19 4:03 ` Gavin Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4126957.6PsWsQAL7t@thomas \
--to=thomas@monjalon.net \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@amd.com \
--cc=gavinl@nvidia.com \
--cc=jiaweiw@nvidia.com \
--cc=orika@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).