DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: "Vladimir Medvedkin" <medvedkinv@gmail.com>,
	"Dengdui Huang" <huangdengdui@huawei.com>, <dev@dpdk.org>,
	<stephen@networkplumber.org>, <thomas@monjalon.net>,
	<aman.deep.singh@intel.com>, <lihuisong@huawei.com>,
	<fengchengwen@huawei.com>, <liuyonglong@huawei.com>
Subject: RE: [PATCH] net: support VLAN stacking packet type parsing
Date: Wed, 9 Jul 2025 15:50:26 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9FDAA@smartserver.smartshare.dk> (raw)
In-Reply-To: <aG5WD8YfiZyRiURS@bricha3-mobl1.ger.corp.intel.com>

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Wednesday, 9 July 2025 13.44
> 
> On Tue, Jul 08, 2025 at 04:15:36PM +0100, Bruce Richardson wrote:
> > On Tue, Jul 08, 2025 at 05:07:05PM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Tuesday, 8 July 2025 12.16
> > > >
> > > > On Tue, Jul 08, 2025 at 12:00:42AM +0200, Morten Brørup wrote:
> > > > >    From: Vladimir Medvedkin [mailto:medvedkinv@gmail.com]
> > > > >    Sent: Monday, 7 July 2025 22.10
> > > > >
> > > > >
> > > > >    Hi Morten, all,
> > > > >
> > > > >
> > > > >
> > > > >    пн, 7 июл. 2025 г. в 19:09, Morten Brørup
> > > > >    <[1]mb@smartsharesystems.com>:
> > > > >
> > > > >      > From: Bruce Richardson
> [mailto:[2]bruce.richardson@intel.com]
> > > > >      > Sent: Friday, 4 July 2025 13.32
> > > > >      > Hi all,
> > > > >      >
> > > > >      > this email discussion comes at a bit of a fortunate time
> for
> > > > me,
> > > > >      as I'm
> > > > >      > currently looking at our vlan tag/qinq stripping
> behaviour in
> > > > our
> > > > >      Intel
> > > > >      > NIC
> > > > >      > drivers, and there is some discussion internally as to
> what our
> > > > >      driver
> > > > >      > behaviour should be compared to what it has historically
> been.
> > > > :-)
> > > > >      >
> > > > >      > The documentation - both in the NIC guide [1] and the
> testpmd
> > > > >      guide [2]
> > > > >      > -
> > > > >      > is rather short on detail as to what exactly the
> behaviour
> > > > should
> > > > >      be
> > > > >      > when
> > > > >      > vlan strip or qinq strip is implemented. Therefore, I'd
> hope
> > > > that
> > > > >      those
> > > > >      > more familiar with networking than me would be able to
> help
> > > > >      clarify
> > > > >      > things
> > > > >      > so we can document the correct behaviour precisely - and
> > > > hopefully
> > > > >      test
> > > > >      > our
> > > > >      > drivers against it in future!
> > > > >      >
> > > >
> > > > <snipping full discussion for brevity>
> > > >
> > > > So from the discussion, would the following be a good set of
> guidelines
> > > > to
> > > > document for correct driver behaviour:
> > > >
> > > > * VLAN-strip always strips one VLAN tag if available. If multiple
> VLAN
> > > > tags
> > > >   are present, it strips the outer.
> > >
> > > Yes.
> > >
> > > > * QinQ strip, strips two VLAN tags if present. If only one tag is
> > > > present it
> > > >   behaves as VLAN-strip.
> > >
> > > Yes.
> > >
> > > > * Specifying both VLAN-strip and QinQ strip is the same as QinQ
> strip
> > > > alone (??)
> > >
> > > Yes.
> > >
> > > One more thought about this:
> > > With QinQ stripping enabled, an mbuf for a single VLAN tagged packet
> will have the RX_VLAN and RX_VLAN_STRIPPED flags set.
> > > So, considering this output, it would be reasonable requiring that
> VLAN stripping is also enabled when QinQ stripping is enabled.
> > >
> > > On the other hand, that might be a new requirement for applications.
> > > So backwards compatibility speaks for making the VLAN stripping
> configuration "don't care" when the QinQ stripping configuration is
> enabled.
> > >
> > > >
> > > > Mbuf reporting behaviour:
> > > >
> > > > Input Traffic		VLAN-strip on		QinQ strip on
> > > > --------------		-------------		-------------
> > > > Single VLAN pkts		Tag in mbuf->vlan_tci	Tag in mbuf-
> >vlan_tci
> > > >
> > > > Double VLAN pkts		Outer tag in vlan_tci	Outer tag in
> vlan_tci_outer
> > > > 								Inner tag in
> vlan_tci
> > > >
> > > >
> > > > Does the above seem reasonable and correct?
> > >
> > > Yes.
> > >
> > > BTW: I think that having double tagged packets on a link configured
> for single VLAN stripping is weird.
> > > But describing the expected behavior is good!
> > >
> > > >
> > > > My one (minor)concern would be the handling and placement of the
> single
> > > > tag
> > > > in the QinQ case. Depending on how the hardware treats a single
> tag in
> > > > that
> > > > mode, the data path may have to pay a penalty if the HW takes a
> single
> > > > VLAN
> > > > and places it in the "outer" position in the descriptor, since it
> needs
> > > > to
> > > > go in the "inner" position in the mbuf, necessitating some
> conditional
> > > > logic.  AFAIK (subject to me actually testing for confirmation),
> this
> > > > will
> > > > be the case for our Intel drivers.
> > >
> > > If that is the case, then maybe someone already thought about this
> when designing the NIC HW, and came to a different conclusion than I
> did.
> > >
> > > Inspired by Vladimir's comments about QinQ packets with an S-TAG
> (Subscriber tag) and no C-TAG (Customer tag), maybe we should consider
> the alternative:
> > > When configured for QinQ stripping, the first tag is always
> considered the S-TAG, and thus always goes to the vlan_tci_outer, and
> the second (inner) tag is optional, and goes to the vlan_tci if present.
> > >
> > > <feature creep>
> > > When configured for QinQ stripping, we could control the single VLAN
> tag placement through the VLAN stripping configuration:
> > > With VLAN stripping also enabled, the link is considered a "super
> hybrid", and the VLAN ID of single VLAN tagged packets goes into
> vlan_tci (being a normal VLAN tagged packet).
> > > With VLAN stripping disabled, the link is considered a pure QinQ
> trunk, and the VLAN ID of single VLAN tagged packets goes into
> vlan_tci_outer (being the S-TAG of a QinQ packet with no C-TAG).
> > > </feature creep>
> > >
> > > But again, this only works when VLAN/QinQ stripping is enabled. Into
> which fields the various VLAN tags are written when VLAN/QinQ stripping
> is disabled will be fixed/hardcoded.
> > > I'm not even sure the vlan_tci and vlan_tci_outer fields are filled
> when stripping is disabled. But there are separate mbuf flags for VLAN
> presence and VLAN stripped (and QinQ presence and QinQ stripped), so it
> could be supported.
> > >
> > > Furthermore, in favor of my original suggestion, consider TX:
> > > TX of an mbuf with a single VLAN tag doesn't know about QinQ with no
> C-TAG, and thus looks at vlan_tci when transmitting such packets.
> > > If we like symmetry, RX should behave similarly.
> > >
> > > I'm leaning towards my original suggestion, but I don't have a
> strong opinion about this.
> > > There are good arguments for both solutions, either vlan_tci or
> vlan_tci_outer for single VLAN tagged packets received on a link
> configured for QinQ.
> > >
> > I'd tend toward your original suggestion too, where single vlan always
> gets
> > put in vlan_tci field if stripped.
> >
> > I think if we were to go back and change things is would be to not
> have
> > "vlan_tci_outer" field, but have a "vlan_tci_2" field, where the
> *second* vlan
> > tag, if present, is put. That would mean that the behaviour would be
> > unambiguous and the field would only ever be under consideration for
> > double-vlan packets with QinQ strip enabled. Sadly, I think that would
> be
> > too big a change to make to the mbuf at this stage. :-(
> >
> 
> So, I started looking into trying to clarify things a bit in the docs,
> and
> in the process of doing so found [1] in the mbuf docs.
> 
>  * - If both RTE_MBUF_F_RX_QINQ_STRIPPED and RTE_MBUF_F_RX_VLAN_STRIPPED
> are
>  *   set, the 2 VLANs have been stripped by the hardware and their TCIs
> are
>  *   saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
>  * - If RTE_MBUF_F_RX_QINQ_STRIPPED is set and
> RTE_MBUF_F_RX_VLAN_STRIPPED
>  *   is unset, only the outer VLAN is removed from packet data, but both
> tci
>  *   are saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer
> (outer).
> 
> This documented behaviour seems to contradict our proposal above.
> However,
> is this the current validated behaviour of our NIC drivers? It also
> seems
> to make QINQ_STRIP the equivalent of VLAN strip if just one is
> specified. I
> wonder if the original intent when QinQ support was added was that vlan
> strip would strip tags with type 0x8100, and QinQ strip would strip tags
> with 0x88a8.

That interpretation seems likely.
It would also be symmetrical for TX.
The VLAN flag strips/inserts one VLAN tag with EtherType 0x8100 to/from mbuf->vlan_tci.
The QinQ flag strips/inserts one VLAN tag with EtherType 0x88a8 to/from mbuf->vlan_tci_outer.

This also allows transmitting a QinQ packet with no C-TAG, i.e. a packet with a single Ethertype 0x88a8 VLAN tag (and no inner VLAN tag).

But it doesn't support VLAN stacking, i.e. with the outer QinQ tag using EtherType 0x8100.

In another thread, Stephen made a point about the APIs reflecting hardware support.
If no NIC hardware supports VLAN stacking, there is no point in designing our API for it.

So, maybe we should give up on hardware offloaded VLAN stacking? Our proposal only supports it for RX anyway.

Or, we could extend the API with another set of flags for double-tagged packets with EtherType 0x8100 in the outer tag. (We can reuse the mbuf->vlan_tci_outer field for the outer VLAN ID.)

> 
> /Bruce
> 
> [1]
> https://doc.dpdk.org/api/rte__mbuf__core_8h.html#af720eb399a97e49e5e2195
> 9afa307a0f

  reply	other threads:[~2025-07-09 13:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-03  9:30 Dengdui Huang
2025-07-04 10:18 ` Morten Brørup
2025-07-04 11:32   ` Bruce Richardson
2025-07-07 18:08     ` Morten Brørup
2025-07-07 20:10       ` Vladimir Medvedkin
2025-07-07 22:00         ` Morten Brørup
2025-07-08 10:16           ` Bruce Richardson
2025-07-08 15:07             ` Morten Brørup
2025-07-08 15:15               ` Bruce Richardson
2025-07-09 11:44                 ` Bruce Richardson
2025-07-09 13:50                   ` Morten Brørup [this message]
2025-07-08 12:25           ` Medvedkin, Vladimir

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35E9FDAA@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=aman.deep.singh@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=huangdengdui@huawei.com \
    --cc=lihuisong@huawei.com \
    --cc=liuyonglong@huawei.com \
    --cc=medvedkinv@gmail.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).