DPDK patches and discussions
 help / color / mirror / Atom feed
From: Tudor Cornea <tudor.cornea@gmail.com>
To: ferruh.yigit@intel.com
Cc: linville@tuxdriver.com, stephen@networkplumber.org,
	andrew.rybchenko@oktetlabs.ru, thomas@monjalon.net,
	jerinj@marvell.com, dev@dpdk.org,
	Tudor Cornea <tudor.cornea@gmail.com>
Subject: [dpdk-dev] [PATCH v3] net/af_packet: reinsert the stripped vlan tag
Date: Fri, 24 Sep 2021 14:44:45 +0300	[thread overview]
Message-ID: <1632483885-84732-1-git-send-email-tudor.cornea@gmail.com> (raw)
In-Reply-To: <1631091558-63337-1-git-send-email-tudor.cornea@gmail.com>

The af_packet pmd driver binds to a raw socket and allows
sending and receiving of packets through the kernel.

Since commit [1], the kernel strips the vlan tags early in
__netif_receive_skb_core(), so we receive untagged packets while
running with the af_packet pmd.

Luckily for us, the skb vlan-related fields are still populated from the
stripped vlan tags, so we end up having all the information
that we need in the mbuf.

Having the pmd driver support DEV_RX_OFFLOAD_VLAN_STRIP allows the
application to control the desired vlan stripping behavior,
until we have a way to describe offloads that can't be disabled by
pmd drivers.

This patch will cause a change in the default way that the af_packet
pmd treats received vlan-tagged frames. While previously, the
application was required to check the PKT_RX_VLAN_STRIPPED flag, after
this patch, the pmd will re-insert the vlan tag transparently to the
user, unless the DEV_RX_OFFLOAD_VLAN_STRIP is enabled in
rxmode.offloads.

I've attempted a preliminary benchmark to understand if the change could
cause a sizable performance hit.

Setup:
Two virtual machines running on top of an ESXi hypervisor

Tx: DPDK app (running on top of vmxnet3 PMD)
Rx: af_packet (running on top of a kernel vmxnet3 interface)
Packet size :68 (packet contains a vlan tag)

Rates:
Tx - 1.419 Mpps
Rx (without vlan insertion) - 1227636 pps
Rx (with vlan insertion)    - 1220081 pps

At a first glance, we don't seem to have a large degradation in terms of packet rate

[1] https://github.com/torvalds/linux/commit/bcc6d47903612c3861201cc3a866fb604f26b8b2

Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>

---
v3:
* Updated release note and documentation
* Updated commit with performance measurements
v2:
* Added DEV_RX_OFFLOAD_VLAN_STRIP to rxmode->offloads
---
 doc/guides/nics/af_packet.rst             | 38 +++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst    |  4 ++++
 drivers/net/af_packet/rte_eth_af_packet.c | 12 ++++++++++
 3 files changed, 54 insertions(+)

diff --git a/doc/guides/nics/af_packet.rst b/doc/guides/nics/af_packet.rst
index efd6f1c..97d5502 100644
--- a/doc/guides/nics/af_packet.rst
+++ b/doc/guides/nics/af_packet.rst
@@ -65,3 +65,41 @@ framecnt=512):
 .. code-block:: console
 
     --vdev=eth_af_packet0,iface=tap0,blocksz=4096,framesz=2048,framecnt=512,qpairs=1,qdisc_bypass=0
+
+Features and Limitations of the af_packet PMD
+---------------------------------------------
+
+Since the following commit, the Linux kernel strips the vlan tag
+
+.. code-block:: console
+
+    commit bcc6d47903612c3861201cc3a866fb604f26b8b2
+    Author: Jiri Pirko <jpirko@xxxxxxxxxx>
+    Date:   Thu Apr 7 19:48:33 2011 +0000
+
+     net: vlan: make non-hw-accel rx path similar to hw-accel
+
+Running on such a kernel results in receiving untagged frames while using
+the af_packet PMD. Fortunately, the stripped information is still available
+for use in ``mbuf->vlan_tci``, and applications could check ``PKT_RX_VLAN_STRIPPED``.
+
+However, we currently don't have a way to describe offloads which can't be
+disabled by PMDs, and this creates an inconsistency with the way applications
+expect the PMD offloads to work, and requires them to be aware of which
+underlying driver they use.
+
+Since release 21.11 the af_packet PMD will implement support for the
+``DEV_RX_OFFLOAD_VLAN_STRIP`` offload, and users can control the desired vlan
+stripping behavior.
+
+It's important to note that the default case will change. If previously,
+the vlan tag was stripped, if the application now requires the same behavior,
+it will need to configure ``rxmode.offloads`` with ``DEV_RX_OFFLOAD_VLAN_STRIP``.
+
+The PMD driver will re-insert the vlan tag transparently to the application
+if the kernel strips it, as long as the ``DEV_RX_OFFLOAD_VLAN_STRIP`` is not
+enabled.
+
+.. code-block:: console
+
+    port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_VLAN_STRIP
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index ad7c1af..095fd5b 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -66,6 +66,10 @@ New Features
 
   * Added rte_flow support for dual VLAN insert and strip actions.
 
+* **Updated af_packet ethdev driver.**
+
+  * Added DEV_RX_OFFLOAD_VLAN_STRIP capability.
+
 * **Updated Marvell cnxk crypto PMD.**
 
   * Added AES-CBC SHA1-HMAC support in lookaside protocol (IPsec) for CN10K.
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index b73b211..5ed9dd6 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -48,6 +48,7 @@ struct pkt_rx_queue {
 
 	struct rte_mempool *mb_pool;
 	uint16_t in_port;
+	uint8_t vlan_strip;
 
 	volatile unsigned long rx_pkts;
 	volatile unsigned long rx_bytes;
@@ -78,6 +79,7 @@ struct pmd_internals {
 
 	struct pkt_rx_queue *rx_queue;
 	struct pkt_tx_queue *tx_queue;
+	uint8_t vlan_strip;
 };
 
 static const char *valid_arguments[] = {
@@ -148,6 +150,9 @@ eth_af_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		if (ppd->tp_status & TP_STATUS_VLAN_VALID) {
 			mbuf->vlan_tci = ppd->tp_vlan_tci;
 			mbuf->ol_flags |= (PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED);
+
+			if (!pkt_q->vlan_strip && rte_vlan_insert(&mbuf))
+				PMD_LOG(ERR, "Failed to reinsert VLAN tag");
 		}
 
 		/* release incoming frame and advance ring buffer */
@@ -302,6 +307,11 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static int
 eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
 {
+	struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
+	const struct rte_eth_rxmode *rxmode = &dev_conf->rxmode;
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	internals->vlan_strip = !!(rxmode->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
 	return 0;
 }
 
@@ -318,6 +328,7 @@ eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	dev_info->min_rx_bufsize = 0;
 	dev_info->tx_offload_capa = DEV_TX_OFFLOAD_MULTI_SEGS |
 		DEV_TX_OFFLOAD_VLAN_INSERT;
+	dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP;
 
 	return 0;
 }
@@ -448,6 +459,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
 
 	dev->data->rx_queues[rx_queue_id] = pkt_q;
 	pkt_q->in_port = dev->data->port_id;
+	pkt_q->vlan_strip = internals->vlan_strip;
 
 	return 0;
 }
-- 
2.7.4


  parent reply	other threads:[~2021-09-24 11:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20 12:46 [dpdk-dev] [PATCH] net/af_packet: try to " Tudor Cornea
2021-08-31 15:31 ` Ferruh Yigit
2021-09-01 19:07   ` Tudor Cornea
2021-09-01 21:34     ` Stephen Hemminger
2021-09-02 10:49       ` Ferruh Yigit
2021-09-03  9:45         ` Tudor Cornea
2021-09-08  8:59 ` [dpdk-dev] [PATCH v2] net/af_packet: " Tudor Cornea
2021-09-20 15:40   ` Ferruh Yigit
2021-09-21 20:59     ` Tudor Cornea
2021-09-24 11:44   ` Tudor Cornea [this message]
2021-09-24 15:10     ` [dpdk-dev] [PATCH v3] " Stephen Hemminger
2021-09-29 14:13       ` Tudor Cornea
2021-09-29 14:08     ` [dpdk-dev] [PATCH v4] " Tudor Cornea
2021-09-30  8:14       ` Ferruh Yigit
2021-10-01  8:49         ` Tudor Cornea
2021-10-01  8:35       ` [dpdk-dev] [PATCH v5] " Tudor Cornea
2021-10-01 15:02         ` Stephen Hemminger
2021-10-06  9:42           ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1632483885-84732-1-git-send-email-tudor.cornea@gmail.com \
    --to=tudor.cornea@gmail.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jerinj@marvell.com \
    --cc=linville@tuxdriver.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).