DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/3] AF_XDP tx halt fix, IRQ pinning and unaligned chunks
@ 2019-09-30 16:42 Ciara Loftus
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets Ciara Loftus
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Ciara Loftus @ 2019-09-30 16:42 UTC (permalink / raw)
  To: dev, xiaolong.ye, kevin.laatz, bruce.richardson, ciara.loftus

This series contains 3 patches for the AF_XDP PMD.

Previous: http://mails.dpdk.org/archives/dev/2019-September/143910.html

Patch 1: fix Tx halt when no recv packets (Xiaolong Ye)

Patch 2: support pinning of IRQs
v1 -> v2:
* Change enum format to match coding style
* Change some variables to const
* Re-order arguments in functions
* Do not duplicate driver string in generate_search_idx
* Check if driver handler function ptr is !NULL before using
* Check return value of fwrite
* Use rte_memcpy

Patch 3: enable support for unaligned umem chunks
v1 -> v2:
* Take mbuf alloc out of reserve_fill_queue
* Free correct mbuf on reserve fail for tx

The performance of the new zero copy implementation was measured to be within 5%
of the previous zero copy implementation, generally with an improvement observed
for single-core (with need wakeup) and multiple-PMD test cases.

Ciara Loftus (2):
  net/af_xdp: support pinning of IRQs
  net/af_xdp: enable support for unaligned umem chunks

Xiaolong Ye (1):
  net/af_xdp: fix Tx halt when no recv packets

 doc/guides/nics/af_xdp.rst             |  17 +-
 doc/guides/rel_notes/release_19_11.rst |  11 +
 drivers/net/af_xdp/rte_eth_af_xdp.c    | 780 ++++++++++++++++++++++---
 3 files changed, 714 insertions(+), 94 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets
  2019-09-30 16:42 [dpdk-dev] [PATCH v2 0/3] AF_XDP tx halt fix, IRQ pinning and unaligned chunks Ciara Loftus
@ 2019-09-30 16:42 ` Ciara Loftus
  2019-10-22  5:32   ` Ye Xiaolong
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs Ciara Loftus
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks Ciara Loftus
  2 siblings, 1 reply; 27+ messages in thread
From: Ciara Loftus @ 2019-09-30 16:42 UTC (permalink / raw)
  To: dev, xiaolong.ye, kevin.laatz, bruce.richardson, ciara.loftus

From: Xiaolong Ye <xiaolong.ye@intel.com>

The kernel only consumes Tx packets if we have some Rx traffic on specified
queue or we have called send(). So we need to issue a send() even when the
allocation fails so that kernel will start to consume packets again.

Commit 45bba02c95b0 ("net/af_xdp: support need wakeup feature") breaks
above rule by adding some condition to send, this patch fixes it while
still keeps the need_wakeup feature for Tx.

Fixes: 45bba02c95b0 ("net/af_xdp: support need wakeup feature")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 41ed5b2af..e496e9aaa 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -286,19 +286,16 @@ kick_tx(struct pkt_tx_queue *txq)
 {
 	struct xsk_umem_info *umem = txq->pair->umem;
 
-#if defined(XDP_USE_NEED_WAKEUP)
-	if (xsk_ring_prod__needs_wakeup(&txq->tx))
-#endif
-		while (send(xsk_socket__fd(txq->pair->xsk), NULL,
-			    0, MSG_DONTWAIT) < 0) {
-			/* some thing unexpected */
-			if (errno != EBUSY && errno != EAGAIN && errno != EINTR)
-				break;
-
-			/* pull from completion queue to leave more space */
-			if (errno == EAGAIN)
-				pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
-		}
+	while (send(xsk_socket__fd(txq->pair->xsk), NULL,
+		    0, MSG_DONTWAIT) < 0) {
+		/* some thing unexpected */
+		if (errno != EBUSY && errno != EAGAIN && errno != EINTR)
+			break;
+
+		/* pull from completion queue to leave more space */
+		if (errno == EAGAIN)
+			pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
+	}
 	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
 }
 
@@ -367,7 +364,10 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 
 	xsk_ring_prod__submit(&txq->tx, nb_pkts);
 
-	kick_tx(txq);
+#if defined(XDP_USE_NEED_WAKEUP)
+	if (xsk_ring_prod__needs_wakeup(&txq->tx))
+#endif
+		kick_tx(txq);
 
 	txq->stats.tx_pkts += nb_pkts;
 	txq->stats.tx_bytes += tx_bytes;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-09-30 16:42 [dpdk-dev] [PATCH v2 0/3] AF_XDP tx halt fix, IRQ pinning and unaligned chunks Ciara Loftus
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets Ciara Loftus
@ 2019-09-30 16:42 ` Ciara Loftus
  2019-09-30 17:11   ` Stephen Hemminger
  2019-10-18 23:49   ` Ye Xiaolong
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks Ciara Loftus
  2 siblings, 2 replies; 27+ messages in thread
From: Ciara Loftus @ 2019-09-30 16:42 UTC (permalink / raw)
  To: dev, xiaolong.ye, kevin.laatz, bruce.richardson, ciara.loftus

Network devices using the AF_XDP PMD will trigger interrupts
on reception of packets. The new PMD argument 'queue_irq'
allows the user to specify a core on which to pin interrupts
for a given queue. Multiple queue_irq arguments can be specified.
For example:

  --vdev=net_af_xdp1,iface=eth0,queue_count=2,
           queue_irq=0:2,queue_irq=1:5

..will pin queue 0 interrupts to core 2 and queue 1 interrupts
to core 5.

The queue argument refers to the ethdev queue as opposed to the
netdev queue. These values are the same unless a value greater
than 0 is specified in a start_queue argument.

The drivers supported for this feature are those with support for
AF_XDP zero copy in the kernel, namely ixgbe, i40e and mlx5_core.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/nics/af_xdp.rst             |  15 +
 doc/guides/rel_notes/release_19_11.rst |   7 +
 drivers/net/af_xdp/rte_eth_af_xdp.c    | 366 ++++++++++++++++++++++++-
 3 files changed, 383 insertions(+), 5 deletions(-)

diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
index ec46f08f0..a255ba4e7 100644
--- a/doc/guides/nics/af_xdp.rst
+++ b/doc/guides/nics/af_xdp.rst
@@ -36,6 +36,11 @@ The following options can be provided to set up an af_xdp port in DPDK.
 *   ``start_queue`` - starting netdev queue id (optional, default 0);
 *   ``queue_count`` - total netdev queue number (optional, default 1);
 *   ``pmd_zero_copy`` - enable zero copy or not (optional, default 0);
+*   ``queue_irq`` - pin queue irqs to specified core <queue:core> (optional,
+    default no pinning). The queue argument refers to the ethdev queue as
+    opposed to the netdev queue. These values are the same unless a value
+    greater than 0 is specified for start_queue. ixgbe, i40e and mlx5 drivers
+    supported;
 
 Prerequisites
 -------------
@@ -57,3 +62,13 @@ The following example will set up an af_xdp interface in DPDK:
 .. code-block:: console
 
     --vdev net_af_xdp,iface=ens786f1
+
+Pin queue IRQs
+--------------
+The following example will pin queue 0 interrupts to core 2 and queue 1
+interrupts to core 5.
+
+.. code-block:: console
+
+      --vdev=net_af_xdp1,iface=eth0,queue_count=2,
+               queue_irq=0:2,queue_irq=1:5
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 27cfbd9e3..06bf57c42 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -56,6 +56,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Updated the AF_XDP PMD.**
+
+  Updated the AF_XDP PMD. The new features include:
+
+  * Support for pinning netdev queue IRQs to cores specified by the user.
+    Available for ixgbe, i40e and mlx5 drivers.
+
 
 Removed Items
 -------------
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index e496e9aaa..dfb2845cb 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -3,6 +3,7 @@
  */
 #include <unistd.h>
 #include <errno.h>
+#include <regex.h>
 #include <stdlib.h>
 #include <string.h>
 #include <poll.h>
@@ -10,6 +11,7 @@
 #include <net/if.h>
 #include <sys/socket.h>
 #include <sys/ioctl.h>
+#include <sys/sysinfo.h>
 #include <linux/if_ether.h>
 #include <linux/if_xdp.h>
 #include <linux/if_link.h>
@@ -17,6 +19,8 @@
 #include <linux/sockios.h>
 #include "af_xdp_deps.h"
 #include <bpf/xsk.h>
+#include <sys/stat.h>
+#include <libgen.h>
 
 #include <rte_ethdev.h>
 #include <rte_ethdev_driver.h>
@@ -116,6 +120,7 @@ struct pmd_internals {
 	int queue_cnt;
 	int max_queue_cnt;
 	int combined_queue_cnt;
+	int queue_irqs[RTE_MAX_QUEUES_PER_PORT];
 
 	int pmd_zc;
 	struct rte_ether_addr eth_addr;
@@ -128,12 +133,14 @@ struct pmd_internals {
 #define ETH_AF_XDP_START_QUEUE_ARG		"start_queue"
 #define ETH_AF_XDP_QUEUE_COUNT_ARG		"queue_count"
 #define ETH_AF_XDP_PMD_ZC_ARG			"pmd_zero_copy"
+#define ETH_AF_XDP_QUEUE_IRQ_ARG		"queue_irq"
 
 static const char * const valid_arguments[] = {
 	ETH_AF_XDP_IFACE_ARG,
 	ETH_AF_XDP_START_QUEUE_ARG,
 	ETH_AF_XDP_QUEUE_COUNT_ARG,
 	ETH_AF_XDP_PMD_ZC_ARG,
+	ETH_AF_XDP_QUEUE_IRQ_ARG,
 	NULL
 };
 
@@ -144,6 +151,26 @@ static const struct rte_eth_link pmd_link = {
 	.link_autoneg = ETH_LINK_AUTONEG
 };
 
+/* drivers supported for the queue_irq option */
+enum supported_drivers {
+	I40E_DRIVER,
+	IXGBE_DRIVER,
+	MLX5_DRIVER,
+	NUM_DRIVERS
+};
+const char driver_array[NUM_DRIVERS][NAME_MAX] = {"i40e", "ixgbe", "mlx5_core"};
+
+/*
+ * function pointer template to be implemented for each driver in 'driver_array'
+ * to generate the appropriate regular expression to search for in
+ * /proc/interrupts in order to identify the IRQ number for the netdev_qid of
+ * the given interface.
+ */
+typedef
+int (*generate_driver_regex_func)(struct pmd_internals *internals,
+				  uint16_t netdev_qid,
+				  char *iface_regex_str);
+
 static inline int
 reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size)
 {
@@ -660,6 +687,287 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 	return ret;
 }
 
+/** get interface's driver name to determine /proc/interrupts entry format */
+static int
+get_driver_name(struct pmd_internals *internals, char *driver)
+{
+	char driver_path[PATH_MAX];
+	struct stat s;
+	char link[PATH_MAX];
+	int len;
+
+	snprintf(driver_path, sizeof(driver_path),
+			"/sys/class/net/%s/device/driver", internals->if_name);
+	if (lstat(driver_path, &s)) {
+		AF_XDP_LOG(ERR, "Error reading %s: %s\n",
+					driver_path, strerror(errno));
+		return -errno;
+	}
+
+	/* driver_path should link to /sys/bus/pci/drivers/<driver_name> */
+	len = readlink(driver_path, link, PATH_MAX - 1);
+	if (len == -1) {
+		AF_XDP_LOG(ERR, "Error reading symbolic link %s: %s\n",
+					driver_path, strerror(errno));
+		return -errno;
+	}
+
+	link[len] = '\0';
+	strlcpy(driver, basename(link), NAME_MAX);
+	if (!strncmp(driver, ".", strlen(driver))) {
+		AF_XDP_LOG(ERR, "Error getting driver name from %s: %s\n",
+					link, strerror(errno));
+		return -errno;
+	}
+
+	return 0;
+}
+
+static int
+generate_ixgbe_i40e_regex(struct pmd_internals *internals, uint16_t netdev_qid,
+			  char *iface_regex_str)
+{
+	if (snprintf(iface_regex_str, 128,
+			"-%s.*-%d", internals->if_name, netdev_qid) >= 128) {
+		AF_XDP_LOG(INFO, "Cannot get interrupt for %s q %i\n",
+					internals->if_name, netdev_qid);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+generate_mlx5_regex(struct pmd_internals *internals, uint16_t netdev_qid,
+		    char *iface_regex_str)
+{
+	char pci_path[PATH_MAX];
+	char *pci;
+	int ret = -1;
+	struct stat s;
+	char *link;
+	int len;
+
+	snprintf(pci_path, sizeof(pci_path),
+			"/sys/class/net/%s/device", internals->if_name);
+	if (lstat(pci_path, &s)) {
+		AF_XDP_LOG(ERR, "Error reading %s: %s\n",
+					pci_path, strerror(errno));
+		return -errno;
+	}
+
+	/* pci_path should link to a directory whose name is the pci addr */
+	link = malloc(s.st_size + 1);
+	len = readlink(pci_path, link, PATH_MAX - 1);
+	if (len == -1) {
+		AF_XDP_LOG(ERR, "Error reading symbolic link %s: %s\n",
+					pci_path, strerror(errno));
+		ret = -errno;
+		goto out;
+	}
+
+	link[len] = '\0';
+	pci = basename(link);
+	if (!strncmp(pci, ".", strlen(pci))) {
+		AF_XDP_LOG(ERR, "Error getting pci from %s\n", link);
+		goto out;
+	}
+
+	if (snprintf(iface_regex_str, 128, ".*p%i@pci:%s", netdev_qid, pci) >=
+			128) {
+		AF_XDP_LOG(INFO, "Cannot get interrupt for %s q %i\n",
+					internals->if_name, netdev_qid);
+		goto out;
+	}
+
+	ret = 0;
+
+out:
+	if (link)
+		free(link);
+
+	return ret;
+}
+
+/*
+ * array of handlers for different drivers for generating appropriate regex
+ * format for searching /proc/interrupts
+ */
+generate_driver_regex_func driver_handlers[NUM_DRIVERS] = {
+					generate_ixgbe_i40e_regex,
+					generate_ixgbe_i40e_regex,
+					generate_mlx5_regex};
+
+/*
+ * function for getting the index into driver_handlers array that corresponds
+ * to 'driver'
+ */
+static int
+get_driver_idx(const char *driver)
+{
+	for (int i = 0; i < NUM_DRIVERS; i++) {
+		if (strncmp(driver, driver_array[i], strlen(driver_array[i])))
+			continue;
+		return i;
+	}
+
+	return -1;
+}
+
+/** generate /proc/interrupts search regex based on driver type */
+static int
+generate_search_regex(struct pmd_internals *internals, const char *driver,
+		      uint16_t netdev_qid, regex_t *r)
+{
+	char iface_regex_str[128];
+	int ret = -1;
+	int idx = get_driver_idx(driver);
+
+	if (idx == -1 || driver_handlers[idx] == NULL) {
+		AF_XDP_LOG(ERR, "Error getting driver handler for %s\n",
+					internals->if_name);
+		goto out;
+	}
+
+	if (driver_handlers[idx](internals, netdev_qid, iface_regex_str)) {
+		AF_XDP_LOG(ERR, "Error getting regex string for %s\n",
+					internals->if_name);
+		goto out;
+	}
+
+	if (regcomp(r, iface_regex_str, 0)) {
+		AF_XDP_LOG(ERR, "Error computing regex %s\n", iface_regex_str);
+		goto out;
+	}
+
+	ret = 0;
+
+out:
+	return ret;
+}
+
+/** get interrupt number associated with the given interface qid */
+static int
+get_interrupt_number(struct pmd_internals *internals, regex_t *r,
+		     int *interrupt)
+{
+	FILE *f_int_proc;
+	int found = 0;
+	char line[4096];
+	int ret = 0;
+
+	f_int_proc = fopen("/proc/interrupts", "r");
+	if (f_int_proc == NULL) {
+		AF_XDP_LOG(ERR, "Failed to open /proc/interrupts.\n");
+		return -1;
+	}
+
+	while (!feof(f_int_proc) && !found) {
+		/* Make sure to read a full line at a time */
+		if (fgets(line, sizeof(line), f_int_proc) == NULL ||
+				line[strlen(line) - 1] != '\n') {
+			AF_XDP_LOG(ERR, "Error reading from interrupts file\n");
+			ret = -1;
+			break;
+		}
+
+		/* Extract interrupt number from line */
+		if (regexec(r, line, 0, NULL, 0) == 0) {
+			*interrupt = atoi(line);
+			found = true;
+			AF_XDP_LOG(INFO, "Got interrupt %d for %s\n",
+						*interrupt, internals->if_name);
+		}
+	}
+
+	fclose(f_int_proc);
+
+	return ret;
+}
+
+/** affinitise interrupts for the given qid to the given coreid */
+static int
+set_irq_affinity(struct pmd_internals *internals, int coreid,
+		 uint16_t rx_queue_id, uint16_t netdev_qid, int interrupt)
+{
+	char bitmask[128];
+	char smp_affinity_filename[NAME_MAX];
+	FILE *f_int_smp_affinity;
+	int i, ret = 0;
+
+	/* Create affinity bitmask. Every 32 bits are separated by a comma */
+	snprintf(bitmask, sizeof(bitmask), "%x", 1 << (coreid % 32));
+	for (i = 0; i < coreid / 32; i++)
+		strlcat(bitmask, ",00000000", sizeof(bitmask));
+
+	/* Write the new affinity bitmask */
+	snprintf(smp_affinity_filename, sizeof(smp_affinity_filename),
+			"/proc/irq/%d/smp_affinity", interrupt);
+	f_int_smp_affinity = fopen(smp_affinity_filename, "w");
+	if (f_int_smp_affinity == NULL) {
+		AF_XDP_LOG(ERR, "Error opening %s\n", smp_affinity_filename);
+		return -1;
+	}
+	if (fwrite(bitmask, strlen(bitmask), 1, f_int_smp_affinity) != 1) {
+		AF_XDP_LOG(ERR, "Error writing to %s\n", smp_affinity_filename);
+		ret = -1;
+		goto out;
+	}
+
+	AF_XDP_LOG(INFO, "IRQs for %s ethdev queue %i (netdev queue %i)"
+				" affinitised to core %i\n",
+				internals->if_name, rx_queue_id,
+				netdev_qid, coreid);
+out:
+	fclose(f_int_smp_affinity);
+
+	return ret;
+}
+
+static void
+configure_irqs(struct pmd_internals *internals, uint16_t rx_queue_id)
+{
+	int coreid = internals->queue_irqs[rx_queue_id];
+	char driver[NAME_MAX];
+	uint16_t netdev_qid = rx_queue_id + internals->start_queue_idx;
+	regex_t r;
+	int interrupt;
+
+	if (coreid < 0)
+		return;
+
+	if (coreid > (get_nprocs() - 1)) {
+		AF_XDP_LOG(ERR, "Affinitisation failed - invalid coreid %i\n",
+					coreid);
+		return;
+	}
+
+	if (get_driver_name(internals, driver)) {
+		AF_XDP_LOG(ERR, "Error retrieving driver name for %s\n",
+					internals->if_name);
+		return;
+	}
+
+	if (generate_search_regex(internals, driver, netdev_qid, &r)) {
+		AF_XDP_LOG(ERR, "Error generating search regex for %s\n",
+					internals->if_name);
+		return;
+	}
+
+	if (get_interrupt_number(internals, &r, &interrupt)) {
+		AF_XDP_LOG(ERR, "Error getting interrupt number for %s\n",
+					internals->if_name);
+		return;
+	}
+
+	if (set_irq_affinity(internals, coreid, rx_queue_id, netdev_qid,
+				interrupt)) {
+		AF_XDP_LOG(ERR, "Error setting interrupt affinity for %s\n",
+					internals->if_name);
+		return;
+	}
+}
+
 static int
 eth_rx_queue_setup(struct rte_eth_dev *dev,
 		   uint16_t rx_queue_id,
@@ -697,6 +1005,8 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
 		goto err;
 	}
 
+	configure_irqs(internals, rx_queue_id);
+
 	rxq->fds[0].fd = xsk_socket__fd(rxq->xsk);
 	rxq->fds[0].events = POLLIN;
 
@@ -834,6 +1144,39 @@ parse_name_arg(const char *key __rte_unused,
 	return 0;
 }
 
+/** parse queue irq argument */
+static int
+parse_queue_irq_arg(const char *key __rte_unused,
+		   const char *value, void *extra_args)
+{
+	int (*queue_irqs)[RTE_MAX_QUEUES_PER_PORT] = extra_args;
+	char *parse_str = strdup(value);
+	char delimiter[] = ":";
+	char *queue_str;
+
+	queue_str = strtok(parse_str, delimiter);
+	if (queue_str != NULL && strncmp(queue_str, value, strlen(value))) {
+		char *end;
+		long queue = strtol(queue_str, &end, 10);
+
+		if (*end == '\0' && queue >= 0 &&
+				queue < RTE_MAX_QUEUES_PER_PORT) {
+			char *core_str = strtok(NULL, delimiter);
+			long core = strtol(core_str, &end, 10);
+
+			if (*end == '\0' && core >= 0 && core < get_nprocs()) {
+				(*queue_irqs)[queue] = core;
+				free(parse_str);
+				return 0;
+			}
+		}
+	}
+
+	AF_XDP_LOG(ERR, "Invalid queue_irq argument.\n");
+	free(parse_str);
+	return -1;
+}
+
 static int
 xdp_get_channels_info(const char *if_name, int *max_queues,
 				int *combined_queues)
@@ -877,7 +1220,8 @@ xdp_get_channels_info(const char *if_name, int *max_queues,
 
 static int
 parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
-			int *queue_cnt, int *pmd_zc)
+			int *queue_cnt, int *pmd_zc,
+			int (*queue_irqs)[RTE_MAX_QUEUES_PER_PORT])
 {
 	int ret;
 
@@ -903,6 +1247,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 	if (ret < 0)
 		goto free_kvlist;
 
+	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_IRQ_ARG,
+				 &parse_queue_irq_arg, queue_irqs);
+	if (ret < 0)
+		goto free_kvlist;
+
 free_kvlist:
 	rte_kvargs_free(kvlist);
 	return ret;
@@ -940,7 +1289,8 @@ get_iface_info(const char *if_name,
 
 static struct rte_eth_dev *
 init_internals(struct rte_vdev_device *dev, const char *if_name,
-			int start_queue_idx, int queue_cnt, int pmd_zc)
+			int start_queue_idx, int queue_cnt, int pmd_zc,
+			int queue_irqs[RTE_MAX_QUEUES_PER_PORT])
 {
 	const char *name = rte_vdev_device_name(dev);
 	const unsigned int numa_node = dev->device.numa_node;
@@ -957,6 +1307,8 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
 	internals->queue_cnt = queue_cnt;
 	internals->pmd_zc = pmd_zc;
 	strlcpy(internals->if_name, if_name, IFNAMSIZ);
+	rte_memcpy(internals->queue_irqs, queue_irqs,
+			sizeof(int) * RTE_MAX_QUEUES_PER_PORT);
 
 	if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
 				  &internals->combined_queue_cnt)) {
@@ -1035,6 +1387,9 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	struct rte_eth_dev *eth_dev = NULL;
 	const char *name;
 	int pmd_zc = 0;
+	int queue_irqs[RTE_MAX_QUEUES_PER_PORT];
+
+	memset(queue_irqs, -1, sizeof(int) * RTE_MAX_QUEUES_PER_PORT);
 
 	AF_XDP_LOG(INFO, "Initializing pmd_af_xdp for %s\n",
 		rte_vdev_device_name(dev));
@@ -1062,7 +1417,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 		dev->device.numa_node = rte_socket_id();
 
 	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
-			     &xsk_queue_cnt, &pmd_zc) < 0) {
+			     &xsk_queue_cnt, &pmd_zc, &queue_irqs) < 0) {
 		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
 		return -EINVAL;
 	}
@@ -1073,7 +1428,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	}
 
 	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
-					xsk_queue_cnt, pmd_zc);
+					xsk_queue_cnt, pmd_zc, queue_irqs);
 	if (eth_dev == NULL) {
 		AF_XDP_LOG(ERR, "Failed to init internals\n");
 		return -1;
@@ -1117,7 +1472,8 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
 			      "iface=<string> "
 			      "start_queue=<int> "
 			      "queue_count=<int> "
-			      "pmd_zero_copy=<0|1>");
+			      "pmd_zero_copy=<0|1> "
+			      "queue_irq=<int>:<int>");
 
 RTE_INIT(af_xdp_init_log)
 {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks
  2019-09-30 16:42 [dpdk-dev] [PATCH v2 0/3] AF_XDP tx halt fix, IRQ pinning and unaligned chunks Ciara Loftus
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets Ciara Loftus
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs Ciara Loftus
@ 2019-09-30 16:42 ` Ciara Loftus
  2019-10-18 23:48   ` Ye Xiaolong
  2019-10-24 11:10   ` David Marchand
  2 siblings, 2 replies; 27+ messages in thread
From: Ciara Loftus @ 2019-09-30 16:42 UTC (permalink / raw)
  To: dev, xiaolong.ye, kevin.laatz, bruce.richardson, ciara.loftus

This patch enables the unaligned chunks feature for AF_XDP which allows
chunks to be placed at arbitrary places in the umem, as opposed to them
being required to be aligned to 2k. This allows for DPDK application
mempools to be mapped directly into the umem and in turn enable zero copy
transfer between umem and the PMD.

This patch replaces the zero copy via external mbuf mechanism introduced
in commit e9ff8bb71943 ("net/af_xdp: enable zero copy by external mbuf").
The pmd_zero copy vdev argument is also removed as now the PMD will
auto-detect presence of the unaligned chunks feature and enable it if so
and otherwise fall back to copy mode if not detected.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
---
 doc/guides/nics/af_xdp.rst             |   2 +-
 doc/guides/rel_notes/release_19_11.rst |   4 +
 drivers/net/af_xdp/rte_eth_af_xdp.c    | 402 ++++++++++++++++++++-----
 3 files changed, 325 insertions(+), 83 deletions(-)

diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
index a255ba4e7..40a35a822 100644
--- a/doc/guides/nics/af_xdp.rst
+++ b/doc/guides/nics/af_xdp.rst
@@ -35,7 +35,6 @@ The following options can be provided to set up an af_xdp port in DPDK.
 *   ``iface`` - name of the Kernel interface to attach to (required);
 *   ``start_queue`` - starting netdev queue id (optional, default 0);
 *   ``queue_count`` - total netdev queue number (optional, default 1);
-*   ``pmd_zero_copy`` - enable zero copy or not (optional, default 0);
 *   ``queue_irq`` - pin queue irqs to specified core <queue:core> (optional,
     default no pinning). The queue argument refers to the ethdev queue as
     opposed to the netdev queue. These values are the same unless a value
@@ -53,6 +52,7 @@ This is a Linux-specific PMD, thus the following prerequisites apply:
    <kernel src tree>/tools/lib/bpf;
 *  A Kernel bound interface to attach to;
 *  For need_wakeup feature, it requires kernel version later than v5.3-rc1;
+*  For PMD zero copy, it requires kernel version later than v5.4-rc1;
 
 Set up an af_xdp interface
 -----------------------------
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 06bf57c42..22369107c 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -62,6 +62,8 @@ New Features
 
   * Support for pinning netdev queue IRQs to cores specified by the user.
     Available for ixgbe, i40e and mlx5 drivers.
+  * Enabled zero copy between application mempools and UMEM by enabling the
+    XDP_UMEM_UNALIGNED_CHUNKS UMEM flag.
 
 
 Removed Items
@@ -85,6 +87,8 @@ Removed Items
      "port config <port_id> rx_offload crc_strip|scatter|ipv4_cksum|udp_cksum|tcp_cksum|
      timestamp|vlan_strip|vlan_filter|vlan_extend on|off"
 
+   * Removed AF_XDP pmd_zero copy vdev argument. Support is now auto-detected.
+
 
 API Changes
 -----------
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index dfb2845cb..9efdb8f7c 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -62,7 +62,13 @@ static int af_xdp_logtype;
 
 #define ETH_AF_XDP_FRAME_SIZE		2048
 #define ETH_AF_XDP_NUM_BUFFERS		4096
+#ifdef XDP_UMEM_UNALIGNED_CHUNK_FLAG
+#define ETH_AF_XDP_MBUF_OVERHEAD	128 /* sizeof(struct rte_mbuf) */
+#define ETH_AF_XDP_DATA_HEADROOM \
+	(ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM)
+#else
 #define ETH_AF_XDP_DATA_HEADROOM	0
+#endif
 #define ETH_AF_XDP_DFLT_NUM_DESCS	XSK_RING_CONS__DEFAULT_NUM_DESCS
 #define ETH_AF_XDP_DFLT_START_QUEUE_IDX	0
 #define ETH_AF_XDP_DFLT_QUEUE_COUNT	1
@@ -77,7 +83,8 @@ struct xsk_umem_info {
 	struct xsk_umem *umem;
 	struct rte_ring *buf_ring;
 	const struct rte_memzone *mz;
-	int pmd_zc;
+	struct rte_mempool *mb_pool;
+	void *buffer;
 };
 
 struct rx_stats {
@@ -102,10 +109,12 @@ struct pkt_rx_queue {
 struct tx_stats {
 	uint64_t tx_pkts;
 	uint64_t tx_bytes;
+	uint64_t tx_dropped;
 };
 
 struct pkt_tx_queue {
 	struct xsk_ring_prod tx;
+	struct xsk_umem_info *umem;
 
 	struct tx_stats stats;
 
@@ -122,7 +131,6 @@ struct pmd_internals {
 	int combined_queue_cnt;
 	int queue_irqs[RTE_MAX_QUEUES_PER_PORT];
 
-	int pmd_zc;
 	struct rte_ether_addr eth_addr;
 
 	struct pkt_rx_queue *rx_queues;
@@ -132,14 +140,12 @@ struct pmd_internals {
 #define ETH_AF_XDP_IFACE_ARG			"iface"
 #define ETH_AF_XDP_START_QUEUE_ARG		"start_queue"
 #define ETH_AF_XDP_QUEUE_COUNT_ARG		"queue_count"
-#define ETH_AF_XDP_PMD_ZC_ARG			"pmd_zero_copy"
 #define ETH_AF_XDP_QUEUE_IRQ_ARG		"queue_irq"
 
 static const char * const valid_arguments[] = {
 	ETH_AF_XDP_IFACE_ARG,
 	ETH_AF_XDP_START_QUEUE_ARG,
 	ETH_AF_XDP_QUEUE_COUNT_ARG,
-	ETH_AF_XDP_PMD_ZC_ARG,
 	ETH_AF_XDP_QUEUE_IRQ_ARG,
 	NULL
 };
@@ -171,8 +177,39 @@ int (*generate_driver_regex_func)(struct pmd_internals *internals,
 				  uint16_t netdev_qid,
 				  char *iface_regex_str);
 
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+static inline int
+reserve_fill_queue_zc(struct xsk_umem_info *umem, uint16_t reserve_size,
+		      struct rte_mbuf **bufs)
+{
+	struct xsk_ring_prod *fq = &umem->fq;
+	uint32_t idx;
+	uint16_t i;
+
+	if (unlikely(!xsk_ring_prod__reserve(fq, reserve_size, &idx))) {
+		for (i = 0; i < reserve_size; i++)
+			rte_pktmbuf_free(bufs[i]);
+		AF_XDP_LOG(DEBUG, "Failed to reserve enough fq descs.\n");
+		return -1;
+	}
+
+	for (i = 0; i < reserve_size; i++) {
+		__u64 *fq_addr;
+		uint64_t addr;
+
+		fq_addr = xsk_ring_prod__fill_addr(fq, idx++);
+		addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer;
+		*fq_addr = addr;
+	}
+
+	xsk_ring_prod__submit(fq, reserve_size);
+
+	return 0;
+}
+#else
 static inline int
-reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size)
+reserve_fill_queue_cp(struct xsk_umem_info *umem, uint16_t reserve_size,
+		      struct rte_mbuf **bufs __rte_unused)
 {
 	struct xsk_ring_prod *fq = &umem->fq;
 	void *addrs[reserve_size];
@@ -203,32 +240,99 @@ reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size)
 
 	return 0;
 }
+#endif
 
-static void
-umem_buf_release_to_fq(void *addr, void *opaque)
+static inline int
+reserve_fill_queue(struct xsk_umem_info *umem, uint16_t reserve_size,
+		   struct rte_mbuf **bufs)
 {
-	struct xsk_umem_info *umem = (struct xsk_umem_info *)opaque;
-	uint64_t umem_addr = (uint64_t)addr - umem->mz->addr_64;
-
-	rte_ring_enqueue(umem->buf_ring, (void *)umem_addr);
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+	return reserve_fill_queue_zc(umem, reserve_size, bufs);
+#else
+	return reserve_fill_queue_cp(umem, reserve_size, bufs);
+#endif
 }
 
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
 static uint16_t
-eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+af_xdp_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
 	struct pkt_rx_queue *rxq = queue;
 	struct xsk_ring_cons *rx = &rxq->rx;
 	struct xsk_umem_info *umem = rxq->umem;
-	struct xsk_ring_prod *fq = &umem->fq;
 	uint32_t idx_rx = 0;
-	uint32_t free_thresh = fq->size >> 1;
-	int pmd_zc = umem->pmd_zc;
-	struct rte_mbuf *mbufs[ETH_AF_XDP_RX_BATCH_SIZE];
-	unsigned long dropped = 0;
 	unsigned long rx_bytes = 0;
 	int rcvd, i;
+	struct rte_mbuf *fq_bufs[ETH_AF_XDP_RX_BATCH_SIZE];
 
-	nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_RX_BATCH_SIZE);
+	/* allocate bufs for fill queue replenishment after rx */
+	if (rte_pktmbuf_alloc_bulk(umem->mb_pool, fq_bufs, nb_pkts)) {
+		AF_XDP_LOG(DEBUG,
+			"Failed to get enough buffers for fq.\n");
+		return -1;
+	}
+
+	rcvd = xsk_ring_cons__peek(rx, nb_pkts, &idx_rx);
+
+	if (rcvd == 0) {
+#if defined(XDP_USE_NEED_WAKEUP)
+		if (xsk_ring_prod__needs_wakeup(&umem->fq))
+			(void)poll(rxq->fds, 1, 1000);
+#endif
+
+		goto out;
+	}
+
+	for (i = 0; i < rcvd; i++) {
+		const struct xdp_desc *desc;
+		uint64_t addr;
+		uint32_t len;
+		uint64_t offset;
+
+		desc = xsk_ring_cons__rx_desc(rx, idx_rx++);
+		addr = desc->addr;
+		len = desc->len;
+
+		offset = xsk_umem__extract_offset(addr);
+		addr = xsk_umem__extract_addr(addr);
+
+		bufs[i] = (struct rte_mbuf *)
+				xsk_umem__get_data(umem->buffer, addr);
+		bufs[i]->data_off = offset - sizeof(struct rte_mbuf);
+
+		rte_pktmbuf_pkt_len(bufs[i]) = len;
+		rte_pktmbuf_data_len(bufs[i]) = len;
+		rx_bytes += len;
+	}
+
+	xsk_ring_cons__release(rx, rcvd);
+
+	(void)reserve_fill_queue(umem, rcvd, fq_bufs);
+
+	/* statistics */
+	rxq->stats.rx_pkts += rcvd;
+	rxq->stats.rx_bytes += rx_bytes;
+
+out:
+	if (rcvd != nb_pkts)
+		rte_mempool_put_bulk(umem->mb_pool, (void **)&fq_bufs[rcvd],
+				     nb_pkts - rcvd);
+
+	return rcvd;
+}
+#else
+static uint16_t
+af_xdp_rx_cp(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct pkt_rx_queue *rxq = queue;
+	struct xsk_ring_cons *rx = &rxq->rx;
+	struct xsk_umem_info *umem = rxq->umem;
+	struct xsk_ring_prod *fq = &umem->fq;
+	uint32_t idx_rx = 0;
+	unsigned long rx_bytes = 0;
+	int rcvd, i;
+	uint32_t free_thresh = fq->size >> 1;
+	struct rte_mbuf *mbufs[ETH_AF_XDP_RX_BATCH_SIZE];
 
 	if (unlikely(rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, nb_pkts) != 0))
 		return 0;
@@ -244,32 +348,21 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	}
 
 	if (xsk_prod_nb_free(fq, free_thresh) >= free_thresh)
-		(void)reserve_fill_queue(umem, ETH_AF_XDP_RX_BATCH_SIZE);
+		(void)reserve_fill_queue(umem, ETH_AF_XDP_RX_BATCH_SIZE, NULL);
 
 	for (i = 0; i < rcvd; i++) {
 		const struct xdp_desc *desc;
 		uint64_t addr;
 		uint32_t len;
 		void *pkt;
-		uint16_t buf_len = ETH_AF_XDP_FRAME_SIZE;
-		struct rte_mbuf_ext_shared_info *shinfo;
 
 		desc = xsk_ring_cons__rx_desc(rx, idx_rx++);
 		addr = desc->addr;
 		len = desc->len;
 		pkt = xsk_umem__get_data(rxq->umem->mz->addr, addr);
 
-		if (pmd_zc) {
-			shinfo = rte_pktmbuf_ext_shinfo_init_helper(pkt,
-					&buf_len, umem_buf_release_to_fq, umem);
-
-			rte_pktmbuf_attach_extbuf(mbufs[i], pkt, 0, buf_len,
-						  shinfo);
-		} else {
-			rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *),
-							pkt, len);
-			rte_ring_enqueue(umem->buf_ring, (void *)addr);
-		}
+		rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), pkt, len);
+		rte_ring_enqueue(umem->buf_ring, (void *)addr);
 		rte_pktmbuf_pkt_len(mbufs[i]) = len;
 		rte_pktmbuf_data_len(mbufs[i]) = len;
 		rx_bytes += len;
@@ -279,7 +372,7 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	xsk_ring_cons__release(rx, rcvd);
 
 	/* statistics */
-	rxq->stats.rx_pkts += (rcvd - dropped);
+	rxq->stats.rx_pkts += rcvd;
 	rxq->stats.rx_bytes += rx_bytes;
 
 out:
@@ -289,6 +382,19 @@ eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 
 	return rcvd;
 }
+#endif
+
+static uint16_t
+eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_RX_BATCH_SIZE);
+
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+	return af_xdp_rx_zc(queue, bufs, nb_pkts);
+#else
+	return af_xdp_rx_cp(queue, bufs, nb_pkts);
+#endif
+}
 
 static void
 pull_umem_cq(struct xsk_umem_info *umem, int size)
@@ -302,7 +408,13 @@ pull_umem_cq(struct xsk_umem_info *umem, int size)
 	for (i = 0; i < n; i++) {
 		uint64_t addr;
 		addr = *xsk_ring_cons__comp_addr(cq, idx_cq++);
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+		addr = xsk_umem__extract_addr(addr);
+		rte_pktmbuf_free((struct rte_mbuf *)
+					xsk_umem__get_data(umem->buffer, addr));
+#else
 		rte_ring_enqueue(umem->buf_ring, (void *)addr);
+#endif
 	}
 
 	xsk_ring_cons__release(cq, n);
@@ -311,7 +423,7 @@ pull_umem_cq(struct xsk_umem_info *umem, int size)
 static void
 kick_tx(struct pkt_tx_queue *txq)
 {
-	struct xsk_umem_info *umem = txq->pair->umem;
+	struct xsk_umem_info *umem = txq->umem;
 
 	while (send(xsk_socket__fd(txq->pair->xsk), NULL,
 		    0, MSG_DONTWAIT) < 0) {
@@ -323,24 +435,96 @@ kick_tx(struct pkt_tx_queue *txq)
 		if (errno == EAGAIN)
 			pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
 	}
+#ifndef XDP_UMEM_UNALIGNED_CHUNK_FLAG
 	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
+#endif
 }
 
-static inline bool
-in_umem_range(struct xsk_umem_info *umem, uint64_t addr)
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+static uint16_t
+af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
-	uint64_t mz_base_addr = umem->mz->addr_64;
+	struct pkt_tx_queue *txq = queue;
+	struct xsk_umem_info *umem = txq->umem;
+	struct rte_mbuf *mbuf;
+	unsigned long tx_bytes = 0;
+	int i;
+	uint32_t idx_tx;
+	uint16_t count = 0;
+	struct xdp_desc *desc;
+	uint64_t addr, offset;
 
-	return addr >= mz_base_addr && addr < mz_base_addr + umem->mz->len;
-}
+	pull_umem_cq(umem, nb_pkts);
+
+	for (i = 0; i < nb_pkts; i++) {
+		mbuf = bufs[i];
+
+		if (mbuf->pool == umem->mb_pool) {
+			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
+				kick_tx(txq);
+				goto out;
+			}
+			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
+			desc->len = mbuf->pkt_len;
+			addr = (uint64_t)mbuf - (uint64_t)umem->buffer;
+			offset = rte_pktmbuf_mtod(mbuf, uint64_t) -
+					(uint64_t)mbuf;
+			offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
+			desc->addr = addr | offset;
+			count++;
+		} else {
+			struct rte_mbuf *local_mbuf =
+					rte_pktmbuf_alloc(umem->mb_pool);
+			void *pkt;
+
+			if (local_mbuf == NULL)
+				goto out;
+
+			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
+				rte_pktmbuf_free(local_mbuf);
+				kick_tx(txq);
+				goto out;
+			}
+
+			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
+			desc->len = mbuf->pkt_len;
+
+			addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer;
+			offset = rte_pktmbuf_mtod(local_mbuf, uint64_t) -
+					(uint64_t)local_mbuf;
+			pkt = xsk_umem__get_data(umem->buffer, addr + offset);
+			offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
+			desc->addr = addr | offset;
+			rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *),
+					desc->len);
+			rte_pktmbuf_free(mbuf);
+			count++;
+		}
 
+		tx_bytes += mbuf->pkt_len;
+	}
+
+#if defined(XDP_USE_NEED_WAKEUP)
+	if (xsk_ring_prod__needs_wakeup(&txq->tx))
+#endif
+		kick_tx(txq);
+
+out:
+	xsk_ring_prod__submit(&txq->tx, count);
+
+	txq->stats.tx_pkts += count;
+	txq->stats.tx_bytes += tx_bytes;
+	txq->stats.tx_dropped += nb_pkts - count;
+
+	return count;
+}
+#else
 static uint16_t
-eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+af_xdp_tx_cp(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
 	struct pkt_tx_queue *txq = queue;
-	struct xsk_umem_info *umem = txq->pair->umem;
+	struct xsk_umem_info *umem = txq->umem;
 	struct rte_mbuf *mbuf;
-	int pmd_zc = umem->pmd_zc;
 	void *addrs[ETH_AF_XDP_TX_BATCH_SIZE];
 	unsigned long tx_bytes = 0;
 	int i;
@@ -369,24 +553,12 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		mbuf = bufs[i];
 		desc->len = mbuf->pkt_len;
 
-		/*
-		 * We need to make sure the external mbuf address is within
-		 * current port's umem memzone range
-		 */
-		if (pmd_zc && RTE_MBUF_HAS_EXTBUF(mbuf) &&
-				in_umem_range(umem, (uint64_t)mbuf->buf_addr)) {
-			desc->addr = (uint64_t)mbuf->buf_addr -
-				umem->mz->addr_64;
-			mbuf->buf_addr = xsk_umem__get_data(umem->mz->addr,
-					(uint64_t)addrs[i]);
-		} else {
-			desc->addr = (uint64_t)addrs[i];
-			pkt = xsk_umem__get_data(umem->mz->addr,
-					desc->addr);
-			rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *),
-					desc->len);
-		}
+		desc->addr = (uint64_t)addrs[i];
+		pkt = xsk_umem__get_data(umem->mz->addr,
+					 desc->addr);
+		rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), desc->len);
 		tx_bytes += mbuf->pkt_len;
+		rte_pktmbuf_free(mbuf);
 	}
 
 	xsk_ring_prod__submit(&txq->tx, nb_pkts);
@@ -399,11 +571,19 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	txq->stats.tx_pkts += nb_pkts;
 	txq->stats.tx_bytes += tx_bytes;
 
-	for (i = 0; i < nb_pkts; i++)
-		rte_pktmbuf_free(bufs[i]);
-
 	return nb_pkts;
 }
+#endif
+
+static uint16_t
+eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+	return af_xdp_tx_zc(queue, bufs, nb_pkts);
+#else
+	return af_xdp_tx_cp(queue, bufs, nb_pkts);
+#endif
+}
 
 static int
 eth_dev_start(struct rte_eth_dev *dev)
@@ -473,6 +653,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 		stats->ipackets += stats->q_ipackets[i];
 		stats->ibytes += stats->q_ibytes[i];
 		stats->imissed += rxq->stats.rx_dropped;
+		stats->oerrors += txq->stats.tx_dropped;
 		ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP,
 				XDP_STATISTICS, &xdp_stats, &optlen);
 		if (ret != 0) {
@@ -519,11 +700,16 @@ remove_xdp_program(struct pmd_internals *internals)
 static void
 xdp_umem_destroy(struct xsk_umem_info *umem)
 {
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+	rte_mempool_free(umem->mb_pool);
+	umem->mb_pool = NULL;
+#else
 	rte_memzone_free(umem->mz);
 	umem->mz = NULL;
 
 	rte_ring_free(umem->buf_ring);
 	umem->buf_ring = NULL;
+#endif
 
 	rte_free(umem);
 	umem = NULL;
@@ -573,6 +759,55 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+static inline uint64_t get_base_addr(struct rte_mempool *mp)
+{
+	struct rte_mempool_memhdr *memhdr;
+
+	memhdr = STAILQ_FIRST(&mp->mem_list);
+	return (uint64_t)memhdr->addr & ~(getpagesize() - 1);
+}
+
+static struct
+xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals __rte_unused,
+				  struct pkt_rx_queue *rxq)
+{
+	struct xsk_umem_info *umem;
+	int ret;
+	struct xsk_umem_config usr_config = {
+		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
+		.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
+		.flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG};
+	void *base_addr = NULL;
+	struct rte_mempool *mb_pool = rxq->mb_pool;
+
+	usr_config.frame_size = rte_pktmbuf_data_room_size(mb_pool) +
+					ETH_AF_XDP_MBUF_OVERHEAD +
+					mb_pool->private_data_size;
+	usr_config.frame_headroom = ETH_AF_XDP_DATA_HEADROOM +
+					mb_pool->private_data_size;
+
+	umem = rte_zmalloc_socket("umem", sizeof(*umem), 0, rte_socket_id());
+	if (umem == NULL) {
+		AF_XDP_LOG(ERR, "Failed to allocate umem info");
+		return NULL;
+	}
+
+	umem->mb_pool = mb_pool;
+	base_addr = (void *)get_base_addr(mb_pool);
+
+	ret = xsk_umem__create(&umem->umem, base_addr,
+			       mb_pool->populated_size * usr_config.frame_size,
+			       &umem->fq, &umem->cq,
+			       &usr_config);
+
+	if (ret) {
+		AF_XDP_LOG(ERR, "Failed to create umem");
+		goto err;
+	}
+	umem->buffer = base_addr;
+
+#else
 static struct
 xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals,
 				  struct pkt_rx_queue *rxq)
@@ -633,6 +868,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals,
 	}
 	umem->mz = mz;
 
+#endif
 	return umem;
 
 err:
@@ -647,11 +883,13 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 	struct xsk_socket_config cfg;
 	struct pkt_tx_queue *txq = rxq->pair;
 	int ret = 0;
-	int reserve_size;
+	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2;
+	struct rte_mbuf *fq_bufs[reserve_size];
 
 	rxq->umem = xdp_umem_configure(internals, rxq);
 	if (rxq->umem == NULL)
 		return -ENOMEM;
+	txq->umem = rxq->umem;
 
 	cfg.rx_size = ring_size;
 	cfg.tx_size = ring_size;
@@ -671,8 +909,13 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 		goto err;
 	}
 
-	reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2;
-	ret = reserve_fill_queue(rxq->umem, reserve_size);
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+	if (rte_pktmbuf_alloc_bulk(rxq->umem->mb_pool, fq_bufs, reserve_size)) {
+		AF_XDP_LOG(DEBUG, "Failed to get enough buffers for fq.\n");
+		goto err;
+	}
+#endif
+	ret = reserve_fill_queue(rxq->umem, reserve_size, fq_bufs);
 	if (ret) {
 		xsk_socket__delete(rxq->xsk);
 		AF_XDP_LOG(ERR, "Failed to reserve fill queue.\n");
@@ -977,7 +1220,6 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
 		   struct rte_mempool *mb_pool)
 {
 	struct pmd_internals *internals = dev->data->dev_private;
-	uint32_t buf_size, data_size;
 	struct pkt_rx_queue *rxq;
 	int ret;
 
@@ -985,6 +1227,10 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
 
 	AF_XDP_LOG(INFO, "Set up rx queue, rx queue id: %d, xsk queue id: %d\n",
 		   rx_queue_id, rxq->xsk_queue_idx);
+
+#ifndef XDP_UMEM_UNALIGNED_CHUNK_FLAG
+	uint32_t buf_size, data_size;
+
 	/* Now get the space available for data in the mbuf */
 	buf_size = rte_pktmbuf_data_room_size(mb_pool) -
 		RTE_PKTMBUF_HEADROOM;
@@ -996,6 +1242,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
 		ret = -ENOMEM;
 		goto err;
 	}
+#endif
 
 	rxq->mb_pool = mb_pool;
 
@@ -1010,8 +1257,6 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
 	rxq->fds[0].fd = xsk_socket__fd(rxq->xsk);
 	rxq->fds[0].events = POLLIN;
 
-	rxq->umem->pmd_zc = internals->pmd_zc;
-
 	dev->data->rx_queues[rx_queue_id] = rxq;
 	return 0;
 
@@ -1220,7 +1465,7 @@ xdp_get_channels_info(const char *if_name, int *max_queues,
 
 static int
 parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
-			int *queue_cnt, int *pmd_zc,
+			int *queue_cnt,
 			int (*queue_irqs)[RTE_MAX_QUEUES_PER_PORT])
 {
 	int ret;
@@ -1242,11 +1487,6 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
 		goto free_kvlist;
 	}
 
-	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_PMD_ZC_ARG,
-				 &parse_integer_arg, pmd_zc);
-	if (ret < 0)
-		goto free_kvlist;
-
 	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_IRQ_ARG,
 				 &parse_queue_irq_arg, queue_irqs);
 	if (ret < 0)
@@ -1289,7 +1529,7 @@ get_iface_info(const char *if_name,
 
 static struct rte_eth_dev *
 init_internals(struct rte_vdev_device *dev, const char *if_name,
-			int start_queue_idx, int queue_cnt, int pmd_zc,
+			int start_queue_idx, int queue_cnt,
 			int queue_irqs[RTE_MAX_QUEUES_PER_PORT])
 {
 	const char *name = rte_vdev_device_name(dev);
@@ -1305,7 +1545,6 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
 
 	internals->start_queue_idx = start_queue_idx;
 	internals->queue_cnt = queue_cnt;
-	internals->pmd_zc = pmd_zc;
 	strlcpy(internals->if_name, if_name, IFNAMSIZ);
 	rte_memcpy(internals->queue_irqs, queue_irqs,
 			sizeof(int) * RTE_MAX_QUEUES_PER_PORT);
@@ -1363,8 +1602,9 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
 	/* Let rte_eth_dev_close() release the port resources. */
 	eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
 
-	if (internals->pmd_zc)
-		AF_XDP_LOG(INFO, "Zero copy between umem and mbuf enabled.\n");
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+	AF_XDP_LOG(INFO, "Zero copy between umem and mbuf enabled.\n");
+#endif
 
 	return eth_dev;
 
@@ -1386,7 +1626,6 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	int xsk_queue_cnt = ETH_AF_XDP_DFLT_QUEUE_COUNT;
 	struct rte_eth_dev *eth_dev = NULL;
 	const char *name;
-	int pmd_zc = 0;
 	int queue_irqs[RTE_MAX_QUEUES_PER_PORT];
 
 	memset(queue_irqs, -1, sizeof(int) * RTE_MAX_QUEUES_PER_PORT);
@@ -1417,7 +1656,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 		dev->device.numa_node = rte_socket_id();
 
 	if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
-			     &xsk_queue_cnt, &pmd_zc, &queue_irqs) < 0) {
+			     &xsk_queue_cnt, &queue_irqs) < 0) {
 		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
 		return -EINVAL;
 	}
@@ -1428,7 +1667,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
 	}
 
 	eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
-					xsk_queue_cnt, pmd_zc, queue_irqs);
+					xsk_queue_cnt, queue_irqs);
 	if (eth_dev == NULL) {
 		AF_XDP_LOG(ERR, "Failed to init internals\n");
 		return -1;
@@ -1472,7 +1711,6 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
 			      "iface=<string> "
 			      "start_queue=<int> "
 			      "queue_count=<int> "
-			      "pmd_zero_copy=<0|1> "
 			      "queue_irq=<int>:<int>");
 
 RTE_INIT(af_xdp_init_log)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs Ciara Loftus
@ 2019-09-30 17:11   ` Stephen Hemminger
  2019-10-03 13:23     ` Loftus, Ciara
  2019-10-18 23:49   ` Ye Xiaolong
  1 sibling, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2019-09-30 17:11 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, xiaolong.ye, kevin.laatz, bruce.richardson

On Mon, 30 Sep 2019 16:42:04 +0000
Ciara Loftus <ciara.loftus@intel.com> wrote:

> +/* drivers supported for the queue_irq option */
> +enum supported_drivers {
> +	I40E_DRIVER,
> +	IXGBE_DRIVER,
> +	MLX5_DRIVER,
> +	NUM_DRIVERS
> +};

Anything device specific like this raises a red flag to me.

This regex etc, seems like a huge hack. Is there a better way  using
irqbalance and smp_affinity in kernel drivers?

NACK

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-09-30 17:11   ` Stephen Hemminger
@ 2019-10-03 13:23     ` Loftus, Ciara
  2019-10-14 14:43       ` Bruce Richardson
                         ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Loftus, Ciara @ 2019-10-03 13:23 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Ye, Xiaolong, Laatz, Kevin, Richardson, Bruce



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday 30 September 2019 18:12
> To: Loftus, Ciara <ciara.loftus@intel.com>
> Cc: dev@dpdk.org; Ye, Xiaolong <xiaolong.ye@intel.com>; Laatz, Kevin
> <kevin.laatz@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
> 
> On Mon, 30 Sep 2019 16:42:04 +0000
> Ciara Loftus <ciara.loftus@intel.com> wrote:
> 
> > +/* drivers supported for the queue_irq option */
> > +enum supported_drivers {
> > +	I40E_DRIVER,
> > +	IXGBE_DRIVER,
> > +	MLX5_DRIVER,
> > +	NUM_DRIVERS
> > +};
> 
> Anything device specific like this raises a red flag to me.
> 
> This regex etc, seems like a huge hack. Is there a better way  using
> irqbalance and smp_affinity in kernel drivers?
> 
> NACK

Hi Stephen,
 
Thanks for looking at the patch. I understand your concern however unfortunately I haven't been able to identify a way to achieve the desired outcome by using your suggestions of irqbalance and smp_affinity. Did you have something specific in mind or are aware of any generic way of retrieving interrupt numbers for NICs regardless of vendor or range?
 
I think this feature is really important for the usability of this PMD. Without it, to configure the IRQs the user has to open up /proc/interrupts, trawl through it and identify the correct IRQ number for their given NIC and qid (the format for which is unlikely to be known off-hand), and manually pin them by writing the appropriate values in the appropriate format to the appropriate file - prone to error if not automated IMO.
If the user fails to set the affinity it's probably fine for a single pmd, however with multiple pmds all irqs will by default land on core 0 and lead to terrible performance.

It should be possible to rework the code to remove the regexes and use a direct string compare. Would that make the solution more palatable?
 
Let me know what you think.
 
Thanks,
Ciara

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-03 13:23     ` Loftus, Ciara
@ 2019-10-14 14:43       ` Bruce Richardson
  2019-10-21 15:24         ` Ferruh Yigit
  2019-10-15 11:14       ` Ray Kinsella
  2019-10-21 10:04       ` Loftus, Ciara
  2 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2019-10-14 14:43 UTC (permalink / raw)
  To: Loftus, Ciara
  Cc: Stephen Hemminger, dev, Ye, Xiaolong, Laatz, Kevin, ferruh.yigit,
	arybchenko

On Thu, Oct 03, 2019 at 02:23:07PM +0100, Loftus, Ciara wrote:
> 
> 
> > -----Original Message----- From: Stephen Hemminger
> > <stephen@networkplumber.org> Sent: Monday 30 September 2019 18:12 To:
> > Loftus, Ciara <ciara.loftus@intel.com> Cc: dev@dpdk.org; Ye, Xiaolong
> > <xiaolong.ye@intel.com>; Laatz, Kevin <kevin.laatz@intel.com>;
> > Richardson, Bruce <bruce.richardson@intel.com> Subject: Re: [dpdk-dev]
> > [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
> > 
> > On Mon, 30 Sep 2019 16:42:04 +0000 Ciara Loftus
> > <ciara.loftus@intel.com> wrote:
> > 
> > > +/* drivers supported for the queue_irq option */ +enum
> > > supported_drivers { +	I40E_DRIVER, +	IXGBE_DRIVER, +
> > > MLX5_DRIVER, +	NUM_DRIVERS +};
> > 
> > Anything device specific like this raises a red flag to me.
> > 
> > This regex etc, seems like a huge hack. Is there a better way  using
> > irqbalance and smp_affinity in kernel drivers?
> > 
> > NACK
> 
> Hi Stephen,
>  
> Thanks for looking at the patch. I understand your concern however
> unfortunately I haven't been able to identify a way to achieve the
> desired outcome by using your suggestions of irqbalance and smp_affinity.
> Did you have something specific in mind or are aware of any generic way
> of retrieving interrupt numbers for NICs regardless of vendor or range?
>  
> I think this feature is really important for the usability of this PMD.
> Without it, to configure the IRQs the user has to open up
> /proc/interrupts, trawl through it and identify the correct IRQ number
> for their given NIC and qid (the format for which is unlikely to be known
> off-hand), and manually pin them by writing the appropriate values in the
> appropriate format to the appropriate file - prone to error if not
> automated IMO.  If the user fails to set the affinity it's probably fine
> for a single pmd, however with multiple pmds all irqs will by default
> land on core 0 and lead to terrible performance.
> 
> It should be possible to rework the code to remove the regexes and use a
> direct string compare. Would that make the solution more palatable?
>  

Hi Ciara, Stephen,

is there any way forward on this patch?

From my experience with using AF_XDP the pinning of interrupts is both
necessary for performance and sadly rather awkward to implement in
practice. If we can't find a better way to do this, I think merging this
patch is the best thing to do. It may be a bit messy, but the overall user
experience should be far improved over not having it.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-03 13:23     ` Loftus, Ciara
  2019-10-14 14:43       ` Bruce Richardson
@ 2019-10-15 11:14       ` Ray Kinsella
  2019-10-21 10:04       ` Loftus, Ciara
  2 siblings, 0 replies; 27+ messages in thread
From: Ray Kinsella @ 2019-10-15 11:14 UTC (permalink / raw)
  To: dev



On 03/10/2019 14:23, Loftus, Ciara wrote:
> 
> 
>> -----Original Message-----
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Sent: Monday 30 September 2019 18:12
>> To: Loftus, Ciara <ciara.loftus@intel.com>
>> Cc: dev@dpdk.org; Ye, Xiaolong <xiaolong.ye@intel.com>; Laatz, Kevin
>> <kevin.laatz@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
>>
>> On Mon, 30 Sep 2019 16:42:04 +0000
>> Ciara Loftus <ciara.loftus@intel.com> wrote:
>>
>>> +/* drivers supported for the queue_irq option */
>>> +enum supported_drivers {
>>> +	I40E_DRIVER,
>>> +	IXGBE_DRIVER,
>>> +	MLX5_DRIVER,
>>> +	NUM_DRIVERS
>>> +};
>>
>> Anything device specific like this raises a red flag to me.
>>
>> This regex etc, seems like a huge hack. Is there a better way  using
>> irqbalance and smp_affinity in kernel drivers?
>>
>> NACK
> 
> Hi Stephen,
>  
> Thanks for looking at the patch. I understand your concern however unfortunately I haven't been able to identify a way to achieve the desired outcome by using your suggestions of irqbalance and smp_affinity. Did you have something specific in mind or are aware of any generic way of retrieving interrupt numbers for NICs regardless of vendor or range?
>  
> I think this feature is really important for the usability of this PMD. Without it, to configure the IRQs the user has to open up /proc/interrupts, trawl through it and identify the correct IRQ number for their given NIC and qid (the format for which is unlikely to be known off-hand), and manually pin them by writing the appropriate values in the appropriate format to the appropriate file - prone to error if not automated IMO.
> If the user fails to set the affinity it's probably fine for a single pmd, however with multiple pmds all irqs will by default land on core 0 and lead to terrible performance.
> 
> It should be possible to rework the code to remove the regexes and use a direct string compare. Would that make the solution more palatable?
>  
> Let me know what you think.
>  
> Thanks,
> Ciara
> 

Assuming there is no easier way to co-relate an ethernet device with an interrupt, to make the strcmp's go away. My preference is for DPDK to take care of it's rqmts - even in a less-that-ideal way, in preference to asking a user to figure it out. 

Ray K


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks Ciara Loftus
@ 2019-10-18 23:48   ` Ye Xiaolong
  2019-10-22 14:28     ` Ferruh Yigit
  2019-10-24 11:10   ` David Marchand
  1 sibling, 1 reply; 27+ messages in thread
From: Ye Xiaolong @ 2019-10-18 23:48 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, kevin.laatz, bruce.richardson

On 09/30, Ciara Loftus wrote:
>This patch enables the unaligned chunks feature for AF_XDP which allows
>chunks to be placed at arbitrary places in the umem, as opposed to them
>being required to be aligned to 2k. This allows for DPDK application
>mempools to be mapped directly into the umem and in turn enable zero copy
>transfer between umem and the PMD.
>
>This patch replaces the zero copy via external mbuf mechanism introduced
>in commit e9ff8bb71943 ("net/af_xdp: enable zero copy by external mbuf").
>The pmd_zero copy vdev argument is also removed as now the PMD will
>auto-detect presence of the unaligned chunks feature and enable it if so
>and otherwise fall back to copy mode if not detected.
>
>Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
>---
> doc/guides/nics/af_xdp.rst             |   2 +-
> doc/guides/rel_notes/release_19_11.rst |   4 +
> drivers/net/af_xdp/rte_eth_af_xdp.c    | 402 ++++++++++++++++++++-----
> 3 files changed, 325 insertions(+), 83 deletions(-)
>

Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs Ciara Loftus
  2019-09-30 17:11   ` Stephen Hemminger
@ 2019-10-18 23:49   ` Ye Xiaolong
  1 sibling, 0 replies; 27+ messages in thread
From: Ye Xiaolong @ 2019-10-18 23:49 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, kevin.laatz, bruce.richardson

On 09/30, Ciara Loftus wrote:
>Network devices using the AF_XDP PMD will trigger interrupts
>on reception of packets. The new PMD argument 'queue_irq'
>allows the user to specify a core on which to pin interrupts
>for a given queue. Multiple queue_irq arguments can be specified.
>For example:
>
>  --vdev=net_af_xdp1,iface=eth0,queue_count=2,
>           queue_irq=0:2,queue_irq=1:5
>
>..will pin queue 0 interrupts to core 2 and queue 1 interrupts
>to core 5.
>
>The queue argument refers to the ethdev queue as opposed to the
>netdev queue. These values are the same unless a value greater
>than 0 is specified in a start_queue argument.
>
>The drivers supported for this feature are those with support for
>AF_XDP zero copy in the kernel, namely ixgbe, i40e and mlx5_core.
>
>Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
> doc/guides/nics/af_xdp.rst             |  15 +
> doc/guides/rel_notes/release_19_11.rst |   7 +
> drivers/net/af_xdp/rte_eth_af_xdp.c    | 366 ++++++++++++++++++++++++-
> 3 files changed, 383 insertions(+), 5 deletions(-)
>

Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-03 13:23     ` Loftus, Ciara
  2019-10-14 14:43       ` Bruce Richardson
  2019-10-15 11:14       ` Ray Kinsella
@ 2019-10-21 10:04       ` Loftus, Ciara
  2019-10-21 12:52         ` Varghese, Vipin
  2 siblings, 1 reply; 27+ messages in thread
From: Loftus, Ciara @ 2019-10-21 10:04 UTC (permalink / raw)
  To: 'Stephen Hemminger'
  Cc: 'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Richardson, Bruce, Yigit, Ferruh

> > On Mon, 30 Sep 2019 16:42:04 +0000
> > Ciara Loftus <ciara.loftus@intel.com> wrote:
> >
> > > +/* drivers supported for the queue_irq option */
> > > +enum supported_drivers {
> > > +	I40E_DRIVER,
> > > +	IXGBE_DRIVER,
> > > +	MLX5_DRIVER,
> > > +	NUM_DRIVERS
> > > +};
> >
> > Anything device specific like this raises a red flag to me.
> >
> > This regex etc, seems like a huge hack. Is there a better way  using
> > irqbalance and smp_affinity in kernel drivers?
> >
> > NACK
> 
> Hi Stephen,
> 
> Thanks for looking at the patch. I understand your concern however
> unfortunately I haven't been able to identify a way to achieve the desired
> outcome by using your suggestions of irqbalance and smp_affinity. Did you
> have something specific in mind or are aware of any generic way of retrieving
> interrupt numbers for NICs regardless of vendor or range?
> 
> I think this feature is really important for the usability of this PMD. Without it,
> to configure the IRQs the user has to open up /proc/interrupts, trawl through
> it and identify the correct IRQ number for their given NIC and qid (the format
> for which is unlikely to be known off-hand), and manually pin them by writing
> the appropriate values in the appropriate format to the appropriate file -
> prone to error if not automated IMO.
> If the user fails to set the affinity it's probably fine for a single pmd, however
> with multiple pmds all irqs will by default land on core 0 and lead to terrible
> performance.

Hi,

Following this up with some performance data which shows the impact of no pinning.

The test case is N instances of testpmd macswap where N= the number of interfaces.

ifaces  no pinning  pinning
1       9059100     9171612
2       9261635     18376552
3       9332804     27696702

For the no-pinning case, all IRQs are landing on the default core 0, which results in very poor scaling versus the pinned case where scaling is linear.

Thanks,
Ciara

> 
> It should be possible to rework the code to remove the regexes and use a
> direct string compare. Would that make the solution more palatable?
> 
> Let me know what you think.
> 
> Thanks,
> Ciara

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 10:04       ` Loftus, Ciara
@ 2019-10-21 12:52         ` Varghese, Vipin
  2019-10-21 13:04           ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Varghese, Vipin @ 2019-10-21 12:52 UTC (permalink / raw)
  To: Loftus, Ciara, 'Stephen Hemminger'
  Cc: 'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Richardson, Bruce, Yigit, Ferruh

Hi Ciara,

snipped
> 
> ifaces  no pinning  pinning
> 1       9059100     9171612
> 2       9261635     18376552
> 3       9332804     27696702
> 
> For the no-pinning case, all IRQs are landing on the default core 0, which
> results in very poor scaling versus the pinned case where scaling is linear.

Thanks for the information, but a question here `Is the reason for landing all IRQ on core '0' is because the Kernel CMD line 'isol or no interupts' is done for all expect core 0?`

If the cores are not isolated and no interrupts are redirected; normally `cat /proc/interrupts` shows IRQ mask to cores. Depending upon FDIR (intel X522 and X710) this could be core 0 or 'n-1'?

> 
> Thanks,
> Ciara
> 
> >
> > It should be possible to rework the code to remove the regexes and use
> > a direct string compare. Would that make the solution more palatable?
> >
> > Let me know what you think.
> >
> > Thanks,
> > Ciara

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 12:52         ` Varghese, Vipin
@ 2019-10-21 13:04           ` Bruce Richardson
  2019-10-21 13:11             ` Varghese, Vipin
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2019-10-21 13:04 UTC (permalink / raw)
  To: Varghese, Vipin
  Cc: Loftus, Ciara, 'Stephen Hemminger',
	'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Yigit, Ferruh

On Mon, Oct 21, 2019 at 01:52:27PM +0100, Varghese, Vipin wrote:
> Hi Ciara,
> 
> snipped
> > 
> > ifaces  no pinning  pinning
> > 1       9059100     9171612
> > 2       9261635     18376552
> > 3       9332804     27696702
> > 
> > For the no-pinning case, all IRQs are landing on the default core 0, which
> > results in very poor scaling versus the pinned case where scaling is linear.
> 
> Thanks for the information, but a question here `Is the reason for landing all IRQ on core '0' is because the Kernel CMD line 'isol or no interupts' is done for all expect core 0?`
> 
> If the cores are not isolated and no interrupts are redirected; normally `cat /proc/interrupts` shows IRQ mask to cores. Depending upon FDIR (intel X522 and X710) this could be core 0 or 'n-1'?
> 
Yes, the interrupt pinning default is somewhat dependent on the exact
setup, but the fact remains that in just about any setup the interrupts for
an AF_XDP queue are unlikely to end up on the exactly the one core that the
user wants them on. This is what makes this patch so necessary, both from a
usability and performance point of view.

In the absense of alternatives, I really think this patch should be merged,
since with more than one point the difference between having correctly or
incorrectly pinned interrupts is huge. I'd also point out that in my
testing the interrupts need to be pinned each and every time an app is run -
it's not a set once and forget thing. This ability to have the driver pin
the interrupts for the user would be a big timesaver for developers too,
who may be constantly re-running apps when testing.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 13:04           ` Bruce Richardson
@ 2019-10-21 13:11             ` Varghese, Vipin
  2019-10-21 13:17               ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Varghese, Vipin @ 2019-10-21 13:11 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: Loftus, Ciara, 'Stephen Hemminger',
	'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Yigit, Ferruh

Hi Bruce,

snipped
> > >
> > > For the no-pinning case, all IRQs are landing on the default core 0,
> > > which results in very poor scaling versus the pinned case where scaling is
> linear.
> >
> > Thanks for the information, but a question here `Is the reason for
> > landing all IRQ on core '0' is because the Kernel CMD line 'isol or no
> > interupts' is done for all expect core 0?`
> >
> > If the cores are not isolated and no interrupts are redirected; normally `cat
> /proc/interrupts` shows IRQ mask to cores. Depending upon FDIR (intel X522
> and X710) this could be core 0 or 'n-1'?
> >
> Yes, the interrupt pinning default is somewhat dependent on the exact setup,
> but the fact remains that in just about any setup the interrupts for an AF_XDP
> queue are unlikely to end up on the exactly the one core that the user wants
> them on. This is what makes this patch so necessary, both from a usability and
> performance point of view.
> 
> In the absense of alternatives, I really think this patch should be merged, since
> with more than one point the difference between having correctly or
> incorrectly pinned interrupts is huge. I'd also point out that in my testing the
> interrupts need to be pinned each and every time an app is run - it's not a set
> once and forget thing.
Yes, I agree with you as in my testing with XDP and FDIR we had to do this in each test run.

 This ability to have the driver pin the interrupts for the
> user would be a big timesaver for developers too, who may be constantly re-
> running apps when testing.
Here my understanding, user can not or should not pass DPDK cores for interrupt pinning. So should we ask the driver to fetch `rte_eal_configuration` and ensure the same?

> 
> Regards,
> /Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 13:11             ` Varghese, Vipin
@ 2019-10-21 13:17               ` Bruce Richardson
  2019-10-21 13:45                 ` Varghese, Vipin
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2019-10-21 13:17 UTC (permalink / raw)
  To: Varghese, Vipin
  Cc: Loftus, Ciara, 'Stephen Hemminger',
	'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Yigit, Ferruh

On Mon, Oct 21, 2019 at 02:11:39PM +0100, Varghese, Vipin wrote:
> Hi Bruce,
> 
> snipped
> > > >
> > > > For the no-pinning case, all IRQs are landing on the default core 0,
> > > > which results in very poor scaling versus the pinned case where scaling is
> > linear.
> > >
> > > Thanks for the information, but a question here `Is the reason for
> > > landing all IRQ on core '0' is because the Kernel CMD line 'isol or no
> > > interupts' is done for all expect core 0?`
> > >
> > > If the cores are not isolated and no interrupts are redirected; normally `cat
> > /proc/interrupts` shows IRQ mask to cores. Depending upon FDIR (intel X522
> > and X710) this could be core 0 or 'n-1'?
> > >
> > Yes, the interrupt pinning default is somewhat dependent on the exact setup,
> > but the fact remains that in just about any setup the interrupts for an AF_XDP
> > queue are unlikely to end up on the exactly the one core that the user wants
> > them on. This is what makes this patch so necessary, both from a usability and
> > performance point of view.
> > 
> > In the absense of alternatives, I really think this patch should be merged, since
> > with more than one point the difference between having correctly or
> > incorrectly pinned interrupts is huge. I'd also point out that in my testing the
> > interrupts need to be pinned each and every time an app is run - it's not a set
> > once and forget thing.
> Yes, I agree with you as in my testing with XDP and FDIR we had to do this in each test run.
> 
>  This ability to have the driver pin the interrupts for the
> > user would be a big timesaver for developers too, who may be constantly re-
> > running apps when testing.
> Here my understanding, user can not or should not pass DPDK cores for interrupt pinning. So should we ask the driver to fetch `rte_eal_configuration` and ensure the same?
> 

Actually I disagree. I think the user should pass the cores for interrupt
pinning, because unlike other PMDs it is perfectly valid to have the
interrupts pinned to dedicated cores separate from those used by DPDK.

Or taking another example, suppose the app takes 8 cores in the coremask,
but only one of those cores is to be used for I/O, what cores should the
driver pin the interrupts to? It probably should be the same core used for
I/O, but the driver can't know which cores will be for that, or
alternatively the user might want to use AF_XDP split across two cores, in
which case any core on the system might be the intended one for interrupts.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 13:17               ` Bruce Richardson
@ 2019-10-21 13:45                 ` Varghese, Vipin
  2019-10-21 13:56                   ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Varghese, Vipin @ 2019-10-21 13:45 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: Loftus, Ciara, 'Stephen Hemminger',
	'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Yigit, Ferruh

Hi Bruce,

snipped
> >  This ability to have the driver pin the interrupts for the
> > > user would be a big timesaver for developers too, who may be
> > > constantly re- running apps when testing.
> > Here my understanding, user can not or should not pass DPDK cores for
> interrupt pinning. So should we ask the driver to fetch `rte_eal_configuration`
> and ensure the same?
> >
> 
> Actually I disagree. I think the user should pass the cores for interrupt pinning,
I agree to this.

> because unlike other PMDs it is perfectly valid to have the interrupts pinned to
> dedicated cores separate from those used by DPDK.
My point is the same, but not on DPDK DP or service cores.

> 
> Or taking another example, suppose the app takes 8 cores in the coremask, but
> only one of those cores is to be used for I/O, what cores should the driver pin
> the interrupts to?
It can be cores on machine (guest or host) which is not used by DPDK.

 It probably should be the same core used for I/O, but the
> driver can't know which cores will be for that, or alternatively the user might
> want to use AF_XDP split across two cores, in which case any core on the
> system might be the intended one for interrupts.
I agree to the patch, only difference in dev->probe function, should not there be validation to ensure the IRQ core is not DPDK core or Service core as the Interface is owned by kernel and for non matched eBPF skb buff is used by kernel.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 13:45                 ` Varghese, Vipin
@ 2019-10-21 13:56                   ` Bruce Richardson
  2019-10-21 14:06                     ` Varghese, Vipin
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2019-10-21 13:56 UTC (permalink / raw)
  To: Varghese, Vipin
  Cc: Loftus, Ciara, 'Stephen Hemminger',
	'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Yigit, Ferruh

On Mon, Oct 21, 2019 at 02:45:05PM +0100, Varghese, Vipin wrote:
> Hi Bruce,
> 
> snipped
> > >  This ability to have the driver pin the interrupts for the
> > > > user would be a big timesaver for developers too, who may be
> > > > constantly re- running apps when testing.
> > > Here my understanding, user can not or should not pass DPDK cores for
> > interrupt pinning. So should we ask the driver to fetch `rte_eal_configuration`
> > and ensure the same?
> > >
> > 
> > Actually I disagree. I think the user should pass the cores for interrupt pinning,
> I agree to this.
> 
> > because unlike other PMDs it is perfectly valid to have the interrupts pinned to
> > dedicated cores separate from those used by DPDK.
> My point is the same, but not on DPDK DP or service cores.
> 
> > 
> > Or taking another example, suppose the app takes 8 cores in the coremask, but
> > only one of those cores is to be used for I/O, what cores should the driver pin
> > the interrupts to?
> It can be cores on machine (guest or host) which is not used by DPDK.
> 
>  It probably should be the same core used for I/O, but the
> > driver can't know which cores will be for that, or alternatively the user might
> > want to use AF_XDP split across two cores, in which case any core on the
> > system might be the intended one for interrupts.
> I agree to the patch, only difference in dev->probe function, should not there be validation to ensure the IRQ core is not DPDK core or Service core as the Interface is owned by kernel and for non matched eBPF skb buff is used by kernel.
> 
No. Since the 5.4 kernel, it's a usable configuration to run both the
kernel and userspace portions of AF_XDP on the same core. In order to get
best performance with a fixed number of cores, this setup - with interrupts
pinned to the polling RX core - is now recommended. [For absolute best perf
using any number of cores, a separate interrupt core may still work best,
though]

/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 13:56                   ` Bruce Richardson
@ 2019-10-21 14:06                     ` Varghese, Vipin
  0 siblings, 0 replies; 27+ messages in thread
From: Varghese, Vipin @ 2019-10-21 14:06 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: Loftus, Ciara, 'Stephen Hemminger',
	'dev@dpdk.org',
	Ye, Xiaolong, Laatz, Kevin, Yigit, Ferruh

Ok thanks

> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday, October 21, 2019 7:26 PM
> To: Varghese, Vipin <vipin.varghese@intel.com>
> Cc: Loftus, Ciara <ciara.loftus@intel.com>; 'Stephen Hemminger'
> <stephen@networkplumber.org>; 'dev@dpdk.org' <dev@dpdk.org>; Ye,
> Xiaolong <xiaolong.ye@intel.com>; Laatz, Kevin <kevin.laatz@intel.com>;
> Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
> 
> On Mon, Oct 21, 2019 at 02:45:05PM +0100, Varghese, Vipin wrote:
> > Hi Bruce,
> >
> > snipped
> > > >  This ability to have the driver pin the interrupts for the
> > > > > user would be a big timesaver for developers too, who may be
> > > > > constantly re- running apps when testing.
> > > > Here my understanding, user can not or should not pass DPDK cores
> > > > for
> > > interrupt pinning. So should we ask the driver to fetch
> > > `rte_eal_configuration` and ensure the same?
> > > >
> > >
> > > Actually I disagree. I think the user should pass the cores for
> > > interrupt pinning,
> > I agree to this.
> >
> > > because unlike other PMDs it is perfectly valid to have the
> > > interrupts pinned to dedicated cores separate from those used by DPDK.
> > My point is the same, but not on DPDK DP or service cores.
> >
> > >
> > > Or taking another example, suppose the app takes 8 cores in the
> > > coremask, but only one of those cores is to be used for I/O, what
> > > cores should the driver pin the interrupts to?
> > It can be cores on machine (guest or host) which is not used by DPDK.
> >
> >  It probably should be the same core used for I/O, but the
> > > driver can't know which cores will be for that, or alternatively the
> > > user might want to use AF_XDP split across two cores, in which case
> > > any core on the system might be the intended one for interrupts.
> > I agree to the patch, only difference in dev->probe function, should not there
> be validation to ensure the IRQ core is not DPDK core or Service core as the
> Interface is owned by kernel and for non matched eBPF skb buff is used by
> kernel.
> >
> No. Since the 5.4 kernel, it's a usable configuration to run both the kernel and
> userspace portions of AF_XDP on the same core. In order to get best
> performance with a fixed number of cores, this setup - with interrupts pinned
> to the polling RX core - is now recommended. [For absolute best perf using any
> number of cores, a separate interrupt core may still work best, though]
> 
> /Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-14 14:43       ` Bruce Richardson
@ 2019-10-21 15:24         ` Ferruh Yigit
  2019-10-21 15:54           ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Ferruh Yigit @ 2019-10-21 15:24 UTC (permalink / raw)
  To: Bruce Richardson, Loftus, Ciara
  Cc: Stephen Hemminger, dev, Ye, Xiaolong, Laatz, Kevin, arybchenko

On 10/14/2019 3:43 PM, Bruce Richardson wrote:
> On Thu, Oct 03, 2019 at 02:23:07PM +0100, Loftus, Ciara wrote:
>>
>>
>>> -----Original Message----- From: Stephen Hemminger
>>> <stephen@networkplumber.org> Sent: Monday 30 September 2019 18:12 To:
>>> Loftus, Ciara <ciara.loftus@intel.com> Cc: dev@dpdk.org; Ye, Xiaolong
>>> <xiaolong.ye@intel.com>; Laatz, Kevin <kevin.laatz@intel.com>;
>>> Richardson, Bruce <bruce.richardson@intel.com> Subject: Re: [dpdk-dev]
>>> [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
>>>
>>> On Mon, 30 Sep 2019 16:42:04 +0000 Ciara Loftus
>>> <ciara.loftus@intel.com> wrote:
>>>
>>>> +/* drivers supported for the queue_irq option */ +enum
>>>> supported_drivers { +	I40E_DRIVER, +	IXGBE_DRIVER, +
>>>> MLX5_DRIVER, +	NUM_DRIVERS +};
>>>
>>> Anything device specific like this raises a red flag to me.
>>>
>>> This regex etc, seems like a huge hack. Is there a better way  using
>>> irqbalance and smp_affinity in kernel drivers?
>>>
>>> NACK
>>
>> Hi Stephen,
>>  
>> Thanks for looking at the patch. I understand your concern however
>> unfortunately I haven't been able to identify a way to achieve the
>> desired outcome by using your suggestions of irqbalance and smp_affinity.
>> Did you have something specific in mind or are aware of any generic way
>> of retrieving interrupt numbers for NICs regardless of vendor or range?
>>  
>> I think this feature is really important for the usability of this PMD.
>> Without it, to configure the IRQs the user has to open up
>> /proc/interrupts, trawl through it and identify the correct IRQ number
>> for their given NIC and qid (the format for which is unlikely to be known
>> off-hand), and manually pin them by writing the appropriate values in the
>> appropriate format to the appropriate file - prone to error if not
>> automated IMO.  If the user fails to set the affinity it's probably fine
>> for a single pmd, however with multiple pmds all irqs will by default
>> land on core 0 and lead to terrible performance.
>>
>> It should be possible to rework the code to remove the regexes and use a
>> direct string compare. Would that make the solution more palatable?
>>  
> 
> Hi Ciara, Stephen,
> 
> is there any way forward on this patch?
> 
> From my experience with using AF_XDP the pinning of interrupts is both
> necessary for performance and sadly rather awkward to implement in
> practice. If we can't find a better way to do this, I think merging this
> patch is the best thing to do. It may be a bit messy, but the overall user
> experience should be far improved over not having it.
> 

Can we have this external to the PMD, like a helper script that run after you
start the DPDK app?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 15:24         ` Ferruh Yigit
@ 2019-10-21 15:54           ` Bruce Richardson
  2019-10-21 16:02             ` Ferruh Yigit
  0 siblings, 1 reply; 27+ messages in thread
From: Bruce Richardson @ 2019-10-21 15:54 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Loftus, Ciara, Stephen Hemminger, dev, Ye, Xiaolong, Laatz,
	Kevin, arybchenko

On Mon, Oct 21, 2019 at 04:24:26PM +0100, Ferruh Yigit wrote:
> On 10/14/2019 3:43 PM, Bruce Richardson wrote:
> > On Thu, Oct 03, 2019 at 02:23:07PM +0100, Loftus, Ciara wrote:
> >>
> >>
> >>> -----Original Message----- From: Stephen Hemminger
> >>> <stephen@networkplumber.org> Sent: Monday 30 September 2019 18:12 To:
> >>> Loftus, Ciara <ciara.loftus@intel.com> Cc: dev@dpdk.org; Ye, Xiaolong
> >>> <xiaolong.ye@intel.com>; Laatz, Kevin <kevin.laatz@intel.com>;
> >>> Richardson, Bruce <bruce.richardson@intel.com> Subject: Re: [dpdk-dev]
> >>> [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
> >>>
> >>> On Mon, 30 Sep 2019 16:42:04 +0000 Ciara Loftus
> >>> <ciara.loftus@intel.com> wrote:
> >>>
> >>>> +/* drivers supported for the queue_irq option */ +enum
> >>>> supported_drivers { +	I40E_DRIVER, +	IXGBE_DRIVER, +
> >>>> MLX5_DRIVER, +	NUM_DRIVERS +};
> >>>
> >>> Anything device specific like this raises a red flag to me.
> >>>
> >>> This regex etc, seems like a huge hack. Is there a better way  using
> >>> irqbalance and smp_affinity in kernel drivers?
> >>>
> >>> NACK
> >>
> >> Hi Stephen,
> >>  
> >> Thanks for looking at the patch. I understand your concern however
> >> unfortunately I haven't been able to identify a way to achieve the
> >> desired outcome by using your suggestions of irqbalance and smp_affinity.
> >> Did you have something specific in mind or are aware of any generic way
> >> of retrieving interrupt numbers for NICs regardless of vendor or range?
> >>  
> >> I think this feature is really important for the usability of this PMD.
> >> Without it, to configure the IRQs the user has to open up
> >> /proc/interrupts, trawl through it and identify the correct IRQ number
> >> for their given NIC and qid (the format for which is unlikely to be known
> >> off-hand), and manually pin them by writing the appropriate values in the
> >> appropriate format to the appropriate file - prone to error if not
> >> automated IMO.  If the user fails to set the affinity it's probably fine
> >> for a single pmd, however with multiple pmds all irqs will by default
> >> land on core 0 and lead to terrible performance.
> >>
> >> It should be possible to rework the code to remove the regexes and use a
> >> direct string compare. Would that make the solution more palatable?
> >>  
> > 
> > Hi Ciara, Stephen,
> > 
> > is there any way forward on this patch?
> > 
> > From my experience with using AF_XDP the pinning of interrupts is both
> > necessary for performance and sadly rather awkward to implement in
> > practice. If we can't find a better way to do this, I think merging this
> > patch is the best thing to do. It may be a bit messy, but the overall user
> > experience should be far improved over not having it.
> > 
> 
> Can we have this external to the PMD, like a helper script that run after you
> start the DPDK app?

It could be, but the main objection here seems to be with the method used
to find the interrupts, which would not change based on a script version,
not to mention the fact that the usability would be far less in that case.
For ease of use, this needs to be part of the driver, and the current
implementation is the best we can do right now, and nobody has suggested a
concrete alternative to it.

/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 15:54           ` Bruce Richardson
@ 2019-10-21 16:02             ` Ferruh Yigit
  2019-10-21 16:14               ` Bruce Richardson
  0 siblings, 1 reply; 27+ messages in thread
From: Ferruh Yigit @ 2019-10-21 16:02 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Loftus, Ciara, Stephen Hemminger, dev, Ye, Xiaolong, Laatz,
	Kevin, arybchenko

On 10/21/2019 4:54 PM, Bruce Richardson wrote:
> On Mon, Oct 21, 2019 at 04:24:26PM +0100, Ferruh Yigit wrote:
>> On 10/14/2019 3:43 PM, Bruce Richardson wrote:
>>> On Thu, Oct 03, 2019 at 02:23:07PM +0100, Loftus, Ciara wrote:
>>>>
>>>>
>>>>> -----Original Message----- From: Stephen Hemminger
>>>>> <stephen@networkplumber.org> Sent: Monday 30 September 2019 18:12 To:
>>>>> Loftus, Ciara <ciara.loftus@intel.com> Cc: dev@dpdk.org; Ye, Xiaolong
>>>>> <xiaolong.ye@intel.com>; Laatz, Kevin <kevin.laatz@intel.com>;
>>>>> Richardson, Bruce <bruce.richardson@intel.com> Subject: Re: [dpdk-dev]
>>>>> [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
>>>>>
>>>>> On Mon, 30 Sep 2019 16:42:04 +0000 Ciara Loftus
>>>>> <ciara.loftus@intel.com> wrote:
>>>>>
>>>>>> +/* drivers supported for the queue_irq option */ +enum
>>>>>> supported_drivers { +	I40E_DRIVER, +	IXGBE_DRIVER, +
>>>>>> MLX5_DRIVER, +	NUM_DRIVERS +};
>>>>>
>>>>> Anything device specific like this raises a red flag to me.
>>>>>
>>>>> This regex etc, seems like a huge hack. Is there a better way  using
>>>>> irqbalance and smp_affinity in kernel drivers?
>>>>>
>>>>> NACK
>>>>
>>>> Hi Stephen,
>>>>  
>>>> Thanks for looking at the patch. I understand your concern however
>>>> unfortunately I haven't been able to identify a way to achieve the
>>>> desired outcome by using your suggestions of irqbalance and smp_affinity.
>>>> Did you have something specific in mind or are aware of any generic way
>>>> of retrieving interrupt numbers for NICs regardless of vendor or range?
>>>>  
>>>> I think this feature is really important for the usability of this PMD.
>>>> Without it, to configure the IRQs the user has to open up
>>>> /proc/interrupts, trawl through it and identify the correct IRQ number
>>>> for their given NIC and qid (the format for which is unlikely to be known
>>>> off-hand), and manually pin them by writing the appropriate values in the
>>>> appropriate format to the appropriate file - prone to error if not
>>>> automated IMO.  If the user fails to set the affinity it's probably fine
>>>> for a single pmd, however with multiple pmds all irqs will by default
>>>> land on core 0 and lead to terrible performance.
>>>>
>>>> It should be possible to rework the code to remove the regexes and use a
>>>> direct string compare. Would that make the solution more palatable?
>>>>  
>>>
>>> Hi Ciara, Stephen,
>>>
>>> is there any way forward on this patch?
>>>
>>> From my experience with using AF_XDP the pinning of interrupts is both
>>> necessary for performance and sadly rather awkward to implement in
>>> practice. If we can't find a better way to do this, I think merging this
>>> patch is the best thing to do. It may be a bit messy, but the overall user
>>> experience should be far improved over not having it.
>>>
>>
>> Can we have this external to the PMD, like a helper script that run after you
>> start the DPDK app?
> 
> It could be, but the main objection here seems to be with the method used
> to find the interrupts, which would not change based on a script version,
> not to mention the fact that the usability would be far less in that case.
> For ease of use, this needs to be part of the driver, and the current
> implementation is the best we can do right now, and nobody has suggested a
> concrete alternative to it.
> 

The method used to find interrupts can be same but it being embedded into a
specific PMD with the list of supported driver hardcoded versus putting this
information into a script differs I think.

I tend to agree that this looks hack for the PMD.

And not sure if the usability will be far less, instead of providing parameters
to the PMD, you will provide to the script. And script becomes something more
generic.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
  2019-10-21 16:02             ` Ferruh Yigit
@ 2019-10-21 16:14               ` Bruce Richardson
  0 siblings, 0 replies; 27+ messages in thread
From: Bruce Richardson @ 2019-10-21 16:14 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Loftus, Ciara, Stephen Hemminger, dev, Ye, Xiaolong, Laatz,
	Kevin, arybchenko

On Mon, Oct 21, 2019 at 05:02:25PM +0100, Ferruh Yigit wrote:
> On 10/21/2019 4:54 PM, Bruce Richardson wrote:
> > On Mon, Oct 21, 2019 at 04:24:26PM +0100, Ferruh Yigit wrote:
> >> On 10/14/2019 3:43 PM, Bruce Richardson wrote:
> >>> On Thu, Oct 03, 2019 at 02:23:07PM +0100, Loftus, Ciara wrote:
> >>>>
> >>>>
> >>>>> -----Original Message----- From: Stephen Hemminger
> >>>>> <stephen@networkplumber.org> Sent: Monday 30 September 2019 18:12 To:
> >>>>> Loftus, Ciara <ciara.loftus@intel.com> Cc: dev@dpdk.org; Ye, Xiaolong
> >>>>> <xiaolong.ye@intel.com>; Laatz, Kevin <kevin.laatz@intel.com>;
> >>>>> Richardson, Bruce <bruce.richardson@intel.com> Subject: Re: [dpdk-dev]
> >>>>> [PATCH v2 2/3] net/af_xdp: support pinning of IRQs
> >>>>>
> >>>>> On Mon, 30 Sep 2019 16:42:04 +0000 Ciara Loftus
> >>>>> <ciara.loftus@intel.com> wrote:
> >>>>>
> >>>>>> +/* drivers supported for the queue_irq option */ +enum
> >>>>>> supported_drivers { +	I40E_DRIVER, +	IXGBE_DRIVER, +
> >>>>>> MLX5_DRIVER, +	NUM_DRIVERS +};
> >>>>>
> >>>>> Anything device specific like this raises a red flag to me.
> >>>>>
> >>>>> This regex etc, seems like a huge hack. Is there a better way  using
> >>>>> irqbalance and smp_affinity in kernel drivers?
> >>>>>
> >>>>> NACK
> >>>>
> >>>> Hi Stephen,
> >>>>  
> >>>> Thanks for looking at the patch. I understand your concern however
> >>>> unfortunately I haven't been able to identify a way to achieve the
> >>>> desired outcome by using your suggestions of irqbalance and smp_affinity.
> >>>> Did you have something specific in mind or are aware of any generic way
> >>>> of retrieving interrupt numbers for NICs regardless of vendor or range?
> >>>>  
> >>>> I think this feature is really important for the usability of this PMD.
> >>>> Without it, to configure the IRQs the user has to open up
> >>>> /proc/interrupts, trawl through it and identify the correct IRQ number
> >>>> for their given NIC and qid (the format for which is unlikely to be known
> >>>> off-hand), and manually pin them by writing the appropriate values in the
> >>>> appropriate format to the appropriate file - prone to error if not
> >>>> automated IMO.  If the user fails to set the affinity it's probably fine
> >>>> for a single pmd, however with multiple pmds all irqs will by default
> >>>> land on core 0 and lead to terrible performance.
> >>>>
> >>>> It should be possible to rework the code to remove the regexes and use a
> >>>> direct string compare. Would that make the solution more palatable?
> >>>>  
> >>>
> >>> Hi Ciara, Stephen,
> >>>
> >>> is there any way forward on this patch?
> >>>
> >>> From my experience with using AF_XDP the pinning of interrupts is both
> >>> necessary for performance and sadly rather awkward to implement in
> >>> practice. If we can't find a better way to do this, I think merging this
> >>> patch is the best thing to do. It may be a bit messy, but the overall user
> >>> experience should be far improved over not having it.
> >>>
> >>
> >> Can we have this external to the PMD, like a helper script that run after you
> >> start the DPDK app?
> > 
> > It could be, but the main objection here seems to be with the method used
> > to find the interrupts, which would not change based on a script version,
> > not to mention the fact that the usability would be far less in that case.
> > For ease of use, this needs to be part of the driver, and the current
> > implementation is the best we can do right now, and nobody has suggested a
> > concrete alternative to it.
> > 
> 
> The method used to find interrupts can be same but it being embedded into a
> specific PMD with the list of supported driver hardcoded versus putting this
> information into a script differs I think.
> 
> I tend to agree that this looks hack for the PMD.
> 
We are happy to work to make it less of a hack. However, no better methods
are forthcoming.

> And not sure if the usability will be far less, instead of providing parameters
> to the PMD, you will provide to the script. And script becomes something more
> generic.
Unlikely to be generic. It will still be very much tied to AF_XDP PMD, I
think. A script would be workable option except for the fact that the
script has to be run each time *after* the application starts up. This is
very different to e.g. dev_bind, where it only needs to be done once, and
can be done ahead of application startup. Using a script in this case means
that the user needs to either start their app in the background or use a
separate terminal session to run the script once app startup has occurred.
This also makes the integration into a CI automation system that bit more
complex too, I believe.

Beyond this, this patch solves an immediate, visible, user problem
illustrated with measurable performance data above, while the issues with
it are concerning possible future maintenance issues which may never arise.

/Bruce

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets Ciara Loftus
@ 2019-10-22  5:32   ` Ye Xiaolong
  0 siblings, 0 replies; 27+ messages in thread
From: Ye Xiaolong @ 2019-10-22  5:32 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, kevin.laatz, bruce.richardson

Hi, 

As this is indeed a kernel driver issue, and Magnus is working on the workaound/fix
in i40e kernel driver, as long as the workaound/fix can be merged in v5.4, this
patch can be dropped.

Thanks,
Xiaolong

On 09/30, Ciara Loftus wrote:
>From: Xiaolong Ye <xiaolong.ye@intel.com>
>
>The kernel only consumes Tx packets if we have some Rx traffic on specified
>queue or we have called send(). So we need to issue a send() even when the
>allocation fails so that kernel will start to consume packets again.
>
>Commit 45bba02c95b0 ("net/af_xdp: support need wakeup feature") breaks
>above rule by adding some condition to send, this patch fixes it while
>still keeps the need_wakeup feature for Tx.
>
>Fixes: 45bba02c95b0 ("net/af_xdp: support need wakeup feature")
>Cc: stable@dpdk.org
>
>Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
>Tested-by: Ciara Loftus <ciara.loftus@intel.com>
>---
> drivers/net/af_xdp/rte_eth_af_xdp.c | 28 ++++++++++++++--------------
> 1 file changed, 14 insertions(+), 14 deletions(-)
>
>diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
>index 41ed5b2af..e496e9aaa 100644
>--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>@@ -286,19 +286,16 @@ kick_tx(struct pkt_tx_queue *txq)
> {
> 	struct xsk_umem_info *umem = txq->pair->umem;
> 
>-#if defined(XDP_USE_NEED_WAKEUP)
>-	if (xsk_ring_prod__needs_wakeup(&txq->tx))
>-#endif
>-		while (send(xsk_socket__fd(txq->pair->xsk), NULL,
>-			    0, MSG_DONTWAIT) < 0) {
>-			/* some thing unexpected */
>-			if (errno != EBUSY && errno != EAGAIN && errno != EINTR)
>-				break;
>-
>-			/* pull from completion queue to leave more space */
>-			if (errno == EAGAIN)
>-				pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>-		}
>+	while (send(xsk_socket__fd(txq->pair->xsk), NULL,
>+		    0, MSG_DONTWAIT) < 0) {
>+		/* some thing unexpected */
>+		if (errno != EBUSY && errno != EAGAIN && errno != EINTR)
>+			break;
>+
>+		/* pull from completion queue to leave more space */
>+		if (errno == EAGAIN)
>+			pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>+	}
> 	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
> }
> 
>@@ -367,7 +364,10 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> 
> 	xsk_ring_prod__submit(&txq->tx, nb_pkts);
> 
>-	kick_tx(txq);
>+#if defined(XDP_USE_NEED_WAKEUP)
>+	if (xsk_ring_prod__needs_wakeup(&txq->tx))
>+#endif
>+		kick_tx(txq);
> 
> 	txq->stats.tx_pkts += nb_pkts;
> 	txq->stats.tx_bytes += tx_bytes;
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks
  2019-10-18 23:48   ` Ye Xiaolong
@ 2019-10-22 14:28     ` Ferruh Yigit
  0 siblings, 0 replies; 27+ messages in thread
From: Ferruh Yigit @ 2019-10-22 14:28 UTC (permalink / raw)
  To: Ye Xiaolong, Ciara Loftus; +Cc: dev, kevin.laatz, bruce.richardson

On 10/19/2019 12:48 AM, Ye Xiaolong wrote:
> On 09/30, Ciara Loftus wrote:
>> This patch enables the unaligned chunks feature for AF_XDP which allows
>> chunks to be placed at arbitrary places in the umem, as opposed to them
>> being required to be aligned to 2k. This allows for DPDK application
>> mempools to be mapped directly into the umem and in turn enable zero copy
>> transfer between umem and the PMD.
>>
>> This patch replaces the zero copy via external mbuf mechanism introduced
>> in commit e9ff8bb71943 ("net/af_xdp: enable zero copy by external mbuf").
>> The pmd_zero copy vdev argument is also removed as now the PMD will
>> auto-detect presence of the unaligned chunks feature and enable it if so
>> and otherwise fall back to copy mode if not detected.
>>
>> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
>> ---
>> doc/guides/nics/af_xdp.rst             |   2 +-
>> doc/guides/rel_notes/release_19_11.rst |   4 +
>> drivers/net/af_xdp/rte_eth_af_xdp.c    | 402 ++++++++++++++++++++-----
>> 3 files changed, 325 insertions(+), 83 deletions(-)
>>
> 
> Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
> 

We tend to not split patches but since this patch doesn't depend on others, I
will get this while the discussion is going on with the other one.

Applied to dpdk-next-net/master, thanks.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks
  2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks Ciara Loftus
  2019-10-18 23:48   ` Ye Xiaolong
@ 2019-10-24 11:10   ` David Marchand
  2019-10-24 12:18     ` David Marchand
  1 sibling, 1 reply; 27+ messages in thread
From: David Marchand @ 2019-10-24 11:10 UTC (permalink / raw)
  To: Ciara Loftus, Yigit, Ferruh
  Cc: dev, Xiaolong Ye, Kevin Laatz, Bruce Richardson, Maxime Coquelin

On Mon, Sep 30, 2019 at 6:45 PM Ciara Loftus <ciara.loftus@intel.com> wrote:
> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> index 06bf57c42..22369107c 100644
> --- a/doc/guides/rel_notes/release_19_11.rst
> +++ b/doc/guides/rel_notes/release_19_11.rst
> @@ -62,6 +62,8 @@ New Features
>
>    * Support for pinning netdev queue IRQs to cores specified by the user.
>      Available for ixgbe, i40e and mlx5 drivers.
> +  * Enabled zero copy between application mempools and UMEM by enabling the
> +    XDP_UMEM_UNALIGNED_CHUNKS UMEM flag.
>
>
>  Removed Items
> @@ -85,6 +87,8 @@ Removed Items
>       "port config <port_id> rx_offload crc_strip|scatter|ipv4_cksum|udp_cksum|tcp_cksum|
>       timestamp|vlan_strip|vlan_filter|vlan_extend on|off"
>
> +   * Removed AF_XDP pmd_zero copy vdev argument. Support is now auto-detected.
> +
>

Caught by Maxime, this part is indented and interpreted as a comment
when generating the doc.

Ferruh, how do you want to deal with this?
I can merge http://patchwork.dpdk.org/patch/60798/ in master for testpmd.

Or you merge it and fix the AF_XDP issue in next-net.


Thanks.
-- 
David Marchand

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks
  2019-10-24 11:10   ` David Marchand
@ 2019-10-24 12:18     ` David Marchand
  2019-10-24 14:18       ` Ferruh Yigit
  0 siblings, 1 reply; 27+ messages in thread
From: David Marchand @ 2019-10-24 12:18 UTC (permalink / raw)
  To: Yigit, Ferruh
  Cc: dev, Xiaolong Ye, Kevin Laatz, Bruce Richardson, Maxime Coquelin,
	Ciara Loftus, Thomas Monjalon

On Thu, Oct 24, 2019 at 1:10 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Mon, Sep 30, 2019 at 6:45 PM Ciara Loftus <ciara.loftus@intel.com> wrote:
> > diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
> > index 06bf57c42..22369107c 100644
> > --- a/doc/guides/rel_notes/release_19_11.rst
> > +++ b/doc/guides/rel_notes/release_19_11.rst
> > @@ -62,6 +62,8 @@ New Features
> >
> >    * Support for pinning netdev queue IRQs to cores specified by the user.
> >      Available for ixgbe, i40e and mlx5 drivers.
> > +  * Enabled zero copy between application mempools and UMEM by enabling the
> > +    XDP_UMEM_UNALIGNED_CHUNKS UMEM flag.
> >
> >
> >  Removed Items
> > @@ -85,6 +87,8 @@ Removed Items
> >       "port config <port_id> rx_offload crc_strip|scatter|ipv4_cksum|udp_cksum|tcp_cksum|
> >       timestamp|vlan_strip|vlan_filter|vlan_extend on|off"
> >
> > +   * Removed AF_XDP pmd_zero copy vdev argument. Support is now auto-detected.
> > +
> >
>
> Caught by Maxime, this part is indented and interpreted as a comment
> when generating the doc.
>
> Ferruh, how do you want to deal with this?
> I can merge http://patchwork.dpdk.org/patch/60798/ in master for testpmd.
>

Synced with Thomas.
The first fix is going through master.
Thomas will fix when pulling next-net.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks
  2019-10-24 12:18     ` David Marchand
@ 2019-10-24 14:18       ` Ferruh Yigit
  0 siblings, 0 replies; 27+ messages in thread
From: Ferruh Yigit @ 2019-10-24 14:18 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, Xiaolong Ye, Kevin Laatz, Bruce Richardson, Maxime Coquelin,
	Ciara Loftus, Thomas Monjalon

On 10/24/2019 1:18 PM, David Marchand wrote:
> On Thu, Oct 24, 2019 at 1:10 PM David Marchand
> <david.marchand@redhat.com> wrote:
>>
>> On Mon, Sep 30, 2019 at 6:45 PM Ciara Loftus <ciara.loftus@intel.com> wrote:
>>> diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
>>> index 06bf57c42..22369107c 100644
>>> --- a/doc/guides/rel_notes/release_19_11.rst
>>> +++ b/doc/guides/rel_notes/release_19_11.rst
>>> @@ -62,6 +62,8 @@ New Features
>>>
>>>    * Support for pinning netdev queue IRQs to cores specified by the user.
>>>      Available for ixgbe, i40e and mlx5 drivers.
>>> +  * Enabled zero copy between application mempools and UMEM by enabling the
>>> +    XDP_UMEM_UNALIGNED_CHUNKS UMEM flag.
>>>
>>>
>>>  Removed Items
>>> @@ -85,6 +87,8 @@ Removed Items
>>>       "port config <port_id> rx_offload crc_strip|scatter|ipv4_cksum|udp_cksum|tcp_cksum|
>>>       timestamp|vlan_strip|vlan_filter|vlan_extend on|off"
>>>
>>> +   * Removed AF_XDP pmd_zero copy vdev argument. Support is now auto-detected.
>>> +
>>>
>>
>> Caught by Maxime, this part is indented and interpreted as a comment
>> when generating the doc.
>>
>> Ferruh, how do you want to deal with this?
>> I can merge http://patchwork.dpdk.org/patch/60798/ in master for testpmd.
>>
> 
> Synced with Thomas.
> The first fix is going through master.
> Thomas will fix when pulling next-net.
> 

Thanks!

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-10-24 14:18 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-30 16:42 [dpdk-dev] [PATCH v2 0/3] AF_XDP tx halt fix, IRQ pinning and unaligned chunks Ciara Loftus
2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix Tx halt when no recv packets Ciara Loftus
2019-10-22  5:32   ` Ye Xiaolong
2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 2/3] net/af_xdp: support pinning of IRQs Ciara Loftus
2019-09-30 17:11   ` Stephen Hemminger
2019-10-03 13:23     ` Loftus, Ciara
2019-10-14 14:43       ` Bruce Richardson
2019-10-21 15:24         ` Ferruh Yigit
2019-10-21 15:54           ` Bruce Richardson
2019-10-21 16:02             ` Ferruh Yigit
2019-10-21 16:14               ` Bruce Richardson
2019-10-15 11:14       ` Ray Kinsella
2019-10-21 10:04       ` Loftus, Ciara
2019-10-21 12:52         ` Varghese, Vipin
2019-10-21 13:04           ` Bruce Richardson
2019-10-21 13:11             ` Varghese, Vipin
2019-10-21 13:17               ` Bruce Richardson
2019-10-21 13:45                 ` Varghese, Vipin
2019-10-21 13:56                   ` Bruce Richardson
2019-10-21 14:06                     ` Varghese, Vipin
2019-10-18 23:49   ` Ye Xiaolong
2019-09-30 16:42 ` [dpdk-dev] [PATCH v2 3/3] net/af_xdp: enable support for unaligned umem chunks Ciara Loftus
2019-10-18 23:48   ` Ye Xiaolong
2019-10-22 14:28     ` Ferruh Yigit
2019-10-24 11:10   ` David Marchand
2019-10-24 12:18     ` David Marchand
2019-10-24 14:18       ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).