DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/2] LACP control packet filtering offload
@ 2017-05-27 11:27 Tomasz Kulasek
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 1/2] " Tomasz Kulasek
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Tomasz Kulasek @ 2017-05-27 11:27 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty

1. Overview

  Packet processing in the current path for bonding in mode 4, requires
  parse all packets in the fast path, to classify and process LACP
  packets.

  The idea of performance improvement is to use hardware offloads to
  improve packet classification.

2. Scope of work

   a) Optimization of software LACP packet classification by using
      packet_type metadata to eliminate the requirement of parsing each
      packet in the received burst.

   b) Implementation of classification mechanism using flow director to
      redirect LACP packets to the dedicated queue (not visible by
      application).

      - Filter pattern choosing (not all filters are supported by all
        devices),
      - Changing processing path to speed up non-LACP packets
        processing,
      - Handle LACP packets from dedicated Rx queue and send to the
        dedicated Tx queue,

   c) Creation of fallback mechanism allowing to select the most
      preferable method of processing:

      - Flow director,
      - Packet type metadata,
      - Software parsing,

3. Implementation

3.1. Packet type

   The packet_type approach would result in a performance improvement
   as packets data would no longer be required to be read, but with this
   approach the bonded driver would still need to look at the mbuf of
   each packet thereby having an impact on the achievable Rx
   performance.

   There's not packet_type value describing LACP packets directly.
   However, it can be used to limit number of packets required to be
   parsed, e.g. if packet_type indicates >L2 packets.

   It should improve performance while well-known non-LACP packets can
   be skipped without the need to look up into its data.

3.2. Flow director

   Using rte_flow API and pattern on ethernet type of packet (0x8809),
   we can configure flow director to redirect slow packets to separated
   queue.

   An independent Rx queues for LACP would remove the requirement to
   filter all ingress traffic in sw which should result in a performance
   increase. Other queues stay untouched and processing of packets on
   the fast path will be reduced to simple packet collecting from
   slaves.

   Separated Tx queue for LACP daemon allows to send LACP responses
   immediately, without interfering into Tx fast path.

   RECEIVE

         .---------------.
         | Slave 0       |
         |      .------. |
         |  Fd  | Rxq  | |
   Rx ======o==>|      |==============.
         |  |   +======+ |            |      .---------------.
         |  `-->| LACP |--------.     |      | Bonding       |
         |      `------' |      |     |      |      .------. |
         `---------------'      |     |      |      |      | |
                                |     >============>|      |=======> Rx
         .---------------.      |     |      |      +======+ |
         | Slave 1       |      |     |      |      | XXXX | |
         |      .------. |      |     |      |      `------' |
         |  Fd  | Rxq  | |      |     |      `---------------'
   Rx ======o==>|      |=============='        .-----------.
         |  |   +======+ |      |             /             \
         |  `-->| LACP |--------+----------->+  LACP DAEMON  |
         |      `------' |             Tx <---\             /
         `---------------'                     `-----------'

   All slow packets received by slaves in bonding are redirected to the
   separated queue using flow director. Other packets are collected from
   slaves and exposed to the application with Rx burst on bonded device.

   TRANSMIT

         .---------------.
         | Slave 0       |
         |      .------. |
         |      |      | |
   Tx <=====+===|      |<=============.
         |  |   |------| |            |      .---------------.
         |  `---| LACP |<-------.     |      | Bonding       |
         |      `------' |      |     |      |      .------. |
         `---------------'      |     |      |      |      | |
                                |     +<============|      |<====== Tx
         .---------------.      |     |      |      +======+ |
         | Slave 1       |      |     |      |      | XXXX | |
         |      .------. |      |     |      |      `------' |
         |      |      | |      |     |      `---------------'
   Tx <=====+===|      |<============='  Rx    .-----------.
         |  |   |------| |      |         `-->/             \
         |  `---| LACP |<-------+------------+  LACP DAEMON  |
         |      `------' |                    \             /
         `---------------'                     `-----------'

   On transmit, packets are propagated on the slaves. While we have
   separated Tx queue for LACP responses, it can be sent regardless of
   the fast path.

   LACP DAEMON

   In this mode whole slow packets are handled in LACP DAEMON.


Tomasz Kulasek (2):
  LACP control packet filtering offload
  test-pmd: add set bonding slow_queue hw/sw

 app/test-pmd/cmdline.c                            |  58 ++++
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 141 +++++++--
 drivers/net/bonding/rte_eth_bond_8023ad.h         |   6 +
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  15 +
 drivers/net/bonding/rte_eth_bond_pmd.c            | 345 +++++++++++++++++++++-
 drivers/net/bonding/rte_eth_bond_version.map      |   9 +
 6 files changed, 539 insertions(+), 35 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH 1/2] LACP control packet filtering offload
  2017-05-27 11:27 [dpdk-dev] [PATCH 0/2] LACP control packet filtering offload Tomasz Kulasek
@ 2017-05-27 11:27 ` Tomasz Kulasek
  2017-05-29  8:10   ` Adrien Mazarguil
  2017-06-29  9:18   ` Declan Doherty
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 2/2] test-pmd: add set bonding slow_queue hw/sw Tomasz Kulasek
  2017-06-29 16:20 ` [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload Tomasz Kulasek
  2 siblings, 2 replies; 23+ messages in thread
From: Tomasz Kulasek @ 2017-05-27 11:27 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty

New API funtions implemented:

   rte_eth_bond_8023ad_slow_queue_enable(uint8_t port_id);
   rte_eth_bond_8023ad_slow_queue_disable(uint8_t port_id);

rte_eth_bond_8023ad_slow_queue_enable should be called before bonding port
start to enable new path.

When this option is enabled all slaves must support flow director's
filtering by ethernet type and support one additional queue on slaves
tx/rx.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 141 +++++++--
 drivers/net/bonding/rte_eth_bond_8023ad.h         |   6 +
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  15 +
 drivers/net/bonding/rte_eth_bond_pmd.c            | 345 +++++++++++++++++++++-
 drivers/net/bonding/rte_eth_bond_version.map      |   9 +
 5 files changed, 481 insertions(+), 35 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index 7b863d6..125eb45 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -632,12 +632,20 @@
 	lacpdu->tlv_type_terminator = TLV_TYPE_TERMINATOR_INFORMATION;
 	lacpdu->terminator_length = 0;
 
-	if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
-		/* If TX ring full, drop packet and free message. Retransmission
-		 * will happen in next function call. */
-		rte_pktmbuf_free(lacp_pkt);
-		set_warning_flags(port, WRN_TX_QUEUE_FULL);
-		return;
+	if (internals->mode4.slow_rx_queue == 0) {
+		if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
+			/* If TX ring full, drop packet and free message. Retransmission
+			 * will happen in next function call. */
+			rte_pktmbuf_free(lacp_pkt);
+			set_warning_flags(port, WRN_TX_QUEUE_FULL);
+			return;
+		}
+	} else {
+		if (rte_eth_tx_burst(slave_id, internals->mode4.slow_tx_queue, &lacp_pkt, 1) == 0) {
+			rte_pktmbuf_free(lacp_pkt);
+			set_warning_flags(port, WRN_TX_QUEUE_FULL);
+			return;
+		}
 	}
 
 	MODE4_DEBUG("sending LACP frame\n");
@@ -741,6 +749,25 @@
 }
 
 static void
+rx_machine_update(struct bond_dev_private *internals, uint8_t slave_id,
+		struct rte_mbuf *lacp_pkt) {
+
+	/* Find LACP packet to this port. Do not check subtype, it is done in
+	 * function that queued packet */
+	if (lacp_pkt != NULL) {
+		struct lacpdu_header *lacp;
+
+		lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
+		RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
+
+		/* This is LACP frame so pass it to rx_machine */
+		rx_machine(internals, slave_id, &lacp->lacpdu);
+		rte_pktmbuf_free(lacp_pkt);
+	} else
+		rx_machine(internals, slave_id, NULL);
+}
+
+static void
 bond_mode_8023ad_periodic_cb(void *arg)
 {
 	struct rte_eth_dev *bond_dev = arg;
@@ -809,20 +836,21 @@
 
 		SM_FLAG_SET(port, LACP_ENABLED);
 
-		/* Find LACP packet to this port. Do not check subtype, it is done in
-		 * function that queued packet */
-		if (rte_ring_dequeue(port->rx_ring, &pkt) == 0) {
-			struct rte_mbuf *lacp_pkt = pkt;
-			struct lacpdu_header *lacp;
+		struct rte_mbuf *lacp_pkt = NULL;
 
-			lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
-			RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
+		if (internals->mode4.slow_rx_queue == 0) {
+			/* Find LACP packet to this port. Do not check subtype, it is done in
+			 * function that queued packet */
+			if (rte_ring_dequeue(port->rx_ring, &pkt) == 0)
+				lacp_pkt = pkt;
 
-			/* This is LACP frame so pass it to rx_machine */
-			rx_machine(internals, slave_id, &lacp->lacpdu);
-			rte_pktmbuf_free(lacp_pkt);
-		} else
-			rx_machine(internals, slave_id, NULL);
+			rx_machine_update(internals, slave_id, lacp_pkt);
+		} else {
+			if (rte_eth_rx_burst(slave_id, internals->mode4.slow_rx_queue, &lacp_pkt, 1) == 1)
+				bond_mode_8023ad_handle_slow_pkt(internals, slave_id, lacp_pkt);
+			else
+				rx_machine_update(internals, slave_id, NULL);
+		}
 
 		periodic_machine(internals, slave_id);
 		mux_machine(internals, slave_id);
@@ -1188,18 +1216,36 @@
 		m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
 		rte_eth_macaddr_get(slave_id, &m_hdr->eth_hdr.s_addr);
 
-		if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) == -ENOBUFS)) {
-			/* reset timer */
-			port->rx_marker_timer = 0;
-			wrn = WRN_TX_QUEUE_FULL;
-			goto free_out;
+		if (internals->mode4.slow_tx_queue == 0) {
+			if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) ==
+					-ENOBUFS)) {
+				/* reset timer */
+				port->rx_marker_timer = 0;
+				wrn = WRN_TX_QUEUE_FULL;
+				goto free_out;
+			}
+		} else {
+			/* Send packet directly to the slow queue */
+			if (unlikely(rte_eth_tx_burst(slave_id,
+					internals->mode4.slow_tx_queue,
+					&pkt, 1) == 0)) {
+				/* reset timer */
+				port->rx_marker_timer = 0;
+				wrn = WRN_TX_QUEUE_FULL;
+				goto free_out;
+			}
 		}
 	} else if (likely(subtype == SLOW_SUBTYPE_LACP)) {
-		if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
-			/* If RX fing full free lacpdu message and drop packet */
-			wrn = WRN_RX_QUEUE_FULL;
-			goto free_out;
-		}
+
+		if (internals->mode4.slow_rx_queue == 0) {
+			if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
+				/* If RX fing full free lacpdu message and drop packet */
+				wrn = WRN_RX_QUEUE_FULL;
+				goto free_out;
+			}
+		} else
+			rx_machine_update(internals, slave_id, pkt);
+
 	} else {
 		wrn = WRN_UNKNOWN_SLOW_TYPE;
 		goto free_out;
@@ -1504,3 +1550,42 @@
 	rte_eal_alarm_set(internals->mode4.update_timeout_us,
 			bond_mode_8023ad_ext_periodic_cb, arg);
 }
+
+#define MBUF_CACHE_SIZE 250
+#define NUM_MBUFS 8191
+
+int
+rte_eth_bond_8023ad_slow_queue_enable(uint8_t port)
+{
+	int retval = 0;
+	struct rte_eth_dev *dev = &rte_eth_devices[port];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		dev->data->dev_private;
+
+	if (check_for_bonded_ethdev(dev) != 0)
+		return -1;
+
+	internals->mode4.slow_rx_queue = dev->data->nb_rx_queues;
+	internals->mode4.slow_tx_queue = dev->data->nb_tx_queues;
+
+	bond_ethdev_mode_set(dev, internals->mode);
+	return retval;
+}
+
+int
+rte_eth_bond_8023ad_slow_queue_disable(uint8_t port)
+{
+	int retval = 0;
+	struct rte_eth_dev *dev = &rte_eth_devices[port];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		dev->data->dev_private;
+
+	if (check_for_bonded_ethdev(dev) != 0)
+		return -1;
+
+	internals->mode4.slow_rx_queue = 0;
+	internals->mode4.slow_tx_queue = 0;
+
+	bond_ethdev_mode_set(dev, internals->mode);
+	return retval;
+}
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.h b/drivers/net/bonding/rte_eth_bond_8023ad.h
index 6b8ff57..8d21c7a 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.h
@@ -302,4 +302,10 @@ struct rte_eth_bond_8023ad_slave_info {
 rte_eth_bond_8023ad_ext_slowtx(uint8_t port_id, uint8_t slave_id,
 		struct rte_mbuf *lacp_pkt);
 
+int
+rte_eth_bond_8023ad_slow_queue_enable(uint8_t port_id);
+
+int
+rte_eth_bond_8023ad_slow_queue_disable(uint8_t port_id);
+
 #endif /* RTE_ETH_BOND_8023AD_H_ */
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
index ca8858b..3963714 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_byteorder.h>
 #include <rte_atomic.h>
+#include <rte_flow.h>
 
 #include "rte_eth_bond_8023ad.h"
 
@@ -162,6 +163,9 @@ struct port {
 
 	uint64_t warning_timer;
 	volatile uint16_t warnings_to_show;
+
+	/** Memory pool used to allocate slow queues */
+	struct rte_mempool *slow_pool;
 };
 
 struct mode8023ad_private {
@@ -175,6 +179,10 @@ struct mode8023ad_private {
 	uint64_t update_timeout_us;
 	rte_eth_bond_8023ad_ext_slowrx_fn slowrx_cb;
 	uint8_t external_sm;
+
+	uint8_t slow_rx_queue; /**< Queue no for slow packets, or 0 if no accel */
+	uint8_t slow_tx_queue;
+	struct rte_flow *slow_flow[RTE_MAX_ETHPORTS];
 };
 
 /**
@@ -295,4 +303,11 @@ struct mode8023ad_private {
 void
 bond_mode_8023ad_mac_address_update(struct rte_eth_dev *bond_dev);
 
+int
+bond_ethdev_8023ad_flow_verify(struct rte_eth_dev *bond_dev,
+		uint8_t slave_port);
+
+int
+bond_ethdev_8023ad_flow_set(struct rte_eth_dev *bond_dev, uint8_t slave_port);
+
 #endif /* RTE_ETH_BOND_8023AD_H_ */
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 82959ab..558682c 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -59,6 +59,12 @@
 /* Table for statistics in mode 5 TLB */
 static uint64_t tlb_last_obytets[RTE_MAX_ETHPORTS];
 
+#if  __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define _htons(x) ((uint16_t)((((x) & 0x00ffU) << 8) | (((x) & 0xff00U) >> 8)))
+#else
+#define _htons(x) (x)
+#endif
+
 static inline size_t
 get_vlan_offset(struct ether_hdr *eth_hdr, uint16_t *proto)
 {
@@ -133,6 +139,215 @@
 		(subtype == SLOW_SUBTYPE_MARKER || subtype == SLOW_SUBTYPE_LACP));
 }
 
+/*****************************************************************************
+ * Flow director's setup for mode 4 optimization
+ */
+
+static struct rte_flow_item_eth flow_item_eth_type_8023ad = {
+	.dst.addr_bytes = { 0 },
+	.src.addr_bytes = { 0 },
+	.type = _htons(ETHER_TYPE_SLOW),
+};
+
+static struct rte_flow_item_eth flow_item_eth_mask_type_8023ad = {
+	.dst.addr_bytes = { 0 },
+	.src.addr_bytes = { 0 },
+	.type = 0xFFFF,
+};
+
+static struct rte_flow_item flow_item_8023ad[] = {
+	{
+		.type = RTE_FLOW_ITEM_TYPE_ETH,
+		.spec = &flow_item_eth_type_8023ad,
+		.last = NULL,
+		.mask = &flow_item_eth_mask_type_8023ad,
+	},
+	{
+		.type = RTE_FLOW_ITEM_TYPE_END,
+		.spec = NULL,
+		.last = NULL,
+		.mask = NULL,
+	}
+};
+
+const struct rte_flow_attr flow_attr_8023ad = {
+	.group = 0,
+	.priority = 0,
+	.ingress = 1,
+	.egress = 0,
+	.reserved = 0,
+};
+
+int
+bond_ethdev_8023ad_flow_verify(struct rte_eth_dev *bond_dev,
+		uint8_t slave_port) {
+
+	struct rte_flow_error error;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+
+	struct rte_flow_action_queue lacp_queue_conf = {
+		.index = internals->mode4.slow_rx_queue,
+	};
+
+	const struct rte_flow_action actions[] = {
+		{
+			.type = RTE_FLOW_ACTION_TYPE_QUEUE,
+			.conf = &lacp_queue_conf
+		},
+		{
+			.type = RTE_FLOW_ACTION_TYPE_END,
+		}
+	};
+
+	int ret = rte_flow_validate(slave_port, &flow_attr_8023ad,
+			flow_item_8023ad, actions, &error);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+int
+bond_ethdev_8023ad_flow_set(struct rte_eth_dev *bond_dev, uint8_t slave_port) {
+
+	struct rte_flow_error error;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+
+	struct rte_flow_action_queue lacp_queue_conf = {
+		.index = internals->mode4.slow_rx_queue,
+	};
+
+	const struct rte_flow_action actions[] = {
+		{
+			.type = RTE_FLOW_ACTION_TYPE_QUEUE,
+			.conf = &lacp_queue_conf
+		},
+		{
+			.type = RTE_FLOW_ACTION_TYPE_END,
+		}
+	};
+
+	internals->mode4.slow_flow[slave_port] = rte_flow_create(slave_port,
+			&flow_attr_8023ad, flow_item_8023ad, actions, &error);
+	if (internals->mode4.slow_flow[slave_port] == NULL) {
+		RTE_BOND_LOG(ERR,
+			"bond_ethdev_8023ad_flow_set: %s (slave_port=%d queue_id=%d)",
+			error.message, slave_port, internals->mode4.slow_rx_queue);
+		return -1;
+	}
+
+	return 0;
+}
+
+static uint16_t
+bond_ethdev_rx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs,
+		uint16_t nb_pkts)
+{
+	struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue;
+	struct bond_dev_private *internals = bd_rx_q->dev_private;
+	uint16_t num_rx_total = 0;	/* Total number of received packets */
+	uint8_t slaves[RTE_MAX_ETHPORTS];
+	uint8_t slave_count;
+
+	uint8_t i;
+
+	/* Copy slave list to protect against slave up/down changes during tx
+	 * bursting */
+	slave_count = internals->active_slave_count;
+	memcpy(slaves, internals->active_slaves,
+			sizeof(internals->active_slaves[0]) * slave_count);
+
+	for (i = 0; i < slave_count && num_rx_total < nb_pkts; i++) {
+		/* Read packets from this slave */
+		num_rx_total += rte_eth_rx_burst(slaves[i], bd_rx_q->queue_id,
+				&bufs[num_rx_total], nb_pkts - num_rx_total);
+	}
+
+	return num_rx_total;
+}
+
+static uint16_t
+bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs,
+		uint16_t nb_pkts)
+{
+	struct bond_dev_private *internals;
+	struct bond_tx_queue *bd_tx_q;
+
+	uint8_t num_of_slaves;
+	uint8_t slaves[RTE_MAX_ETHPORTS];
+	 /* positions in slaves, not ID */
+	uint8_t distributing_offsets[RTE_MAX_ETHPORTS];
+	uint8_t distributing_count;
+
+	uint16_t num_tx_slave, num_tx_total = 0, num_tx_fail_total = 0;
+	uint16_t i, op_slave_idx;
+
+	struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_pkts];
+
+	/* Total amount of packets in slave_bufs */
+	uint16_t slave_nb_pkts[RTE_MAX_ETHPORTS] = { 0 };
+	/* Slow packets placed in each slave */
+
+	if (unlikely(nb_pkts == 0))
+		return 0;
+
+	bd_tx_q = (struct bond_tx_queue *)queue;
+	internals = bd_tx_q->dev_private;
+
+	/* Copy slave list to protect against slave up/down changes during tx
+	 * bursting */
+	num_of_slaves = internals->active_slave_count;
+	if (num_of_slaves < 1)
+		return num_tx_total;
+
+	memcpy(slaves, internals->active_slaves, sizeof(slaves[0]) *
+			num_of_slaves);
+
+	distributing_count = 0;
+	for (i = 0; i < num_of_slaves; i++) {
+		struct port *port = &mode_8023ad_ports[slaves[i]];
+		if (ACTOR_STATE(port, DISTRIBUTING))
+			distributing_offsets[distributing_count++] = i;
+	}
+
+	if (likely(distributing_count > 0)) {
+		/* Populate slaves mbuf with the packets which are to be sent on it */
+		for (i = 0; i < nb_pkts; i++) {
+			/* Select output slave using hash based on xmit policy */
+			op_slave_idx = internals->xmit_hash(bufs[i], distributing_count);
+
+			/* Populate slave mbuf arrays with mbufs for that slave. Use only
+			 * slaves that are currently distributing. */
+			uint8_t slave_offset = distributing_offsets[op_slave_idx];
+			slave_bufs[slave_offset][slave_nb_pkts[slave_offset]] = bufs[i];
+			slave_nb_pkts[slave_offset]++;
+		}
+	}
+
+	/* Send packet burst on each slave device */
+	for (i = 0; i < num_of_slaves; i++) {
+		if (slave_nb_pkts[i] == 0)
+			continue;
+
+		num_tx_slave = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
+				slave_bufs[i], slave_nb_pkts[i]);
+
+		num_tx_total += num_tx_slave;
+		num_tx_fail_total += slave_nb_pkts[i] - num_tx_slave;
+
+		/* If tx burst fails move packets to end of bufs */
+		if (unlikely(num_tx_slave < slave_nb_pkts[i])) {
+			uint16_t j = nb_pkts - num_tx_fail_total;
+			for ( ; num_tx_slave < slave_nb_pkts[i]; j++, num_tx_slave++)
+				bufs[j] = slave_bufs[i][num_tx_slave];
+		}
+	}
+
+	return num_tx_total;
+}
+
 static uint16_t
 bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		uint16_t nb_pkts)
@@ -180,6 +395,13 @@
 
 		/* Handle slow protocol packets. */
 		while (j < num_rx_total) {
+
+			/* if packet is not pure L2 and is known, skip it */
+			if ((bufs[j]->packet_type & ~RTE_PTYPE_L2_ETHER) != 0) {
+				j++;
+				continue;
+			}
+
 			if (j + 3 < num_rx_total)
 				rte_prefetch0(rte_pktmbuf_mtod(bufs[j + 3], void *));
 
@@ -1295,11 +1517,19 @@ struct bwg_slave {
 		if (bond_mode_8023ad_enable(eth_dev) != 0)
 			return -1;
 
-		eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_8023ad;
-		eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_8023ad;
-		RTE_LOG(WARNING, PMD,
-				"Using mode 4, it is necessary to do TX burst and RX burst "
-				"at least every 100ms.\n");
+		if (internals->mode4.slow_rx_queue == 0) {
+			eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_8023ad;
+			eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_8023ad;
+			RTE_LOG(WARNING, PMD,
+				"Using mode 4, it is necessary to do TX burst "
+				"and RX burst at least every 100ms.\n");
+		} else {
+			/* Use flow director's optimization */
+			eth_dev->rx_pkt_burst =
+					bond_ethdev_rx_burst_8023ad_fast_queue;
+			eth_dev->tx_pkt_burst =
+					bond_ethdev_tx_burst_8023ad_fast_queue;
+		}
 		break;
 	case BONDING_MODE_TLB:
 		eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_tlb;
@@ -1321,6 +1551,72 @@ struct bwg_slave {
 	return 0;
 }
 
+static int
+slave_configure_slow_queue(struct rte_eth_dev *bonded_eth_dev,
+		struct rte_eth_dev *slave_eth_dev)
+{
+	int errval = 0;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		bonded_eth_dev->data->dev_private;
+	struct port *port = &mode_8023ad_ports[slave_eth_dev->data->port_id];
+
+	if ((internals->mode != BONDING_MODE_8023AD) ||
+			(internals->mode4.slow_rx_queue == 0) ||
+			(internals->mode4.slow_tx_queue == 0))
+		return 0;
+
+	if (port->slow_pool == NULL) {
+		char mem_name[256];
+		int slave_id = slave_eth_dev->data->port_id;
+
+		snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_slow_pool",
+				slave_id);
+		port->slow_pool = rte_pktmbuf_pool_create(mem_name, 8191,
+			250, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
+			slave_eth_dev->data->numa_node);
+
+		/* Any memory allocation failure in initialization is critical because
+		 * resources can't be free, so reinitialization is impossible. */
+		if (port->slow_pool == NULL) {
+			rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
+				slave_id, mem_name, rte_strerror(rte_errno));
+		}
+	}
+
+	if (internals->mode4.slow_rx_queue > 0) {
+		/* Configure slow Rx queue */
+
+		errval = rte_eth_rx_queue_setup(slave_eth_dev->data->port_id,
+				internals->mode4.slow_rx_queue, 128,
+				rte_eth_dev_socket_id(slave_eth_dev->data->port_id),
+				NULL, port->slow_pool);
+		if (errval != 0) {
+			RTE_BOND_LOG(ERR,
+					"rte_eth_rx_queue_setup: port=%d queue_id %d, err (%d)",
+					slave_eth_dev->data->port_id,
+					internals->mode4.slow_rx_queue,
+					errval);
+			return errval;
+		}
+	}
+
+	if (internals->mode4.slow_tx_queue > 0) {
+		errval = rte_eth_tx_queue_setup(slave_eth_dev->data->port_id,
+				internals->mode4.slow_tx_queue, 512,
+				rte_eth_dev_socket_id(slave_eth_dev->data->port_id),
+				NULL);
+		if (errval != 0) {
+			RTE_BOND_LOG(ERR,
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id,
+				internals->mode4.slow_tx_queue,
+				errval);
+			return errval;
+		}
+	}
+	return 0;
+}
+
 int
 slave_configure(struct rte_eth_dev *bonded_eth_dev,
 		struct rte_eth_dev *slave_eth_dev)
@@ -1330,6 +1626,10 @@ struct bwg_slave {
 
 	int errval;
 	uint16_t q_id;
+	struct rte_flow_error flow_error;
+
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		bonded_eth_dev->data->dev_private;
 
 	/* Stop slave */
 	rte_eth_dev_stop(slave_eth_dev->data->port_id);
@@ -1359,10 +1659,19 @@ struct bwg_slave {
 	slave_eth_dev->data->dev_conf.rxmode.hw_vlan_filter =
 			bonded_eth_dev->data->dev_conf.rxmode.hw_vlan_filter;
 
+	uint16_t nb_rx_queues = bonded_eth_dev->data->nb_rx_queues;
+	uint16_t nb_tx_queues = bonded_eth_dev->data->nb_tx_queues;
+
+	if (internals->mode == BONDING_MODE_8023AD) {
+		if (internals->mode4.slow_rx_queue > 0)
+			nb_rx_queues++;
+		if (internals->mode4.slow_tx_queue > 0)
+			nb_tx_queues++;
+	}
+
 	/* Configure device */
 	errval = rte_eth_dev_configure(slave_eth_dev->data->port_id,
-			bonded_eth_dev->data->nb_rx_queues,
-			bonded_eth_dev->data->nb_tx_queues,
+			nb_rx_queues, nb_tx_queues,
 			&(slave_eth_dev->data->dev_conf));
 	if (errval != 0) {
 		RTE_BOND_LOG(ERR, "Cannot configure slave device: port %u , err (%d)",
@@ -1402,6 +1711,28 @@ struct bwg_slave {
 		}
 	}
 
+	slave_configure_slow_queue(bonded_eth_dev, slave_eth_dev);
+
+	if ((internals->mode == BONDING_MODE_8023AD) &&
+			(internals->mode4.slow_rx_queue > 0)) {
+
+		if (bond_ethdev_8023ad_flow_verify(bonded_eth_dev,
+				slave_eth_dev->data->port_id) != 0) {
+			RTE_BOND_LOG(ERR,
+					"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+					slave_eth_dev->data->port_id, q_id, errval);
+			return -1;
+		}
+
+		if (internals->mode4.slow_flow[slave_eth_dev->data->port_id] != NULL)
+			rte_flow_destroy(slave_eth_dev->data->port_id,
+					internals->mode4.slow_flow[slave_eth_dev->data->port_id],
+					&flow_error);
+
+		bond_ethdev_8023ad_flow_set(bonded_eth_dev,
+				slave_eth_dev->data->port_id);
+	}
+
 	/* Start device */
 	errval = rte_eth_dev_start(slave_eth_dev->data->port_id);
 	if (errval != 0) {
diff --git a/drivers/net/bonding/rte_eth_bond_version.map b/drivers/net/bonding/rte_eth_bond_version.map
index 2de0a7d..6f1f13a 100644
--- a/drivers/net/bonding/rte_eth_bond_version.map
+++ b/drivers/net/bonding/rte_eth_bond_version.map
@@ -43,3 +43,12 @@ DPDK_16.07 {
 	rte_eth_bond_8023ad_setup;
 
 } DPDK_16.04;
+
+DPDK_17.08 {
+	global:
+
+	rte_eth_bond_8023ad_slow_queue_enable;
+	rte_eth_bond_8023ad_slow_queue_disable;
+
+	local: *;
+} DPDK_16.07;
-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH 2/2] test-pmd: add set bonding slow_queue hw/sw
  2017-05-27 11:27 [dpdk-dev] [PATCH 0/2] LACP control packet filtering offload Tomasz Kulasek
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 1/2] " Tomasz Kulasek
@ 2017-05-27 11:27 ` Tomasz Kulasek
  2017-06-29 16:20 ` [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload Tomasz Kulasek
  2 siblings, 0 replies; 23+ messages in thread
From: Tomasz Kulasek @ 2017-05-27 11:27 UTC (permalink / raw)
  To: dev; +Cc: declan.doherty

This patch adds new command:

    set bonding slow_queue <port_id> sw|hw

"set bonding slow_queue <bonding_port_id> hw" sets hardware management
of slow packets and chooses simplified paths for tx/rx bursts.

"set bonding slow_queue <bonding_port_id> sw" turns back to the software
handling of slow packets. This option is default.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 app/test-pmd/cmdline.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0afac68..11fa4a5 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -87,6 +87,7 @@
 #include <cmdline.h>
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
+#include <rte_eth_bond_8023ad.h>
 #endif
 #ifdef RTE_LIBRTE_IXGBE_PMD
 #include <rte_pmd_ixgbe.h>
@@ -4279,6 +4280,62 @@ static void cmd_set_bonding_mode_parsed(void *parsed_result,
 		}
 };
 
+/* *** SET BONDING SLOW_QUEUE SW/HW *** */
+struct cmd_set_bonding_slow_queue_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t bonding;
+	cmdline_fixed_string_t slow_queue;
+	uint8_t port_id;
+	cmdline_fixed_string_t mode;
+};
+
+static void cmd_set_bonding_slow_queue_parsed(void *parsed_result,
+		__attribute__((unused))  struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_bonding_slow_queue_result *res = parsed_result;
+	portid_t port_id = res->port_id;
+
+	if (!strcmp(res->mode, "hw")) {
+		rte_eth_bond_8023ad_slow_queue_enable(port_id);
+		printf("Hardware slow queue enabled\n");
+	} else if (!strcmp(res->mode, "sw")) {
+		rte_eth_bond_8023ad_slow_queue_disable(port_id);
+	}
+}
+
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_set =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		set, "set");
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_bonding =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		bonding, "bonding");
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_slow_queue =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		slow_queue, "slow_queue");
+cmdline_parse_token_num_t cmd_setbonding_slow_queue_port =
+TOKEN_NUM_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		port_id, UINT8);
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_mode =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		mode, "sw#hw");
+
+cmdline_parse_inst_t cmd_set_slow_queue = {
+		.f = cmd_set_bonding_slow_queue_parsed,
+		.help_str = "set bonding slow_queue <port_id> "
+			"sw|hw: "
+			"Set the bonding slow queue acceleration for port_id",
+		.data = NULL,
+		.tokens = {
+				(void *)&cmd_setbonding_slow_queue_set,
+				(void *)&cmd_setbonding_slow_queue_bonding,
+				(void *)&cmd_setbonding_slow_queue_slow_queue,
+				(void *)&cmd_setbonding_slow_queue_port,
+				(void *)&cmd_setbonding_slow_queue_mode,
+				NULL
+		}
+};
+
 /* *** SET BALANCE XMIT POLICY *** */
 struct cmd_set_bonding_balance_xmit_policy_result {
 	cmdline_fixed_string_t set;
@@ -13613,6 +13670,7 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *) &cmd_set_bond_mac_addr,
 	(cmdline_parse_inst_t *) &cmd_set_balance_xmit_policy,
 	(cmdline_parse_inst_t *) &cmd_set_bond_mon_period,
+	(cmdline_parse_inst_t *) &cmd_set_slow_queue,
 #endif
 	(cmdline_parse_inst_t *)&cmd_vlan_offload,
 	(cmdline_parse_inst_t *)&cmd_vlan_tpid,
-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH 1/2] LACP control packet filtering offload
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 1/2] " Tomasz Kulasek
@ 2017-05-29  8:10   ` Adrien Mazarguil
  2017-06-29  9:18   ` Declan Doherty
  1 sibling, 0 replies; 23+ messages in thread
From: Adrien Mazarguil @ 2017-05-29  8:10 UTC (permalink / raw)
  To: Tomasz Kulasek; +Cc: dev, declan.doherty

Hi Tomasz,

On Sat, May 27, 2017 at 01:27:43PM +0200, Tomasz Kulasek wrote:
> New API funtions implemented:
> 
>    rte_eth_bond_8023ad_slow_queue_enable(uint8_t port_id);
>    rte_eth_bond_8023ad_slow_queue_disable(uint8_t port_id);
> 
> rte_eth_bond_8023ad_slow_queue_enable should be called before bonding port
> start to enable new path.
> 
> When this option is enabled all slaves must support flow director's
> filtering by ethernet type and support one additional queue on slaves
> tx/rx.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
[...]
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
> index 82959ab..558682c 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -59,6 +59,12 @@
>  /* Table for statistics in mode 5 TLB */
>  static uint64_t tlb_last_obytets[RTE_MAX_ETHPORTS];
>  
> +#if  __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> +#define _htons(x) ((uint16_t)((((x) & 0x00ffU) << 8) | (((x) & 0xff00U) >> 8)))
> +#else
> +#define _htons(x) (x)
> +#endif
> +
[...]
>  static inline size_t
>  get_vlan_offset(struct ether_hdr *eth_hdr, uint16_t *proto)
>  {
> @@ -133,6 +139,215 @@
>  		(subtype == SLOW_SUBTYPE_MARKER || subtype == SLOW_SUBTYPE_LACP));
>  }
>  
> +/*****************************************************************************
> + * Flow director's setup for mode 4 optimization
> + */
> +
> +static struct rte_flow_item_eth flow_item_eth_type_8023ad = {
> +	.dst.addr_bytes = { 0 },
> +	.src.addr_bytes = { 0 },
> +	.type = _htons(ETHER_TYPE_SLOW),
> +};

Might I interest you in a more generic alternative [1]?

[1] http://dpdk.org/ml/archives/dev/2017-May/066097.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH 1/2] LACP control packet filtering offload
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 1/2] " Tomasz Kulasek
  2017-05-29  8:10   ` Adrien Mazarguil
@ 2017-06-29  9:18   ` Declan Doherty
  1 sibling, 0 replies; 23+ messages in thread
From: Declan Doherty @ 2017-06-29  9:18 UTC (permalink / raw)
  To: Tomasz Kulasek, dev

On 27/05/17 12:27, Tomasz Kulasek wrote:
> New API funtions implemented:
> 
>     rte_eth_bond_8023ad_slow_queue_enable(uint8_t port_id);
>     rte_eth_bond_8023ad_slow_queue_disable(uint8_t port_id);
> 
> rte_eth_bond_8023ad_slow_queue_enable should be called before bonding port
> start to enable new path.
> 
> When this option is enabled all slaves must support flow director's
> filtering by ethernet type and support one additional queue on slaves
> tx/rx.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> ---
>   drivers/net/bonding/rte_eth_bond_8023ad.c         | 141 +++++++--
>   drivers/net/bonding/rte_eth_bond_8023ad.h         |   6 +
>   drivers/net/bonding/rte_eth_bond_8023ad_private.h |  15 +
>   drivers/net/bonding/rte_eth_bond_pmd.c            | 345 +++++++++++++++++++++-
>   drivers/net/bonding/rte_eth_bond_version.map      |   9 +
>   5 files changed, 481 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index 7b863d6..125eb45 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -632,12 +632,20 @@
>   	lacpdu->tlv_type_terminator = TLV_TYPE_TERMINATOR_INFORMATION;
>   	lacpdu->terminator_length = 0;
>   
> -	if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
> -		/* If TX ring full, drop packet and free message. Retransmission
> -		 * will happen in next function call. */
> -		rte_pktmbuf_free(lacp_pkt);
> -		set_warning_flags(port, WRN_TX_QUEUE_FULL);
> -		return;
> +	if (internals->mode4.slow_rx_queue == 0) {

I think we should have an explicit flag set for if hw filtering of slow 
packets is enabled instead of checking the rx/tx queue id like above.

> +		if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
> +			/* If TX ring full, drop packet and free message. Retransmission
> +			 * will happen in next function call. */
> +			rte_pktmbuf_free(lacp_pkt);
> +			set_warning_flags(port, WRN_TX_QUEUE_FULL);
> +			return;
> +		}
> +	} else {
> +		if (rte_eth_tx_burst(slave_id, internals->mode4.slow_tx_queue, &lacp_pkt, 1) == 0) {
> +			rte_pktmbuf_free(lacp_pkt);
> +			set_warning_flags(port, WRN_TX_QUEUE_FULL);
> +			return;
> +		}
>   	}
>   
>   	MODE4_DEBUG("sending LACP frame\n");
> @@ -741,6 +749,25 @@
>   }
>   
>   static void
> +rx_machine_update(struct bond_dev_private *internals, uint8_t slave_id,
> +		struct rte_mbuf *lacp_pkt) {
> +
> +	/* Find LACP packet to this port. Do not check subtype, it is done in
> +	 * function that queued packet */
> +	if (lacp_pkt != NULL) {
> +		struct lacpdu_header *lacp;
> +
> +		lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
> +		RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
> +
> +		/* This is LACP frame so pass it to rx_machine */
> +		rx_machine(internals, slave_id, &lacp->lacpdu);
> +		rte_pktmbuf_free(lacp_pkt);
> +	} else
> +		rx_machine(internals, slave_id, NULL);
> +}
> +
> +static void
>   bond_mode_8023ad_periodic_cb(void *arg)
>   {
>   	struct rte_eth_dev *bond_dev = arg;
> @@ -809,20 +836,21 @@
>   
>   		SM_FLAG_SET(port, LACP_ENABLED);
>   
> -		/* Find LACP packet to this port. Do not check subtype, it is done in
> -		 * function that queued packet */
> -		if (rte_ring_dequeue(port->rx_ring, &pkt) == 0) {
> -			struct rte_mbuf *lacp_pkt = pkt;
> -			struct lacpdu_header *lacp;
> +		struct rte_mbuf *lacp_pkt = NULL;
>   
> -			lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
> -			RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
> +		if (internals->mode4.slow_rx_queue == 0) {
 >

As above instead of checking rx queue id and explicit enable/disable 
flag would be clearer.

> +			/* Find LACP packet to this port. Do not check subtype, it is done in
> +			 * function that queued packet */
> +			if (rte_ring_dequeue(port->rx_ring, &pkt) == 0)
> +				lacp_pkt = pkt;
>   
> -			/* This is LACP frame so pass it to rx_machine */
> -			rx_machine(internals, slave_id, &lacp->lacpdu);
> -			rte_pktmbuf_free(lacp_pkt);
> -		} else
> -			rx_machine(internals, slave_id, NULL);
> +			rx_machine_update(internals, slave_id, lacp_pkt);
> +		} else {
> +			if (rte_eth_rx_burst(slave_id, internals->mode4.slow_rx_queue, &lacp_pkt, 1) == 1)
> +				bond_mode_8023ad_handle_slow_pkt(internals, slave_id, lacp_pkt);
> +			else
> +				rx_machine_update(internals, slave_id, NULL);
> +		}


If possible it would be good if the hw filtered path and the using the 
sw queue followed the same code path here. We are now calling 
bond_mode_8023ad_handle_slow_pkt from both the 
bond_mode_8023ad_periodic_cb and bond_ethdev_tx_burst_8023ad, it would 
be clearer if both follow the same processing path and 
bond_mode_8023ad_handle_slow_pkt wasn't called within 
bond_ethdev_tx_burst_8023ad.

>   
>   		periodic_machine(internals, slave_id);
>   		mux_machine(internals, slave_id);
> @@ -1188,18 +1216,36 @@
>   		m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
>   		rte_eth_macaddr_get(slave_id, &m_hdr->eth_hdr.s_addr);
>   
> -		if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) == -ENOBUFS)) {
> -			/* reset timer */
> -			port->rx_marker_timer = 0;
> -			wrn = WRN_TX_QUEUE_FULL;
> -			goto free_out;
> +		if (internals->mode4.slow_tx_queue == 0) {
> +			if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) ==
> +					-ENOBUFS)) {
> +				/* reset timer */
> +				port->rx_marker_timer = 0;
> +				wrn = WRN_TX_QUEUE_FULL;
> +				goto free_out;
> +			}
> +		} else {
> +			/* Send packet directly to the slow queue */
> +			if (unlikely(rte_eth_tx_burst(slave_id,
> +					internals->mode4.slow_tx_queue,
> +					&pkt, 1) == 0)) {
> +				/* reset timer */
> +				port->rx_marker_timer = 0;
> +				wrn = WRN_TX_QUEUE_FULL;
> +				goto free_out;
> +			}
>   		}
>   	} else if (likely(subtype == SLOW_SUBTYPE_LACP)) {
> -		if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
> -			/* If RX fing full free lacpdu message and drop packet */
> -			wrn = WRN_RX_QUEUE_FULL;
> -			goto free_out;
> -		}
> +
> +		if (internals->mode4.slow_rx_queue == 0) {
> +			if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
> +				/* If RX fing full free lacpdu message and drop packet */
> +				wrn = WRN_RX_QUEUE_FULL;
> +				goto free_out;
> +			}
> +		} else
> +			rx_machine_update(internals, slave_id, pkt);
> +
>   	} else {
>   		wrn = WRN_UNKNOWN_SLOW_TYPE;
>   		goto free_out;
> @@ -1504,3 +1550,42 @@
>   	rte_eal_alarm_set(internals->mode4.update_timeout_us,
>   			bond_mode_8023ad_ext_periodic_cb, arg);
>   }
> +
> +#define MBUF_CACHE_SIZE 250
> +#define NUM_MBUFS 8191
> +
> +int
> +rte_eth_bond_8023ad_slow_queue_enable(uint8_t port)
> +{
> +	int retval = 0;
> +	struct rte_eth_dev *dev = &rte_eth_devices[port];
> +	struct bond_dev_private *internals = (struct bond_dev_private *)
> +		dev->data->dev_private;
> +
> +	if (check_for_bonded_ethdev(dev) != 0)
> +		return -1;
> +
> +	internals->mode4.slow_rx_queue = dev->data->nb_rx_queues;
> +	internals->mode4.slow_tx_queue = dev->data->nb_tx_queues;
> +

We shouldn't be setting the slow queues here as they won't necessarily 
be the right values, as mentioned above just an enable flag would be 
sufficient.

Also we should really be testing whether all the slaves of the bond can 
support applying the filtering rule required here and then fail 
enablement if they don't.

> +	bond_ethdev_mode_set(dev, internals->mode);
> +	return retval;
> +}
> +
> +int
> +rte_eth_bond_8023ad_slow_queue_disable(uint8_t port)
> +{
> +	int retval = 0;
> +	struct rte_eth_dev *dev = &rte_eth_devices[port];
> +	struct bond_dev_private *internals = (struct bond_dev_private *)
> +		dev->data->dev_private;
> +
> +	if (check_for_bonded_ethdev(dev) != 0)
> +		return -1;
> +
> +	internals->mode4.slow_rx_queue = 0;
> +	internals->mode4.slow_tx_queue = 0;
> +


As above, in regards to the enable flag

> +	bond_ethdev_mode_set(dev, internals->mode);
> +	return retval;
> +}
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.h b/drivers/net/bonding/rte_eth_bond_8023ad.h
> index 6b8ff57..8d21c7a 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.h
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.h
> @@ -302,4 +302,10 @@ struct rte_eth_bond_8023ad_slave_info {
>   rte_eth_bond_8023ad_ext_slowtx(uint8_t port_id, uint8_t slave_id,
>   		struct rte_mbuf *lacp_pkt);
>   
> +int
> +rte_eth_bond_8023ad_slow_queue_enable(uint8_t port_id);
> 
> +int
> +rte_eth_bond_8023ad_slow_queue_disable(uint8_t port_id);
> +


We need to include the doxygen here, with some  details on what is being 
enable here, i.e. details that dedicated rx/tx queues on slaves are 
being created for filtering the lacp control plane traffic from data 
path traffic so filtering in the data path is not required.

Also, I think that these functions purpose would be clearer if there 
where called rte_eth_bond_8023ad_slow_pkt_hw_filter_enable/disable

>   #endif /* RTE_ETH_BOND_8023AD_H_ */
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
> index ca8858b..3963714 100644
....
> 

On thing missing is the reporting to the application that there is a 
reduced number of tx/rx queues available when hw filtering is enabled. 
Looking at the bond_ethdev_info() it doesn't look like this is getting 
reported correctly at the moment anyway but it should be smallest value 
of the max number of queues of the slave devices minus one. So if we had 
3 slaves one which support 8 rx queues and the other 2 supported 16, 
then we should report 7 (8-1) as the maximum number of rx queues for the 
bonded devices.


Finally, we are missing some updated documentation about this new 
feature. The information in the cover note should be added to the 
bonding documentation at a minimum.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload
  2017-05-27 11:27 [dpdk-dev] [PATCH 0/2] LACP control packet filtering offload Tomasz Kulasek
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 1/2] " Tomasz Kulasek
  2017-05-27 11:27 ` [dpdk-dev] [PATCH 2/2] test-pmd: add set bonding slow_queue hw/sw Tomasz Kulasek
@ 2017-06-29 16:20 ` Tomasz Kulasek
  2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 1/2] " Tomasz Kulasek
                     ` (2 more replies)
  2 siblings, 3 replies; 23+ messages in thread
From: Tomasz Kulasek @ 2017-06-29 16:20 UTC (permalink / raw)
  To: dev

1. Overview

  Packet processing in the current path for bonding in mode 4, requires
  parse all packets in the fast path, to classify and process LACP
  packets.

  The idea of performance improvement is to use hardware offloads to
  improve packet classification.

2. Scope of work

   a) Optimization of software LACP packet classification by using
      packet_type metadata to eliminate the requirement of parsing each
      packet in the received burst.

   b) Implementation of classification mechanism using flow director to
      redirect LACP packets to the dedicated queue (not visible by
      application).

      - Filter pattern choosing (not all filters are supported by all
        devices),
      - Changing processing path to speed up non-LACP packets
        processing,
      - Handle LACP packets from dedicated Rx queue and send to the
        dedicated Tx queue,

   c) Creation of fallback mechanism allowing to select the most
      preferable method of processing:

      - Flow director,
      - Packet type metadata,
      - Software parsing,

3. Implementation

3.1. Packet type

   The packet_type approach would result in a performance improvement
   as packets data would no longer be required to be read, but with this
   approach the bonded driver would still need to look at the mbuf of
   each packet thereby having an impact on the achievable Rx
   performance.

   There's not packet_type value describing LACP packets directly.
   However, it can be used to limit number of packets required to be
   parsed, e.g. if packet_type indicates >L2 packets.

   It should improve performance while well-known non-LACP packets can
   be skipped without the need to look up into its data.

3.2. Flow director

   Using rte_flow API and pattern on ethernet type of packet (0x8809),
   we can configure flow director to redirect slow packets to separated
   queue.

   An independent Rx queues for LACP would remove the requirement to
   filter all ingress traffic in sw which should result in a performance
   increase. Other queues stay untouched and processing of packets on
   the fast path will be reduced to simple packet collecting from
   slaves.

   Separated Tx queue for LACP daemon allows to send LACP responses
   immediately, without interfering into Tx fast path.

   RECEIVE

         .---------------.
         | Slave 0       |
         |      .------. |
         |  Fd  | Rxq  | |
   Rx ======o==>|      |==============.
         |  |   +======+ |            |      .---------------.
         |  `-->| LACP |--------.     |      | Bonding       |
         |      `------' |      |     |      |      .------. |
         `---------------'      |     |      |      |      | |
                                |     >============>|      |=======> Rx
         .---------------.      |     |      |      +======+ |
         | Slave 1       |      |     |      |      | XXXX | |
         |      .------. |      |     |      |      `------' |
         |  Fd  | Rxq  | |      |     |      `---------------'
   Rx ======o==>|      |=============='        .-----------.
         |  |   +======+ |      |             /             \
         |  `-->| LACP |--------+----------->+  LACP DAEMON  |
         |      `------' |             Tx <---\             /
         `---------------'                     `-----------'

   All slow packets received by slaves in bonding are redirected to the
   separated queue using flow director. Other packets are collected from
   slaves and exposed to the application with Rx burst on bonded device.

   TRANSMIT

         .---------------.
         | Slave 0       |
         |      .------. |
         |      |      | |
   Tx <=====+===|      |<=============.
         |  |   |------| |            |      .---------------.
         |  `---| LACP |<-------.     |      | Bonding       |
         |      `------' |      |     |      |      .------. |
         `---------------'      |     |      |      |      | |
                                |     +<============|      |<====== Tx
         .---------------.      |     |      |      +======+ |
         | Slave 1       |      |     |      |      | XXXX | |
         |      .------. |      |     |      |      `------' |
         |      |      | |      |     |      `---------------'
   Tx <=====+===|      |<============='  Rx    .-----------.
         |  |   |------| |      |         `-->/             \
         |  `---| LACP |<-------+------------+  LACP DAEMON  |
         |      `------' |                    \             /
         `---------------'                     `-----------'

   On transmit, packets are propagated on the slaves. While we have
   separated Tx queue for LACP responses, it can be sent regardless of
   the fast path.

   LACP DAEMON

   In this mode whole slow packets are handled in LACP DAEMON.

Tomasz Kulasek (2):
  LACP control packet filtering offload
  test-pmd: add set bonding slow_queue hw/sw

 app/test-pmd/cmdline.c                            |  75 ++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst       |   8 +
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 160 ++++++--
 drivers/net/bonding/rte_eth_bond_8023ad.h         |  35 ++
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  24 ++
 drivers/net/bonding/rte_eth_bond_pmd.c            | 424 +++++++++++++++++++++-
 drivers/net/bonding/rte_eth_bond_version.map      |   9 +
 7 files changed, 693 insertions(+), 42 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] LACP control packet filtering offload
  2017-06-29 16:20 ` [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload Tomasz Kulasek
@ 2017-06-29 16:20   ` Tomasz Kulasek
  2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 2/2] test-pmd: add set bonding slow_queue hw/sw Tomasz Kulasek
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
  2 siblings, 0 replies; 23+ messages in thread
From: Tomasz Kulasek @ 2017-06-29 16:20 UTC (permalink / raw)
  To: dev

New API functions implemented:

   rte_eth_bond_8023ad_slow_pkt_hw_filter_enable(uint8_t port_id);
   rte_eth_bond_8023ad_slow_pkt_hw_filter_disable(uint8_t port_id);

rte_eth_bond_8023ad_slow_pkt_hw_filter_enable should be called before bonding
port start to enable new path.

When this option is enabled all slaves must support flow director's
filtering by ethernet type and support one additional queue on slaves
tx/rx.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
v2 changes:
 - changed name of rte_eth_bond_8023ad_slow_queue_enable/disable to
   rte_eth_bond_8023ad_slow_pkt_hw_filter_enable/disable,
 - propagated number of tx/rx queues available for bonding based on the
   attached slaves and slow packet filtering requirements,
 - improved validation of slaves,
 - introduced one structure to organize all slow queue settings,
 - use of RTE_BE16() instead of locally defined macro,
 - some comments improvements
---
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 160 ++++++--
 drivers/net/bonding/rte_eth_bond_8023ad.h         |  35 ++
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  24 ++
 drivers/net/bonding/rte_eth_bond_pmd.c            | 424 +++++++++++++++++++++-
 drivers/net/bonding/rte_eth_bond_version.map      |   9 +
 5 files changed, 610 insertions(+), 42 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index d2b7592..6b37a61 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -632,15 +632,25 @@
 	lacpdu->tlv_type_terminator = TLV_TYPE_TERMINATOR_INFORMATION;
 	lacpdu->terminator_length = 0;
 
-	if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
-		/* If TX ring full, drop packet and free message. Retransmission
-		 * will happen in next function call. */
-		rte_pktmbuf_free(lacp_pkt);
-		set_warning_flags(port, WRN_TX_QUEUE_FULL);
-		return;
+	if (!internals->mode4.slow_pkts.hw_filtering_en) {
+		if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
+			/* If TX ring full, drop packet and free message.
+			   Retransmission will happen in next function call. */
+			rte_pktmbuf_free(lacp_pkt);
+			set_warning_flags(port, WRN_TX_QUEUE_FULL);
+			return;
+		}
+	} else {
+		if (rte_eth_tx_burst(slave_id,
+				internals->mode4.slow_pkts.tx_queue_id,
+				&lacp_pkt, 1) == 0) {
+			rte_pktmbuf_free(lacp_pkt);
+			set_warning_flags(port, WRN_TX_QUEUE_FULL);
+			return;
+		}
 	}
 
-	MODE4_DEBUG("sending LACP frame\n");
+	MODE4_DEBUG("Sending LACP frame\n");
 	BOND_PRINT_LACP(lacpdu);
 
 	timer_set(&port->tx_machine_timer, internals->mode4.tx_period_timeout);
@@ -741,6 +751,22 @@
 }
 
 static void
+rx_machine_update(struct bond_dev_private *internals, uint8_t slave_id,
+		struct rte_mbuf *lacp_pkt) {
+	struct lacpdu_header *lacp;
+
+	if (lacp_pkt != NULL) {
+		lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
+		RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
+
+		/* This is LACP frame so pass it to rx_machine */
+		rx_machine(internals, slave_id, &lacp->lacpdu);
+		rte_pktmbuf_free(lacp_pkt);
+	} else
+		rx_machine(internals, slave_id, NULL);
+}
+
+static void
 bond_mode_8023ad_periodic_cb(void *arg)
 {
 	struct rte_eth_dev *bond_dev = arg;
@@ -809,20 +835,24 @@
 
 		SM_FLAG_SET(port, LACP_ENABLED);
 
-		/* Find LACP packet to this port. Do not check subtype, it is done in
-		 * function that queued packet */
-		if (rte_ring_dequeue(port->rx_ring, &pkt) == 0) {
-			struct rte_mbuf *lacp_pkt = pkt;
-			struct lacpdu_header *lacp;
+		struct rte_mbuf *lacp_pkt = NULL;
 
-			lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
-			RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
+		if (!internals->mode4.slow_pkts.hw_filtering_en) {
+			/* Find LACP packet to this port. Do not check subtype,
+			 * it is done in function that queued packet
+			 */
+			if (rte_ring_dequeue(port->rx_ring, &pkt) == 0)
+				lacp_pkt = pkt;
 
-			/* This is LACP frame so pass it to rx_machine */
-			rx_machine(internals, slave_id, &lacp->lacpdu);
-			rte_pktmbuf_free(lacp_pkt);
-		} else
-			rx_machine(internals, slave_id, NULL);
+			rx_machine_update(internals, slave_id, lacp_pkt);
+		} else {
+			if (rte_eth_rx_burst(slave_id,
+					internals->mode4.slow_rx_queue,
+					&lacp_pkt, 1) == 1)
+				bond_mode_8023ad_handle_slow_pkt(internals, slave_id, lacp_pkt);
+			else
+				rx_machine_update(internals, slave_id, NULL);
+		}
 
 		periodic_machine(internals, slave_id);
 		mux_machine(internals, slave_id);
@@ -1064,6 +1094,10 @@
 	mode4->tx_period_timeout = conf->tx_period_ms * ms_ticks;
 	mode4->rx_marker_timeout = conf->rx_marker_period_ms * ms_ticks;
 	mode4->update_timeout_us = conf->update_timeout_ms * 1000;
+
+	mode4->slow_pkts.hw_filtering_en = 0;
+	mode4->slow_pkts.rx_queue_id = UINT16_MAX;
+	mode4->slow_pkts.tx_queue_id = UINT16_MAX;
 }
 
 static void
@@ -1188,18 +1222,34 @@
 		m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
 		rte_eth_macaddr_get(slave_id, &m_hdr->eth_hdr.s_addr);
 
-		if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) == -ENOBUFS)) {
-			/* reset timer */
-			port->rx_marker_timer = 0;
-			wrn = WRN_TX_QUEUE_FULL;
-			goto free_out;
+		if (internals->mode4.slow_pkts.hw_filtering_en == 0) {
+			if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) ==
+					-ENOBUFS)) {
+				/* reset timer */
+				port->rx_marker_timer = 0;
+				wrn = WRN_TX_QUEUE_FULL;
+				goto free_out;
+			}
+		} else {
+			/* Send packet directly to the slow queue */
+			if (unlikely(rte_eth_tx_burst(slave_id,
+					internals->mode4.slow_pkts.tx_queue_id,
+					&pkt, 1) == 0)) {
+				/* reset timer */
+				port->rx_marker_timer = 0;
+				wrn = WRN_TX_QUEUE_FULL;
+				goto free_out;
+			}
 		}
 	} else if (likely(subtype == SLOW_SUBTYPE_LACP)) {
-		if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
-			/* If RX fing full free lacpdu message and drop packet */
-			wrn = WRN_RX_QUEUE_FULL;
-			goto free_out;
-		}
+		if (!internals->mode4.slow_pkts.hw_filtering_en) {
+			if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
+				/* If RX fing full free lacpdu message and drop packet */
+				wrn = WRN_RX_QUEUE_FULL;
+				goto free_out;
+			}
+		} else
+			rx_machine_update(internals, slave_id, pkt);
 	} else {
 		wrn = WRN_UNKNOWN_SLOW_TYPE;
 		goto free_out;
@@ -1504,3 +1554,55 @@
 	rte_eal_alarm_set(internals->mode4.update_timeout_us,
 			bond_mode_8023ad_ext_periodic_cb, arg);
 }
+
+#define MBUF_CACHE_SIZE 250
+#define NUM_MBUFS 8191
+
+int
+rte_eth_bond_8023ad_slow_pkt_hw_filter_enable(uint8_t port)
+{
+	int retval = 0;
+	struct rte_eth_dev *dev = &rte_eth_devices[port];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		dev->data->dev_private;
+
+	if (check_for_bonded_ethdev(dev) != 0)
+		return -1;
+
+	if (bond_8023ad_slow_pkt_hw_filter_supported(port) != 0)
+		return -1;
+
+	/* Device must be stopped to set up slow queue */
+	if (dev->data->dev_started)
+		return -1;
+
+	internals->mode4.slow_pkts.hw_filtering_en = 1;
+
+	bond_ethdev_mode_set(dev, internals->mode);
+	return retval;
+}
+
+int
+rte_eth_bond_8023ad_slow_pkt_hw_filter_disable(uint8_t port)
+{
+	int retval = 0;
+	struct rte_eth_dev *dev = &rte_eth_devices[port];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		dev->data->dev_private;
+
+	if (check_for_bonded_ethdev(dev) != 0)
+		return -1;
+
+	/* Device must be stopped to set up slow queue */
+	if (dev->data->dev_started)
+		return -1;
+
+	internals->mode4.slow_pkts.hw_filtering_en = 0;
+
+	bond_ethdev_mode_set(dev, internals->mode);
+
+	internals->mode4.slow_pkts.rx_queue_id = UINT16_MAX;
+	internals->mode4.slow_pkts.tx_queue_id = UINT16_MAX;
+
+	return retval;
+}
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.h b/drivers/net/bonding/rte_eth_bond_8023ad.h
index 6b8ff57..d527970 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.h
@@ -302,4 +302,39 @@ struct rte_eth_bond_8023ad_slave_info {
 rte_eth_bond_8023ad_ext_slowtx(uint8_t port_id, uint8_t slave_id,
 		struct rte_mbuf *lacp_pkt);
 
+/**
+ * Enable slow queue on slaves
+ *
+ * This function creates additional queues on slaves to use flow director to
+ * redirect all slow packets to process it in LACP daemon.
+ * To use this feature all slaves must support at least one queue more than
+ * bonded device for receiving and transmit packets.
+ *
+ * Bonding port must be stopped to change this configuration.
+ *
+ * @param port_id      Bonding device id
+ *
+ * @return
+ *   0 on success, negative value otherwise.
+ */
+int
+rte_eth_bond_8023ad_slow_pkt_hw_filter_enable(uint8_t port_id);
+
+/**
+ * Disable slow queue on slaves
+ *
+ * This function disables hardware slow packet filter.
+ *
+ * Bonding port must be stopped to change this configuration.
+ *
+ * @see rte_eth_bond_8023ad_slow_pkt_hw_filter_enable
+ *
+ * @param port_id      Bonding device id
+ * @return
+ *   0 on success, negative value otherwise.
+ *
+ */
+int
+rte_eth_bond_8023ad_slow_pkt_hw_filter_disable(uint8_t port_id);
+
 #endif /* RTE_ETH_BOND_8023AD_H_ */
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
index ca8858b..40a4320 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_byteorder.h>
 #include <rte_atomic.h>
+#include <rte_flow.h>
 
 #include "rte_eth_bond_8023ad.h"
 
@@ -162,6 +163,9 @@ struct port {
 
 	uint64_t warning_timer;
 	volatile uint16_t warnings_to_show;
+
+	/** Memory pool used to allocate slow queues */
+	struct rte_mempool *slow_pool;
 };
 
 struct mode8023ad_private {
@@ -175,6 +179,16 @@ struct mode8023ad_private {
 	uint64_t update_timeout_us;
 	rte_eth_bond_8023ad_ext_slowrx_fn slowrx_cb;
 	uint8_t external_sm;
+
+
+	struct {
+		uint8_t hw_filtering_en;
+
+		struct rte_flow *flow[RTE_MAX_ETHPORTS];
+
+		uint16_t rx_queue_id;
+		uint16_t tx_queue_id;
+	} slow_pkts;
 };
 
 /**
@@ -295,4 +309,14 @@ struct mode8023ad_private {
 void
 bond_mode_8023ad_mac_address_update(struct rte_eth_dev *bond_dev);
 
+int
+bond_ethdev_8023ad_flow_verify(struct rte_eth_dev *bond_dev,
+		uint8_t slave_port);
+
+int
+bond_ethdev_8023ad_flow_set(struct rte_eth_dev *bond_dev, uint8_t slave_port);
+
+int
+bond_8023ad_slow_pkt_hw_filter_supported(uint8_t port_id);
+
 #endif /* RTE_ETH_BOND_8023AD_H_ */
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 37f3d43..46b7d80 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -133,6 +133,250 @@
 		(subtype == SLOW_SUBTYPE_MARKER || subtype == SLOW_SUBTYPE_LACP));
 }
 
+/*****************************************************************************
+ * Flow director's setup for mode 4 optimization
+ */
+
+static struct rte_flow_item_eth flow_item_eth_type_8023ad = {
+	.dst.addr_bytes = { 0 },
+	.src.addr_bytes = { 0 },
+	.type = RTE_BE16(ETHER_TYPE_SLOW),
+};
+
+static struct rte_flow_item_eth flow_item_eth_mask_type_8023ad = {
+	.dst.addr_bytes = { 0 },
+	.src.addr_bytes = { 0 },
+	.type = 0xFFFF,
+};
+
+static struct rte_flow_item flow_item_8023ad[] = {
+	{
+		.type = RTE_FLOW_ITEM_TYPE_ETH,
+		.spec = &flow_item_eth_type_8023ad,
+		.last = NULL,
+		.mask = &flow_item_eth_mask_type_8023ad,
+	},
+	{
+		.type = RTE_FLOW_ITEM_TYPE_END,
+		.spec = NULL,
+		.last = NULL,
+		.mask = NULL,
+	}
+};
+
+const struct rte_flow_attr flow_attr_8023ad = {
+	.group = 0,
+	.priority = 0,
+	.ingress = 1,
+	.egress = 0,
+	.reserved = 0,
+};
+
+int
+bond_ethdev_8023ad_flow_verify(struct rte_eth_dev *bond_dev,
+		uint8_t slave_port) {
+	struct rte_flow_error error;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+
+	struct rte_flow_action_queue lacp_queue_conf = {
+		.index = internals->mode4.slow_pkts.rx_queue_id,
+	};
+
+	const struct rte_flow_action actions[] = {
+		{
+			.type = RTE_FLOW_ACTION_TYPE_QUEUE,
+			.conf = &lacp_queue_conf
+		},
+		{
+			.type = RTE_FLOW_ACTION_TYPE_END,
+		}
+	};
+
+	int ret = rte_flow_validate(slave_port, &flow_attr_8023ad,
+			flow_item_8023ad, actions, &error);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+int
+bond_8023ad_slow_pkt_hw_filter_supported(uint8_t port_id) {
+	struct rte_eth_dev *bond_dev = &rte_eth_devices[port_id];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+	struct rte_eth_dev_info bond_info, slave_info;
+	uint8_t idx;
+
+	/* Verify if all slaves in bonding supports flow director and */
+	if (internals->slave_count > 0) {
+		rte_eth_dev_info_get(bond_dev->data->port_id, &bond_info);
+		internals->mode4.slow_pkts.rx_queue_id = bond_info.nb_rx_queues;
+		internals->mode4.slow_pkts.tx_queue_id = bond_info.nb_tx_queues;
+		for (idx = 0; idx < internals->slave_count; idx++) {
+			rte_eth_dev_info_get(internals->slaves[idx].port_id,
+					&slave_info);
+			if ((slave_info.max_rx_queues < bond_info.nb_rx_queues)
+					|| (slave_info.max_rx_queues <
+						bond_info.nb_rx_queues))
+				return -1;
+
+			if (bond_ethdev_8023ad_flow_verify(bond_dev,
+					internals->slaves[idx].port_id) != 0)
+				return -1;
+		}
+	}
+
+	return 0;
+}
+
+int
+bond_ethdev_8023ad_flow_set(struct rte_eth_dev *bond_dev, uint8_t slave_port) {
+
+	struct rte_flow_error error;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+
+	struct rte_flow_action_queue lacp_queue_conf = {
+		.index = internals->mode4.slow_pkts.rx_queue_id,
+	};
+
+	const struct rte_flow_action actions[] = {
+		{
+			.type = RTE_FLOW_ACTION_TYPE_QUEUE,
+			.conf = &lacp_queue_conf
+		},
+		{
+			.type = RTE_FLOW_ACTION_TYPE_END,
+		}
+	};
+
+	internals->mode4.slow_pkts.flow[slave_port] = rte_flow_create(slave_port,
+			&flow_attr_8023ad, flow_item_8023ad, actions, &error);
+	if (internals->mode4.slow_pkts.flow[slave_port] == NULL) {
+		RTE_BOND_LOG(ERR, "bond_ethdev_8023ad_flow_set: %s "
+				"(slave_port=%d queue_id=%d)",
+				error.message, slave_port,
+				internals->mode4.slow_pkts.rx_queue_id);
+		return -1;
+	}
+
+	return 0;
+}
+
+static uint16_t
+bond_ethdev_rx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs,
+		uint16_t nb_pkts)
+{
+	struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue;
+	struct bond_dev_private *internals = bd_rx_q->dev_private;
+	uint16_t num_rx_total = 0;	/* Total number of received packets */
+	uint8_t slaves[RTE_MAX_ETHPORTS];
+	uint8_t slave_count;
+
+	uint8_t i;
+
+	/* Copy slave list to protect against slave up/down changes during tx
+	 * bursting */
+	slave_count = internals->active_slave_count;
+	memcpy(slaves, internals->active_slaves,
+			sizeof(internals->active_slaves[0]) * slave_count);
+
+	for (i = 0; i < slave_count && num_rx_total < nb_pkts; i++) {
+		/* Read packets from this slave */
+		num_rx_total += rte_eth_rx_burst(slaves[i], bd_rx_q->queue_id,
+				&bufs[num_rx_total], nb_pkts - num_rx_total);
+	}
+
+	return num_rx_total;
+}
+
+static uint16_t
+bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs,
+		uint16_t nb_pkts)
+{
+	struct bond_dev_private *internals;
+	struct bond_tx_queue *bd_tx_q;
+
+	uint8_t num_of_slaves;
+	uint8_t slaves[RTE_MAX_ETHPORTS];
+	 /* positions in slaves, not ID */
+	uint8_t distributing_offsets[RTE_MAX_ETHPORTS];
+	uint8_t distributing_count;
+
+	uint16_t num_tx_slave, num_tx_total = 0, num_tx_fail_total = 0;
+	uint16_t i, op_slave_idx;
+
+	struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_pkts];
+
+	/* Total amount of packets in slave_bufs */
+	uint16_t slave_nb_pkts[RTE_MAX_ETHPORTS] = { 0 };
+	/* Slow packets placed in each slave */
+
+	if (unlikely(nb_pkts == 0))
+		return 0;
+
+	bd_tx_q = (struct bond_tx_queue *)queue;
+	internals = bd_tx_q->dev_private;
+
+	/* Copy slave list to protect against slave up/down changes during tx
+	 * bursting */
+	num_of_slaves = internals->active_slave_count;
+	if (num_of_slaves < 1)
+		return num_tx_total;
+
+	memcpy(slaves, internals->active_slaves, sizeof(slaves[0]) *
+			num_of_slaves);
+
+	distributing_count = 0;
+	for (i = 0; i < num_of_slaves; i++) {
+		struct port *port = &mode_8023ad_ports[slaves[i]];
+		if (ACTOR_STATE(port, DISTRIBUTING))
+			distributing_offsets[distributing_count++] = i;
+	}
+
+	if (likely(distributing_count > 0)) {
+		/* Populate slaves mbuf with the packets which are to be sent */
+		for (i = 0; i < nb_pkts; i++) {
+			/* Select output slave using hash based on xmit policy */
+			op_slave_idx = internals->xmit_hash(bufs[i],
+					distributing_count);
+
+			/* Populate slave mbuf arrays with mbufs for that slave.
+			 * Use only slaves that are currently distributing.
+			 */
+			uint8_t slave_offset =
+					distributing_offsets[op_slave_idx];
+			slave_bufs[slave_offset][slave_nb_pkts[slave_offset]] =
+					bufs[i];
+			slave_nb_pkts[slave_offset]++;
+		}
+	}
+
+	/* Send packet burst on each slave device */
+	for (i = 0; i < num_of_slaves; i++) {
+		if (slave_nb_pkts[i] == 0)
+			continue;
+
+		num_tx_slave = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
+				slave_bufs[i], slave_nb_pkts[i]);
+
+		num_tx_total += num_tx_slave;
+		num_tx_fail_total += slave_nb_pkts[i] - num_tx_slave;
+
+		/* If tx burst fails move packets to end of bufs */
+		if (unlikely(num_tx_slave < slave_nb_pkts[i])) {
+			uint16_t j = nb_pkts - num_tx_fail_total;
+			for ( ; num_tx_slave < slave_nb_pkts[i]; j++,
+					num_tx_slave++)
+				bufs[j] = slave_bufs[i][num_tx_slave];
+		}
+	}
+
+	return num_tx_total;
+}
+
 static uint16_t
 bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		uint16_t nb_pkts)
@@ -180,6 +424,13 @@
 
 		/* Handle slow protocol packets. */
 		while (j < num_rx_total) {
+
+			/* if packet is not pure L2 and is known, skip it */
+			if ((bufs[j]->packet_type & ~RTE_PTYPE_L2_ETHER) != 0) {
+				j++;
+				continue;
+			}
+
 			if (j + 3 < num_rx_total)
 				rte_prefetch0(rte_pktmbuf_mtod(bufs[j + 3], void *));
 
@@ -187,7 +438,7 @@
 			subtype = ((struct slow_protocol_frame *)hdr)->slow_protocol.subtype;
 
 			/* Remove packet from array if it is slow packet or slave is not
-			 * in collecting state or bondign interface is not in promiscus
+			 * in collecting state or bonding interface is not in promiscuous
 			 * mode and packet address does not match. */
 			if (unlikely(is_lacp_packets(hdr->ether_type, subtype, bufs[j]->vlan_tci) ||
 				!collecting || (!promisc &&
@@ -204,7 +455,8 @@
 				num_rx_total--;
 				if (j < num_rx_total) {
 					memmove(&bufs[j], &bufs[j + 1], sizeof(bufs[0]) *
-						(num_rx_total - j));
+							(num_rx_total - j));
+
 				}
 			} else
 				j++;
@@ -1295,11 +1547,19 @@ struct bwg_slave {
 		if (bond_mode_8023ad_enable(eth_dev) != 0)
 			return -1;
 
-		eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_8023ad;
-		eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_8023ad;
-		RTE_LOG(WARNING, PMD,
-				"Using mode 4, it is necessary to do TX burst and RX burst "
-				"at least every 100ms.\n");
+		if (!internals->mode4.slow_pkts.hw_filtering_en) {
+			eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_8023ad;
+			eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_8023ad;
+			RTE_LOG(WARNING, PMD,
+				"Using mode 4, it is necessary to do TX burst "
+				"and RX burst at least every 100ms.\n");
+		} else {
+			/* Use flow director's optimization */
+			eth_dev->rx_pkt_burst =
+					bond_ethdev_rx_burst_8023ad_fast_queue;
+			eth_dev->tx_pkt_burst =
+					bond_ethdev_tx_burst_8023ad_fast_queue;
+		}
 		break;
 	case BONDING_MODE_TLB:
 		eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_tlb;
@@ -1321,15 +1581,80 @@ struct bwg_slave {
 	return 0;
 }
 
+static int
+slave_configure_slow_queue(struct rte_eth_dev *bonded_eth_dev,
+		struct rte_eth_dev *slave_eth_dev)
+{
+	int errval = 0;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		bonded_eth_dev->data->dev_private;
+	struct port *port = &mode_8023ad_ports[slave_eth_dev->data->port_id];
+
+	if (port->slow_pool == NULL) {
+		char mem_name[256];
+		int slave_id = slave_eth_dev->data->port_id;
+
+		snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_slow_pool",
+				slave_id);
+		port->slow_pool = rte_pktmbuf_pool_create(mem_name, 8191,
+			250, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
+			slave_eth_dev->data->numa_node);
+
+		/* Any memory allocation failure in initialization is critical because
+		 * resources can't be free, so reinitialization is impossible. */
+		if (port->slow_pool == NULL) {
+			rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
+				slave_id, mem_name, rte_strerror(rte_errno));
+		}
+	}
+
+	if (internals->mode4.slow_pkts.hw_filtering_en) {
+		/* Configure slow Rx queue */
+
+		errval = rte_eth_rx_queue_setup(slave_eth_dev->data->port_id,
+				internals->mode4.slow_pkts.rx_queue_id, 128,
+				rte_eth_dev_socket_id(slave_eth_dev->data->port_id),
+				NULL, port->slow_pool);
+		if (errval != 0) {
+			RTE_BOND_LOG(ERR,
+					"rte_eth_rx_queue_setup: port=%d queue_id %d, err (%d)",
+					slave_eth_dev->data->port_id,
+					internals->mode4.slow_pkts.rx_queue_id,
+					errval);
+			return errval;
+		}
+
+		errval = rte_eth_tx_queue_setup(slave_eth_dev->data->port_id,
+				internals->mode4.slow_pkts.tx_queue_id, 512,
+				rte_eth_dev_socket_id(slave_eth_dev->data->port_id),
+				NULL);
+		if (errval != 0) {
+			RTE_BOND_LOG(ERR,
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id,
+				internals->mode4.slow_pkts.tx_queue_id,
+				errval);
+			return errval;
+		}
+	}
+	return 0;
+}
+
 int
 slave_configure(struct rte_eth_dev *bonded_eth_dev,
 		struct rte_eth_dev *slave_eth_dev)
 {
 	struct bond_rx_queue *bd_rx_q;
 	struct bond_tx_queue *bd_tx_q;
+	uint16_t nb_rx_queues;
+	uint16_t nb_tx_queues;
 
 	int errval;
 	uint16_t q_id;
+	struct rte_flow_error flow_error;
+
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		bonded_eth_dev->data->dev_private;
 
 	/* Stop slave */
 	rte_eth_dev_stop(slave_eth_dev->data->port_id);
@@ -1359,10 +1684,19 @@ struct bwg_slave {
 	slave_eth_dev->data->dev_conf.rxmode.hw_vlan_filter =
 			bonded_eth_dev->data->dev_conf.rxmode.hw_vlan_filter;
 
+	nb_rx_queues = bonded_eth_dev->data->nb_rx_queues;
+	nb_tx_queues = bonded_eth_dev->data->nb_tx_queues;
+
+	if (internals->mode == BONDING_MODE_8023AD) {
+		if (internals->mode4.slow_pkts.hw_filtering_en) {
+			nb_rx_queues++;
+			nb_tx_queues++;
+		}
+	}
+
 	/* Configure device */
 	errval = rte_eth_dev_configure(slave_eth_dev->data->port_id,
-			bonded_eth_dev->data->nb_rx_queues,
-			bonded_eth_dev->data->nb_tx_queues,
+			nb_rx_queues, nb_tx_queues,
 			&(slave_eth_dev->data->dev_conf));
 	if (errval != 0) {
 		RTE_BOND_LOG(ERR, "Cannot configure slave device: port %u , err (%d)",
@@ -1396,10 +1730,33 @@ struct bwg_slave {
 				&bd_tx_q->tx_conf);
 		if (errval != 0) {
 			RTE_BOND_LOG(ERR,
-					"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
-					slave_eth_dev->data->port_id, q_id, errval);
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id, q_id, errval);
+			return errval;
+		}
+	}
+
+	if (internals->mode == BONDING_MODE_8023AD &&
+			internals->mode4.slow_pkts.hw_filtering_en) {
+		if (slave_configure_slow_queue(bonded_eth_dev, slave_eth_dev)
+				!= 0)
 			return errval;
+
+		if (bond_ethdev_8023ad_flow_verify(bonded_eth_dev,
+				slave_eth_dev->data->port_id) != 0) {
+			RTE_BOND_LOG(ERR,
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id, q_id, errval);
+			return -1;
 		}
+
+		if (internals->mode4.slow_pkts.flow[slave_eth_dev->data->port_id] != NULL)
+			rte_flow_destroy(slave_eth_dev->data->port_id,
+					internals->mode4.slow_pkts.flow[slave_eth_dev->data->port_id],
+					&flow_error);
+
+		bond_ethdev_8023ad_flow_set(bonded_eth_dev,
+				slave_eth_dev->data->port_id);
 	}
 
 	/* Start device */
@@ -1559,6 +1916,15 @@ struct bwg_slave {
 	if (internals->promiscuous_en)
 		bond_ethdev_promiscuous_enable(eth_dev);
 
+	if (internals->mode == BONDING_MODE_8023AD) {
+		if (internals->mode4.slow_pkts.hw_filtering_en) {
+			internals->mode4.slow_pkts.rx_queue_id =
+					eth_dev->data->nb_rx_queues;
+			internals->mode4.slow_pkts.tx_queue_id =
+					eth_dev->data->nb_tx_queues;
+		}
+	}
+
 	/* Reconfigure each slave device if starting bonded device */
 	for (i = 0; i < internals->slave_count; i++) {
 		if (slave_configure(eth_dev,
@@ -1688,8 +2054,10 @@ struct bwg_slave {
 
 static void
 bond_ethdev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+
 {
 	struct bond_dev_private *internals = dev->data->dev_private;
+	uint16_t max_nb_rx_queues = 0, max_nb_tx_queues = 0;
 
 	dev_info->max_mac_addrs = 1;
 
@@ -1697,8 +2065,38 @@ struct bwg_slave {
 				  ? internals->candidate_max_rx_pktlen
 				  : ETHER_MAX_JUMBO_FRAME_LEN;
 
-	dev_info->max_rx_queues = (uint16_t)128;
-	dev_info->max_tx_queues = (uint16_t)512;
+	if (internals->slave_count > 0) {
+		/* Max number of tx/rx queues that the bonded device can
+		 * support is the minimum values of the bonded slaves */
+		struct rte_eth_dev_info slave_info;
+		uint8_t idx;
+
+		max_nb_rx_queues = UINT16_MAX;
+		max_nb_tx_queues = UINT16_MAX;
+		for (idx = 0; idx < internals->slave_count; idx++) {
+			rte_eth_dev_info_get(internals->slaves[idx].port_id,
+					&slave_info);
+
+			if (max_nb_rx_queues == 0 ||
+				slave_info.max_rx_queues < max_nb_rx_queues)
+				max_nb_rx_queues = slave_info.max_rx_queues;
+
+			if (max_nb_tx_queues == 0 ||
+				slave_info.max_rx_queues < max_nb_tx_queues)
+				max_nb_tx_queues = slave_info.max_tx_queues;
+		}
+		dev_info->max_rx_queues = max_nb_rx_queues;
+		dev_info->max_tx_queues = max_nb_tx_queues;
+	} else {
+		dev_info->max_rx_queues = (uint16_t)128;
+		dev_info->max_tx_queues = (uint16_t)512;
+	}
+
+	if (internals->mode == BONDING_MODE_8023AD &&
+		internals->mode4.slow_pkts.hw_filtering_en) {
+		dev_info->max_rx_queues--;
+		dev_info->max_tx_queues--;
+	}
 
 	dev_info->min_rx_bufsize = 0;
 
diff --git a/drivers/net/bonding/rte_eth_bond_version.map b/drivers/net/bonding/rte_eth_bond_version.map
index 2de0a7d..0ad2ba4 100644
--- a/drivers/net/bonding/rte_eth_bond_version.map
+++ b/drivers/net/bonding/rte_eth_bond_version.map
@@ -43,3 +43,12 @@ DPDK_16.07 {
 	rte_eth_bond_8023ad_setup;
 
 } DPDK_16.04;
+
+DPDK_17.08 {
+	global:
+
+	rte_eth_bond_8023ad_slow_pkt_hw_filter_enable;
+	rte_eth_bond_8023ad_slow_pkt_hw_filter_disable;
+
+	local: *;
+} DPDK_16.07;
-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] test-pmd: add set bonding slow_queue hw/sw
  2017-06-29 16:20 ` [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload Tomasz Kulasek
  2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 1/2] " Tomasz Kulasek
@ 2017-06-29 16:20   ` Tomasz Kulasek
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
  2 siblings, 0 replies; 23+ messages in thread
From: Tomasz Kulasek @ 2017-06-29 16:20 UTC (permalink / raw)
  To: dev

This patch adds new command:

    set bonding slow_queue <port_id> [sw|hw]

"set bonding slow_queue <bonding_port_id> hw" sets hardware management
of slow packets and chooses simplified paths for tx/rx bursts.

"set bonding slow_queue <bonding_port_id> sw" turns back to the software
handling of slow packets. This option is default.

Example:

    testpmd> create bonded device 4 0
    testpmd> add bonding slave 0 <bond_id>
    testpmd> add bonding slave 1 <bond_id>
    testpmd> set bonding slow_queue <bond_id> [sw|hw]
    testpmd> port start <bond_id>

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
v2 changes:
 - changed name of rte_eth_bond_8023ad_slow_queue_enable/disable to
   rte_eth_bond_8023ad_slow_pkt_hw_filter_enable/disable,
 - added "set bonding slow_queue <port_id> [sw|hw]" description in
   documentation
---
 app/test-pmd/cmdline.c                      | 75 +++++++++++++++++++++++++++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  8 +++
 2 files changed, 83 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 632d6f0..194d986 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -87,6 +87,7 @@
 #include <cmdline.h>
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
+#include <rte_eth_bond_8023ad.h>
 #endif
 #ifdef RTE_LIBRTE_IXGBE_PMD
 #include <rte_pmd_ixgbe.h>
@@ -4300,6 +4301,79 @@ static void cmd_set_bonding_mode_parsed(void *parsed_result,
 		}
 };
 
+/* *** SET BONDING SLOW_QUEUE SW/HW *** */
+struct cmd_set_bonding_slow_queue_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t bonding;
+	cmdline_fixed_string_t slow_queue;
+	uint8_t port_id;
+	cmdline_fixed_string_t mode;
+};
+
+static void cmd_set_bonding_slow_queue_parsed(void *parsed_result,
+		__attribute__((unused))  struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_bonding_slow_queue_result *res = parsed_result;
+	portid_t port_id = res->port_id;
+	struct rte_port *port;
+
+	port = &ports[port_id];
+
+	/** Check if the port is not started **/
+	if (port->port_status != RTE_PORT_STOPPED) {
+		printf("Please stop port %d first\n", port_id);
+		return;
+	}
+
+	if (!strcmp(res->mode, "hw")) {
+		if (rte_eth_bond_8023ad_slow_pkt_hw_filter_enable(port_id) == 0)
+			printf("Hardware slow queue enabled\n");
+		else
+			printf("Enabling hardware slow queue on port %d "
+					"failed\n", port_id);
+	} else if (!strcmp(res->mode, "sw")) {
+		if (rte_eth_bond_8023ad_slow_pkt_hw_filter_disable(port_id)
+				== 0)
+			printf("Software slow queue enabled\n");
+		else
+			printf("Enabling software slow queue on port %d "
+					"failed\n", port_id);
+	}
+}
+
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_set =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		set, "set");
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_bonding =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		bonding, "bonding");
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_slow_queue =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		slow_queue, "slow_queue");
+cmdline_parse_token_num_t cmd_setbonding_slow_queue_port =
+TOKEN_NUM_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		port_id, UINT8);
+cmdline_parse_token_string_t cmd_setbonding_slow_queue_mode =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_slow_queue_result,
+		mode, "sw#hw");
+
+cmdline_parse_inst_t cmd_set_slow_queue = {
+		.f = cmd_set_bonding_slow_queue_parsed,
+		.help_str = "set bonding slow_queue <port_id> "
+			"sw|hw: "
+			"Set the bonding slow queue acceleration for port_id",
+		.data = NULL,
+		.tokens = {
+				(void *)&cmd_setbonding_slow_queue_set,
+				(void *)&cmd_setbonding_slow_queue_bonding,
+				(void *)&cmd_setbonding_slow_queue_slow_queue,
+				(void *)&cmd_setbonding_slow_queue_port,
+				(void *)&cmd_setbonding_slow_queue_mode,
+				NULL
+		}
+};
+
 /* *** SET BALANCE XMIT POLICY *** */
 struct cmd_set_bonding_balance_xmit_policy_result {
 	cmdline_fixed_string_t set;
@@ -13846,6 +13920,7 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *) &cmd_set_bond_mac_addr,
 	(cmdline_parse_inst_t *) &cmd_set_balance_xmit_policy,
 	(cmdline_parse_inst_t *) &cmd_set_bond_mon_period,
+	(cmdline_parse_inst_t *) &cmd_set_slow_queue,
 #endif
 	(cmdline_parse_inst_t *)&cmd_vlan_offload,
 	(cmdline_parse_inst_t *)&cmd_vlan_tpid,
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 18ee8a3..3da2a38 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1759,6 +1759,14 @@ For example, to set the link status monitoring polling period of bonded device (
    testpmd> set bonding mon_period 5 150
 
 
+set bonding slow_queue
+~~~~~~~~~~~~~~~~~~~~~~
+
+Set software or hardware slow packet processing in mode 4.
+
+   testpmd> set bonding slow_queue (port_id) (sw|hw)
+
+
 show bonding config
 ~~~~~~~~~~~~~~~~~~~
 
-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration
  2017-06-29 16:20 ` [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload Tomasz Kulasek
  2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 1/2] " Tomasz Kulasek
  2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 2/2] test-pmd: add set bonding slow_queue hw/sw Tomasz Kulasek
@ 2017-07-04 16:46   ` Declan Doherty
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 1/4] net/bond: calculate number of bonding tx/rx queues Declan Doherty
                       ` (4 more replies)
  2 siblings, 5 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 16:46 UTC (permalink / raw)
  To: dev; +Cc: Declan Doherty

1. Overview

  Packet processing in the current path for bonding in mode 4, requires
  parse all packets in the fast path, to classify and process LACP
  packets.

  The idea of performance improvement is to use hardware offloads to
  improve packet classification.

2. Scope of work

   a) Optimization of software LACP packet classification by using
      packet_type metadata to eliminate the requirement of parsing each
      packet in the received burst.

   b) Implementation of classification mechanism using flow director to
      redirect LACP packets to the dedicated queue (not visible by
      application).

      - Filter pattern choosing (not all filters are supported by all
        devices),
      - Changing processing path to speed up non-LACP packets
        processing,
      - Handle LACP packets from dedicated Rx queue and send to the
        dedicated Tx queue,

   c) Creation of fallback mechanism allowing to select the most
      preferable method of processing:

      - Flow director,
      - Packet type metadata,
      - Software parsing,

3. Implementation

3.1. Packet type

   The packet_type approach would result in a performance improvement
   as packets data would no longer be required to be read, but with this
   approach the bonded driver would still need to look at the mbuf of
   each packet thereby having an impact on the achievable Rx
   performance.

   There's not packet_type value describing LACP packets directly.
   However, it can be used to limit number of packets required to be
   parsed, e.g. if packet_type indicates >L2 packets.

   It should improve performance while well-known non-LACP packets can
   be skipped without the need to look up into its data.

3.2. Flow director

   Using rte_flow API and pattern on ethernet type of packet (0x8809),
   we can configure flow director to redirect slow packets to separated
   queue.

   An independent Rx queues for LACP would remove the requirement to
   filter all ingress traffic in sw which should result in a performance
   increase. Other queues stay untouched and processing of packets on
   the fast path will be reduced to simple packet collecting from
   slaves.

   Separated Tx queue for LACP daemon allows to send LACP responses
   immediately, without interfering into Tx fast path.

   RECEIVE

         .---------------.
         | Slave 0       |
         |      .------. |
         |  Fd  | Rxq  | |
   Rx ======o==>|      |==============.
         |  |   +======+ |            |      .---------------.
         |  `-->| LACP |--------.     |      | Bonding       |
         |      `------' |      |     |      |      .------. |
         `---------------'      |     |      |      |      | |
                                |     >============>|      |=======> Rx
         .---------------.      |     |      |      +======+ |
         | Slave 1       |      |     |      |      | XXXX | |
         |      .------. |      |     |      |      `------' |
         |  Fd  | Rxq  | |      |     |      `---------------'
   Rx ======o==>|      |=============='        .-----------.
         |  |   +======+ |      |             /             \
         |  `-->| LACP |--------+----------->+  LACP DAEMON  |
         |      `------' |             Tx <---\             /
         `---------------'                     `-----------'

   All slow packets received by slaves in bonding are redirected to the
   separated queue using flow director. Other packets are collected from
   slaves and exposed to the application with Rx burst on bonded device.

   TRANSMIT

         .---------------.
         | Slave 0       |
         |      .------. |
         |      |      | |
   Tx <=====+===|      |<=============.
         |  |   |------| |            |      .---------------.
         |  `---| LACP |<-------.     |      | Bonding       |
         |      `------' |      |     |      |      .------. |
         `---------------'      |     |      |      |      | |
                                |     +<============|      |<====== Tx
         .---------------.      |     |      |      +======+ |
         | Slave 1       |      |     |      |      | XXXX | |
         |      .------. |      |     |      |      `------' |
         |      |      | |      |     |      `---------------'
   Tx <=====+===|      |<============='  Rx    .-----------.
         |  |   |------| |      |         `-->/             \
         |  `---| LACP |<-------+------------+  LACP DAEMON  |
         |      `------' |                    \             /
         `---------------'                     `-----------'

   On transmit, packets are propagated on the slaves. While we have
   separated Tx queue for LACP responses, it can be sent regardless of
   the fast path.

   LACP DAEMON

   In this mode whole slow packets are handled in LACP DAEMON.

V3:
 - Split hw filtering patch into 3 patches:
 	- fix for calculating maximum number of tx/rx queues of bonding device
	- enable use of ptype hint for filtering of control plane packets in
	default enablement
	- enablement of dedicated queues for LACP control packet filtering.

Declan Doherty (1):
  net/bond: calculate number of bonding tx/rx queues

Tomasz Kulasek (3):
  net/bond: use ptype flags for LACP rx filtering
  net/bond: dedicated hw queues for LACP control traffic
  app/test-pmd: add cmd for dedicated LACP  rx/tx queues

 app/test-pmd/cmdline.c                            |  85 +++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst       |   9 +
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 167 ++++++--
 drivers/net/bonding/rte_eth_bond_8023ad.h         |  42 ++
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  27 ++
 drivers/net/bonding/rte_eth_bond_pmd.c            | 445 +++++++++++++++++++++-
 drivers/net/bonding/rte_eth_bond_version.map      |   9 +
 7 files changed, 734 insertions(+), 50 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v3 1/4] net/bond: calculate number of bonding tx/rx queues
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
@ 2017-07-04 16:46     ` Declan Doherty
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 2/4] net/bond: use ptype flags for LACP rx filtering Declan Doherty
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 16:46 UTC (permalink / raw)
  To: dev; +Cc: Declan Doherty

Fixes: 2efb58cb ("bond: new link bonding library")

This patch fixes the maximum number of tx an rx queues supported by a
bonding device return by the rte_eth_dev_info_get function. The bonding
device now calculates the maximum number of supported tx and rx queues
based on the slaves bound to the bonded device, with the minimum values
of tx and rx queues from the device slaves being the bonded devices
maximum, as each slave must be able to support the same number of tx
and rx queues.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index dccc016..f428e96 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1691,6 +1691,8 @@ static void
 bond_ethdev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 {
 	struct bond_dev_private *internals = dev->data->dev_private;
+	uint16_t max_nb_rx_queues = UINT16_MAX;
+	uint16_t max_nb_tx_queues = UINT16_MAX;
 
 	dev_info->max_mac_addrs = 1;
 
@@ -1698,8 +1700,29 @@ bond_ethdev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 				  ? internals->candidate_max_rx_pktlen
 				  : ETHER_MAX_JUMBO_FRAME_LEN;
 
-	dev_info->max_rx_queues = (uint16_t)128;
-	dev_info->max_tx_queues = (uint16_t)512;
+	if (internals->slave_count > 0) {
+		/* Max number of tx/rx queues that the bonded device can
+		 * support is the minimum values of the bonded slaves, as
+		 * all slaves must be capable of supporting the same number
+		 * of tx/rx queues.
+		 */
+		struct rte_eth_dev_info slave_info;
+		uint8_t idx;
+
+		for (idx = 0; idx < internals->slave_count; idx++) {
+			rte_eth_dev_info_get(internals->slaves[idx].port_id,
+					&slave_info);
+
+			if (slave_info.max_rx_queues < max_nb_rx_queues)
+				max_nb_rx_queues = slave_info.max_rx_queues;
+
+			if (slave_info.max_tx_queues < max_nb_tx_queues)
+				max_nb_tx_queues = slave_info.max_tx_queues;
+		}
+	}
+
+	dev_info->max_rx_queues = max_nb_rx_queues;
+	dev_info->max_tx_queues = max_nb_tx_queues;
 
 	dev_info->min_rx_bufsize = 0;
 
-- 
2.9.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v3 2/4] net/bond: use ptype flags for LACP rx filtering
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 1/4] net/bond: calculate number of bonding tx/rx queues Declan Doherty
@ 2017-07-04 16:46     ` Declan Doherty
  2017-07-04 19:54       ` Declan Doherty
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 16:46 UTC (permalink / raw)
  To: dev; +Cc: Tomasz Kulasek, Declan Doherty

From: Tomasz Kulasek <tomaszx.kulasek@intel.com>

Use packet types flags in mbuf to provide hint for filtering of LACP
control plane traffic from the data path.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index f428e96..9730ae0 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -180,6 +180,13 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 
 		/* Handle slow protocol packets. */
 		while (j < num_rx_total) {
+
+			/* If packet is not pure L2 and is known, skip it */
+			if ((bufs[j]->packet_type & ~RTE_PTYPE_L2_ETHER) != 0) {
+				j++;
+				continue;
+			}
+
 			if (j + 3 < num_rx_total)
 				rte_prefetch0(rte_pktmbuf_mtod(bufs[j + 3], void *));
 
@@ -187,7 +194,7 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 			subtype = ((struct slow_protocol_frame *)hdr)->slow_protocol.subtype;
 
 			/* Remove packet from array if it is slow packet or slave is not
-			 * in collecting state or bondign interface is not in promiscus
+			 * in collecting state or bonding interface is not in promiscuous
 			 * mode and packet address does not match. */
 			if (unlikely(is_lacp_packets(hdr->ether_type, subtype, bufs[j]->vlan_tci) ||
 				!collecting || (!promisc &&
-- 
2.9.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 1/4] net/bond: calculate number of bonding tx/rx queues Declan Doherty
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 2/4] net/bond: use ptype flags for LACP rx filtering Declan Doherty
@ 2017-07-04 16:46     ` Declan Doherty
  2017-07-04 19:55       ` Declan Doherty
                         ` (3 more replies)
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues Declan Doherty
  2017-07-05 11:35     ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Ferruh Yigit
  4 siblings, 4 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 16:46 UTC (permalink / raw)
  To: dev; +Cc: Tomasz Kulasek, Declan Doherty

From: Tomasz Kulasek <tomaszx.kulasek@intel.com>

Add support for hardware flow classification of LACP control plane
traffic to be redirect to a dedicated receive queue on each slave which
is not visible to application. Also enables a dedicate transmit queue
for LACP traffic which allows complete decoupling of control and data
paths.

This only applies to bonding devices running in mode 4
(link-aggegration-802.3ad).

Introduce two new APIs to support enable/disabled of dedicated
queues.

- rte_eth_bond_8023ad_dedicated_queues_enable
- rte_eth_bond_8023ad_dedicated_queues_disable

rte_eth_bond_8023ad_dedicated_queues_enable must be called before
bonding port is configured or started to reserved and configure the
dedicated queuesh.

When this option is enabled all slaves must support flow filtering 
by ethernet type and support one additional tx and rx queue on 
each slave.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
---
 drivers/net/bonding/rte_eth_bond_8023ad.c         | 167 +++++++--
 drivers/net/bonding/rte_eth_bond_8023ad.h         |  42 +++
 drivers/net/bonding/rte_eth_bond_8023ad_private.h |  27 ++
 drivers/net/bonding/rte_eth_bond_pmd.c            | 419 ++++++++++++++++++++--
 drivers/net/bonding/rte_eth_bond_version.map      |   9 +
 5 files changed, 612 insertions(+), 52 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index 65dc75b..a2313b3 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -632,16 +632,29 @@ tx_machine(struct bond_dev_private *internals, uint8_t slave_id)
 	lacpdu->tlv_type_terminator = TLV_TYPE_TERMINATOR_INFORMATION;
 	lacpdu->terminator_length = 0;
 
-	if (rte_ring_enqueue(port->tx_ring, lacp_pkt) == -ENOBUFS) {
-		/* If TX ring full, drop packet and free message. Retransmission
-		 * will happen in next function call. */
-		rte_pktmbuf_free(lacp_pkt);
-		set_warning_flags(port, WRN_TX_QUEUE_FULL);
-		return;
+	MODE4_DEBUG("Sending LACP frame\n");
+	BOND_PRINT_LACP(lacpdu);
+
+	if (internals->mode4.dedicated_queues.enabled == 0) {
+		int retval = rte_ring_enqueue(port->tx_ring, lacp_pkt);
+		if (retval != 0) {
+			/* If TX ring full, drop packet and free message.
+			   Retransmission will happen in next function call. */
+			rte_pktmbuf_free(lacp_pkt);
+			set_warning_flags(port, WRN_TX_QUEUE_FULL);
+			return;
+		}
+	} else {
+		uint16_t pkts_sent = rte_eth_tx_burst(slave_id,
+				internals->mode4.dedicated_queues.tx_qid,
+				&lacp_pkt, 1);
+		if (pkts_sent != 1) {
+			rte_pktmbuf_free(lacp_pkt);
+			set_warning_flags(port, WRN_TX_QUEUE_FULL);
+			return;
+		}
 	}
 
-	MODE4_DEBUG("sending LACP frame\n");
-	BOND_PRINT_LACP(lacpdu);
 
 	timer_set(&port->tx_machine_timer, internals->mode4.tx_period_timeout);
 	SM_FLAG_CLR(port, NTT);
@@ -741,6 +754,22 @@ link_speed_key(uint16_t speed) {
 }
 
 static void
+rx_machine_update(struct bond_dev_private *internals, uint8_t slave_id,
+		struct rte_mbuf *lacp_pkt) {
+	struct lacpdu_header *lacp;
+
+	if (lacp_pkt != NULL) {
+		lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
+		RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
+
+		/* This is LACP frame so pass it to rx_machine */
+		rx_machine(internals, slave_id, &lacp->lacpdu);
+		rte_pktmbuf_free(lacp_pkt);
+	} else
+		rx_machine(internals, slave_id, NULL);
+}
+
+static void
 bond_mode_8023ad_periodic_cb(void *arg)
 {
 	struct rte_eth_dev *bond_dev = arg;
@@ -748,8 +777,8 @@ bond_mode_8023ad_periodic_cb(void *arg)
 	struct port *port;
 	struct rte_eth_link link_info;
 	struct ether_addr slave_addr;
+	struct rte_mbuf *lacp_pkt = NULL;
 
-	void *pkt = NULL;
 	uint8_t i, slave_id;
 
 
@@ -809,20 +838,28 @@ bond_mode_8023ad_periodic_cb(void *arg)
 
 		SM_FLAG_SET(port, LACP_ENABLED);
 
-		/* Find LACP packet to this port. Do not check subtype, it is done in
-		 * function that queued packet */
-		if (rte_ring_dequeue(port->rx_ring, &pkt) == 0) {
-			struct rte_mbuf *lacp_pkt = pkt;
-			struct lacpdu_header *lacp;
+		if (internals->mode4.dedicated_queues.enabled == 0) {
+			/* Find LACP packet to this port. Do not check subtype,
+			 * it is done in function that queued packet
+			 */
+			int retval = rte_ring_dequeue(port->rx_ring,
+					(void **)&lacp_pkt);
 
-			lacp = rte_pktmbuf_mtod(lacp_pkt, struct lacpdu_header *);
-			RTE_ASSERT(lacp->lacpdu.subtype == SLOW_SUBTYPE_LACP);
+			if (retval != 0)
+				lacp_pkt = NULL;
 
-			/* This is LACP frame so pass it to rx_machine */
-			rx_machine(internals, slave_id, &lacp->lacpdu);
-			rte_pktmbuf_free(lacp_pkt);
-		} else
-			rx_machine(internals, slave_id, NULL);
+			rx_machine_update(internals, slave_id, lacp_pkt);
+		} else {
+			uint16_t rx_count = rte_eth_rx_burst(slave_id,
+					internals->mode4.dedicated_queues.rx_qid,
+					&lacp_pkt, 1);
+
+			if (rx_count == 1)
+				bond_mode_8023ad_handle_slow_pkt(internals,
+						slave_id, lacp_pkt);
+			else
+				rx_machine_update(internals, slave_id, NULL);
+		}
 
 		periodic_machine(internals, slave_id);
 		mux_machine(internals, slave_id);
@@ -1067,6 +1104,10 @@ bond_mode_8023ad_conf_assign(struct mode8023ad_private *mode4,
 	mode4->tx_period_timeout = conf->tx_period_ms * ms_ticks;
 	mode4->rx_marker_timeout = conf->rx_marker_period_ms * ms_ticks;
 	mode4->update_timeout_us = conf->update_timeout_ms * 1000;
+
+	mode4->dedicated_queues.enabled = 0;
+	mode4->dedicated_queues.rx_qid = UINT16_MAX;
+	mode4->dedicated_queues.tx_qid = UINT16_MAX;
 }
 
 static void
@@ -1191,18 +1232,36 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
 		m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
 		rte_eth_macaddr_get(slave_id, &m_hdr->eth_hdr.s_addr);
 
-		if (unlikely(rte_ring_enqueue(port->tx_ring, pkt) == -ENOBUFS)) {
-			/* reset timer */
-			port->rx_marker_timer = 0;
-			wrn = WRN_TX_QUEUE_FULL;
-			goto free_out;
+		if (internals->mode4.dedicated_queues.enabled == 0) {
+			int retval = rte_ring_enqueue(port->tx_ring, pkt);
+			if (retval != 0) {
+				/* reset timer */
+				port->rx_marker_timer = 0;
+				wrn = WRN_TX_QUEUE_FULL;
+				goto free_out;
+			}
+		} else {
+			/* Send packet directly to the slow queue */
+			uint16_t tx_count = rte_eth_tx_burst(slave_id,
+					internals->mode4.dedicated_queues.tx_qid,
+					&pkt, 1);
+			if (tx_count != 1) {
+				/* reset timer */
+				port->rx_marker_timer = 0;
+				wrn = WRN_TX_QUEUE_FULL;
+				goto free_out;
+			}
 		}
 	} else if (likely(subtype == SLOW_SUBTYPE_LACP)) {
-		if (unlikely(rte_ring_enqueue(port->rx_ring, pkt) == -ENOBUFS)) {
-			/* If RX fing full free lacpdu message and drop packet */
-			wrn = WRN_RX_QUEUE_FULL;
-			goto free_out;
-		}
+		if (internals->mode4.dedicated_queues.enabled == 0) {
+			int retval = rte_ring_enqueue(port->rx_ring, pkt);
+			if (retval != 0) {
+				/* If RX fing full free lacpdu message and drop packet */
+				wrn = WRN_RX_QUEUE_FULL;
+				goto free_out;
+			}
+		} else
+			rx_machine_update(internals, slave_id, pkt);
 	} else {
 		wrn = WRN_UNKNOWN_SLOW_TYPE;
 		goto free_out;
@@ -1507,3 +1566,49 @@ bond_mode_8023ad_ext_periodic_cb(void *arg)
 	rte_eal_alarm_set(internals->mode4.update_timeout_us,
 			bond_mode_8023ad_ext_periodic_cb, arg);
 }
+
+int
+rte_eth_bond_8023ad_dedicated_queues_enable(uint8_t port)
+{
+	int retval = 0;
+	struct rte_eth_dev *dev = &rte_eth_devices[port];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		dev->data->dev_private;
+
+	if (check_for_bonded_ethdev(dev) != 0)
+		return -1;
+
+	if (bond_8023ad_slow_pkt_hw_filter_supported(port) != 0)
+		return -1;
+
+	/* Device must be stopped to set up slow queue */
+	if (dev->data->dev_started)
+		return -1;
+
+	internals->mode4.dedicated_queues.enabled = 1;
+
+	bond_ethdev_mode_set(dev, internals->mode);
+	return retval;
+}
+
+int
+rte_eth_bond_8023ad_dedicated_queues_disable(uint8_t port)
+{
+	int retval = 0;
+	struct rte_eth_dev *dev = &rte_eth_devices[port];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		dev->data->dev_private;
+
+	if (check_for_bonded_ethdev(dev) != 0)
+		return -1;
+
+	/* Device must be stopped to set up slow queue */
+	if (dev->data->dev_started)
+		return -1;
+
+	internals->mode4.dedicated_queues.enabled = 0;
+
+	bond_ethdev_mode_set(dev, internals->mode);
+
+	return retval;
+}
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.h b/drivers/net/bonding/rte_eth_bond_8023ad.h
index 6b8ff57..5c61e66 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.h
@@ -302,4 +302,46 @@ int
 rte_eth_bond_8023ad_ext_slowtx(uint8_t port_id, uint8_t slave_id,
 		struct rte_mbuf *lacp_pkt);
 
+/**
+ * Enable dedicated hw queues for 802.3ad control plane traffic on on slaves
+ *
+ * This function creates an additional tx and rx queue on each slave for
+ * dedicated 802.3ad control plane traffic . A flow filtering rule is
+ * programmed on each slave to redirect all LACP slow packets to that rx queue
+ * for processing in the LACP state machine, this removes the need to filter
+ * these packets in the bonded devices data path. The additional tx queue is
+ * used to enable the LACP state machine to enqueue LACP packets directly to
+ * slave hw independently of the bonded devices data path.
+ *
+ * To use this feature all slaves must support the programming of the flow
+ * filter rule required for rx and have enough queues that one rx and tx queue
+ * can be reserved for the LACP state machines control packets.
+ *
+ * Bonding port must be stopped to change this configuration.
+ *
+ * @param port_id      Bonding device id
+ *
+ * @return
+ *   0 on success, negative value otherwise.
+ */
+int
+rte_eth_bond_8023ad_dedicated_queues_enable(uint8_t port_id);
+
+/**
+ * Disable slow queue on slaves
+ *
+ * This function disables hardware slow packet filter.
+ *
+ * Bonding port must be stopped to change this configuration.
+ *
+ * @see rte_eth_bond_8023ad_slow_pkt_hw_filter_enable
+ *
+ * @param port_id      Bonding device id
+ * @return
+ *   0 on success, negative value otherwise.
+ *
+ */
+int
+rte_eth_bond_8023ad_dedicated_queues_disable(uint8_t port_id);
+
 #endif /* RTE_ETH_BOND_8023AD_H_ */
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
index ca8858b..c16dba8 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_byteorder.h>
 #include <rte_atomic.h>
+#include <rte_flow.h>
 
 #include "rte_eth_bond_8023ad.h"
 
@@ -162,6 +163,9 @@ struct port {
 
 	uint64_t warning_timer;
 	volatile uint16_t warnings_to_show;
+
+	/** Memory pool used to allocate slow queues */
+	struct rte_mempool *slow_pool;
 };
 
 struct mode8023ad_private {
@@ -175,6 +179,19 @@ struct mode8023ad_private {
 	uint64_t update_timeout_us;
 	rte_eth_bond_8023ad_ext_slowrx_fn slowrx_cb;
 	uint8_t external_sm;
+
+	/**
+	 * Configuration of dedicated hardware queues for control plane
+	 * traffic
+	 */
+	struct {
+		uint8_t enabled;
+
+		struct rte_flow *flow[RTE_MAX_ETHPORTS];
+
+		uint16_t rx_qid;
+		uint16_t tx_qid;
+	} dedicated_queues;
 };
 
 /**
@@ -295,4 +312,14 @@ bond_mode_8023ad_deactivate_slave(struct rte_eth_dev *dev, uint8_t slave_pos);
 void
 bond_mode_8023ad_mac_address_update(struct rte_eth_dev *bond_dev);
 
+int
+bond_ethdev_8023ad_flow_verify(struct rte_eth_dev *bond_dev,
+		uint8_t slave_port);
+
+int
+bond_ethdev_8023ad_flow_set(struct rte_eth_dev *bond_dev, uint8_t slave_port);
+
+int
+bond_8023ad_slow_pkt_hw_filter_supported(uint8_t port_id);
+
 #endif /* RTE_ETH_BOND_8023AD_H_ */
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 9730ae0..4d1b262 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -133,6 +133,254 @@ is_lacp_packets(uint16_t ethertype, uint8_t subtype, uint16_t vlan_tci)
 		(subtype == SLOW_SUBTYPE_MARKER || subtype == SLOW_SUBTYPE_LACP));
 }
 
+/*****************************************************************************
+ * Flow director's setup for mode 4 optimization
+ */
+
+static struct rte_flow_item_eth flow_item_eth_type_8023ad = {
+	.dst.addr_bytes = { 0 },
+	.src.addr_bytes = { 0 },
+	.type = RTE_BE16(ETHER_TYPE_SLOW),
+};
+
+static struct rte_flow_item_eth flow_item_eth_mask_type_8023ad = {
+	.dst.addr_bytes = { 0 },
+	.src.addr_bytes = { 0 },
+	.type = 0xFFFF,
+};
+
+static struct rte_flow_item flow_item_8023ad[] = {
+	{
+		.type = RTE_FLOW_ITEM_TYPE_ETH,
+		.spec = &flow_item_eth_type_8023ad,
+		.last = NULL,
+		.mask = &flow_item_eth_mask_type_8023ad,
+	},
+	{
+		.type = RTE_FLOW_ITEM_TYPE_END,
+		.spec = NULL,
+		.last = NULL,
+		.mask = NULL,
+	}
+};
+
+const struct rte_flow_attr flow_attr_8023ad = {
+	.group = 0,
+	.priority = 0,
+	.ingress = 1,
+	.egress = 0,
+	.reserved = 0,
+};
+
+int
+bond_ethdev_8023ad_flow_verify(struct rte_eth_dev *bond_dev,
+		uint8_t slave_port) {
+	struct rte_flow_error error;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+
+	struct rte_flow_action_queue lacp_queue_conf = {
+		.index = internals->mode4.dedicated_queues.rx_qid,
+	};
+
+	const struct rte_flow_action actions[] = {
+		{
+			.type = RTE_FLOW_ACTION_TYPE_QUEUE,
+			.conf = &lacp_queue_conf
+		},
+		{
+			.type = RTE_FLOW_ACTION_TYPE_END,
+		}
+	};
+
+	int ret = rte_flow_validate(slave_port, &flow_attr_8023ad,
+			flow_item_8023ad, actions, &error);
+	if (ret < 0)
+		return -1;
+
+	return 0;
+}
+
+int
+bond_8023ad_slow_pkt_hw_filter_supported(uint8_t port_id) {
+	struct rte_eth_dev *bond_dev = &rte_eth_devices[port_id];
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+	struct rte_eth_dev_info bond_info, slave_info;
+	uint8_t idx;
+
+	/* Verify if all slaves in bonding supports flow director and */
+	if (internals->slave_count > 0) {
+		rte_eth_dev_info_get(bond_dev->data->port_id, &bond_info);
+
+		internals->mode4.dedicated_queues.rx_qid = bond_info.nb_rx_queues;
+		internals->mode4.dedicated_queues.tx_qid = bond_info.nb_tx_queues;
+
+		for (idx = 0; idx < internals->slave_count; idx++) {
+			rte_eth_dev_info_get(internals->slaves[idx].port_id,
+					&slave_info);
+
+			if (bond_ethdev_8023ad_flow_verify(bond_dev,
+					internals->slaves[idx].port_id) != 0)
+				return -1;
+		}
+	}
+
+	return 0;
+}
+
+int
+bond_ethdev_8023ad_flow_set(struct rte_eth_dev *bond_dev, uint8_t slave_port) {
+
+	struct rte_flow_error error;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+			(bond_dev->data->dev_private);
+
+	struct rte_flow_action_queue lacp_queue_conf = {
+		.index = internals->mode4.dedicated_queues.rx_qid,
+	};
+
+	const struct rte_flow_action actions[] = {
+		{
+			.type = RTE_FLOW_ACTION_TYPE_QUEUE,
+			.conf = &lacp_queue_conf
+		},
+		{
+			.type = RTE_FLOW_ACTION_TYPE_END,
+		}
+	};
+
+	internals->mode4.dedicated_queues.flow[slave_port] = rte_flow_create(slave_port,
+			&flow_attr_8023ad, flow_item_8023ad, actions, &error);
+	if (internals->mode4.dedicated_queues.flow[slave_port] == NULL) {
+		RTE_BOND_LOG(ERR, "bond_ethdev_8023ad_flow_set: %s "
+				"(slave_port=%d queue_id=%d)",
+				error.message, slave_port,
+				internals->mode4.dedicated_queues.rx_qid);
+		return -1;
+	}
+
+	return 0;
+}
+
+static uint16_t
+bond_ethdev_rx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs,
+		uint16_t nb_pkts)
+{
+	struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue;
+	struct bond_dev_private *internals = bd_rx_q->dev_private;
+	uint16_t num_rx_total = 0;	/* Total number of received packets */
+	uint8_t slaves[RTE_MAX_ETHPORTS];
+	uint8_t slave_count;
+
+	uint8_t i, idx;
+
+	/* Copy slave list to protect against slave up/down changes during tx
+	 * bursting */
+	slave_count = internals->active_slave_count;
+	memcpy(slaves, internals->active_slaves,
+			sizeof(internals->active_slaves[0]) * slave_count);
+
+	for (i = 0, idx = internals->active_slave;
+			i < slave_count && num_rx_total < nb_pkts; i++, idx++) {
+		idx = idx % slave_count;
+
+		/* Read packets from this slave */
+		num_rx_total += rte_eth_rx_burst(slaves[idx], bd_rx_q->queue_id,
+				&bufs[num_rx_total], nb_pkts - num_rx_total);
+	}
+
+	internals->active_slave = idx;
+
+	return num_rx_total;
+}
+
+static uint16_t
+bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs,
+		uint16_t nb_pkts)
+{
+	struct bond_dev_private *internals;
+	struct bond_tx_queue *bd_tx_q;
+
+	uint8_t num_of_slaves;
+	uint8_t slaves[RTE_MAX_ETHPORTS];
+	 /* positions in slaves, not ID */
+	uint8_t distributing_offsets[RTE_MAX_ETHPORTS];
+	uint8_t distributing_count;
+
+	uint16_t num_tx_slave, num_tx_total = 0, num_tx_fail_total = 0;
+	uint16_t i, op_slave_idx;
+
+	struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_pkts];
+
+	/* Total amount of packets in slave_bufs */
+	uint16_t slave_nb_pkts[RTE_MAX_ETHPORTS] = { 0 };
+	/* Slow packets placed in each slave */
+
+	if (unlikely(nb_pkts == 0))
+		return 0;
+
+	bd_tx_q = (struct bond_tx_queue *)queue;
+	internals = bd_tx_q->dev_private;
+
+	/* Copy slave list to protect against slave up/down changes during tx
+	 * bursting */
+	num_of_slaves = internals->active_slave_count;
+	if (num_of_slaves < 1)
+		return num_tx_total;
+
+	memcpy(slaves, internals->active_slaves, sizeof(slaves[0]) *
+			num_of_slaves);
+
+	distributing_count = 0;
+	for (i = 0; i < num_of_slaves; i++) {
+		struct port *port = &mode_8023ad_ports[slaves[i]];
+		if (ACTOR_STATE(port, DISTRIBUTING))
+			distributing_offsets[distributing_count++] = i;
+	}
+
+	if (likely(distributing_count > 0)) {
+		/* Populate slaves mbuf with the packets which are to be sent */
+		for (i = 0; i < nb_pkts; i++) {
+			/* Select output slave using hash based on xmit policy */
+			op_slave_idx = internals->xmit_hash(bufs[i],
+					distributing_count);
+
+			/* Populate slave mbuf arrays with mbufs for that slave.
+			 * Use only slaves that are currently distributing.
+			 */
+			uint8_t slave_offset =
+					distributing_offsets[op_slave_idx];
+			slave_bufs[slave_offset][slave_nb_pkts[slave_offset]] =
+					bufs[i];
+			slave_nb_pkts[slave_offset]++;
+		}
+	}
+
+	/* Send packet burst on each slave device */
+	for (i = 0; i < num_of_slaves; i++) {
+		if (slave_nb_pkts[i] == 0)
+			continue;
+
+		num_tx_slave = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
+				slave_bufs[i], slave_nb_pkts[i]);
+
+		num_tx_total += num_tx_slave;
+		num_tx_fail_total += slave_nb_pkts[i] - num_tx_slave;
+
+		/* If tx burst fails move packets to end of bufs */
+		if (unlikely(num_tx_slave < slave_nb_pkts[i])) {
+			uint16_t j = nb_pkts - num_tx_fail_total;
+			for ( ; num_tx_slave < slave_nb_pkts[i]; j++,
+					num_tx_slave++)
+				bufs[j] = slave_bufs[i][num_tx_slave];
+		}
+	}
+
+	return num_tx_total;
+}
+
+
 static uint16_t
 bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		uint16_t nb_pkts)
@@ -1302,11 +1550,19 @@ bond_ethdev_mode_set(struct rte_eth_dev *eth_dev, int mode)
 		if (bond_mode_8023ad_enable(eth_dev) != 0)
 			return -1;
 
-		eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_8023ad;
-		eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_8023ad;
-		RTE_LOG(WARNING, PMD,
-				"Using mode 4, it is necessary to do TX burst and RX burst "
-				"at least every 100ms.\n");
+		if (internals->mode4.dedicated_queues.enabled == 0) {
+			eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_8023ad;
+			eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_8023ad;
+			RTE_LOG(WARNING, PMD,
+				"Using mode 4, it is necessary to do TX burst "
+				"and RX burst at least every 100ms.\n");
+		} else {
+			/* Use flow director's optimization */
+			eth_dev->rx_pkt_burst =
+					bond_ethdev_rx_burst_8023ad_fast_queue;
+			eth_dev->tx_pkt_burst =
+					bond_ethdev_tx_burst_8023ad_fast_queue;
+		}
 		break;
 	case BONDING_MODE_TLB:
 		eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_tlb;
@@ -1328,15 +1584,81 @@ bond_ethdev_mode_set(struct rte_eth_dev *eth_dev, int mode)
 	return 0;
 }
 
+
+static int
+slave_configure_slow_queue(struct rte_eth_dev *bonded_eth_dev,
+		struct rte_eth_dev *slave_eth_dev)
+{
+	int errval = 0;
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		bonded_eth_dev->data->dev_private;
+	struct port *port = &mode_8023ad_ports[slave_eth_dev->data->port_id];
+
+	if (port->slow_pool == NULL) {
+		char mem_name[256];
+		int slave_id = slave_eth_dev->data->port_id;
+
+		snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_slow_pool",
+				slave_id);
+		port->slow_pool = rte_pktmbuf_pool_create(mem_name, 8191,
+			250, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
+			slave_eth_dev->data->numa_node);
+
+		/* Any memory allocation failure in initialization is critical because
+		 * resources can't be free, so reinitialization is impossible. */
+		if (port->slow_pool == NULL) {
+			rte_panic("Slave %u: Failed to create memory pool '%s': %s\n",
+				slave_id, mem_name, rte_strerror(rte_errno));
+		}
+	}
+
+	if (internals->mode4.dedicated_queues.enabled == 1) {
+		/* Configure slow Rx queue */
+
+		errval = rte_eth_rx_queue_setup(slave_eth_dev->data->port_id,
+				internals->mode4.dedicated_queues.rx_qid, 128,
+				rte_eth_dev_socket_id(slave_eth_dev->data->port_id),
+				NULL, port->slow_pool);
+		if (errval != 0) {
+			RTE_BOND_LOG(ERR,
+					"rte_eth_rx_queue_setup: port=%d queue_id %d, err (%d)",
+					slave_eth_dev->data->port_id,
+					internals->mode4.dedicated_queues.rx_qid,
+					errval);
+			return errval;
+		}
+
+		errval = rte_eth_tx_queue_setup(slave_eth_dev->data->port_id,
+				internals->mode4.dedicated_queues.tx_qid, 512,
+				rte_eth_dev_socket_id(slave_eth_dev->data->port_id),
+				NULL);
+		if (errval != 0) {
+			RTE_BOND_LOG(ERR,
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id,
+				internals->mode4.dedicated_queues.tx_qid,
+				errval);
+			return errval;
+		}
+	}
+	return 0;
+}
+
 int
 slave_configure(struct rte_eth_dev *bonded_eth_dev,
 		struct rte_eth_dev *slave_eth_dev)
 {
 	struct bond_rx_queue *bd_rx_q;
 	struct bond_tx_queue *bd_tx_q;
+	uint16_t nb_rx_queues;
+	uint16_t nb_tx_queues;
 
 	int errval;
 	uint16_t q_id;
+	struct rte_flow_error flow_error;
+
+	struct bond_dev_private *internals = (struct bond_dev_private *)
+		bonded_eth_dev->data->dev_private;
 
 	/* Stop slave */
 	rte_eth_dev_stop(slave_eth_dev->data->port_id);
@@ -1366,10 +1688,19 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
 	slave_eth_dev->data->dev_conf.rxmode.hw_vlan_filter =
 			bonded_eth_dev->data->dev_conf.rxmode.hw_vlan_filter;
 
+	nb_rx_queues = bonded_eth_dev->data->nb_rx_queues;
+	nb_tx_queues = bonded_eth_dev->data->nb_tx_queues;
+
+	if (internals->mode == BONDING_MODE_8023AD) {
+		if (internals->mode4.dedicated_queues.enabled == 1) {
+			nb_rx_queues++;
+			nb_tx_queues++;
+		}
+	}
+
 	/* Configure device */
 	errval = rte_eth_dev_configure(slave_eth_dev->data->port_id,
-			bonded_eth_dev->data->nb_rx_queues,
-			bonded_eth_dev->data->nb_tx_queues,
+			nb_rx_queues, nb_tx_queues,
 			&(slave_eth_dev->data->dev_conf));
 	if (errval != 0) {
 		RTE_BOND_LOG(ERR, "Cannot configure slave device: port %u , err (%d)",
@@ -1403,12 +1734,35 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
 				&bd_tx_q->tx_conf);
 		if (errval != 0) {
 			RTE_BOND_LOG(ERR,
-					"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
-					slave_eth_dev->data->port_id, q_id, errval);
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id, q_id, errval);
 			return errval;
 		}
 	}
 
+	if (internals->mode == BONDING_MODE_8023AD &&
+			internals->mode4.dedicated_queues.enabled == 1) {
+		if (slave_configure_slow_queue(bonded_eth_dev, slave_eth_dev)
+				!= 0)
+			return errval;
+
+		if (bond_ethdev_8023ad_flow_verify(bonded_eth_dev,
+				slave_eth_dev->data->port_id) != 0) {
+			RTE_BOND_LOG(ERR,
+				"rte_eth_tx_queue_setup: port=%d queue_id %d, err (%d)",
+				slave_eth_dev->data->port_id, q_id, errval);
+			return -1;
+		}
+
+		if (internals->mode4.dedicated_queues.flow[slave_eth_dev->data->port_id] != NULL)
+			rte_flow_destroy(slave_eth_dev->data->port_id,
+					internals->mode4.dedicated_queues.flow[slave_eth_dev->data->port_id],
+					&flow_error);
+
+		bond_ethdev_8023ad_flow_set(bonded_eth_dev,
+				slave_eth_dev->data->port_id);
+	}
+
 	/* Start device */
 	errval = rte_eth_dev_start(slave_eth_dev->data->port_id);
 	if (errval != 0) {
@@ -1567,13 +1921,26 @@ bond_ethdev_start(struct rte_eth_dev *eth_dev)
 	if (internals->promiscuous_en)
 		bond_ethdev_promiscuous_enable(eth_dev);
 
+	if (internals->mode == BONDING_MODE_8023AD) {
+		if (internals->mode4.dedicated_queues.enabled == 1) {
+			internals->mode4.dedicated_queues.rx_qid =
+					eth_dev->data->nb_rx_queues;
+			internals->mode4.dedicated_queues.tx_qid =
+					eth_dev->data->nb_tx_queues;
+		}
+	}
+
+
 	/* Reconfigure each slave device if starting bonded device */
 	for (i = 0; i < internals->slave_count; i++) {
-		if (slave_configure(eth_dev,
-				&(rte_eth_devices[internals->slaves[i].port_id])) != 0) {
+		struct rte_eth_dev *slave_ethdev =
+				&(rte_eth_devices[internals->slaves[i].port_id]);
+		if (slave_configure(eth_dev, slave_ethdev) != 0) {
 			RTE_BOND_LOG(ERR,
-					"bonded port (%d) failed to reconfigure slave device (%d)",
-					eth_dev->data->port_id, internals->slaves[i].port_id);
+					"bonded port (%d) failed to reconfigure"
+					"slave device (%d)",
+					eth_dev->data->port_id,
+					internals->slaves[i].port_id);
 			return -1;
 		}
 		/* We will need to poll for link status if any slave doesn't
@@ -1698,21 +2065,21 @@ static void
 bond_ethdev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 {
 	struct bond_dev_private *internals = dev->data->dev_private;
+
 	uint16_t max_nb_rx_queues = UINT16_MAX;
 	uint16_t max_nb_tx_queues = UINT16_MAX;
 
 	dev_info->max_mac_addrs = 1;
 
-	dev_info->max_rx_pktlen = internals->candidate_max_rx_pktlen
-				  ? internals->candidate_max_rx_pktlen
-				  : ETHER_MAX_JUMBO_FRAME_LEN;
+	dev_info->max_rx_pktlen = internals->candidate_max_rx_pktlen ?
+			internals->candidate_max_rx_pktlen :
+			ETHER_MAX_JUMBO_FRAME_LEN;
 
+	/* Max number of tx/rx queues that the bonded device can support is the
+	 * minimum values of the bonded slaves, as all slaves must be capable
+	 * of supporting the same number of tx/rx queues.
+	 */
 	if (internals->slave_count > 0) {
-		/* Max number of tx/rx queues that the bonded device can
-		 * support is the minimum values of the bonded slaves, as
-		 * all slaves must be capable of supporting the same number
-		 * of tx/rx queues.
-		 */
 		struct rte_eth_dev_info slave_info;
 		uint8_t idx;
 
@@ -1731,6 +2098,16 @@ bond_ethdev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	dev_info->max_rx_queues = max_nb_rx_queues;
 	dev_info->max_tx_queues = max_nb_tx_queues;
 
+	/**
+	 * If dedicated hw queues enabled for link bonding device in LACP mode
+	 * then we need to reduce the maximum number of data path queues by 1.
+	 */
+	if (internals->mode == BONDING_MODE_8023AD &&
+		internals->mode4.dedicated_queues.enabled == 1) {
+		dev_info->max_rx_queues--;
+		dev_info->max_tx_queues--;
+	}
+
 	dev_info->min_rx_bufsize = 0;
 
 	dev_info->rx_offload_capa = internals->rx_offload_capa;
diff --git a/drivers/net/bonding/rte_eth_bond_version.map b/drivers/net/bonding/rte_eth_bond_version.map
index 2de0a7d..9c15864 100644
--- a/drivers/net/bonding/rte_eth_bond_version.map
+++ b/drivers/net/bonding/rte_eth_bond_version.map
@@ -43,3 +43,12 @@ DPDK_16.07 {
 	rte_eth_bond_8023ad_setup;
 
 } DPDK_16.04;
+
+DPDK_17.08 {
+	global:
+
+	rte_eth_bond_8023ad_dedicated_queues_enable;
+	rte_eth_bond_8023ad_dedicated_queues_disable;
+
+	local: *;
+} DPDK_17.05;
-- 
2.9.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
                       ` (2 preceding siblings ...)
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
@ 2017-07-04 16:46     ` Declan Doherty
  2017-07-04 19:56       ` Declan Doherty
  2017-07-05 11:33       ` Ferruh Yigit
  2017-07-05 11:35     ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Ferruh Yigit
  4 siblings, 2 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 16:46 UTC (permalink / raw)
  To: dev; +Cc: Tomasz Kulasek, Declan Doherty

From: Tomasz Kulasek <tomaszx.kulasek@intel.com>

Add new command to support enable/disable of dedicated tx/rx queue on
each slave of a bond device for LACP control plane traffic.

    set bonding lacp dedicated_queues <port_id> [enable|disable]

When enabled this option creates dedicated queues on each slave device
for LACP control plane traffic. This removes the need to filter control
plane packets in the data path.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
---
 app/test-pmd/cmdline.c                      | 85 +++++++++++++++++++++++++++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  9 +++
 2 files changed, 94 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0fc40a6..486252a 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -87,6 +87,7 @@
 #include <cmdline.h>
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
+#include <rte_eth_bond_8023ad.h>
 #endif
 #ifdef RTE_LIBRTE_IXGBE_PMD
 #include <rte_pmd_ixgbe.h>
@@ -575,6 +576,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"set bonding mon_period (port_id) (value)\n"
 			"	Set the bonding link status monitoring polling period in ms.\n\n"
+
+			"set bonding lacp dedicated_queues <port_id> (enable|disable)\n"
+			"	Enable/disable dedicated queues for LACP control traffic.\n\n"
+
 #endif
 			"set link-up port (port_id)\n"
 			"	Set link up for a port.\n\n"
@@ -4303,6 +4308,85 @@ cmdline_parse_inst_t cmd_set_bonding_mode = {
 		}
 };
 
+/* *** SET BONDING SLOW_QUEUE SW/HW *** */
+struct cmd_set_bonding_lacp_dedicated_queues_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t bonding;
+	cmdline_fixed_string_t lacp;
+	cmdline_fixed_string_t dedicated_queues;
+	uint8_t port_id;
+	cmdline_fixed_string_t mode;
+};
+
+static void cmd_set_bonding_lacp_dedicated_queues_parsed(void *parsed_result,
+		__attribute__((unused))  struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_set_bonding_lacp_dedicated_queues_result *res = parsed_result;
+	portid_t port_id = res->port_id;
+	struct rte_port *port;
+
+	port = &ports[port_id];
+
+	/** Check if the port is not started **/
+	if (port->port_status != RTE_PORT_STOPPED) {
+		printf("Please stop port %d first\n", port_id);
+		return;
+	}
+
+	if (!strcmp(res->mode, "enable")) {
+		if (rte_eth_bond_8023ad_dedicated_queues_enable(port_id) == 0)
+			printf("Dedicate queues for LACP control packets"
+					" enabled\n");
+		else
+			printf("Enabling dedicate queues for LACP control "
+					"packets on port %d failed\n", port_id);
+	} else if (!strcmp(res->mode, "disable")) {
+		if (rte_eth_bond_8023ad_dedicated_queues_disable(port_id) == 0)
+			printf("Dedicated queues for LACP control packets "
+					"disabled\n");
+		else
+			printf("Disabling dedicated queues for LACP control "
+					"traffic on port %d failed\n", port_id);
+	}
+}
+
+cmdline_parse_token_string_t cmd_setbonding_lacp_dedicated_queues_set =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_lacp_dedicated_queues_result,
+		set, "set");
+cmdline_parse_token_string_t cmd_setbonding_lacp_dedicated_queues_bonding =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_lacp_dedicated_queues_result,
+		bonding, "bonding");
+cmdline_parse_token_string_t cmd_setbonding_lacp_dedicated_queues_lacp =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_lacp_dedicated_queues_result,
+		lacp, "lacp");
+cmdline_parse_token_string_t cmd_setbonding_lacp_dedicated_queues_dedicated_queues =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_lacp_dedicated_queues_result,
+		dedicated_queues, "dedicated_queues");
+cmdline_parse_token_num_t cmd_setbonding_lacp_dedicated_queues_port_id =
+TOKEN_NUM_INITIALIZER(struct cmd_set_bonding_lacp_dedicated_queues_result,
+		port_id, UINT8);
+cmdline_parse_token_string_t cmd_setbonding_lacp_dedicated_queues_mode =
+TOKEN_STRING_INITIALIZER(struct cmd_set_bonding_lacp_dedicated_queues_result,
+		mode, "enable#disable");
+
+cmdline_parse_inst_t cmd_set_lacp_dedicated_queues = {
+		.f = cmd_set_bonding_lacp_dedicated_queues_parsed,
+		.help_str = "set bonding lacp dedicated_queues <port_id> "
+			"enable|disable: "
+			"Enable/disable dedicated queues for LACP control traffic for port_id",
+		.data = NULL,
+		.tokens = {
+			(void *)&cmd_setbonding_lacp_dedicated_queues_set,
+			(void *)&cmd_setbonding_lacp_dedicated_queues_bonding,
+			(void *)&cmd_setbonding_lacp_dedicated_queues_lacp,
+			(void *)&cmd_setbonding_lacp_dedicated_queues_dedicated_queues,
+			(void *)&cmd_setbonding_lacp_dedicated_queues_port_id,
+			(void *)&cmd_setbonding_lacp_dedicated_queues_mode,
+			NULL
+		}
+};
+
 /* *** SET BALANCE XMIT POLICY *** */
 struct cmd_set_bonding_balance_xmit_policy_result {
 	cmdline_fixed_string_t set;
@@ -13934,6 +14018,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *) &cmd_set_bond_mac_addr,
 	(cmdline_parse_inst_t *) &cmd_set_balance_xmit_policy,
 	(cmdline_parse_inst_t *) &cmd_set_bond_mon_period,
+	(cmdline_parse_inst_t *) &cmd_set_lacp_dedicated_queues,
 #endif
 	(cmdline_parse_inst_t *)&cmd_vlan_offload,
 	(cmdline_parse_inst_t *)&cmd_vlan_tpid,
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index b8f47fd..35d0b1f 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1766,6 +1766,15 @@ For example, to set the link status monitoring polling period of bonded device (
    testpmd> set bonding mon_period 5 150
 
 
+set bonding lacp dedicated_queue 
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Enable dedicated tx/rx queues on bonding devices slaves to handle LACP control plane traffic
+when in mode 4 (link-aggregration-802.3ad)
+
+   testpmd> set bonding lacp dedicated_queues (port_id) (enable|disable)
+
+
 show bonding config
 ~~~~~~~~~~~~~~~~~~~
 
-- 
2.9.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/4] net/bond: use ptype flags for LACP rx filtering
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 2/4] net/bond: use ptype flags for LACP rx filtering Declan Doherty
@ 2017-07-04 19:54       ` Declan Doherty
  0 siblings, 0 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 19:54 UTC (permalink / raw)
  To: dev; +Cc: Tomasz Kulasek

On 04/07/17 17:46, Declan Doherty wrote:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Use packet types flags in mbuf to provide hint for filtering of LACP
> control plane traffic from the data path.
> 
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>
> ---
...
> 

Acked-by: Declan Doherty <declan.doherty@intel.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
@ 2017-07-04 19:55       ` Declan Doherty
  2017-07-05 11:19       ` Ferruh Yigit
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 19:55 UTC (permalink / raw)
  To: dev; +Cc: Tomasz Kulasek

On 04/07/17 17:46, Declan Doherty wrote:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Add support for hardware flow classification of LACP control plane
> traffic to be redirect to a dedicated receive queue on each slave which
> is not visible to application. Also enables a dedicate transmit queue
> for LACP traffic which allows complete decoupling of control and data
> paths.
> 
> This only applies to bonding devices running in mode 4
> (link-aggegration-802.3ad).
> 
> Introduce two new APIs to support enable/disabled of dedicated
> queues.
> 
> - rte_eth_bond_8023ad_dedicated_queues_enable
> - rte_eth_bond_8023ad_dedicated_queues_disable
> 
> rte_eth_bond_8023ad_dedicated_queues_enable must be called before
> bonding port is configured or started to reserved and configure the
> dedicated queuesh.
> 
> When this option is enabled all slaves must support flow filtering
> by ethernet type and support one additional tx and rx queue on
> each slave.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>
> ---
...

> 

Acked-by: Declan Doherty <declan.doherty@intel.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues Declan Doherty
@ 2017-07-04 19:56       ` Declan Doherty
  2017-07-05 11:33       ` Ferruh Yigit
  1 sibling, 0 replies; 23+ messages in thread
From: Declan Doherty @ 2017-07-04 19:56 UTC (permalink / raw)
  To: dev; +Cc: Tomasz Kulasek

On 04/07/17 17:46, Declan Doherty wrote:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Add new command to support enable/disable of dedicated tx/rx queue on
> each slave of a bond device for LACP control plane traffic.
> 
>      set bonding lacp dedicated_queues <port_id> [enable|disable]
> 
> When enabled this option creates dedicated queues on each slave device
> for LACP control plane traffic. This removes the need to filter control
> plane packets in the data path.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>
> ---
...
> 

Acked-by: Declan Doherty <declan.doherty@intel.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
  2017-07-04 19:55       ` Declan Doherty
@ 2017-07-05 11:19       ` Ferruh Yigit
  2017-07-05 11:33       ` Ferruh Yigit
  2017-12-13  8:16       ` linhaifeng
  3 siblings, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2017-07-05 11:19 UTC (permalink / raw)
  To: Declan Doherty, dev; +Cc: Tomasz Kulasek

On 7/4/2017 5:46 PM, Declan Doherty wrote:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Add support for hardware flow classification of LACP control plane
> traffic to be redirect to a dedicated receive queue on each slave which
> is not visible to application. Also enables a dedicate transmit queue
> for LACP traffic which allows complete decoupling of control and data
> paths.
> 
> This only applies to bonding devices running in mode 4
> (link-aggegration-802.3ad).
> 
> Introduce two new APIs to support enable/disabled of dedicated
> queues.
> 
> - rte_eth_bond_8023ad_dedicated_queues_enable
> - rte_eth_bond_8023ad_dedicated_queues_disable
> 
> rte_eth_bond_8023ad_dedicated_queues_enable must be called before
> bonding port is configured or started to reserved and configure the
> dedicated queuesh.
> 
> When this option is enabled all slaves must support flow filtering 
> by ethernet type and support one additional tx and rx queue on 
> each slave.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>

<...>

> +
> +DPDK_17.08 {
> +	global:
> +
> +	rte_eth_bond_8023ad_dedicated_queues_enable;
> +	rte_eth_bond_8023ad_dedicated_queues_disable;
> +
> +	local: *;

This line is not required.

> +} DPDK_17.05;

And this should be DPDK_16.07, otherwise breaking shared build.

I will fix above ones while applying.

> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues Declan Doherty
  2017-07-04 19:56       ` Declan Doherty
@ 2017-07-05 11:33       ` Ferruh Yigit
  1 sibling, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2017-07-05 11:33 UTC (permalink / raw)
  To: Declan Doherty, dev; +Cc: Tomasz Kulasek

On 7/4/2017 5:46 PM, Declan Doherty wrote:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Add new command to support enable/disable of dedicated tx/rx queue on
> each slave of a bond device for LACP control plane traffic.
> 
>     set bonding lacp dedicated_queues <port_id> [enable|disable]
> 
> When enabled this option creates dedicated queues on each slave device
> for LACP control plane traffic. This removes the need to filter control
> plane packets in the data path.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>

<...>

> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -1766,6 +1766,15 @@ For example, to set the link status monitoring polling period of bonded device (
>     testpmd> set bonding mon_period 5 150
>  
>  
> +set bonding lacp dedicated_queue 

trailing white-space removed.

> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<...>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
  2017-07-04 19:55       ` Declan Doherty
  2017-07-05 11:19       ` Ferruh Yigit
@ 2017-07-05 11:33       ` Ferruh Yigit
  2017-12-13  8:16       ` linhaifeng
  3 siblings, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2017-07-05 11:33 UTC (permalink / raw)
  To: Declan Doherty, dev; +Cc: Tomasz Kulasek

On 7/4/2017 5:46 PM, Declan Doherty wrote:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> 
> Add support for hardware flow classification of LACP control plane
> traffic to be redirect to a dedicated receive queue on each slave which
> is not visible to application. Also enables a dedicate transmit queue
> for LACP traffic which allows complete decoupling of control and data
> paths.
> 
> This only applies to bonding devices running in mode 4
> (link-aggegration-802.3ad).
> 
> Introduce two new APIs to support enable/disabled of dedicated
> queues.
> 
> - rte_eth_bond_8023ad_dedicated_queues_enable
> - rte_eth_bond_8023ad_dedicated_queues_disable
> 
> rte_eth_bond_8023ad_dedicated_queues_enable must be called before
> bonding port is configured or started to reserved and configure the
> dedicated queuesh.
> 
> When this option is enabled all slaves must support flow filtering 
> by ethernet type and support one additional tx and rx queue on 
> each slave.
> 
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>

<...>

> -					"bonded port (%d) failed to reconfigure slave device (%d)",
> -					eth_dev->data->port_id, internals->slaves[i].port_id);
> +					"bonded port (%d) failed to reconfigure"
> +					"slave device (%d)",

Log string merged into single line.

<...>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration
  2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
                       ` (3 preceding siblings ...)
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues Declan Doherty
@ 2017-07-05 11:35     ` Ferruh Yigit
  4 siblings, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2017-07-05 11:35 UTC (permalink / raw)
  To: Declan Doherty, dev

On 7/4/2017 5:46 PM, Declan Doherty wrote:
> 1. Overview
> 
>   Packet processing in the current path for bonding in mode 4, requires
>   parse all packets in the fast path, to classify and process LACP
>   packets.
> 
>   The idea of performance improvement is to use hardware offloads to
>   improve packet classification.
> 
> 2. Scope of work
> 
>    a) Optimization of software LACP packet classification by using
>       packet_type metadata to eliminate the requirement of parsing each
>       packet in the received burst.
> 
>    b) Implementation of classification mechanism using flow director to
>       redirect LACP packets to the dedicated queue (not visible by
>       application).
> 
>       - Filter pattern choosing (not all filters are supported by all
>         devices),
>       - Changing processing path to speed up non-LACP packets
>         processing,
>       - Handle LACP packets from dedicated Rx queue and send to the
>         dedicated Tx queue,
> 
>    c) Creation of fallback mechanism allowing to select the most
>       preferable method of processing:
> 
>       - Flow director,
>       - Packet type metadata,
>       - Software parsing,
> 
> 3. Implementation
> 
> 3.1. Packet type
> 
>    The packet_type approach would result in a performance improvement
>    as packets data would no longer be required to be read, but with this
>    approach the bonded driver would still need to look at the mbuf of
>    each packet thereby having an impact on the achievable Rx
>    performance.
> 
>    There's not packet_type value describing LACP packets directly.
>    However, it can be used to limit number of packets required to be
>    parsed, e.g. if packet_type indicates >L2 packets.
> 
>    It should improve performance while well-known non-LACP packets can
>    be skipped without the need to look up into its data.
> 
> 3.2. Flow director
> 
>    Using rte_flow API and pattern on ethernet type of packet (0x8809),
>    we can configure flow director to redirect slow packets to separated
>    queue.
> 
>    An independent Rx queues for LACP would remove the requirement to
>    filter all ingress traffic in sw which should result in a performance
>    increase. Other queues stay untouched and processing of packets on
>    the fast path will be reduced to simple packet collecting from
>    slaves.
> 
>    Separated Tx queue for LACP daemon allows to send LACP responses
>    immediately, without interfering into Tx fast path.
> 
>    RECEIVE
> 
>          .---------------.
>          | Slave 0       |
>          |      .------. |
>          |  Fd  | Rxq  | |
>    Rx ======o==>|      |==============.
>          |  |   +======+ |            |      .---------------.
>          |  `-->| LACP |--------.     |      | Bonding       |
>          |      `------' |      |     |      |      .------. |
>          `---------------'      |     |      |      |      | |
>                                 |     >============>|      |=======> Rx
>          .---------------.      |     |      |      +======+ |
>          | Slave 1       |      |     |      |      | XXXX | |
>          |      .------. |      |     |      |      `------' |
>          |  Fd  | Rxq  | |      |     |      `---------------'
>    Rx ======o==>|      |=============='        .-----------.
>          |  |   +======+ |      |             /             \
>          |  `-->| LACP |--------+----------->+  LACP DAEMON  |
>          |      `------' |             Tx <---\             /
>          `---------------'                     `-----------'
> 
>    All slow packets received by slaves in bonding are redirected to the
>    separated queue using flow director. Other packets are collected from
>    slaves and exposed to the application with Rx burst on bonded device.
> 
>    TRANSMIT
> 
>          .---------------.
>          | Slave 0       |
>          |      .------. |
>          |      |      | |
>    Tx <=====+===|      |<=============.
>          |  |   |------| |            |      .---------------.
>          |  `---| LACP |<-------.     |      | Bonding       |
>          |      `------' |      |     |      |      .------. |
>          `---------------'      |     |      |      |      | |
>                                 |     +<============|      |<====== Tx
>          .---------------.      |     |      |      +======+ |
>          | Slave 1       |      |     |      |      | XXXX | |
>          |      .------. |      |     |      |      `------' |
>          |      |      | |      |     |      `---------------'
>    Tx <=====+===|      |<============='  Rx    .-----------.
>          |  |   |------| |      |         `-->/             \
>          |  `---| LACP |<-------+------------+  LACP DAEMON  |
>          |      `------' |                    \             /
>          `---------------'                     `-----------'
> 
>    On transmit, packets are propagated on the slaves. While we have
>    separated Tx queue for LACP responses, it can be sent regardless of
>    the fast path.
> 
>    LACP DAEMON
> 
>    In this mode whole slow packets are handled in LACP DAEMON.
> 
> V3:
>  - Split hw filtering patch into 3 patches:
>  	- fix for calculating maximum number of tx/rx queues of bonding device
> 	- enable use of ptype hint for filtering of control plane packets in
> 	default enablement
> 	- enablement of dedicated queues for LACP control packet filtering.
> 
> Declan Doherty (1):
>   net/bond: calculate number of bonding tx/rx queues
> 
> Tomasz Kulasek (3):
>   net/bond: use ptype flags for LACP rx filtering
>   net/bond: dedicated hw queues for LACP control traffic
>   app/test-pmd: add cmd for dedicated LACP  rx/tx queues

Series applied to dpdk-next-net/master, thanks.

(minor updates, mentioned in mail thread, done while applying.)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
  2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
                         ` (2 preceding siblings ...)
  2017-07-05 11:33       ` Ferruh Yigit
@ 2017-12-13  8:16       ` linhaifeng
  2017-12-13 12:41         ` Kulasek, TomaszX
  3 siblings, 1 reply; 23+ messages in thread
From: linhaifeng @ 2017-12-13  8:16 UTC (permalink / raw)
  To: Declan Doherty, dev; +Cc: Tomasz Kulasek

Hi,

What is the purpose of this patch? fix problem or improve performance?

在 2017/7/5 0:46, Declan Doherty 写道:
> From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
>
> Add support for hardware flow classification of LACP control plane
> traffic to be redirect to a dedicated receive queue on each slave which
> is not visible to application. Also enables a dedicate transmit queue
> for LACP traffic which allows complete decoupling of control and data
> paths.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
  2017-12-13  8:16       ` linhaifeng
@ 2017-12-13 12:41         ` Kulasek, TomaszX
  0 siblings, 0 replies; 23+ messages in thread
From: Kulasek, TomaszX @ 2017-12-13 12:41 UTC (permalink / raw)
  To: linhaifeng, Doherty, Declan, dev

Hi,

> -----Original Message-----
> From: linhaifeng [mailto:haifeng.lin@huawei.com]
> Sent: Wednesday, December 13, 2017 09:16
> To: Doherty, Declan <declan.doherty@intel.com>; dev@dpdk.org
> Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP
> control traffic
> 
> Hi,
> 
> What is the purpose of this patch? fix problem or improve performance?
> 
> 在 2017/7/5 0:46, Declan Doherty 写道:
> > From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> >
> > Add support for hardware flow classification of LACP control plane
> > traffic to be redirect to a dedicated receive queue on each slave which
> > is not visible to application. Also enables a dedicate transmit queue
> > for LACP traffic which allows complete decoupling of control and data
> > paths.
> 

This is performance improvement.

Tomasz

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic
@ 2017-12-14  3:13 Linhaifeng
  0 siblings, 0 replies; 23+ messages in thread
From: Linhaifeng @ 2017-12-14  3:13 UTC (permalink / raw)
  To: Kulasek, TomaszX, Doherty, Declan, dev; +Cc: Lilijun (Jerry), jichaoyang

Hi, Tomasz

Thanks for the reply!

I thought that the patch was used to resolve the "lacp loss" problem. We know when the traffic is large enough 
 bond may loss the lacp packets and slave would out of sync.

My question is, are there any solutions to solve the "lacp loss" problem?


-----Original Message-----
发件人: Kulasek, TomaszX [mailto:tomaszx.kulasek@intel.com] 
发送时间: 2017年12月13日 20:42
收件人: Linhaifeng <haifeng.lin@huawei.com>; Doherty, Declan <declan.doherty@intel.com>; dev@dpdk.org
主题: RE: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic

Hi,

> -----Original Message-----
> From: linhaifeng [mailto:haifeng.lin@huawei.com]
> Sent: Wednesday, December 13, 2017 09:16
> To: Doherty, Declan <declan.doherty@intel.com>; dev@dpdk.org
> Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues 
> for LACP control traffic
> 
> Hi,
> 
> What is the purpose of this patch? fix problem or improve performance?
> 
> 在 2017/7/5 0:46, Declan Doherty 写道:
> > From: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> >
> > Add support for hardware flow classification of LACP control plane 
> > traffic to be redirect to a dedicated receive queue on each slave 
> > which is not visible to application. Also enables a dedicate 
> > transmit queue for LACP traffic which allows complete decoupling of 
> > control and data paths.
> 

This is performance improvement.

Tomasz

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-12-14  3:13 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-27 11:27 [dpdk-dev] [PATCH 0/2] LACP control packet filtering offload Tomasz Kulasek
2017-05-27 11:27 ` [dpdk-dev] [PATCH 1/2] " Tomasz Kulasek
2017-05-29  8:10   ` Adrien Mazarguil
2017-06-29  9:18   ` Declan Doherty
2017-05-27 11:27 ` [dpdk-dev] [PATCH 2/2] test-pmd: add set bonding slow_queue hw/sw Tomasz Kulasek
2017-06-29 16:20 ` [dpdk-dev] [PATCH v2 0/2] LACP control packet filtering offload Tomasz Kulasek
2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 1/2] " Tomasz Kulasek
2017-06-29 16:20   ` [dpdk-dev] [PATCH v2 2/2] test-pmd: add set bonding slow_queue hw/sw Tomasz Kulasek
2017-07-04 16:46   ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Declan Doherty
2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 1/4] net/bond: calculate number of bonding tx/rx queues Declan Doherty
2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 2/4] net/bond: use ptype flags for LACP rx filtering Declan Doherty
2017-07-04 19:54       ` Declan Doherty
2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Declan Doherty
2017-07-04 19:55       ` Declan Doherty
2017-07-05 11:19       ` Ferruh Yigit
2017-07-05 11:33       ` Ferruh Yigit
2017-12-13  8:16       ` linhaifeng
2017-12-13 12:41         ` Kulasek, TomaszX
2017-07-04 16:46     ` [dpdk-dev] [PATCH v3 4/4] app/test-pmd: add cmd for dedicated LACP rx/tx queues Declan Doherty
2017-07-04 19:56       ` Declan Doherty
2017-07-05 11:33       ` Ferruh Yigit
2017-07-05 11:35     ` [dpdk-dev] [PATCH v3 0/4] LACP control packet filtering acceleration Ferruh Yigit
2017-12-14  3:13 [dpdk-dev] [PATCH v3 3/4] net/bond: dedicated hw queues for LACP control traffic Linhaifeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).