DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance
@ 2017-03-02  7:07 Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage Andrew Rybchenko
                   ` (14 more replies)
  0 siblings, 15 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Implement EF10 (SFN7xxx and SFN8xxx) native datapaths which may be
chosen per device using PCI whitelist device arguments.

libefx-based datapath implementation is bound to API and structure
imposed by the libefx. It has many indirect function calls to
provide HW abstraction (bad for CPU pipeline) and uses many data
structures: driver Rx/Tx queue, driver event queue, libefx Rx/Tx
queue, libefx event queue, libefx NIC (bad for cache).

Native datapath implementation is fully separated from control
path to be able to use alternative control path if required
(e.g. kernel-aware).

Native datapaths show better performance than libefx-based.

Andrew Rybchenko (13):
  net/sfc: callbacks should depend on EvQ usage
  net/sfc: emphasis that RSS hash flag is an Rx queue flag
  net/sfc: do not use Rx queue control state on datapath
  net/sfc: factor out libefx-based Rx datapath
  net/sfc: Rx scatter is a datapath-dependent feature
  net/sfc: implement EF10 native Rx datapath
  net/sfc: factory out libefx-based Tx datapath
  net/sfc: VLAN insertion is a datapath dependent feature
  net/sfc: TSO is a datapath dependent feature
  net/sfc: implement EF10 native Tx datapath
  net/sfc: multi-segment support as is Tx datapath features
  net/sfc: implement simple EF10 native Tx datapath
  net/sfc: support Rx packed stream EF10-specific datapath

 doc/guides/nics/sfc_efx.rst      |  45 +++
 drivers/net/sfc/Makefile         |   4 +
 drivers/net/sfc/efsys.h          |   2 +-
 drivers/net/sfc/sfc.h            |   4 +
 drivers/net/sfc/sfc_dp.c         |  89 +++++
 drivers/net/sfc/sfc_dp.h         |  91 +++++
 drivers/net/sfc/sfc_dp_rx.h      | 216 ++++++++++++
 drivers/net/sfc/sfc_dp_tx.h      | 177 ++++++++++
 drivers/net/sfc/sfc_ef10_ps_rx.c | 659 ++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ef10_rx.c    | 713 +++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ef10_tx.c    | 517 ++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c     | 171 ++++++++--
 drivers/net/sfc/sfc_ev.c         | 272 +++++++++++++--
 drivers/net/sfc/sfc_ev.h         |  27 +-
 drivers/net/sfc/sfc_kvargs.c     |  11 +
 drivers/net/sfc/sfc_kvargs.h     |  19 ++
 drivers/net/sfc/sfc_rx.c         | 329 ++++++++++++++----
 drivers/net/sfc/sfc_rx.h         |  79 +++--
 drivers/net/sfc/sfc_tso.c        |  22 +-
 drivers/net/sfc/sfc_tx.c         | 331 +++++++++++++-----
 drivers/net/sfc/sfc_tx.h         |  95 ++++--
 21 files changed, 3604 insertions(+), 269 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_dp.c
 create mode 100644 drivers/net/sfc/sfc_dp.h
 create mode 100644 drivers/net/sfc/sfc_dp_rx.h
 create mode 100644 drivers/net/sfc/sfc_dp_tx.h
 create mode 100644 drivers/net/sfc/sfc_ef10_ps_rx.c
 create mode 100644 drivers/net/sfc/sfc_ef10_rx.c
 create mode 100644 drivers/net/sfc/sfc_ef10_tx.c

-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-04 21:04   ` Ferruh Yigit
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag Andrew Rybchenko
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Use different sets of libefx EvQ callbacks for management,
transmit and receive event queue. It makes event handling
more robust against unexpected events.

Also it is required for alternative datapath support.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_ev.c | 107 +++++++++++++++++++++++++++++++++++++++++++++--
 drivers/net/sfc/sfc_ev.h |  19 +++++----
 2 files changed, 114 insertions(+), 12 deletions(-)

diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index f717faa..c0f2218 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -69,6 +69,18 @@
 }
 
 static boolean_t
+sfc_ev_nop_rx(void *arg, uint32_t label, uint32_t id,
+	      uint32_t size, uint16_t flags)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa,
+		"EVQ %u unexpected Rx event label=%u id=%#x size=%u flags=%#x",
+		evq->evq_index, label, id, size, flags);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 	  uint32_t size, uint16_t flags)
 {
@@ -142,6 +154,16 @@
 }
 
 static boolean_t
+sfc_ev_nop_tx(void *arg, uint32_t label, uint32_t id)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected Tx event label=%u id=%#x",
+		evq->evq_index, label, id);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
@@ -196,6 +218,16 @@
 }
 
 static boolean_t
+sfc_ev_nop_rxq_flush_done(void *arg, uint32_t rxq_hw_index)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected RxQ %u flush done",
+		evq->evq_index, rxq_hw_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_rxq_flush_done(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
@@ -211,6 +243,16 @@
 }
 
 static boolean_t
+sfc_ev_nop_rxq_flush_failed(void *arg, uint32_t rxq_hw_index)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected RxQ %u flush failed",
+		evq->evq_index, rxq_hw_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_rxq_flush_failed(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
@@ -226,6 +268,16 @@
 }
 
 static boolean_t
+sfc_ev_nop_txq_flush_done(void *arg, uint32_t txq_hw_index)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected TxQ %u flush done",
+		evq->evq_index, txq_hw_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_txq_flush_done(void *arg, __rte_unused uint32_t txq_hw_index)
 {
 	struct sfc_evq *evq = arg;
@@ -281,6 +333,16 @@
 }
 
 static boolean_t
+sfc_ev_nop_link_change(void *arg, __rte_unused efx_link_mode_t link_mode)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected link change event",
+		evq->evq_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_link_change(void *arg, efx_link_mode_t link_mode)
 {
 	struct sfc_evq *evq = arg;
@@ -312,17 +374,47 @@
 
 static const efx_ev_callbacks_t sfc_ev_callbacks = {
 	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
+	.eec_tx			= sfc_ev_nop_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_nop_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_link_change,
+};
+
+static const efx_ev_callbacks_t sfc_ev_callbacks_rx = {
+	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_rx,
-	.eec_tx			= sfc_ev_tx,
+	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
 	.eec_rxq_flush_failed	= sfc_ev_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_nop_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_nop_link_change,
+};
+
+static const efx_ev_callbacks_t sfc_ev_callbacks_tx = {
+	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
+	.eec_tx			= sfc_ev_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
 	.eec_txq_flush_done	= sfc_ev_txq_flush_done,
 	.eec_software		= sfc_ev_software,
 	.eec_sram		= sfc_ev_sram,
 	.eec_wake_up		= sfc_ev_wake_up,
 	.eec_timer		= sfc_ev_timer,
-	.eec_link_change	= sfc_ev_link_change,
+	.eec_link_change	= sfc_ev_nop_link_change,
 };
 
 
@@ -334,7 +426,7 @@
 
 	/* Synchronize the DMA memory for reading not required */
 
-	efx_ev_qpoll(evq->common, &evq->read_ptr, &sfc_ev_callbacks, evq);
+	efx_ev_qpoll(evq->common, &evq->read_ptr, evq->callbacks, evq);
 
 	if (unlikely(evq->exception) && sfc_adapter_trylock(evq->sa)) {
 		struct sfc_adapter *sa = evq->sa;
@@ -425,6 +517,14 @@
 	if (rc != 0)
 		goto fail_ev_qcreate;
 
+	SFC_ASSERT(evq->rxq == NULL || evq->txq == NULL);
+	if (evq->rxq != 0)
+		evq->callbacks = &sfc_ev_callbacks_rx;
+	else if (evq->txq != 0)
+		evq->callbacks = &sfc_ev_callbacks_tx;
+	else
+		evq->callbacks = &sfc_ev_callbacks;
+
 	evq->init_state = SFC_EVQ_STARTING;
 
 	/* Wait for the initialization event */
@@ -483,6 +583,7 @@
 		return;
 
 	evq->init_state = SFC_EVQ_INITIALIZED;
+	evq->callbacks = NULL;
 	evq->read_ptr = 0;
 	evq->exception = B_FALSE;
 
diff --git a/drivers/net/sfc/sfc_ev.h b/drivers/net/sfc/sfc_ev.h
index 346e3ec..41a37f4 100644
--- a/drivers/net/sfc/sfc_ev.h
+++ b/drivers/net/sfc/sfc_ev.h
@@ -54,17 +54,18 @@ enum sfc_evq_state {
 
 struct sfc_evq {
 	/* Used on datapath */
-	efx_evq_t		*common;
-	unsigned int		read_ptr;
-	boolean_t		exception;
-	efsys_mem_t		mem;
-	struct sfc_rxq		*rxq;
-	struct sfc_txq		*txq;
+	efx_evq_t			*common;
+	const efx_ev_callbacks_t	*callbacks;
+	unsigned int			read_ptr;
+	boolean_t			exception;
+	efsys_mem_t			mem;
+	struct sfc_rxq			*rxq;
+	struct sfc_txq			*txq;
 
 	/* Not used on datapath */
-	struct sfc_adapter	*sa;
-	unsigned int		evq_index;
-	enum sfc_evq_state	init_state;
+	struct sfc_adapter		*sa;
+	unsigned int			evq_index;
+	enum sfc_evq_state		init_state;
 };
 
 struct sfc_evq_info {
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 03/13] net/sfc: do not use Rx queue control state on datapath Andrew Rybchenko
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Style fix to establish namespace for Rx queue flag defines.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_rx.c | 4 ++--
 drivers/net/sfc/sfc_rx.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 906536e..e72cc3b 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -192,7 +192,7 @@
 	uint8_t *mbuf_data;
 
 
-	if ((rxq->flags & SFC_RXQ_RSS_HASH) == 0)
+	if ((rxq->flags & SFC_RXQ_FLAG_RSS_HASH) == 0)
 		return;
 
 	mbuf_data = rte_pktmbuf_mtod(m, uint8_t *);
@@ -715,7 +715,7 @@
 
 #if EFSYS_OPT_RX_SCALE
 	if (sa->hash_support == EFX_RX_HASH_AVAILABLE)
-		rxq->flags |= SFC_RXQ_RSS_HASH;
+		rxq->flags |= SFC_RXQ_FLAG_RSS_HASH;
 #endif
 
 	rxq->state = SFC_RXQ_INITIALIZED;
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index 45b1d77..622d66a 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -85,7 +85,7 @@ struct sfc_rxq {
 	uint16_t		prefix_size;
 #if EFSYS_OPT_RX_SCALE
 	unsigned int		flags;
-#define SFC_RXQ_RSS_HASH	0x1
+#define SFC_RXQ_FLAG_RSS_HASH	0x1
 #endif
 
 	/* Used on refill */
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 03/13] net/sfc: do not use Rx queue control state on datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Rx queue flags should keep the information required on datapath.

It is a preparation to split control and data paths.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_ev.c |  5 +++--
 drivers/net/sfc/sfc_rx.c | 12 +++++++-----
 drivers/net/sfc/sfc_rx.h | 12 +++++-------
 3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index c0f2218..7f79c7e 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -99,7 +99,7 @@
 
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->evq == evq);
-	SFC_ASSERT(rxq->state & SFC_RXQ_STARTED);
+	SFC_ASSERT(rxq->flags & SFC_RXQ_FLAG_STARTED);
 
 	stop = (id + 1) & rxq->ptr_mask;
 	pending_id = rxq->pending & rxq->ptr_mask;
@@ -432,7 +432,8 @@
 		struct sfc_adapter *sa = evq->sa;
 		int rc;
 
-		if ((evq->rxq != NULL) && (evq->rxq->state & SFC_RXQ_RUNNING)) {
+		if ((evq->rxq != NULL) &&
+		    (evq->rxq->flags & SFC_RXQ_FLAG_RUNNING)) {
 			unsigned int rxq_sw_index = sfc_rxq_sw_index(evq->rxq);
 
 			sfc_warn(sa,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index e72cc3b..61654fc 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -217,7 +217,7 @@
 	boolean_t discard_next = B_FALSE;
 	struct rte_mbuf *scatter_pkt = NULL;
 
-	if (unlikely((rxq->state & SFC_RXQ_RUNNING) == 0))
+	if (unlikely((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0))
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -318,7 +318,7 @@
 	SFC_ASSERT(sw_index < sa->rxq_count);
 	rxq = sa->rxq_info[sw_index].rxq;
 
-	if (rxq == NULL || (rxq->state & SFC_RXQ_RUNNING) == 0)
+	if (rxq == NULL || (rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -329,7 +329,7 @@
 int
 sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset)
 {
-	if ((rxq->state & SFC_RXQ_RUNNING) == 0)
+	if ((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -434,7 +434,8 @@
 
 	rxq->pending = rxq->completed = rxq->added = rxq->pushed = 0;
 
-	rxq->state |= (SFC_RXQ_STARTED | SFC_RXQ_RUNNING);
+	rxq->state |= SFC_RXQ_STARTED;
+	rxq->flags |= SFC_RXQ_FLAG_STARTED | SFC_RXQ_FLAG_RUNNING;
 
 	sfc_rx_qrefill(rxq);
 
@@ -483,13 +484,14 @@
 	sa->eth_dev->data->rx_queue_state[sw_index] =
 		RTE_ETH_QUEUE_STATE_STOPPED;
 
-	rxq->state &= ~SFC_RXQ_RUNNING;
+	rxq->flags &= ~SFC_RXQ_FLAG_RUNNING;
 
 	if (sw_index == 0)
 		efx_mac_filter_default_rxq_clear(sa->nic);
 
 	sfc_rx_qflush(sa, sw_index);
 
+	rxq->flags &= ~SFC_RXQ_FLAG_STARTED;
 	rxq->state = SFC_RXQ_INITIALIZED;
 
 	efx_rx_qdestroy(rxq->common);
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index 622d66a..881453c 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -59,8 +59,6 @@ enum sfc_rxq_state_bit {
 #define SFC_RXQ_INITIALIZED	(1 << SFC_RXQ_INITIALIZED_BIT)
 	SFC_RXQ_STARTED_BIT,
 #define SFC_RXQ_STARTED		(1 << SFC_RXQ_STARTED_BIT)
-	SFC_RXQ_RUNNING_BIT,
-#define SFC_RXQ_RUNNING		(1 << SFC_RXQ_RUNNING_BIT)
 	SFC_RXQ_FLUSHING_BIT,
 #define SFC_RXQ_FLUSHING	(1 << SFC_RXQ_FLUSHING_BIT)
 	SFC_RXQ_FLUSHED_BIT,
@@ -77,16 +75,15 @@ struct sfc_rxq {
 	/* Used on data path */
 	struct sfc_evq		*evq;
 	struct sfc_rx_sw_desc	*sw_desc;
-	unsigned int		state;
+	unsigned int		flags;
+#define SFC_RXQ_FLAG_STARTED	0x1
+#define SFC_RXQ_FLAG_RUNNING	0x2
+#define SFC_RXQ_FLAG_RSS_HASH	0x4
 	unsigned int		ptr_mask;
 	unsigned int		pending;
 	unsigned int		completed;
 	uint16_t		batch_max;
 	uint16_t		prefix_size;
-#if EFSYS_OPT_RX_SCALE
-	unsigned int		flags;
-#define SFC_RXQ_FLAG_RSS_HASH	0x1
-#endif
 
 	/* Used on refill */
 	unsigned int		added;
@@ -100,6 +97,7 @@ struct sfc_rxq {
 
 	/* Not used on data path */
 	unsigned int		hw_index;
+	unsigned int		state;
 };
 
 static inline unsigned int
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (2 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 03/13] net/sfc: do not use Rx queue control state on datapath Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-04 21:05   ` Ferruh Yigit
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 05/13] net/sfc: Rx scatter is a datapath-dependent feature Andrew Rybchenko
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.

libefx-based Rx datapath is bound to libefx control path, but
other datapaths should be possible to use with alternative
control path(s).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst  |   7 +
 drivers/net/sfc/Makefile     |   1 +
 drivers/net/sfc/sfc.h        |   3 +
 drivers/net/sfc/sfc_dp.c     |  87 ++++++++++++
 drivers/net/sfc/sfc_dp.h     |  86 ++++++++++++
 drivers/net/sfc/sfc_dp_rx.h  | 184 +++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c | 105 ++++++++++++---
 drivers/net/sfc/sfc_ev.c     |  75 ++++++++---
 drivers/net/sfc/sfc_ev.h     |   4 +-
 drivers/net/sfc/sfc_kvargs.c |  10 ++
 drivers/net/sfc/sfc_kvargs.h |   8 ++
 drivers/net/sfc/sfc_rx.c     | 313 ++++++++++++++++++++++++++++++++++---------
 drivers/net/sfc/sfc_rx.h     |  73 ++++++----
 13 files changed, 824 insertions(+), 132 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_dp.c
 create mode 100644 drivers/net/sfc/sfc_dp.h
 create mode 100644 drivers/net/sfc/sfc_dp_rx.h

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 0a05a0a..8bd8a0c 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -181,6 +181,13 @@ whitelist option like "-w 02:00.0,arg1=value1,...".
 Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify
 boolean parameters value.
 
+- ``rx_datapath`` [auto|efx] (default **auto**)
+
+  Choose receive datapath implementation.
+  **auto** allows the driver itself to make a choice based on firmware
+  features available and required by the datapath implementation.
+  **efx** chooses libefx-based datapath which supports Rx scatter.
+
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
   Choose hardware tunning to be optimized for either throughput or
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index 619a0ed..befff4c 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -90,6 +90,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_port.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_rx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tso.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
 
 VPATH += $(SRCDIR)/base
 
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 8c6c02f..2512f2e 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -136,6 +136,7 @@ struct sfc_intr {
 struct sfc_evq_info;
 struct sfc_rxq_info;
 struct sfc_txq_info;
+struct sfc_dp_rx;
 
 struct sfc_port {
 	unsigned int			lsc_seq;
@@ -209,6 +210,8 @@ struct sfc_adapter {
 	unsigned int			rss_tbl[EFX_RSS_TBL_SIZE];
 	uint8_t				rss_key[SFC_RSS_KEY_SIZE];
 #endif
+
+	const struct sfc_dp_rx		*dp_rx;
 };
 
 /*
diff --git a/drivers/net/sfc/sfc_dp.c b/drivers/net/sfc/sfc_dp.c
new file mode 100644
index 0000000..c4d5fb3
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp.c
@@ -0,0 +1,87 @@
+/*-
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/queue.h>
+#include <string.h>
+#include <errno.h>
+
+#include <rte_log.h>
+
+#include "sfc_dp.h"
+
+struct sfc_dp *
+sfc_dp_find_by_name(struct sfc_dp_list *head, enum sfc_dp_type type,
+		    const char *name)
+{
+	struct sfc_dp *entry;
+
+	TAILQ_FOREACH(entry, head, links) {
+		if (entry->type != type)
+			continue;
+
+		if (strcmp(entry->name, name) == 0)
+			return entry;
+	}
+
+	return NULL;
+}
+
+struct sfc_dp *
+sfc_dp_find_by_caps(struct sfc_dp_list *head, enum sfc_dp_type type,
+		    unsigned int avail_caps)
+{
+	struct sfc_dp *entry;
+
+	TAILQ_FOREACH(entry, head, links) {
+		if (entry->type != type)
+			continue;
+
+		/* Take the first matching */
+		if (sfc_dp_match_hw_fw_caps(entry, avail_caps))
+			return entry;
+	}
+
+	return NULL;
+}
+
+int
+sfc_dp_register(struct sfc_dp_list *head, struct sfc_dp *entry)
+{
+	if (sfc_dp_find_by_name(head, entry->type, entry->name) != NULL) {
+		rte_log(RTE_LOG_ERR, RTE_LOGTYPE_PMD,
+			"sfc %s dapapath '%s' already registered\n",
+			entry->type == SFC_DP_RX ? "Rx" : "unknown",
+			entry->name);
+		return EEXIST;
+	}
+
+	TAILQ_INSERT_TAIL(head, entry, links);
+
+	return 0;
+}
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
new file mode 100644
index 0000000..8f78f98
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp.h
@@ -0,0 +1,86 @@
+/*-
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_DP_H
+#define _SFC_DP_H
+
+#include <stdbool.h>
+#include <sys/queue.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define SFC_DIV_ROUND_UP(a, b) \
+	__extension__ ({		\
+		typeof(a) _a = (a);	\
+		typeof(b) _b = (b);	\
+					\
+		(_a + (_b - 1)) / _b;	\
+	})
+
+/**
+ * Datapath exception handler to be provided by the control path.
+ */
+typedef void (sfc_dp_exception_t)(void *ctrl);
+
+enum sfc_dp_type {
+	SFC_DP_RX = 0,	/**< Receive datapath */
+};
+
+/** Datapath definition */
+struct sfc_dp {
+	TAILQ_ENTRY(sfc_dp)		links;
+	const char			*name;
+	enum sfc_dp_type		type;
+	/* Mask of required hardware/firmware capabilities */
+	unsigned int			hw_fw_caps;
+};
+
+/** List of datapath variants */
+TAILQ_HEAD(sfc_dp_list, sfc_dp);
+
+/* Check if available HW/FW capabilities are sufficient for the datapath */
+static inline bool
+sfc_dp_match_hw_fw_caps(const struct sfc_dp *dp, unsigned int avail_caps)
+{
+	return (dp->hw_fw_caps & avail_caps) == dp->hw_fw_caps;
+}
+
+struct sfc_dp *sfc_dp_find_by_name(struct sfc_dp_list *head,
+				   enum sfc_dp_type type, const char *name);
+struct sfc_dp *sfc_dp_find_by_caps(struct sfc_dp_list *head,
+				   enum sfc_dp_type type,
+				   unsigned int avail_caps);
+int sfc_dp_register(struct sfc_dp_list *head, struct sfc_dp *entry);
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_DP_H */
diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
new file mode 100644
index 0000000..5a714a1
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -0,0 +1,184 @@
+/*-
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_DP_RX_H
+#define _SFC_DP_RX_H
+
+#include <rte_mempool.h>
+#include <rte_ethdev.h>
+
+#include "sfc_dp.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct sfc_dp_rxq;
+
+/**
+ * Callback to get control path receive queue by datapath receive queue
+ * handle.
+ */
+typedef void * (sfc_dp_rxq_get_ctrl_t)(struct sfc_dp_rxq *dp_rxq);
+
+/** Datapath receive queue operations */
+struct sfc_dp_rxq_ops {
+	sfc_dp_rxq_get_ctrl_t		*get_ctrl;
+};
+
+/**
+ * Generic receive queue information used on data path.
+ * It must be kept as small as it is possible since it is built into
+ * the structure used on datapath.
+ */
+struct sfc_dp_rxq {
+	const struct sfc_dp_rxq_ops	*ops;
+};
+
+/**
+ * Datapath receive queue creation arguments.
+ *
+ * The structure is used just to pass information from control path to
+ * datapath. It could be just function arguments, but it would be hardly
+ * readable.
+ */
+struct sfc_dp_rx_qcreate_args {
+	/** Memory pool to allocate Rx buffer from */
+	struct rte_mempool	*refill_mb_pool;
+	/** Minimum number of unused Rx descriptors to do refill */
+	unsigned int		refill_threshold;
+	/**
+	 * Usable mbuf data space in accordance with alignment and
+	 * padding requirements imposed by HW.
+	 */
+	unsigned int		buf_size;
+	/** Port number to be set in the mbuf */
+	uint8_t			port_id;
+
+	/**
+	 * Maximum number of Rx descriptors completed in one Rx event.
+	 * Just for sanity checks if datapath would like to do.
+	 */
+	unsigned int		batch_max;
+
+	/** Pseudo-header size */
+	unsigned int		prefix_size;
+
+	/** Receive queue flags initializer */
+	unsigned int		flags;
+#define SFC_RXQ_FLAG_RSS_HASH	0x1
+
+	/** Rx queue size */
+	unsigned int		rxq_entries;
+};
+
+/**
+ * Allocate and initalize datapath receive queue.
+ *
+ * @param ctrl		Control path Rx queue opaque handle
+ * @param exception	Datapath exception handler to bail out to control path
+ * @param socket_id	Socket ID to allocate memory
+ * @param args		Function arguments wrapped in structure
+ * @param dp_rxqp	Location for generic datapath receive queue pointer
+ *
+ * @return 0 or positive errno.
+ */
+typedef int (sfc_dp_rx_qcreate_t)(void *ctrl, sfc_dp_exception_t *exception,
+				  int socket_id,
+				  const struct sfc_dp_rx_qcreate_args *args,
+				  struct sfc_dp_rxq **dp_rxqp);
+
+/**
+ * Free resources allocated for datapath recevie queue.
+ */
+typedef void (sfc_dp_rx_qdestroy_t)(struct sfc_dp_rxq *dp_rxq);
+
+/**
+ * Receive queue start callback.
+ *
+ * It handovers EvQ to the datapath.
+ */
+typedef int (sfc_dp_rx_qstart_t)(struct sfc_dp_rxq *dp_rxq,
+				 unsigned int evq_read_ptr);
+
+/**
+ * Receive queue stop function called before flush.
+ */
+typedef void (sfc_dp_rx_qstop_t)(struct sfc_dp_rxq *dp_rxq,
+				 unsigned int *evq_read_ptr);
+
+/**
+ * Receive queue purge function called after queue flush.
+ *
+ * Should be used to free unused recevie buffers.
+ */
+typedef void (sfc_dp_rx_qpurge_t)(struct sfc_dp_rxq *dp_rxq);
+
+/** Get packet types recognized/classified */
+typedef const uint32_t * (sfc_dp_rx_supported_ptypes_get_t)(void);
+
+/** Get number of pending Rx descriptors */
+typedef unsigned int (sfc_dp_rx_qdesc_npending_t)(struct sfc_dp_rxq *dp_rxq);
+
+/** Receive datapath definition */
+struct sfc_dp_rx {
+	struct sfc_dp				dp;
+
+	sfc_dp_rx_qcreate_t			*qcreate;
+	sfc_dp_rx_qdestroy_t			*qdestroy;
+	sfc_dp_rx_qstart_t			*qstart;
+	sfc_dp_rx_qstop_t			*qstop;
+	sfc_dp_rx_qpurge_t			*qpurge;
+	sfc_dp_rx_supported_ptypes_get_t	*supported_ptypes_get;
+	sfc_dp_rx_qdesc_npending_t		*qdesc_npending;
+	eth_rx_burst_t				pkt_burst;
+};
+
+static inline struct sfc_dp_rx *
+sfc_dp_find_rx_by_name(struct sfc_dp_list *head, const char *name)
+{
+	struct sfc_dp *p = sfc_dp_find_by_name(head, SFC_DP_RX, name);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_rx, dp);
+}
+
+static inline struct sfc_dp_rx *
+sfc_dp_find_rx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
+{
+	struct sfc_dp *p = sfc_dp_find_by_caps(head, SFC_DP_RX, avail_caps);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_rx, dp);
+}
+
+extern struct sfc_dp_rx sfc_efx_rx;
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_DP_RX_H */
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 71587fb..3207cf4 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -30,6 +30,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev.h>
 #include <rte_pci.h>
+#include <rte_errno.h>
 
 #include "efx.h"
 
@@ -40,7 +41,11 @@
 #include "sfc_ev.h"
 #include "sfc_rx.h"
 #include "sfc_tx.h"
+#include "sfc_dp.h"
+#include "sfc_dp_rx.h"
 
+static struct sfc_dp_list sfc_dp_head =
+	TAILQ_HEAD_INITIALIZER(sfc_dp_head);
 
 static void
 sfc_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
@@ -114,19 +119,9 @@
 static const uint32_t *
 sfc_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 {
-	static const uint32_t ptypes[] = {
-		RTE_PTYPE_L2_ETHER,
-		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
-		RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
-		RTE_PTYPE_L4_TCP,
-		RTE_PTYPE_L4_UDP,
-		RTE_PTYPE_UNKNOWN
-	};
-
-	if (dev->rx_pkt_burst == sfc_recv_pkts)
-		return ptypes;
-
-	return NULL;
+	struct sfc_adapter *sa = dev->data->dev_private;
+
+	return sa->dp_rx->supported_ptypes_get();
 }
 
 static int
@@ -366,7 +361,7 @@
 	if (rc != 0)
 		goto fail_rx_qinit;
 
-	dev->data->rx_queues[rx_queue_id] = sa->rxq_info[rx_queue_id].rxq;
+	dev->data->rx_queues[rx_queue_id] = sa->rxq_info[rx_queue_id].rxq->dp;
 
 	sfc_adapter_unlock(sa);
 
@@ -381,13 +376,15 @@
 static void
 sfc_rx_queue_release(void *queue)
 {
-	struct sfc_rxq *rxq = queue;
+	struct sfc_dp_rxq *dp_rxq = queue;
+	struct sfc_rxq *rxq;
 	struct sfc_adapter *sa;
 	unsigned int sw_index;
 
-	if (rxq == NULL)
+	if (dp_rxq == NULL)
 		return;
 
+	rxq = dp_rxq->ops->get_ctrl(dp_rxq);
 	sa = rxq->evq->sa;
 	sfc_adapter_lock(sa);
 
@@ -905,9 +902,9 @@
 static int
 sfc_rx_descriptor_done(void *queue, uint16_t offset)
 {
-	struct sfc_rxq *rxq = queue;
+	struct sfc_dp_rxq *dp_rxq = queue;
 
-	return sfc_rx_qdesc_done(rxq, offset);
+	return sfc_rx_qdesc_done(dp_rxq, offset);
 }
 
 static int
@@ -1230,6 +1227,69 @@
 };
 
 static int
+sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+	unsigned int avail_caps = 0;
+	const char *rx_name = NULL;
+	int rc;
+
+	if (sa == NULL || sa->state == SFC_ADAPTER_UNINITIALIZED)
+		return -E_RTE_SECONDARY;
+
+	rc = sfc_kvargs_process(sa, SFC_KVARG_RX_DATAPATH,
+				sfc_kvarg_string_handler, &rx_name);
+	if (rc != 0)
+		goto fail_kvarg_rx_datapath;
+
+	if (rx_name != NULL) {
+		sa->dp_rx = sfc_dp_find_rx_by_name(&sfc_dp_head, rx_name);
+		if (sa->dp_rx == NULL) {
+			sfc_err(sa, "Rx datapath %s not found", rx_name);
+			rc = ENOENT;
+			goto fail_dp_rx;
+		}
+		if (!sfc_dp_match_hw_fw_caps(&sa->dp_rx->dp, avail_caps)) {
+			sfc_err(sa,
+				"Insufficient Hw/FW capabilities to use Rx datapath %s",
+				rx_name);
+			rc = EINVAL;
+			goto fail_dp_rx;
+		}
+	} else {
+		sa->dp_rx = sfc_dp_find_rx_by_caps(&sfc_dp_head, avail_caps);
+		if (sa->dp_rx == NULL) {
+			sfc_err(sa, "Rx datapath by caps %#x not found",
+				avail_caps);
+			rc = ENOENT;
+			goto fail_dp_rx;
+		}
+	}
+
+	sfc_info(sa, "use %s Rx datapath", sa->dp_rx->dp.name);
+
+	dev->rx_pkt_burst = sa->dp_rx->pkt_burst;
+
+	dev->tx_pkt_burst = sfc_xmit_pkts;
+
+	dev->dev_ops = &sfc_eth_dev_ops;
+
+	return 0;
+
+fail_dp_rx:
+fail_kvarg_rx_datapath:
+	return rc;
+}
+
+static void
+sfc_register_dp(void)
+{
+	/* Register once */
+	if (TAILQ_EMPTY(&sfc_dp_head))
+		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
+}
+
+static int
 sfc_eth_dev_init(struct rte_eth_dev *dev)
 {
 	struct sfc_adapter *sa = dev->data->dev_private;
@@ -1238,6 +1298,8 @@
 	const efx_nic_cfg_t *encp;
 	const struct ether_addr *from;
 
+	sfc_register_dp();
+
 	/* Required for logging */
 	sa->eth_dev = dev;
 
@@ -1278,12 +1340,10 @@
 	from = (const struct ether_addr *)(encp->enc_mac_addr);
 	ether_addr_copy(from, &dev->data->mac_addrs[0]);
 
-	dev->dev_ops = &sfc_eth_dev_ops;
-	dev->rx_pkt_burst = &sfc_recv_pkts;
-	dev->tx_pkt_burst = &sfc_xmit_pkts;
-
 	sfc_adapter_unlock(sa);
 
+	sfc_eth_dev_set_ops(dev);
+
 	sfc_log_init(sa, "done");
 	return 0;
 
@@ -1358,6 +1418,7 @@
 RTE_PMD_REGISTER_PCI_TABLE(net_sfc_efx, pci_id_sfc_efx_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_sfc_efx, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PARAM_STRING(net_sfc_efx,
+	SFC_KVARG_RX_DATAPATH "=" SFC_KVARG_VALUES_RX_DATAPATH " "
 	SFC_KVARG_PERF_PROFILE "=" SFC_KVARG_VALUES_PERF_PROFILE " "
 	SFC_KVARG_MCDI_LOGGING "=" SFC_KVARG_VALUES_BOOL " "
 	SFC_KVARG_DEBUG_INIT "=" SFC_KVARG_VALUES_BOOL);
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 7f79c7e..dca3a81 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -81,25 +81,25 @@
 }
 
 static boolean_t
-sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
-	  uint32_t size, uint16_t flags)
+sfc_ev_efx_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
+	      uint32_t size, uint16_t flags)
 {
 	struct sfc_evq *evq = arg;
-	struct sfc_rxq *rxq;
+	struct sfc_efx_rxq *rxq;
 	unsigned int stop;
 	unsigned int pending_id;
 	unsigned int delta;
 	unsigned int i;
-	struct sfc_rx_sw_desc *rxd;
+	struct sfc_efx_rx_sw_desc *rxd;
 
 	if (unlikely(evq->exception))
 		goto done;
 
-	rxq = evq->rxq;
+	rxq = sfc_efx_rxq_by_dp_rxq(evq->dp_rxq);
 
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->evq == evq);
-	SFC_ASSERT(rxq->flags & SFC_RXQ_FLAG_STARTED);
+	SFC_ASSERT(rxq->flags & SFC_EFX_RXQ_FLAG_STARTED);
 
 	stop = (id + 1) & rxq->ptr_mask;
 	pending_id = rxq->pending & rxq->ptr_mask;
@@ -117,7 +117,9 @@
 			sfc_err(evq->sa,
 				"EVQ %u RxQ %u invalid RX abort "
 				"(id=%#x size=%u flags=%#x); needs restart",
-				evq->evq_index, sfc_rxq_sw_index(rxq),
+				evq->evq_index,
+				sfc_rxq_sw_index_by_hw_index(
+					rxq->ctrl->hw_index),
 				id, size, flags);
 			goto done;
 		}
@@ -132,8 +134,9 @@
 		sfc_err(evq->sa,
 			"EVQ %u RxQ %u completion out of order "
 			"(id=%#x delta=%u flags=%#x); needs restart",
-			evq->evq_index, sfc_rxq_sw_index(rxq), id, delta,
-			flags);
+			evq->evq_index,
+			sfc_rxq_sw_index_by_hw_index(rxq->ctrl->hw_index),
+			id, delta, flags);
 
 		goto done;
 	}
@@ -231,9 +234,13 @@
 sfc_ev_rxq_flush_done(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
 	struct sfc_rxq *rxq;
 
-	rxq = evq->rxq;
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	rxq = dp_rxq->ops->get_ctrl(dp_rxq);
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->hw_index == rxq_hw_index);
 	SFC_ASSERT(rxq->evq == evq);
@@ -256,9 +263,13 @@
 sfc_ev_rxq_flush_failed(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
 	struct sfc_rxq *rxq;
 
-	rxq = evq->rxq;
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	rxq = dp_rxq->ops->get_ctrl(dp_rxq);
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->hw_index == rxq_hw_index);
 	SFC_ASSERT(rxq->evq == evq);
@@ -387,9 +398,24 @@
 	.eec_link_change	= sfc_ev_link_change,
 };
 
-static const efx_ev_callbacks_t sfc_ev_callbacks_rx = {
+static const efx_ev_callbacks_t sfc_ev_callbacks_efx_rx = {
 	.eec_initialized	= sfc_ev_initialized,
-	.eec_rx			= sfc_ev_rx,
+	.eec_rx			= sfc_ev_efx_rx,
+	.eec_tx			= sfc_ev_nop_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_nop_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_nop_link_change,
+};
+
+static const efx_ev_callbacks_t sfc_ev_callbacks_dp_rx = {
+	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
@@ -432,9 +458,12 @@
 		struct sfc_adapter *sa = evq->sa;
 		int rc;
 
-		if ((evq->rxq != NULL) &&
-		    (evq->rxq->flags & SFC_RXQ_FLAG_RUNNING)) {
-			unsigned int rxq_sw_index = sfc_rxq_sw_index(evq->rxq);
+		if (evq->dp_rxq != NULL) {
+			struct sfc_rxq *rxq;
+			unsigned int rxq_sw_index;
+
+			rxq = evq->dp_rxq->ops->get_ctrl(evq->dp_rxq);
+			rxq_sw_index = sfc_rxq_sw_index(rxq);
 
 			sfc_warn(sa,
 				 "restart RxQ %u because of exception on its EvQ %u",
@@ -518,13 +547,17 @@
 	if (rc != 0)
 		goto fail_ev_qcreate;
 
-	SFC_ASSERT(evq->rxq == NULL || evq->txq == NULL);
-	if (evq->rxq != 0)
-		evq->callbacks = &sfc_ev_callbacks_rx;
-	else if (evq->txq != 0)
+	SFC_ASSERT(evq->dp_rxq == NULL || evq->txq == NULL);
+	if (evq->dp_rxq != 0) {
+		if (strcmp(sa->dp_rx->dp.name, SFC_KVARG_DATAPATH_EFX) == 0)
+			evq->callbacks = &sfc_ev_callbacks_efx_rx;
+		else
+			evq->callbacks = &sfc_ev_callbacks_dp_rx;
+	} else if (evq->txq != 0) {
 		evq->callbacks = &sfc_ev_callbacks_tx;
-	else
+	} else {
 		evq->callbacks = &sfc_ev_callbacks;
+	}
 
 	evq->init_state = SFC_EVQ_STARTING;
 
diff --git a/drivers/net/sfc/sfc_ev.h b/drivers/net/sfc/sfc_ev.h
index 41a37f4..e99cd74 100644
--- a/drivers/net/sfc/sfc_ev.h
+++ b/drivers/net/sfc/sfc_ev.h
@@ -40,7 +40,7 @@
 #define SFC_MGMT_EVQ_ENTRIES	(EFX_EVQ_MINNEVS)
 
 struct sfc_adapter;
-struct sfc_rxq;
+struct sfc_dp_rxq;
 struct sfc_txq;
 
 enum sfc_evq_state {
@@ -59,7 +59,7 @@ struct sfc_evq {
 	unsigned int			read_ptr;
 	boolean_t			exception;
 	efsys_mem_t			mem;
-	struct sfc_rxq			*rxq;
+	struct sfc_dp_rxq		*dp_rxq;
 	struct sfc_txq			*txq;
 
 	/* Not used on datapath */
diff --git a/drivers/net/sfc/sfc_kvargs.c b/drivers/net/sfc/sfc_kvargs.c
index 227a8db..d8529fa 100644
--- a/drivers/net/sfc/sfc_kvargs.c
+++ b/drivers/net/sfc/sfc_kvargs.c
@@ -45,6 +45,7 @@
 		SFC_KVARG_DEBUG_INIT,
 		SFC_KVARG_MCDI_LOGGING,
 		SFC_KVARG_PERF_PROFILE,
+		SFC_KVARG_RX_DATAPATH,
 		NULL,
 	};
 
@@ -110,3 +111,12 @@
 
 	return 0;
 }
+
+int
+sfc_kvarg_string_handler(__rte_unused const char *key,
+			 const char *value_str, void *opaque)
+{
+	*(const char **)opaque = value_str;
+
+	return 0;
+}
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 2fea9c7..2d0ffde 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -52,6 +52,12 @@
 	    SFC_KVARG_PERF_PROFILE_THROUGHPUT "|" \
 	    SFC_KVARG_PERF_PROFILE_LOW_LATENCY "]"
 
+#define SFC_KVARG_DATAPATH_EFX		"efx"
+
+#define SFC_KVARG_RX_DATAPATH		"rx_datapath"
+#define SFC_KVARG_VALUES_RX_DATAPATH \
+	"[" SFC_KVARG_DATAPATH_EFX "]"
+
 struct sfc_adapter;
 
 int sfc_kvargs_parse(struct sfc_adapter *sa);
@@ -62,6 +68,8 @@ int sfc_kvargs_process(struct sfc_adapter *sa, const char *key_match,
 
 int sfc_kvarg_bool_handler(const char *key, const char *value_str,
 			   void *opaque);
+int sfc_kvarg_string_handler(const char *key, const char *value_str,
+			     void *opaque);
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 61654fc..e56837d 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -36,6 +36,7 @@
 #include "sfc_log.h"
 #include "sfc_ev.h"
 #include "sfc_rx.h"
+#include "sfc_kvargs.h"
 #include "sfc_tweak.h"
 
 /*
@@ -72,7 +73,7 @@
 }
 
 static void
-sfc_rx_qrefill(struct sfc_rxq *rxq)
+sfc_efx_rx_qrefill(struct sfc_efx_rxq *rxq)
 {
 	unsigned int free_space;
 	unsigned int bulks;
@@ -81,7 +82,7 @@
 	unsigned int added = rxq->added;
 	unsigned int id;
 	unsigned int i;
-	struct sfc_rx_sw_desc *rxd;
+	struct sfc_efx_rx_sw_desc *rxd;
 	struct rte_mbuf *m;
 	uint8_t port_id = rxq->port_id;
 
@@ -135,7 +136,7 @@
 }
 
 static uint64_t
-sfc_rx_desc_flags_to_offload_flags(const unsigned int desc_flags)
+sfc_efx_rx_desc_flags_to_offload_flags(const unsigned int desc_flags)
 {
 	uint64_t mbuf_flags = 0;
 
@@ -174,7 +175,7 @@
 }
 
 static uint32_t
-sfc_rx_desc_flags_to_packet_type(const unsigned int desc_flags)
+sfc_efx_rx_desc_flags_to_packet_type(const unsigned int desc_flags)
 {
 	return RTE_PTYPE_L2_ETHER |
 		((desc_flags & EFX_PKT_IPV4) ?
@@ -185,14 +186,30 @@
 		((desc_flags & EFX_PKT_UDP) ? RTE_PTYPE_L4_UDP : 0);
 }
 
+static const uint32_t *
+sfc_efx_supported_ptypes_get(void)
+{
+	static const uint32_t ptypes[] = {
+		RTE_PTYPE_L2_ETHER,
+		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+		RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+		RTE_PTYPE_L4_TCP,
+		RTE_PTYPE_L4_UDP,
+		RTE_PTYPE_UNKNOWN
+	};
+
+	return ptypes;
+}
+
 static void
-sfc_rx_set_rss_hash(struct sfc_rxq *rxq, unsigned int flags, struct rte_mbuf *m)
+sfc_efx_rx_set_rss_hash(struct sfc_efx_rxq *rxq, unsigned int flags,
+			struct rte_mbuf *m)
 {
 #if EFSYS_OPT_RX_SCALE
 	uint8_t *mbuf_data;
 
 
-	if ((rxq->flags & SFC_RXQ_FLAG_RSS_HASH) == 0)
+	if ((rxq->flags & SFC_EFX_RXQ_FLAG_RSS_HASH) == 0)
 		return;
 
 	mbuf_data = rte_pktmbuf_mtod(m, uint8_t *);
@@ -207,17 +224,18 @@
 #endif
 }
 
-uint16_t
-sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+static uint16_t
+sfc_efx_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 {
-	struct sfc_rxq *rxq = rx_queue;
+	struct sfc_dp_rxq *dp_rxq = rx_queue;
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 	unsigned int completed;
 	unsigned int prefix_size = rxq->prefix_size;
 	unsigned int done_pkts = 0;
 	boolean_t discard_next = B_FALSE;
 	struct rte_mbuf *scatter_pkt = NULL;
 
-	if (unlikely((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0))
+	if (unlikely((rxq->flags & SFC_EFX_RXQ_FLAG_RUNNING) == 0))
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -225,7 +243,7 @@
 	completed = rxq->completed;
 	while (completed != rxq->pending && done_pkts < nb_pkts) {
 		unsigned int id;
-		struct sfc_rx_sw_desc *rxd;
+		struct sfc_efx_rx_sw_desc *rxd;
 		struct rte_mbuf *m;
 		unsigned int seg_len;
 		unsigned int desc_flags;
@@ -279,14 +297,16 @@
 		/* The first fragment of the packet has prefix */
 		prefix_size = rxq->prefix_size;
 
-		m->ol_flags = sfc_rx_desc_flags_to_offload_flags(desc_flags);
-		m->packet_type = sfc_rx_desc_flags_to_packet_type(desc_flags);
+		m->ol_flags =
+			sfc_efx_rx_desc_flags_to_offload_flags(desc_flags);
+		m->packet_type =
+			sfc_efx_rx_desc_flags_to_packet_type(desc_flags);
 
 		/*
 		 * Extract RSS hash from the packet prefix and
 		 * set the corresponding field (if needed and possible)
 		 */
-		sfc_rx_set_rss_hash(rxq, desc_flags, m);
+		sfc_efx_rx_set_rss_hash(rxq, desc_flags, m);
 
 		m->data_off += prefix_size;
 
@@ -305,20 +325,18 @@
 
 	rxq->completed = completed;
 
-	sfc_rx_qrefill(rxq);
+	sfc_efx_rx_qrefill(rxq);
 
 	return done_pkts;
 }
 
-unsigned int
-sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
+static sfc_dp_rx_qdesc_npending_t sfc_efx_rx_qdesc_npending;
+static unsigned int
+sfc_efx_rx_qdesc_npending(struct sfc_dp_rxq *dp_rxq)
 {
-	struct sfc_rxq *rxq;
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 
-	SFC_ASSERT(sw_index < sa->rxq_count);
-	rxq = sa->rxq_info[sw_index].rxq;
-
-	if (rxq == NULL || (rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
+	if ((rxq->flags & SFC_EFX_RXQ_FLAG_RUNNING) == 0)
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -326,28 +344,170 @@
 	return rxq->pending - rxq->completed;
 }
 
-int
-sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset)
+
+static void *
+sfc_efx_rxq_get_ctrl(struct sfc_dp_rxq *dp_rxq)
 {
-	if ((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
-		return 0;
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 
-	sfc_ev_qpoll(rxq->evq);
+	return rxq->ctrl;
+}
 
-	return offset < (rxq->pending - rxq->completed);
+static const struct sfc_dp_rxq_ops sfc_efx_ops = {
+	.get_ctrl	= sfc_efx_rxq_get_ctrl,
+};
+
+static sfc_dp_rx_qcreate_t sfc_efx_rx_qcreate;
+static int
+sfc_efx_rx_qcreate(void *ctrl, __rte_unused sfc_dp_exception_t *exception,
+		   int socket_id,
+		   const struct sfc_dp_rx_qcreate_args *args,
+		   struct sfc_dp_rxq **dp_rxqp)
+{
+	struct sfc_efx_rxq *rxq;
+	int rc;
+
+	rc = ENOMEM;
+	rxq = rte_zmalloc_socket("sfc-efx-rxq", sizeof(*rxq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		goto fail_rxq_alloc;
+
+	rc = ENOMEM;
+	rxq->sw_desc = rte_calloc_socket("sfc-efx-rxq-sw_desc",
+					 args->rxq_entries,
+					 sizeof(*rxq->sw_desc),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq->sw_desc == NULL)
+		goto fail_desc_alloc;
+
+	rxq->ctrl = ctrl;
+	rxq->dp.ops = &sfc_efx_ops;
+
+	rxq->evq = rxq->ctrl->evq;
+	if (args->flags & SFC_RXQ_FLAG_RSS_HASH)
+		rxq->flags |= SFC_EFX_RXQ_FLAG_RSS_HASH;
+	rxq->ptr_mask = args->rxq_entries - 1;
+	rxq->batch_max = args->batch_max;
+	rxq->prefix_size = args->prefix_size;
+	rxq->refill_threshold = args->refill_threshold;
+	rxq->port_id = args->port_id;
+	rxq->buf_size = args->buf_size;
+	rxq->refill_mb_pool = args->refill_mb_pool;
+
+	*dp_rxqp = &rxq->dp;
+	return 0;
+
+fail_desc_alloc:
+	rte_free(rxq);
+
+fail_rxq_alloc:
+	return rc;
+}
+
+static sfc_dp_rx_qdestroy_t sfc_efx_rx_qdestroy;
+static void
+sfc_efx_rx_qdestroy(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
+
+	rte_free(rxq->sw_desc);
+	rte_free(rxq);
+}
+
+static sfc_dp_rx_qstart_t sfc_efx_rx_qstart;
+static int
+sfc_efx_rx_qstart(struct sfc_dp_rxq *dp_rxq,
+		  __rte_unused unsigned int evq_read_ptr)
+{
+	/* libefx-based datapath is specific to libefx-based PMD */
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->common = rxq->ctrl->common;
+
+	rxq->pending = rxq->completed = rxq->added = rxq->pushed = 0;
+
+	sfc_efx_rx_qrefill(rxq);
+
+	rxq->flags |= (SFC_EFX_RXQ_FLAG_STARTED | SFC_EFX_RXQ_FLAG_RUNNING);
+
+	return 0;
+}
+
+static sfc_dp_rx_qstop_t sfc_efx_rx_qstop;
+static void
+sfc_efx_rx_qstop(struct sfc_dp_rxq *dp_rxq,
+		 __rte_unused unsigned int *evq_read_ptr)
+{
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->flags &= ~SFC_EFX_RXQ_FLAG_RUNNING;
+
+	/* libefx-based datapath is bound to libefx-based PMD and uses
+	 * event queue structure directly. So, there is no necessity to
+	 * return EvQ read pointer.
+	 */
 }
 
+static sfc_dp_rx_qpurge_t sfc_efx_rx_qpurge;
 static void
-sfc_rx_qpurge(struct sfc_rxq *rxq)
+sfc_efx_rx_qpurge(struct sfc_dp_rxq *dp_rxq)
 {
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 	unsigned int i;
-	struct sfc_rx_sw_desc *rxd;
+	struct sfc_efx_rx_sw_desc *rxd;
 
 	for (i = rxq->completed; i != rxq->added; ++i) {
 		rxd = &rxq->sw_desc[i & rxq->ptr_mask];
 		rte_mempool_put(rxq->refill_mb_pool, rxd->mbuf);
 		rxd->mbuf = NULL;
+		/* Packed stream relies on 0 in inactive SW desc.
+		 * Rx queue stop is not performance critical, so
+		 * there is no harm to do it always.
+		 */
+		rxd->flags = 0;
+		rxd->size = 0;
 	}
+
+	rxq->flags &= ~SFC_EFX_RXQ_FLAG_STARTED;
+}
+
+struct sfc_dp_rx sfc_efx_rx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EFX,
+		.type		= SFC_DP_RX,
+		.hw_fw_caps	= 0,
+	},
+	.qcreate		= sfc_efx_rx_qcreate,
+	.qdestroy		= sfc_efx_rx_qdestroy,
+	.qstart			= sfc_efx_rx_qstart,
+	.qstop			= sfc_efx_rx_qstop,
+	.qpurge			= sfc_efx_rx_qpurge,
+	.supported_ptypes_get	= sfc_efx_supported_ptypes_get,
+	.qdesc_npending		= sfc_efx_rx_qdesc_npending,
+	.pkt_burst		= sfc_efx_recv_pkts,
+};
+
+unsigned int
+sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
+{
+	struct sfc_rxq *rxq;
+
+	SFC_ASSERT(sw_index < sa->rxq_count);
+	rxq = sa->rxq_info[sw_index].rxq;
+
+	if (rxq == NULL || (rxq->state & SFC_RXQ_STARTED) == 0)
+		return 0;
+
+	return sa->dp_rx->qdesc_npending(rxq->dp);
+}
+
+int
+sfc_rx_qdesc_done(struct sfc_dp_rxq *dp_rxq, unsigned int offset)
+{
+	struct sfc_rxq *rxq = dp_rxq->ops->get_ctrl(dp_rxq);
+
+	return offset < rxq->evq->sa->dp_rx->qdesc_npending(dp_rxq);
 }
 
 static void
@@ -398,7 +558,7 @@
 			sfc_info(sa, "RxQ %u flushed", sw_index);
 	}
 
-	sfc_rx_qpurge(rxq);
+	sa->dp_rx->qpurge(rxq->dp);
 }
 
 int
@@ -432,12 +592,11 @@
 
 	efx_rx_qenable(rxq->common);
 
-	rxq->pending = rxq->completed = rxq->added = rxq->pushed = 0;
+	rc = sa->dp_rx->qstart(rxq->dp, evq->read_ptr);
+	if (rc != 0)
+		goto fail_dp_qstart;
 
 	rxq->state |= SFC_RXQ_STARTED;
-	rxq->flags |= SFC_RXQ_FLAG_STARTED | SFC_RXQ_FLAG_RUNNING;
-
-	sfc_rx_qrefill(rxq);
 
 	if (sw_index == 0) {
 		rc = efx_mac_filter_default_rxq_set(sa->nic, rxq->common,
@@ -454,6 +613,9 @@
 	return 0;
 
 fail_mac_filter_default_rxq_set:
+	sa->dp_rx->qstop(rxq->dp, &rxq->evq->read_ptr);
+
+fail_dp_qstart:
 	sfc_rx_qflush(sa, sw_index);
 
 fail_rx_qcreate:
@@ -484,14 +646,13 @@
 	sa->eth_dev->data->rx_queue_state[sw_index] =
 		RTE_ETH_QUEUE_STATE_STOPPED;
 
-	rxq->flags &= ~SFC_RXQ_FLAG_RUNNING;
+	sa->dp_rx->qstop(rxq->dp, &rxq->evq->read_ptr);
 
 	if (sw_index == 0)
 		efx_mac_filter_default_rxq_clear(sa->nic);
 
 	sfc_rx_qflush(sa, sw_index);
 
-	rxq->flags &= ~SFC_RXQ_FLAG_STARTED;
 	rxq->state = SFC_RXQ_INITIALIZED;
 
 	efx_rx_qdestroy(rxq->common);
@@ -629,6 +790,31 @@
 	return buf_size;
 }
 
+static sfc_dp_exception_t sfc_rx_dp_exception;
+static void
+sfc_rx_dp_exception(void *ctrl)
+{
+	struct sfc_rxq *rxq = ctrl;
+	struct sfc_adapter *sa = rxq->evq->sa;
+	unsigned int rxq_sw_index;
+	int rc;
+
+	if (!sfc_adapter_trylock(sa))
+		return;
+
+	rxq_sw_index = sfc_rxq_sw_index(rxq);
+
+	sfc_warn(sa, "restart RxQ %u because of datapath exception",
+		 rxq_sw_index);
+
+	sfc_rx_qstop(sa, rxq_sw_index);
+	rc = sfc_rx_qstart(sa, rxq_sw_index);
+	if (rc != 0)
+		sfc_err(sa, "cannot restart RxQ %u", rxq_sw_index);
+
+	sfc_adapter_unlock(sa);
+}
+
 int
 sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	     uint16_t nb_rx_desc, unsigned int socket_id,
@@ -642,6 +828,7 @@
 	unsigned int evq_index;
 	struct sfc_evq *evq;
 	struct sfc_rxq *rxq;
+	struct sfc_dp_rx_qcreate_args args;
 
 	rc = sfc_rx_qcheck_conf(sa, nb_rx_desc, rx_conf);
 	if (rc != 0)
@@ -690,47 +877,51 @@
 	if (rxq == NULL)
 		goto fail_rxq_alloc;
 
-	rc = sfc_dma_alloc(sa, "rxq", sw_index, EFX_RXQ_SIZE(rxq_info->entries),
-			   socket_id, &rxq->mem);
-	if (rc != 0)
-		goto fail_dma_alloc;
-
-	rc = ENOMEM;
-	rxq->sw_desc = rte_calloc_socket("sfc-rxq-sw_desc", rxq_info->entries,
-					 sizeof(*rxq->sw_desc),
-					 RTE_CACHE_LINE_SIZE, socket_id);
-	if (rxq->sw_desc == NULL)
-		goto fail_desc_alloc;
+	rxq_info->rxq = rxq;
 
-	evq->rxq = rxq;
 	rxq->evq = evq;
-	rxq->ptr_mask = rxq_info->entries - 1;
+	rxq->hw_index = sw_index;
 	rxq->refill_threshold = rx_conf->rx_free_thresh;
 	rxq->refill_mb_pool = mb_pool;
-	rxq->buf_size = buf_size;
-	rxq->hw_index = sw_index;
-	rxq->port_id = sa->eth_dev->data->port_id;
 
-	/* Cache limits required on datapath in RxQ structure */
-	rxq->batch_max = encp->enc_rx_batch_max;
-	rxq->prefix_size = encp->enc_rx_prefix_size;
+	rc = sfc_dma_alloc(sa, "rxq", sw_index, EFX_RXQ_SIZE(rxq_info->entries),
+			   socket_id, &rxq->mem);
+	if (rc != 0)
+		goto fail_dma_alloc;
+
+	memset(&args, 0, sizeof(args));
+	args.refill_mb_pool = rxq->refill_mb_pool;
+	args.refill_threshold = rxq->refill_threshold;
+	args.buf_size = buf_size;
+	args.port_id = sa->eth_dev->data->port_id;
+	args.batch_max = encp->enc_rx_batch_max;
+	args.prefix_size = encp->enc_rx_prefix_size;
 
 #if EFSYS_OPT_RX_SCALE
 	if (sa->hash_support == EFX_RX_HASH_AVAILABLE)
-		rxq->flags |= SFC_RXQ_FLAG_RSS_HASH;
+		args.flags |= SFC_RXQ_FLAG_RSS_HASH;
 #endif
 
+	args.rxq_entries = rxq_info->entries;
+
+	rc = sa->dp_rx->qcreate(rxq, sfc_rx_dp_exception, socket_id, &args,
+				&rxq->dp);
+	if (rc != 0)
+		goto fail_dp_rx_qcreate;
+
+	evq->dp_rxq = rxq->dp;
+
 	rxq->state = SFC_RXQ_INITIALIZED;
 
-	rxq_info->rxq = rxq;
 	rxq_info->deferred_start = (rx_conf->rx_deferred_start != 0);
 
 	return 0;
 
-fail_desc_alloc:
+fail_dp_rx_qcreate:
 	sfc_dma_free(sa, &rxq->mem);
 
 fail_dma_alloc:
+	rxq_info->rxq = NULL;
 	rte_free(rxq);
 
 fail_rxq_alloc:
@@ -757,10 +948,12 @@
 	rxq = rxq_info->rxq;
 	SFC_ASSERT(rxq->state == SFC_RXQ_INITIALIZED);
 
+	sa->dp_rx->qdestroy(rxq->dp);
+	rxq->dp = NULL;
+
 	rxq_info->rxq = NULL;
 	rxq_info->entries = 0;
 
-	rte_free(rxq->sw_desc);
 	sfc_dma_free(sa, &rxq->mem);
 	rte_free(rxq);
 }
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index 881453c..6407406 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -36,6 +36,8 @@
 
 #include "efx.h"
 
+#include "sfc_dp_rx.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -47,7 +49,7 @@
  * Software Rx descriptor information associated with hardware Rx
  * descriptor.
  */
-struct sfc_rx_sw_desc {
+struct sfc_efx_rx_sw_desc {
 	struct rte_mbuf		*mbuf;
 	unsigned int		flags;
 	unsigned int		size;
@@ -68,35 +70,17 @@ enum sfc_rxq_state_bit {
 };
 
 /**
- * Receive queue information used on data path.
+ * Receive queue control information.
  * Allocated on the socket specified on the queue setup.
  */
 struct sfc_rxq {
-	/* Used on data path */
 	struct sfc_evq		*evq;
-	struct sfc_rx_sw_desc	*sw_desc;
-	unsigned int		flags;
-#define SFC_RXQ_FLAG_STARTED	0x1
-#define SFC_RXQ_FLAG_RUNNING	0x2
-#define SFC_RXQ_FLAG_RSS_HASH	0x4
-	unsigned int		ptr_mask;
-	unsigned int		pending;
-	unsigned int		completed;
-	uint16_t		batch_max;
-	uint16_t		prefix_size;
-
-	/* Used on refill */
-	unsigned int		added;
-	unsigned int		pushed;
-	unsigned int		refill_threshold;
-	uint8_t			port_id;
-	uint16_t		buf_size;
-	struct rte_mempool	*refill_mb_pool;
 	efx_rxq_t		*common;
 	efsys_mem_t		mem;
-
-	/* Not used on data path */
 	unsigned int		hw_index;
+	unsigned int		refill_threshold;
+	struct rte_mempool	*refill_mb_pool;
+	struct sfc_dp_rxq	*dp;
 	unsigned int		state;
 };
 
@@ -113,6 +97,44 @@ struct sfc_rxq {
 }
 
 /**
+ * Receive queue information used on libefx-based data path.
+ * Allocated on the socket specified on the queue setup.
+ */
+struct sfc_efx_rxq {
+	/* Used on data path */
+	struct sfc_evq			*evq;
+	unsigned int			flags;
+#define SFC_EFX_RXQ_FLAG_STARTED	0x1
+#define SFC_EFX_RXQ_FLAG_RUNNING	0x2
+#define SFC_EFX_RXQ_FLAG_RSS_HASH	0x4
+	unsigned int			ptr_mask;
+	unsigned int			pending;
+	unsigned int			completed;
+	uint16_t			batch_max;
+	uint16_t			prefix_size;
+	struct sfc_efx_rx_sw_desc	*sw_desc;
+
+	/* Used on refill */
+	unsigned int			added;
+	unsigned int			pushed;
+	unsigned int			refill_threshold;
+	uint8_t				port_id;
+	uint16_t			buf_size;
+	struct rte_mempool		*refill_mb_pool;
+	efx_rxq_t			*common;
+
+	/* Datapath receive queue anchor */
+	struct sfc_dp_rxq		dp;
+	struct sfc_rxq			*ctrl;
+};
+
+static inline struct sfc_efx_rxq *
+sfc_efx_rxq_by_dp_rxq(struct sfc_dp_rxq *dp_rxq)
+{
+	return container_of(dp_rxq, struct sfc_efx_rxq, dp);
+}
+
+/**
  * Receive queue information used during setup/release only.
  * Allocated on the same socket as adapter data.
  */
@@ -141,12 +163,9 @@ int sfc_rx_qinit(struct sfc_adapter *sa, unsigned int rx_queue_id,
 void sfc_rx_qflush_done(struct sfc_rxq *rxq);
 void sfc_rx_qflush_failed(struct sfc_rxq *rxq);
 
-uint16_t sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
-		       uint16_t nb_pkts);
-
 unsigned int sfc_rx_qdesc_npending(struct sfc_adapter *sa,
 				   unsigned int sw_index);
-int sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset);
+int sfc_rx_qdesc_done(struct sfc_dp_rxq *dp_rxq, unsigned int offset);
 
 #if EFSYS_OPT_RX_SCALE
 efx_rx_hash_type_t sfc_rte_to_efx_hash_type(uint64_t rss_hf);
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 05/13] net/sfc: Rx scatter is a datapath-dependent feature
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (3 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 06/13] net/sfc: implement EF10 native Rx datapath Andrew Rybchenko
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_rx.h | 2 ++
 drivers/net/sfc/sfc_rx.c    | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index 5a714a1..66d655f 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -150,6 +150,8 @@ typedef void (sfc_dp_rx_qstop_t)(struct sfc_dp_rxq *dp_rxq,
 struct sfc_dp_rx {
 	struct sfc_dp				dp;
 
+	unsigned int				features;
+#define SFC_DP_RX_FEAT_SCATTER			0x1
 	sfc_dp_rx_qcreate_t			*qcreate;
 	sfc_dp_rx_qdestroy_t			*qdestroy;
 	sfc_dp_rx_qstart_t			*qstart;
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index e56837d..0095afd 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -478,6 +478,7 @@ struct sfc_dp_rx sfc_efx_rx = {
 		.type		= SFC_DP_RX,
 		.hw_fw_caps	= 0,
 	},
+	.features		= SFC_DP_RX_FEAT_SCATTER,
 	.qcreate		= sfc_efx_rx_qcreate,
 	.qdestroy		= sfc_efx_rx_qdestroy,
 	.qstart			= sfc_efx_rx_qstart,
@@ -1149,6 +1150,13 @@ struct sfc_dp_rx sfc_efx_rx = {
 		rxmode->hw_strip_crc = 1;
 	}
 
+	if (rxmode->enable_scatter &&
+	    (~sa->dp_rx->features & SFC_DP_RX_FEAT_SCATTER)) {
+		sfc_err(sa, "Rx scatter not supported by %s datapath",
+			sa->dp_rx->dp.name);
+		rc = EINVAL;
+	}
+
 	if (rxmode->enable_lro) {
 		sfc_err(sa, "LRO not supported");
 		rc = EINVAL;
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 06/13] net/sfc: implement EF10 native Rx datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (4 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 05/13] net/sfc: Rx scatter is a datapath-dependent feature Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 07/13] net/sfc: factory out libefx-based Tx datapath Andrew Rybchenko
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst   |   5 +-
 drivers/net/sfc/Makefile      |   1 +
 drivers/net/sfc/sfc_dp.h      |   1 +
 drivers/net/sfc/sfc_dp_rx.h   |  22 ++
 drivers/net/sfc/sfc_ef10_rx.c | 713 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  |  14 +-
 drivers/net/sfc/sfc_ev.c      |  16 +-
 drivers/net/sfc/sfc_kvargs.h  |   4 +-
 drivers/net/sfc/sfc_rx.c      |   5 +
 9 files changed, 777 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_ef10_rx.c

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 8bd8a0c..0aa6740 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -181,12 +181,15 @@ whitelist option like "-w 02:00.0,arg1=value1,...".
 Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify
 boolean parameters value.
 
-- ``rx_datapath`` [auto|efx] (default **auto**)
+- ``rx_datapath`` [auto|efx|ef10] (default **auto**)
 
   Choose receive datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
   features available and required by the datapath implementation.
   **efx** chooses libefx-based datapath which supports Rx scatter.
+  **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
+  more efficient than libefx-based and provides richer packet type
+  classification, but lacks Rx scatter support.
 
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index befff4c..3c15722 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -91,6 +91,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_rx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tso.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_rx.c
 
 VPATH += $(SRCDIR)/base
 
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
index 8f78f98..39e1e70 100644
--- a/drivers/net/sfc/sfc_dp.h
+++ b/drivers/net/sfc/sfc_dp.h
@@ -61,6 +61,7 @@ struct sfc_dp {
 	enum sfc_dp_type		type;
 	/* Mask of required hardware/firmware capabilities */
 	unsigned int			hw_fw_caps;
+#define SFC_DP_HW_FW_CAP_EF10		0x1
 };
 
 /** List of datapath variants */
diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index 66d655f..944d366 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -96,6 +96,21 @@ struct sfc_dp_rx_qcreate_args {
 
 	/** Rx queue size */
 	unsigned int		rxq_entries;
+	/** DMA-mapped Rx descriptors ring */
+	void			*rxq_hw_ring;
+
+	/** Associated event queue size */
+	unsigned int		evq_entries;
+	/** Hardware event ring */
+	void			*evq_hw_ring;
+
+	/** The queue index in hardware (required to push right doorbell) */
+	unsigned int		hw_index;
+	/**
+	 * Virtual address of the memory-mapped BAR to push Rx refill
+	 * doorbell
+	 */
+	volatile void		*mem_bar;
 };
 
 /**
@@ -134,6 +149,11 @@ typedef void (sfc_dp_rx_qstop_t)(struct sfc_dp_rxq *dp_rxq,
 				 unsigned int *evq_read_ptr);
 
 /**
+ * Receive event handler used during queue flush only.
+ */
+typedef bool (sfc_dp_rx_qrx_ev_t)(struct sfc_dp_rxq *dp_rxq, unsigned int id);
+
+/**
  * Receive queue purge function called after queue flush.
  *
  * Should be used to free unused recevie buffers.
@@ -156,6 +176,7 @@ struct sfc_dp_rx {
 	sfc_dp_rx_qdestroy_t			*qdestroy;
 	sfc_dp_rx_qstart_t			*qstart;
 	sfc_dp_rx_qstop_t			*qstop;
+	sfc_dp_rx_qrx_ev_t			*qrx_ev;
 	sfc_dp_rx_qpurge_t			*qpurge;
 	sfc_dp_rx_supported_ptypes_get_t	*supported_ptypes_get;
 	sfc_dp_rx_qdesc_npending_t		*qdesc_npending;
@@ -179,6 +200,7 @@ struct sfc_dp_rx {
 }
 
 extern struct sfc_dp_rx sfc_efx_rx;
+extern struct sfc_dp_rx sfc_ef10_rx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10_rx.c b/drivers/net/sfc/sfc_ef10_rx.c
new file mode 100644
index 0000000..2b5e885
--- /dev/null
+++ b/drivers/net/sfc/sfc_ef10_rx.c
@@ -0,0 +1,713 @@
+/*-
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* EF10 native datapath implementation */
+
+#include <stdbool.h>
+
+#include <rte_byteorder.h>
+#include <rte_mbuf_ptype.h>
+#include <rte_mbuf.h>
+#include <rte_io.h>
+
+#include "efx.h"
+#include "efx_types.h"
+#include "efx_regs.h"
+#include "efx_regs_ef10.h"
+
+#include "sfc_tweak.h"
+#include "sfc_dp_rx.h"
+#include "sfc_kvargs.h"
+
+#if 1
+/* Alignment requirement for value written to RX WPTR:
+ *  the WPTR must be aligned to an 8 descriptor boundary
+ */
+#define	EF10_RX_WPTR_ALIGN 8
+#endif
+
+
+struct sfc_ef10_rx_sw_desc {
+	struct rte_mbuf			*mbuf;
+};
+
+struct sfc_ef10_rxq {
+	/* Used on data path */
+	unsigned int			flags;
+#define SFC_EF10_RXQ_STARTED		0x1
+#define SFC_EF10_RXQ_RUNNING		0x2
+#define SFC_EF10_RXQ_EXCEPTION		0x4
+#define SFC_EF10_RXQ_RSS_HASH		0x8
+	unsigned int			ptr_mask;
+	unsigned int			prepared;
+	unsigned int			completed;
+	unsigned int			evq_read_ptr;
+	volatile efx_qword_t		*evq_hw_ring;
+	struct sfc_ef10_rx_sw_desc	*sw_ring;
+	uint64_t			rearm_data;
+	uint16_t			prefix_size;
+
+	/* Used on refill */
+	uint16_t			buf_size;
+	uint8_t				port_id;
+	unsigned int			added;
+	unsigned int			refill_threshold;
+	struct rte_mempool		*refill_mb_pool;
+	efx_qword_t			*rxq_hw_ring;
+	volatile void			*doorbell;
+
+	/* Datapath receive queue anchor */
+	struct sfc_dp_rxq		dp;
+	void				*ctrl;
+	sfc_dp_exception_t		*exception;
+};
+
+static inline struct sfc_ef10_rxq *
+sfc_ef10_rxq_by_dp_rxq(struct sfc_dp_rxq *dp_rxq)
+{
+	return container_of(dp_rxq, struct sfc_ef10_rxq, dp);
+}
+
+static void
+sfc_ef10_rx_qpush(struct sfc_ef10_rxq *rxq)
+{
+	efx_dword_t dword;
+
+	/* Hardware has alignment restriction for WPTR */
+	RTE_BUILD_BUG_ON(SFC_RX_REFILL_BULK % EF10_RX_WPTR_ALIGN != 0);
+	SFC_ASSERT(RTE_ALIGN(rxq->added, EF10_RX_WPTR_ALIGN) == rxq->added);
+
+	EFX_POPULATE_DWORD_1(dword, ERF_DZ_RX_DESC_WPTR,
+			     rxq->added & rxq->ptr_mask);
+
+	/* Make sure that all descriptor updates (Rx and event) reach memory */
+	rte_wmb();
+
+	/* DMA sync to device is not required */
+
+	rte_write32(dword.ed_u32[0], rxq->doorbell);
+}
+
+static void
+sfc_ef10_rx_qrefill(struct sfc_ef10_rxq *rxq)
+{
+	const unsigned int ptr_mask = rxq->ptr_mask;
+	const uint32_t buf_size = rxq->buf_size;
+	unsigned int free_space;
+	unsigned int bulks;
+	void *objs[SFC_RX_REFILL_BULK];
+	unsigned int added = rxq->added;
+
+	free_space = EFX_RXQ_LIMIT(ptr_mask + 1) - (added - rxq->completed);
+
+	if (free_space < rxq->refill_threshold)
+		return;
+
+	bulks = free_space / RTE_DIM(objs);
+
+	while (bulks-- > 0) {
+		unsigned int id;
+		unsigned int i;
+
+		if (unlikely(rte_mempool_get_bulk(rxq->refill_mb_pool, objs,
+						  RTE_DIM(objs)) < 0)) {
+			struct rte_eth_dev_data *dev_data =
+				rte_eth_devices[rxq->port_id].data;
+
+			/*
+			 * It is hardly a safe way to increment counter
+			 * from different contexts, but all PMDs do it.
+			 */
+			dev_data->rx_mbuf_alloc_failed += RTE_DIM(objs);
+			break;
+		}
+
+		for (i = 0, id = added & ptr_mask;
+		     i < RTE_DIM(objs);
+		     ++i, ++id) {
+			struct rte_mbuf *m = objs[i];
+			struct sfc_ef10_rx_sw_desc *rxd;
+			phys_addr_t phys_addr;
+
+			SFC_ASSERT((id & ~ptr_mask) == 0);
+			rxd = &rxq->sw_ring[id];
+			rxd->mbuf = m;
+
+			/*
+			 * Avoid writing to mbuf. It is cheaper to do it
+			 * when we receive packet and fill in nearby
+			 * structure members.
+			 */
+
+			phys_addr = rte_mbuf_data_dma_addr_default(m);
+			EFX_POPULATE_QWORD_2(rxq->rxq_hw_ring[id],
+			    ESF_DZ_RX_KER_BYTE_CNT, buf_size,
+			    ESF_DZ_RX_KER_BUF_ADDR, phys_addr);
+		}
+
+		added += RTE_DIM(objs);
+	}
+
+	/* Push doorbell if something is posted */
+	if (likely(rxq->added != added)) {
+		rxq->added = added;
+		sfc_ef10_rx_qpush(rxq);
+	}
+}
+
+static void
+sfc_ef10_rx_prefetch_next(struct sfc_ef10_rxq *rxq, unsigned int next_id)
+{
+	struct rte_mbuf *next_mbuf;
+
+	/* Prefetch next bunch of software descriptors */
+	if ((next_id % (RTE_CACHE_LINE_SIZE / sizeof(rxq->sw_ring[0]))) == 0)
+		rte_prefetch0(&rxq->sw_ring[next_id]);
+
+	/*
+	 * It looks strange to prefetch depending on previous prefetch
+	 * data, but measurements show that it is really efficient and
+	 * increases packet rate.
+	 */
+	next_mbuf = rxq->sw_ring[next_id].mbuf;
+	if (likely(next_mbuf != NULL)) {
+		/* Prefetch the next mbuf structure */
+		rte_mbuf_prefetch_part1(next_mbuf);
+
+		/* Prefetch pseudo header of the next packet */
+		/* data_off is not filled in yet */
+		/* Yes, data could be not ready yet, but we hope */
+		rte_prefetch0((uint8_t *)next_mbuf->buf_addr +
+			      RTE_PKTMBUF_HEADROOM);
+	}
+}
+
+static uint16_t
+sfc_ef10_rx_prepared(struct sfc_ef10_rxq *rxq, struct rte_mbuf **rx_pkts,
+		     uint16_t nb_pkts)
+{
+	uint16_t n_rx_pkts = RTE_MIN(nb_pkts, rxq->prepared);
+	unsigned int completed = rxq->completed;
+	unsigned int i;
+
+	rxq->prepared -= n_rx_pkts;
+	rxq->completed = completed + n_rx_pkts;
+
+	for (i = 0; i < n_rx_pkts; ++i, ++completed)
+		rx_pkts[i] = rxq->sw_ring[completed & rxq->ptr_mask].mbuf;
+
+	return n_rx_pkts;
+}
+
+static void
+sfc_ef10_rx_ev_to_offloads(struct sfc_ef10_rxq *rxq, const efx_qword_t rx_ev,
+			   struct rte_mbuf *m)
+{
+	uint32_t l2_ptype = 0;
+	uint32_t l3_ptype = 0;
+	uint32_t l4_ptype = 0;
+	uint64_t ol_flags = 0;
+
+	if (unlikely(EFX_TEST_QWORD_BIT(rx_ev, ESF_DZ_RX_PARSE_INCOMPLETE_LBN)))
+		goto done;
+
+	switch (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_ETH_TAG_CLASS)) {
+	case ESE_DZ_ETH_TAG_CLASS_NONE:
+		l2_ptype = RTE_PTYPE_L2_ETHER;
+		break;
+	case ESE_DZ_ETH_TAG_CLASS_VLAN1:
+		l2_ptype = RTE_PTYPE_L2_ETHER_VLAN;
+		break;
+	case ESE_DZ_ETH_TAG_CLASS_VLAN2:
+		l2_ptype = RTE_PTYPE_L2_ETHER_QINQ;
+		break;
+	default:
+		/* Unexpected Eth tag class */
+		SFC_ASSERT(false);
+	}
+
+	switch (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_L3_CLASS)) {
+	case ESE_DZ_L3_CLASS_IP4_FRAG:
+		l4_ptype = RTE_PTYPE_L4_FRAG;
+		/* FALLTHROUGH */
+	case ESE_DZ_L3_CLASS_IP4:
+		l3_ptype = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+		ol_flags |= PKT_RX_RSS_HASH |
+			((EFX_TEST_QWORD_BIT(rx_ev,
+					     ESF_DZ_RX_IPCKSUM_ERR_LBN)) ?
+			 PKT_RX_IP_CKSUM_BAD : PKT_RX_IP_CKSUM_GOOD);
+		break;
+	case ESE_DZ_L3_CLASS_IP6_FRAG:
+		l4_ptype |= RTE_PTYPE_L4_FRAG;
+		/* FALLTHROUGH */
+	case ESE_DZ_L3_CLASS_IP6:
+		l3_ptype |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+		ol_flags |= PKT_RX_RSS_HASH;
+		break;
+	case ESE_DZ_L3_CLASS_ARP:
+		/* Override Layer 2 packet type */
+		l2_ptype = RTE_PTYPE_L2_ETHER_ARP;
+		break;
+	default:
+		/* Unexpected Layer 3 class */
+		SFC_ASSERT(false);
+	}
+
+	switch (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_L4_CLASS)) {
+	case ESE_DZ_L4_CLASS_TCP:
+		l4_ptype = RTE_PTYPE_L4_TCP;
+		ol_flags |=
+			(EFX_TEST_QWORD_BIT(rx_ev,
+					    ESF_DZ_RX_TCPUDP_CKSUM_ERR_LBN)) ?
+			PKT_RX_L4_CKSUM_BAD : PKT_RX_L4_CKSUM_GOOD;
+		break;
+	case ESE_DZ_L4_CLASS_UDP:
+		l4_ptype = RTE_PTYPE_L4_UDP;
+		ol_flags |=
+			(EFX_TEST_QWORD_BIT(rx_ev,
+					    ESF_DZ_RX_TCPUDP_CKSUM_ERR_LBN)) ?
+			PKT_RX_L4_CKSUM_BAD : PKT_RX_L4_CKSUM_GOOD;
+		break;
+	case ESE_DZ_L4_CLASS_UNKNOWN:
+		break;
+	default:
+		/* Unexpected Layer 4 class */
+		SFC_ASSERT(false);
+	}
+
+	/* Remove RSS hash offload flag if RSS is not enabled */
+	if (~rxq->flags & SFC_EF10_RXQ_RSS_HASH)
+		ol_flags &= ~PKT_RX_RSS_HASH;
+
+done:
+	m->ol_flags = ol_flags;
+	m->packet_type = l2_ptype | l3_ptype | l4_ptype;
+}
+
+static uint16_t
+sfc_ef10_rx_pseudo_hdr_get_len(const uint8_t *pseudo_hdr)
+{
+	return rte_le_to_cpu_16(*(const uint16_t *)&pseudo_hdr[8]);
+}
+
+static uint32_t
+sfc_ef10_rx_pseudo_hdr_get_hash(const uint8_t *pseudo_hdr)
+{
+	return rte_le_to_cpu_32(*(const uint32_t *)pseudo_hdr);
+}
+
+static uint16_t
+sfc_ef10_rx_process_event(struct sfc_ef10_rxq *rxq, efx_qword_t rx_ev,
+			  struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	const unsigned int ptr_mask = rxq->ptr_mask;
+	unsigned int completed = rxq->completed;
+	unsigned int ready;
+	struct sfc_ef10_rx_sw_desc *rxd;
+	struct rte_mbuf *m;
+	struct rte_mbuf *m0;
+	uint16_t n_rx_pkts;
+	const uint8_t *pseudo_hdr;
+	uint16_t pkt_len;
+
+	ready = (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_DSC_PTR_LBITS) - completed) &
+		EFX_MASK32(ESF_DZ_RX_DSC_PTR_LBITS);
+	SFC_ASSERT(ready > 0);
+
+	if (rx_ev.eq_u64[0] &
+	    rte_cpu_to_le_64((1ull << ESF_DZ_RX_ECC_ERR_LBN) |
+			     (1ull << ESF_DZ_RX_ECRC_ERR_LBN))) {
+		SFC_ASSERT(rxq->prepared == 0);
+		rxq->completed += ready;
+		while (ready-- > 0) {
+			rxd = &rxq->sw_ring[completed++ & ptr_mask];
+			rte_mempool_put(rxq->refill_mb_pool, rxd->mbuf);
+		}
+		return 0;
+	}
+
+	n_rx_pkts = RTE_MIN(ready, nb_pkts);
+	rxq->prepared = ready - n_rx_pkts;
+	rxq->completed += n_rx_pkts;
+
+	rxd = &rxq->sw_ring[completed++ & ptr_mask];
+
+	sfc_ef10_rx_prefetch_next(rxq, completed & ptr_mask);
+
+	m = rxd->mbuf;
+
+	*rx_pkts++ = m;
+
+	*(uint64_t *)(&m->rearm_data) = rxq->rearm_data;
+	/* rearm_data rewrites ol_flags which is updated below */
+	rte_compiler_barrier();
+
+	/* Classify packet based on Rx event */
+	sfc_ef10_rx_ev_to_offloads(rxq, rx_ev, m);
+
+	/* data_off already moved past pseudo header */
+	pseudo_hdr = (uint8_t *)m->buf_addr + RTE_PKTMBUF_HEADROOM;
+
+	if (m->ol_flags & PKT_RX_RSS_HASH)
+		m->hash.rss = sfc_ef10_rx_pseudo_hdr_get_hash(pseudo_hdr);
+
+	if (ready == 1)
+		pkt_len = EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_BYTES) -
+			rxq->prefix_size;
+	else
+		pkt_len = sfc_ef10_rx_pseudo_hdr_get_len(pseudo_hdr);
+	SFC_ASSERT(pkt_len > 0);
+	rte_pktmbuf_data_len(m) = pkt_len;
+	rte_pktmbuf_pkt_len(m) = pkt_len;
+
+	m->next = NULL;
+
+	/* Remember mbuf to copy offload flags and packet type from */
+	m0 = m;
+	for (--ready; ready > 0; --ready) {
+		rxd = &rxq->sw_ring[completed++ & ptr_mask];
+
+		sfc_ef10_rx_prefetch_next(rxq, completed & ptr_mask);
+
+		m = rxd->mbuf;
+
+		if (ready > rxq->prepared)
+			*rx_pkts++ = m;
+
+		*(uint64_t *)(&m->rearm_data) = rxq->rearm_data;
+		/* rearm_data rewrites ol_flags which is updated below */
+		rte_compiler_barrier();
+
+		/* Event-dependent information is the same */
+		m->ol_flags = m0->ol_flags;
+		m->packet_type = m0->packet_type;
+
+		/* data_off already moved past pseudo header */
+		pseudo_hdr = (uint8_t *)m->buf_addr + RTE_PKTMBUF_HEADROOM;
+
+		if (m->ol_flags & PKT_RX_RSS_HASH)
+			m->hash.rss =
+				sfc_ef10_rx_pseudo_hdr_get_hash(pseudo_hdr);
+
+		pkt_len = sfc_ef10_rx_pseudo_hdr_get_len(pseudo_hdr);
+		SFC_ASSERT(pkt_len > 0);
+		rte_pktmbuf_data_len(m) = pkt_len;
+		rte_pktmbuf_pkt_len(m) = pkt_len;
+
+		m->next = NULL;
+	}
+
+	return n_rx_pkts;
+}
+
+static bool
+sfc_ef10_rx_get_event(struct sfc_ef10_rxq *rxq, efx_qword_t *rx_ev)
+{
+	if (unlikely(rxq->flags & SFC_EF10_RXQ_EXCEPTION))
+		return false;
+
+	*rx_ev = rxq->evq_hw_ring[rxq->evq_read_ptr & rxq->ptr_mask];
+
+	if (rx_ev->eq_u64[0] == UINT64_MAX)
+		return false;
+
+	if (unlikely(EFX_QWORD_FIELD(*rx_ev, FSF_AZ_EV_CODE) !=
+		     FSE_AZ_EV_CODE_RX_EV)) {
+		/* Do not move read_ptr to keep the event for exception
+		 * handling
+		 */
+		rxq->flags |= SFC_EF10_RXQ_EXCEPTION;
+		return false;
+	}
+
+	rxq->evq_read_ptr++;
+	return true;
+}
+
+static void
+sfc_ef10_ev_qfill(struct sfc_ef10_rxq *rxq, unsigned int old_read_ptr)
+{
+	const unsigned int read_ptr = rxq->evq_read_ptr;
+	const unsigned int ptr_mask = rxq->ptr_mask;
+
+	while (old_read_ptr != read_ptr) {
+		EFX_SET_QWORD(rxq->evq_hw_ring[old_read_ptr & ptr_mask]);
+		++old_read_ptr;
+	}
+
+	/*
+	 * No barriers here.
+	 * Functions which push doorbell should care about correct
+	 * ordering: store instructions which fill in EvQ ring should be
+	 * retired from CPU and DMA sync before doorbell which will allow
+	 * to use these event entries.
+	 */
+}
+
+static uint16_t
+sfc_ef10_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(rx_queue);
+	const unsigned int evq_old_read_ptr = rxq->evq_read_ptr;
+	uint16_t n_rx_pkts;
+	efx_qword_t rx_ev;
+
+	if (unlikely((rxq->flags & SFC_EF10_RXQ_RUNNING) == 0))
+		return 0;
+
+	n_rx_pkts = sfc_ef10_rx_prepared(rxq, rx_pkts, nb_pkts);
+
+	while (n_rx_pkts != nb_pkts && sfc_ef10_rx_get_event(rxq, &rx_ev)) {
+		if (EFX_TEST_QWORD_BIT(rx_ev, ESF_DZ_RX_DROP_EVENT_LBN))
+			continue;
+
+		n_rx_pkts += sfc_ef10_rx_process_event(rxq, rx_ev,
+						       rx_pkts + n_rx_pkts,
+						       nb_pkts - n_rx_pkts);
+	}
+
+	sfc_ef10_ev_qfill(rxq, evq_old_read_ptr);
+
+	if (unlikely(rxq->flags & SFC_EF10_RXQ_EXCEPTION))
+		rxq->exception(rxq->ctrl);
+	else
+		sfc_ef10_rx_qrefill(rxq);
+
+	return n_rx_pkts;
+}
+
+static const uint32_t *
+sfc_ef10_supported_ptypes_get(void)
+{
+	static const uint32_t ef10_native_ptypes[] = {
+		RTE_PTYPE_L2_ETHER,
+		RTE_PTYPE_L2_ETHER_ARP,
+		RTE_PTYPE_L2_ETHER_VLAN,
+		RTE_PTYPE_L2_ETHER_QINQ,
+		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+		RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+		RTE_PTYPE_L4_FRAG,
+		RTE_PTYPE_L4_TCP,
+		RTE_PTYPE_L4_UDP,
+		RTE_PTYPE_UNKNOWN
+	};
+
+	return ef10_native_ptypes;
+}
+
+static sfc_dp_rx_qdesc_npending_t sfc_ef10_rx_qdesc_npending;
+static unsigned int
+sfc_ef10_rx_qdesc_npending(__rte_unused struct sfc_dp_rxq *dp_rxq)
+{
+	/*
+	 * Correct implementation requires EvQ polling and events
+	 * processing (keeping all ready mbufs in prepared).
+	 */
+	return -ENOTSUP;
+}
+
+static void *
+sfc_ef10_rxq_get_ctrl(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	return rxq->ctrl;
+}
+
+static const struct sfc_dp_rxq_ops sfc_ef10_rxq_ops = {
+	.get_ctrl	= sfc_ef10_rxq_get_ctrl,
+};
+
+
+static uint64_t
+sfc_ef10_mk_mbuf_rearm_data(uint8_t port_id, uint16_t prefix_size)
+{
+	struct rte_mbuf m;
+
+	memset(&m, 0, sizeof(m));
+
+	rte_mbuf_refcnt_set(&m, 1);
+	m.data_off = RTE_PKTMBUF_HEADROOM + prefix_size;
+	m.nb_segs = 1;
+	m.port = port_id;
+
+	/* rearm_data covers structure members filled in above */
+	rte_compiler_barrier();
+	return *(uint64_t *)(&m.rearm_data);
+}
+
+static sfc_dp_rx_qcreate_t sfc_ef10_rx_qcreate;
+static int
+sfc_ef10_rx_qcreate(void *ctrl, sfc_dp_exception_t *exception, int socket_id,
+		    const struct sfc_dp_rx_qcreate_args *args,
+		    struct sfc_dp_rxq **dp_rxqp)
+{
+	struct sfc_ef10_rxq *rxq;
+	int rc;
+
+	rc = EINVAL;
+	if (args->rxq_entries != args->evq_entries)
+		goto fail_rxq_args;
+
+	rc = ENOMEM;
+	rxq = rte_zmalloc_socket("sfc-ef10-rxq", sizeof(*rxq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		goto fail_rxq_alloc;
+
+	rc = ENOMEM;
+	rxq->sw_ring = rte_calloc_socket("sfc-ef10-rxq-sw_ring",
+					 args->rxq_entries,
+					 sizeof(*rxq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq->sw_ring == NULL)
+		goto fail_desc_alloc;
+
+	if (args->flags & SFC_RXQ_FLAG_RSS_HASH)
+		rxq->flags |= SFC_EF10_RXQ_RSS_HASH;
+	rxq->ptr_mask = args->rxq_entries - 1;
+	rxq->evq_hw_ring = args->evq_hw_ring;
+	rxq->refill_threshold = args->refill_threshold;
+	rxq->rearm_data =
+		sfc_ef10_mk_mbuf_rearm_data(args->port_id, args->prefix_size);
+	rxq->prefix_size = args->prefix_size;
+	rxq->buf_size = args->buf_size;
+	rxq->port_id = args->port_id;
+	rxq->refill_mb_pool = args->refill_mb_pool;
+	rxq->rxq_hw_ring = args->rxq_hw_ring;
+	rxq->doorbell = (volatile uint8_t *)args->mem_bar +
+			ER_DZ_RX_DESC_UPD_REG_OFST +
+			args->hw_index * ER_DZ_RX_DESC_UPD_REG_STEP;
+
+
+	rxq->dp.ops = &sfc_ef10_rxq_ops;
+	rxq->ctrl = ctrl;
+	rxq->exception = exception;
+
+	*dp_rxqp = &rxq->dp;
+	return 0;
+
+fail_desc_alloc:
+	rte_free(rxq);
+
+fail_rxq_alloc:
+fail_rxq_args:
+	return rc;
+}
+
+static sfc_dp_rx_qdestroy_t sfc_ef10_rx_qdestroy;
+static void
+sfc_ef10_rx_qdestroy(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	rte_free(rxq->sw_ring);
+	rte_free(rxq);
+}
+
+static sfc_dp_rx_qstart_t sfc_ef10_rx_qstart;
+static int
+sfc_ef10_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->prepared = 0;
+	rxq->completed = rxq->added = 0;
+
+	sfc_ef10_rx_qrefill(rxq);
+
+	rxq->evq_read_ptr = evq_read_ptr;
+
+	rxq->flags |= (SFC_EF10_RXQ_STARTED | SFC_EF10_RXQ_RUNNING);
+	rxq->flags &= ~SFC_EF10_RXQ_EXCEPTION;
+
+	return 0;
+}
+
+static sfc_dp_rx_qstop_t sfc_ef10_rx_qstop;
+static void
+sfc_ef10_rx_qstop(struct sfc_dp_rxq *dp_rxq, unsigned int *evq_read_ptr)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->flags &= ~SFC_EF10_RXQ_RUNNING;
+
+	*evq_read_ptr = rxq->evq_read_ptr;
+}
+
+static sfc_dp_rx_qrx_ev_t sfc_ef10_rx_qrx_ev;
+static bool
+sfc_ef10_rx_qrx_ev(struct sfc_dp_rxq *dp_rxq, __rte_unused unsigned int id)
+{
+	__rte_unused struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	SFC_ASSERT(~rxq->flags & SFC_EF10_RXQ_RUNNING);
+
+	/*
+	 * It is safe to ignore Rx event since we free all mbufs on
+	 * queue purge anyway.
+	 */
+
+	return false;
+}
+
+static sfc_dp_rx_qpurge_t sfc_ef10_rx_qpurge;
+static void
+sfc_ef10_rx_qpurge(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+	unsigned int i;
+	struct sfc_ef10_rx_sw_desc *rxd;
+
+	for (i = rxq->completed; i != rxq->added; ++i) {
+		rxd = &rxq->sw_ring[i & rxq->ptr_mask];
+		rte_mempool_put(rxq->refill_mb_pool, rxd->mbuf);
+		rxd->mbuf = NULL;
+	}
+
+	rxq->flags &= ~SFC_EF10_RXQ_STARTED;
+}
+
+struct sfc_dp_rx sfc_ef10_rx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10,
+		.type		= SFC_DP_RX,
+		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_rx_qcreate,
+	.qdestroy		= sfc_ef10_rx_qdestroy,
+	.qstart			= sfc_ef10_rx_qstart,
+	.qstop			= sfc_ef10_rx_qstop,
+	.qrx_ev			= sfc_ef10_rx_qrx_ev,
+	.qpurge			= sfc_ef10_rx_qpurge,
+	.supported_ptypes_get	= sfc_ef10_supported_ptypes_get,
+	.qdesc_npending		= sfc_ef10_rx_qdesc_npending,
+	.pkt_burst		= sfc_ef10_recv_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 3207cf4..f6562f9 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1237,6 +1237,15 @@
 	if (sa == NULL || sa->state == SFC_ADAPTER_UNINITIALIZED)
 		return -E_RTE_SECONDARY;
 
+	switch (sa->family) {
+	case EFX_FAMILY_HUNTINGTON:
+	case EFX_FAMILY_MEDFORD:
+		avail_caps |= SFC_DP_HW_FW_CAP_EF10;
+		break;
+	default:
+		break;
+	}
+
 	rc = sfc_kvargs_process(sa, SFC_KVARG_RX_DATAPATH,
 				sfc_kvarg_string_handler, &rx_name);
 	if (rc != 0)
@@ -1285,8 +1294,11 @@
 sfc_register_dp(void)
 {
 	/* Register once */
-	if (TAILQ_EMPTY(&sfc_dp_head))
+	if (TAILQ_EMPTY(&sfc_dp_head)) {
+		/* Prefer EF10 datapath */
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
+	}
 }
 
 static int
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index dca3a81..64694f5 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -157,6 +157,20 @@
 }
 
 static boolean_t
+sfc_ev_dp_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
+	     __rte_unused uint32_t size, __rte_unused uint16_t flags)
+{
+	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
+
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	SFC_ASSERT(evq->sa->dp_rx->qrx_ev != NULL);
+	return evq->sa->dp_rx->qrx_ev(dp_rxq, id);
+}
+
+static boolean_t
 sfc_ev_nop_tx(void *arg, uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
@@ -415,7 +429,7 @@
 
 static const efx_ev_callbacks_t sfc_ev_callbacks_dp_rx = {
 	.eec_initialized	= sfc_ev_initialized,
-	.eec_rx			= sfc_ev_nop_rx,
+	.eec_rx			= sfc_ev_dp_rx,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 2d0ffde..b57439b 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -53,10 +53,12 @@
 	    SFC_KVARG_PERF_PROFILE_LOW_LATENCY "]"
 
 #define SFC_KVARG_DATAPATH_EFX		"efx"
+#define SFC_KVARG_DATAPATH_EF10		"ef10"
 
 #define SFC_KVARG_RX_DATAPATH		"rx_datapath"
 #define SFC_KVARG_VALUES_RX_DATAPATH \
-	"[" SFC_KVARG_DATAPATH_EFX "]"
+	"[" SFC_KVARG_DATAPATH_EFX "|" \
+	    SFC_KVARG_DATAPATH_EF10 "]"
 
 struct sfc_adapter;
 
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 0095afd..2dda5c7 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -904,6 +904,11 @@ struct sfc_dp_rx sfc_efx_rx = {
 #endif
 
 	args.rxq_entries = rxq_info->entries;
+	args.rxq_hw_ring = rxq->mem.esm_base;
+	args.evq_entries = rxq_info->entries;
+	args.evq_hw_ring = evq->mem.esm_base;
+	args.hw_index = rxq->hw_index;
+	args.mem_bar = sa->mem_bar.esb_base;
 
 	rc = sa->dp_rx->qcreate(rxq, sfc_rx_dp_exception, socket_id, &args,
 				&rxq->dp);
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 07/13] net/sfc: factory out libefx-based Tx datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (5 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 06/13] net/sfc: implement EF10 native Rx datapath Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 08/13] net/sfc: VLAN insertion is a datapath dependent feature Andrew Rybchenko
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.

libefx-based Tx datapath is bound to libefx control path, but
other datapaths should be possible to use with alternative
control path(s).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst  |   8 ++
 drivers/net/sfc/sfc.h        |   1 +
 drivers/net/sfc/sfc_dp.c     |   4 +-
 drivers/net/sfc/sfc_dp.h     |   1 +
 drivers/net/sfc/sfc_dp_tx.h  | 155 ++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c |  41 +++++-
 drivers/net/sfc/sfc_ev.c     |  50 ++++++--
 drivers/net/sfc/sfc_ev.h     |   8 +-
 drivers/net/sfc/sfc_kvargs.c |   1 +
 drivers/net/sfc/sfc_kvargs.h |   4 +
 drivers/net/sfc/sfc_tso.c    |  22 ++--
 drivers/net/sfc/sfc_tx.c     | 298 ++++++++++++++++++++++++++++++++-----------
 drivers/net/sfc/sfc_tx.h     |  95 +++++++++-----
 13 files changed, 557 insertions(+), 131 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_dp_tx.h

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 0aa6740..e864ccc 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -191,6 +191,14 @@ boolean parameters value.
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
 
+- ``tx_datapath`` [auto|efx] (default **auto**)
+
+  Choose transmit datapath implementation.
+  **auto** allows the driver itself to make a choice based on firmware
+  features available and required by the datapath implementation.
+  **efx** chooses libefx-based datapath which supports VLAN insertion
+  (full-feature firmware variant only), TSO and multi-segment mbufs.
+
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
   Choose hardware tunning to be optimized for either throughput or
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 2512f2e..893eafe 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -212,6 +212,7 @@ struct sfc_adapter {
 #endif
 
 	const struct sfc_dp_rx		*dp_rx;
+	const struct sfc_dp_tx		*dp_tx;
 };
 
 /*
diff --git a/drivers/net/sfc/sfc_dp.c b/drivers/net/sfc/sfc_dp.c
index c4d5fb3..47bd500 100644
--- a/drivers/net/sfc/sfc_dp.c
+++ b/drivers/net/sfc/sfc_dp.c
@@ -76,7 +76,9 @@ struct sfc_dp *
 	if (sfc_dp_find_by_name(head, entry->type, entry->name) != NULL) {
 		rte_log(RTE_LOG_ERR, RTE_LOGTYPE_PMD,
 			"sfc %s dapapath '%s' already registered\n",
-			entry->type == SFC_DP_RX ? "Rx" : "unknown",
+			entry->type == SFC_DP_RX ? "Rx" :
+			entry->type == SFC_DP_TX ? "Tx" :
+			"unknown",
 			entry->name);
 		return EEXIST;
 	}
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
index 39e1e70..d3e7007 100644
--- a/drivers/net/sfc/sfc_dp.h
+++ b/drivers/net/sfc/sfc_dp.h
@@ -52,6 +52,7 @@
 
 enum sfc_dp_type {
 	SFC_DP_RX = 0,	/**< Receive datapath */
+	SFC_DP_TX,	/**< Transmit datapath */
 };
 
 /** Datapath definition */
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
new file mode 100644
index 0000000..4879db5
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -0,0 +1,155 @@
+/*-
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_DP_TX_H
+#define _SFC_DP_TX_H
+
+#include <rte_ethdev.h>
+
+#include "sfc_dp.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct sfc_dp_txq;
+
+/**
+ * Callback to get control path transmit queue by datapath transmit queue
+ * handle.
+ */
+typedef void * (sfc_dp_txq_get_ctrl_t)(struct sfc_dp_txq *dp_txq);
+
+/** Datapath transmit queue operations */
+struct sfc_dp_txq_ops {
+	sfc_dp_txq_get_ctrl_t		*get_ctrl;
+};
+
+/**
+ * Generic transmit queue information used on data path.
+ * It must be kept as small as it is possible since it is built into
+ * the structure used on datapath.
+ */
+struct sfc_dp_txq {
+	const struct sfc_dp_txq_ops	*ops;
+};
+
+/**
+ * Datapath transmit queue creation arguments.
+ *
+ * The structure is used just to pass information from control path to
+ * datapath. It could be just function arguments, but it would be hardly
+ * readable.
+ */
+struct sfc_dp_tx_qcreate_args {
+	/** Minimum number of unused Tx descriptors to do reap */
+	unsigned int		free_thresh;
+	/** Transmit queue configuration flags */
+	unsigned int		flags;
+	/** Tx queue size */
+	unsigned int		txq_entries;
+};
+
+/**
+ * Allocate and initalize datapath transmit queue.
+ *
+ * @param ctrl		Control path Tx queue opaque handle
+ * @param exception	Datapath exception handler to bail out to control path
+ * @param socket_id	Socket ID to allocate memory
+ * @param args		Tx queue details wrapped in structure
+ * @param dp_txqp	Location for generic datapath transmit queue pointer
+ *
+ * @return 0 or positive errno.
+ */
+typedef int (sfc_dp_tx_qcreate_t)(void *ctrl, sfc_dp_exception_t *exception,
+				  int socket_id,
+				  const struct sfc_dp_tx_qcreate_args *args,
+				  struct sfc_dp_txq **dp_txqp);
+
+/**
+ * Free resources allocated for datapath transmit queue.
+ */
+typedef void (sfc_dp_tx_qdestroy_t)(struct sfc_dp_txq *dp_txq);
+
+/**
+ * Transmit queue start callback.
+ *
+ * It handovers EvQ to the datapath.
+ */
+typedef int (sfc_dp_tx_qstart_t)(struct sfc_dp_txq *dp_txq,
+				 unsigned int evq_read_ptr,
+				 unsigned int txq_desc_index);
+
+/**
+ * Transmit queue stop function called before the queue flush.
+ *
+ * It returns EvQ to the control path.
+ */
+typedef void (sfc_dp_tx_qstop_t)(struct sfc_dp_txq *dp_txq,
+				 unsigned int *evq_read_ptr);
+
+/**
+ * Transmit queue function called after the queue flush.
+ */
+typedef void (sfc_dp_tx_qreap_t)(struct sfc_dp_txq *dp_txq);
+
+/** Transmit datapath definition */
+struct sfc_dp_tx {
+	struct sfc_dp			dp;
+
+	sfc_dp_tx_qcreate_t		*qcreate;
+	sfc_dp_tx_qdestroy_t		*qdestroy;
+	sfc_dp_tx_qstart_t		*qstart;
+	sfc_dp_tx_qstop_t		*qstop;
+	sfc_dp_tx_qreap_t		*qreap;
+	eth_tx_burst_t			pkt_burst;
+};
+
+static inline struct sfc_dp_tx *
+sfc_dp_find_tx_by_name(struct sfc_dp_list *head, const char *name)
+{
+	struct sfc_dp *p = sfc_dp_find_by_name(head, SFC_DP_TX, name);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_tx, dp);
+}
+
+static inline struct sfc_dp_tx *
+sfc_dp_find_tx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
+{
+	struct sfc_dp *p = sfc_dp_find_by_caps(head, SFC_DP_TX, avail_caps);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_tx, dp);
+}
+
+extern struct sfc_dp_tx sfc_efx_tx;
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_DP_TX_H */
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index f6562f9..c6db730 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -416,7 +416,7 @@
 	if (rc != 0)
 		goto fail_tx_qinit;
 
-	dev->data->tx_queues[tx_queue_id] = sa->txq_info[tx_queue_id].txq;
+	dev->data->tx_queues[tx_queue_id] = sa->txq_info[tx_queue_id].txq->dp;
 
 	sfc_adapter_unlock(sa);
 	return 0;
@@ -1232,6 +1232,7 @@
 	struct sfc_adapter *sa = dev->data->dev_private;
 	unsigned int avail_caps = 0;
 	const char *rx_name = NULL;
+	const char *tx_name = NULL;
 	int rc;
 
 	if (sa == NULL || sa->state == SFC_ADAPTER_UNINITIALIZED)
@@ -1279,12 +1280,45 @@
 
 	dev->rx_pkt_burst = sa->dp_rx->pkt_burst;
 
-	dev->tx_pkt_burst = sfc_xmit_pkts;
+	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
+				sfc_kvarg_string_handler, &tx_name);
+	if (rc != 0)
+		goto fail_kvarg_tx_datapath;
+
+	if (tx_name != NULL) {
+		sa->dp_tx = sfc_dp_find_tx_by_name(&sfc_dp_head, tx_name);
+		if (sa->dp_tx == NULL) {
+			sfc_err(sa, "Tx datapath %s not found", tx_name);
+			rc = ENOENT;
+			goto fail_dp_tx;
+		}
+		if (!sfc_dp_match_hw_fw_caps(&sa->dp_tx->dp, avail_caps)) {
+			sfc_err(sa,
+				"Insufficient Hw/FW capabilities to use Tx datapath %s",
+				tx_name);
+			rc = EINVAL;
+			goto fail_dp_tx;
+		}
+	} else {
+		sa->dp_tx = sfc_dp_find_tx_by_caps(&sfc_dp_head, avail_caps);
+		if (sa->dp_tx == NULL) {
+			sfc_err(sa, "Tx datapath by caps %#x not found",
+				avail_caps);
+			rc = ENOENT;
+			goto fail_dp_tx;
+		}
+	}
+
+	sfc_info(sa, "use %s Tx datapath", sa->dp_tx->dp.name);
+
+	dev->tx_pkt_burst = sa->dp_tx->pkt_burst;
 
 	dev->dev_ops = &sfc_eth_dev_ops;
 
 	return 0;
 
+fail_dp_tx:
+fail_kvarg_tx_datapath:
 fail_dp_rx:
 fail_kvarg_rx_datapath:
 	return rc;
@@ -1298,6 +1332,8 @@
 		/* Prefer EF10 datapath */
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
+
+		sfc_dp_register(&sfc_dp_head, &sfc_efx_tx.dp);
 	}
 }
 
@@ -1431,6 +1467,7 @@
 RTE_PMD_REGISTER_KMOD_DEP(net_sfc_efx, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PARAM_STRING(net_sfc_efx,
 	SFC_KVARG_RX_DATAPATH "=" SFC_KVARG_VALUES_RX_DATAPATH " "
+	SFC_KVARG_TX_DATAPATH "=" SFC_KVARG_VALUES_TX_DATAPATH " "
 	SFC_KVARG_PERF_PROFILE "=" SFC_KVARG_VALUES_PERF_PROFILE " "
 	SFC_KVARG_MCDI_LOGGING "=" SFC_KVARG_VALUES_BOOL " "
 	SFC_KVARG_DEBUG_INIT "=" SFC_KVARG_VALUES_BOOL);
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 64694f5..04e923f 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -184,16 +184,18 @@
 sfc_ev_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
-	struct sfc_txq *txq;
+	struct sfc_dp_txq *dp_txq;
+	struct sfc_efx_txq *txq;
 	unsigned int stop;
 	unsigned int delta;
 
-	txq = evq->txq;
+	dp_txq = evq->dp_txq;
+	SFC_ASSERT(dp_txq != NULL);
 
-	SFC_ASSERT(txq != NULL);
+	txq = sfc_efx_txq_by_dp_txq(dp_txq);
 	SFC_ASSERT(txq->evq == evq);
 
-	if (unlikely((txq->state & SFC_TXQ_STARTED) == 0))
+	if (unlikely((txq->flags & SFC_EFX_TXQ_FLAG_STARTED) == 0))
 		goto done;
 
 	stop = (id + 1) & txq->ptr_mask;
@@ -306,9 +308,13 @@
 sfc_ev_txq_flush_done(void *arg, __rte_unused uint32_t txq_hw_index)
 {
 	struct sfc_evq *evq = arg;
+	struct sfc_dp_txq *dp_txq;
 	struct sfc_txq *txq;
 
-	txq = evq->txq;
+	dp_txq = evq->dp_txq;
+	SFC_ASSERT(dp_txq != NULL);
+
+	txq = dp_txq->ops->get_ctrl(dp_txq);
 	SFC_ASSERT(txq != NULL);
 	SFC_ASSERT(txq->hw_index == txq_hw_index);
 	SFC_ASSERT(txq->evq == evq);
@@ -442,7 +448,7 @@
 	.eec_link_change	= sfc_ev_nop_link_change,
 };
 
-static const efx_ev_callbacks_t sfc_ev_callbacks_tx = {
+static const efx_ev_callbacks_t sfc_ev_callbacks_efx_tx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
 	.eec_tx			= sfc_ev_tx,
@@ -457,6 +463,21 @@
 	.eec_link_change	= sfc_ev_nop_link_change,
 };
 
+static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
+	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
+	.eec_tx			= sfc_ev_nop_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_nop_link_change,
+};
+
 
 void
 sfc_ev_qpoll(struct sfc_evq *evq)
@@ -490,8 +511,12 @@
 					rxq_sw_index);
 		}
 
-		if (evq->txq != NULL) {
-			unsigned int txq_sw_index = sfc_txq_sw_index(evq->txq);
+		if (evq->dp_txq != NULL) {
+			struct sfc_txq *txq;
+			unsigned int txq_sw_index;
+
+			txq = evq->dp_txq->ops->get_ctrl(evq->dp_txq);
+			txq_sw_index = sfc_txq_sw_index(txq);
 
 			sfc_warn(sa,
 				 "restart TxQ %u because of exception on its EvQ %u",
@@ -561,14 +586,17 @@
 	if (rc != 0)
 		goto fail_ev_qcreate;
 
-	SFC_ASSERT(evq->dp_rxq == NULL || evq->txq == NULL);
+	SFC_ASSERT(evq->dp_rxq == NULL || evq->dp_txq == NULL);
 	if (evq->dp_rxq != 0) {
 		if (strcmp(sa->dp_rx->dp.name, SFC_KVARG_DATAPATH_EFX) == 0)
 			evq->callbacks = &sfc_ev_callbacks_efx_rx;
 		else
 			evq->callbacks = &sfc_ev_callbacks_dp_rx;
-	} else if (evq->txq != 0) {
-		evq->callbacks = &sfc_ev_callbacks_tx;
+	} else if (evq->dp_txq != 0) {
+		if (strcmp(sa->dp_tx->dp.name, SFC_KVARG_DATAPATH_EFX) == 0)
+			evq->callbacks = &sfc_ev_callbacks_efx_tx;
+		else
+			evq->callbacks = &sfc_ev_callbacks_dp_tx;
 	} else {
 		evq->callbacks = &sfc_ev_callbacks;
 	}
diff --git a/drivers/net/sfc/sfc_ev.h b/drivers/net/sfc/sfc_ev.h
index e99cd74..f7434a3 100644
--- a/drivers/net/sfc/sfc_ev.h
+++ b/drivers/net/sfc/sfc_ev.h
@@ -30,8 +30,12 @@
 #ifndef _SFC_EV_H_
 #define _SFC_EV_H_
 
+#include <rte_ethdev.h>
+
 #include "efx.h"
 
+#include "sfc.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -41,7 +45,7 @@
 
 struct sfc_adapter;
 struct sfc_dp_rxq;
-struct sfc_txq;
+struct sfc_dp_txq;
 
 enum sfc_evq_state {
 	SFC_EVQ_UNINITIALIZED = 0,
@@ -60,7 +64,7 @@ struct sfc_evq {
 	boolean_t			exception;
 	efsys_mem_t			mem;
 	struct sfc_dp_rxq		*dp_rxq;
-	struct sfc_txq			*txq;
+	struct sfc_dp_txq		*dp_txq;
 
 	/* Not used on datapath */
 	struct sfc_adapter		*sa;
diff --git a/drivers/net/sfc/sfc_kvargs.c b/drivers/net/sfc/sfc_kvargs.c
index d8529fa..f23bc1a 100644
--- a/drivers/net/sfc/sfc_kvargs.c
+++ b/drivers/net/sfc/sfc_kvargs.c
@@ -46,6 +46,7 @@
 		SFC_KVARG_MCDI_LOGGING,
 		SFC_KVARG_PERF_PROFILE,
 		SFC_KVARG_RX_DATAPATH,
+		SFC_KVARG_TX_DATAPATH,
 		NULL,
 	};
 
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index b57439b..14f46db 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -60,6 +60,10 @@
 	"[" SFC_KVARG_DATAPATH_EFX "|" \
 	    SFC_KVARG_DATAPATH_EF10 "]"
 
+#define SFC_KVARG_TX_DATAPATH		"tx_datapath"
+#define SFC_KVARG_VALUES_TX_DATAPATH \
+	"[" SFC_KVARG_DATAPATH_EFX "]"
+
 struct sfc_adapter;
 
 int sfc_kvargs_parse(struct sfc_adapter *sa);
diff --git a/drivers/net/sfc/sfc_tso.c b/drivers/net/sfc/sfc_tso.c
index 68d84c9..317c805 100644
--- a/drivers/net/sfc/sfc_tso.c
+++ b/drivers/net/sfc/sfc_tso.c
@@ -42,13 +42,13 @@
 #define SFC_TSO_OPDESCS_IDX_SHIFT	2
 
 int
-sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
-			unsigned int txq_entries, unsigned int socket_id)
+sfc_efx_tso_alloc_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+			    unsigned int txq_entries, unsigned int socket_id)
 {
 	unsigned int i;
 
 	for (i = 0; i < txq_entries; ++i) {
-		sw_ring[i].tsoh = rte_malloc_socket("sfc-txq-tsoh-obj",
+		sw_ring[i].tsoh = rte_malloc_socket("sfc-efx-txq-tsoh-obj",
 						    SFC_TSOH_STD_LEN,
 						    SFC_TX_SEG_BOUNDARY,
 						    socket_id);
@@ -66,7 +66,8 @@
 }
 
 void
-sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring, unsigned int txq_entries)
+sfc_efx_tso_free_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+			   unsigned int txq_entries)
 {
 	unsigned int i;
 
@@ -77,8 +78,8 @@
 }
 
 static void
-sfc_tso_prepare_header(struct sfc_txq *txq, struct rte_mbuf **in_seg,
-		       size_t *in_off, unsigned int idx, size_t bytes_left)
+sfc_efx_tso_prepare_header(struct sfc_efx_txq *txq, struct rte_mbuf **in_seg,
+			   size_t *in_off, unsigned int idx, size_t bytes_left)
 {
 	struct rte_mbuf *m = *in_seg;
 	size_t bytes_to_copy = 0;
@@ -109,9 +110,9 @@
 }
 
 int
-sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
-	   size_t *in_off, efx_desc_t **pend, unsigned int *pkt_descs,
-	   size_t *pkt_len)
+sfc_efx_tso_do(struct sfc_efx_txq *txq, unsigned int idx,
+	       struct rte_mbuf **in_seg, size_t *in_off, efx_desc_t **pend,
+	       unsigned int *pkt_descs, size_t *pkt_len)
 {
 	uint8_t *tsoh;
 	const struct tcp_hdr *th;
@@ -151,7 +152,8 @@
 	 */
 	if ((m->data_len < header_len) ||
 	    ((paddr_next_frag - header_paddr) < header_len)) {
-		sfc_tso_prepare_header(txq, in_seg, in_off, idx, header_len);
+		sfc_efx_tso_prepare_header(txq, in_seg, in_off, idx,
+					   header_len);
 		tsoh = txq->sw_ring[idx & txq->ptr_mask].tsoh;
 
 		header_paddr = rte_malloc_virt2phy((void *)tsoh);
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 5a6282c..31bccac 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -33,6 +33,7 @@
 #include "sfc_ev.h"
 #include "sfc_tx.h"
 #include "sfc_tweak.h"
+#include "sfc_kvargs.h"
 
 /*
  * Maximum number of TX queue flush attempts in case of
@@ -109,27 +110,29 @@
 	txq->state &= ~SFC_TXQ_FLUSHING;
 }
 
+static sfc_dp_exception_t sfc_tx_dp_exception;
 static void
-sfc_tx_reap(struct sfc_txq *txq)
+sfc_tx_dp_exception(void *ctrl)
 {
-	unsigned int    completed;
+	struct sfc_txq *txq = ctrl;
+	struct sfc_adapter *sa = txq->evq->sa;
+	unsigned int txq_sw_index;
+	int rc;
 
+	if (!sfc_adapter_trylock(sa))
+		return;
 
-	sfc_ev_qpoll(txq->evq);
-
-	for (completed = txq->completed;
-	     completed != txq->pending; completed++) {
-		struct sfc_tx_sw_desc *txd;
+	txq_sw_index = sfc_txq_sw_index(txq);
 
-		txd = &txq->sw_ring[completed & txq->ptr_mask];
+	sfc_warn(sa, "restart TxQ %u because of datapath exception",
+		 txq_sw_index);
 
-		if (txd->mbuf != NULL) {
-			rte_pktmbuf_free(txd->mbuf);
-			txd->mbuf = NULL;
-		}
-	}
+	sfc_tx_qstop(sa, txq_sw_index);
+	rc = sfc_tx_qstart(sa, txq_sw_index);
+	if (rc != 0)
+		sfc_err(sa, "cannot restart TxQ %u", txq_sw_index);
 
-	txq->completed = completed;
+	sfc_adapter_unlock(sa);
 }
 
 int
@@ -142,6 +145,7 @@
 	struct sfc_txq *txq;
 	unsigned int evq_index = sfc_evq_index_by_txq_sw_index(sa, sw_index);
 	int rc = 0;
+	struct sfc_dp_tx_qcreate_args args;
 
 	sfc_log_init(sa, "TxQ = %u", sw_index);
 
@@ -166,56 +170,43 @@
 	if (txq == NULL)
 		goto fail_txq_alloc;
 
+	txq_info->txq = txq;
+
+	txq->hw_index = sw_index;
+	txq->evq = evq;
+	txq->free_thresh =
+		(tx_conf->tx_free_thresh) ? tx_conf->tx_free_thresh :
+		SFC_TX_DEFAULT_FREE_THRESH;
+	txq->flags = tx_conf->txq_flags;
+
 	rc = sfc_dma_alloc(sa, "txq", sw_index, EFX_TXQ_SIZE(txq_info->entries),
 			   socket_id, &txq->mem);
 	if (rc != 0)
 		goto fail_dma_alloc;
 
-	rc = ENOMEM;
-	txq->pend_desc = rte_calloc_socket("sfc-txq-pend-desc",
-					   EFX_TXQ_LIMIT(txq_info->entries),
-					   sizeof(efx_desc_t), 0, socket_id);
-	if (txq->pend_desc == NULL)
-		goto fail_pend_desc_alloc;
+	memset(&args, 0, sizeof(args));
+	args.free_thresh = txq->free_thresh;
+	args.flags = tx_conf->txq_flags;
+	args.txq_entries = txq_info->entries;
 
-	rc = ENOMEM;
-	txq->sw_ring = rte_calloc_socket("sfc-txq-desc", txq_info->entries,
-					 sizeof(*txq->sw_ring), 0, socket_id);
-	if (txq->sw_ring == NULL)
-		goto fail_desc_alloc;
+	rc = sa->dp_tx->qcreate(txq, sfc_tx_dp_exception, socket_id, &args,
+				&txq->dp);
+	if (rc != 0)
+		goto fail_dp_tx_qinit;
 
-	if (sa->tso) {
-		rc = sfc_tso_alloc_tsoh_objs(txq->sw_ring, txq_info->entries,
-					     socket_id);
-		if (rc != 0)
-			goto fail_alloc_tsoh_objs;
-	}
+	evq->dp_txq = txq->dp;
 
 	txq->state = SFC_TXQ_INITIALIZED;
-	txq->ptr_mask = txq_info->entries - 1;
-	txq->free_thresh = (tx_conf->tx_free_thresh) ? tx_conf->tx_free_thresh :
-						     SFC_TX_DEFAULT_FREE_THRESH;
-	txq->hw_index = sw_index;
-	txq->flags = tx_conf->txq_flags;
-	txq->evq = evq;
-
-	evq->txq = txq;
 
-	txq_info->txq = txq;
 	txq_info->deferred_start = (tx_conf->tx_deferred_start != 0);
 
 	return 0;
 
-fail_alloc_tsoh_objs:
-	rte_free(txq->sw_ring);
-
-fail_desc_alloc:
-	rte_free(txq->pend_desc);
-
-fail_pend_desc_alloc:
+fail_dp_tx_qinit:
 	sfc_dma_free(sa, &txq->mem);
 
 fail_dma_alloc:
+	txq_info->txq = NULL;
 	rte_free(txq);
 
 fail_txq_alloc:
@@ -244,13 +235,12 @@
 	SFC_ASSERT(txq != NULL);
 	SFC_ASSERT(txq->state == SFC_TXQ_INITIALIZED);
 
-	sfc_tso_free_tsoh_objs(txq->sw_ring, txq_info->entries);
+	sa->dp_tx->qdestroy(txq->dp);
+	txq->dp = NULL;
 
 	txq_info->txq = NULL;
 	txq_info->entries = 0;
 
-	rte_free(txq->sw_ring);
-	rte_free(txq->pend_desc);
 	sfc_dma_free(sa, &txq->mem);
 	rte_free(txq);
 }
@@ -405,12 +395,13 @@
 		goto fail_tx_qcreate;
 	}
 
-	txq->added = txq->pending = txq->completed = desc_index;
-	txq->hw_vlan_tci = 0;
-
 	efx_tx_qenable(txq->common);
 
-	txq->state |= (SFC_TXQ_STARTED | SFC_TXQ_RUNNING);
+	txq->state |= SFC_TXQ_STARTED;
+
+	rc = sa->dp_tx->qstart(txq->dp, evq->read_ptr, desc_index);
+	if (rc != 0)
+		goto fail_dp_qstart;
 
 	/*
 	 * It seems to be used by DPDK for debug purposes only ('rte_ether')
@@ -420,6 +411,10 @@
 
 	return 0;
 
+fail_dp_qstart:
+	txq->state = SFC_TXQ_INITIALIZED;
+	efx_tx_qdestroy(txq->common);
+
 fail_tx_qcreate:
 	sfc_ev_qstop(sa, evq->evq_index);
 
@@ -435,7 +430,6 @@
 	struct sfc_txq *txq;
 	unsigned int retry_count;
 	unsigned int wait_count;
-	unsigned int txds;
 
 	sfc_log_init(sa, "TxQ = %u", sw_index);
 
@@ -449,7 +443,7 @@
 
 	SFC_ASSERT(txq->state & SFC_TXQ_STARTED);
 
-	txq->state &= ~SFC_TXQ_RUNNING;
+	sa->dp_tx->qstop(txq->dp, &txq->evq->read_ptr);
 
 	/*
 	 * Retry TX queue flushing in case of flush failed or
@@ -484,14 +478,7 @@
 			sfc_info(sa, "TxQ %u flushed", sw_index);
 	}
 
-	sfc_tx_reap(txq);
-
-	for (txds = 0; txds < txq_info->entries; txds++) {
-		if (txq->sw_ring[txds].mbuf != NULL) {
-			rte_pktmbuf_free(txq->sw_ring[txds].mbuf);
-			txq->sw_ring[txds].mbuf = NULL;
-		}
-	}
+	sa->dp_tx->qreap(txq->dp);
 
 	txq->state = SFC_TXQ_INITIALIZED;
 
@@ -563,6 +550,28 @@
 	efx_tx_fini(sa->nic);
 }
 
+static void
+sfc_efx_tx_reap(struct sfc_efx_txq *txq)
+{
+	unsigned int completed;
+
+	sfc_ev_qpoll(txq->evq);
+
+	for (completed = txq->completed;
+	     completed != txq->pending; completed++) {
+		struct sfc_efx_tx_sw_desc *txd;
+
+		txd = &txq->sw_ring[completed & txq->ptr_mask];
+
+		if (txd->mbuf != NULL) {
+			rte_pktmbuf_free(txd->mbuf);
+			txd->mbuf = NULL;
+		}
+	}
+
+	txq->completed = completed;
+}
+
 /*
  * The function is used to insert or update VLAN tag;
  * the firmware has state of the firmware tag to insert per TxQ
@@ -571,8 +580,8 @@
  * the function will update it
  */
 static unsigned int
-sfc_tx_maybe_insert_tag(struct sfc_txq *txq, struct rte_mbuf *m,
-			efx_desc_t **pend)
+sfc_efx_tx_maybe_insert_tag(struct sfc_efx_txq *txq, struct rte_mbuf *m,
+			    efx_desc_t **pend)
 {
 	uint16_t this_tag = ((m->ol_flags & PKT_TX_VLAN_PKT) ?
 			     m->vlan_tci : 0);
@@ -594,10 +603,11 @@
 	return 1;
 }
 
-uint16_t
-sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+static uint16_t
+sfc_efx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
-	struct sfc_txq *txq = (struct sfc_txq *)tx_queue;
+	struct sfc_dp_txq *dp_txq = (struct sfc_dp_txq *)tx_queue;
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
 	unsigned int added = txq->added;
 	unsigned int pushed = added;
 	unsigned int pkts_sent = 0;
@@ -609,7 +619,7 @@
 	int rc __rte_unused;
 	struct rte_mbuf **pktp;
 
-	if (unlikely((txq->state & SFC_TXQ_RUNNING) == 0))
+	if (unlikely((txq->flags & SFC_EFX_TXQ_FLAG_RUNNING) == 0))
 		goto done;
 
 	/*
@@ -620,7 +630,7 @@
 	reap_done = (fill_level > soft_max_fill);
 
 	if (reap_done) {
-		sfc_tx_reap(txq);
+		sfc_efx_tx_reap(txq);
 		/*
 		 * Recalculate fill level since 'txq->completed'
 		 * might have changed on reap
@@ -643,15 +653,16 @@
 		 * DEV_TX_VLAN_OFFLOAD and pushes VLAN TCI, then
 		 * TX_ERROR will occur
 		 */
-		pkt_descs += sfc_tx_maybe_insert_tag(txq, m_seg, &pend);
+		pkt_descs += sfc_efx_tx_maybe_insert_tag(txq, m_seg, &pend);
 
+#ifdef RTE_LIBRTE_SFC_EFX_TSO
 		if (m_seg->ol_flags & PKT_TX_TCP_SEG) {
 			/*
 			 * We expect correct 'pkt->l[2, 3, 4]_len' values
 			 * to be set correctly by the caller
 			 */
-			if (sfc_tso_do(txq, added, &m_seg, &in_off, &pend,
-				       &pkt_descs, &pkt_len) != 0) {
+			if (sfc_efx_tso_do(txq, added, &m_seg, &in_off, &pend,
+					   &pkt_descs, &pkt_len) != 0) {
 				/* We may have reached this place for
 				 * one of the following reasons:
 				 *
@@ -682,6 +693,7 @@
 			 * as for the usual non-TSO path
 			 */
 		}
+#endif /* RTE_LIBRTE_SFC_EFX_TSO */
 
 		for (; m_seg != NULL; m_seg = m_seg->next) {
 			efsys_dma_addr_t	next_frag;
@@ -729,7 +741,7 @@
 			 * Try to reap (if we haven't yet).
 			 */
 			if (!reap_done) {
-				sfc_tx_reap(txq);
+				sfc_efx_tx_reap(txq);
 				reap_done = B_TRUE;
 				fill_level = added - txq->completed;
 				if (fill_level > hard_max_fill) {
@@ -758,9 +770,143 @@
 
 #if SFC_TX_XMIT_PKTS_REAP_AT_LEAST_ONCE
 	if (!reap_done)
-		sfc_tx_reap(txq);
+		sfc_efx_tx_reap(txq);
 #endif
 
 done:
 	return pkts_sent;
 }
+
+static sfc_dp_tx_qcreate_t sfc_efx_tx_qcreate;
+static int
+sfc_efx_tx_qcreate(void *ctrl, __rte_unused sfc_dp_exception_t *exception,
+		   int socket_id, const struct sfc_dp_tx_qcreate_args *args,
+		   struct sfc_dp_txq **dp_txqp)
+{
+	struct sfc_txq *ctrl_txq = ctrl;
+	struct sfc_efx_txq *txq;
+	int rc;
+
+	rc = ENOMEM;
+	txq = rte_zmalloc_socket("sfc-efx-txq", sizeof(*txq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL)
+		goto fail_txq_alloc;
+
+	rc = ENOMEM;
+	txq->pend_desc = rte_calloc_socket("sfc-efx-txq-pend-desc",
+					   EFX_TXQ_LIMIT(args->txq_entries),
+					   sizeof(*txq->pend_desc), 0,
+					   socket_id);
+	if (txq->pend_desc == NULL)
+		goto fail_pend_desc_alloc;
+
+	rc = ENOMEM;
+	txq->sw_ring = rte_calloc_socket("sfc-efx-txq-sw_ring",
+					 args->txq_entries,
+					 sizeof(*txq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq->sw_ring == NULL)
+		goto fail_sw_ring_alloc;
+
+	if (ctrl_txq->evq->sa->tso) {
+		rc = sfc_efx_tso_alloc_tsoh_objs(txq->sw_ring,
+						 args->txq_entries, socket_id);
+		if (rc != 0)
+			goto fail_alloc_tsoh_objs;
+	}
+
+	txq->ctrl = ctrl_txq;
+	txq->evq = ctrl_txq->evq;
+	txq->ptr_mask = args->txq_entries - 1;
+	txq->free_thresh = args->free_thresh;
+
+	*dp_txqp = &txq->dp;
+	return 0;
+
+fail_alloc_tsoh_objs:
+	rte_free(txq->sw_ring);
+
+fail_sw_ring_alloc:
+	rte_free(txq->pend_desc);
+
+fail_pend_desc_alloc:
+	rte_free(txq);
+
+fail_txq_alloc:
+	return rc;
+}
+
+static sfc_dp_tx_qdestroy_t sfc_efx_tx_qdestroy;
+static void
+sfc_efx_tx_qdestroy(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+
+	sfc_efx_tso_free_tsoh_objs(txq->sw_ring, txq->ptr_mask + 1);
+	rte_free(txq->sw_ring);
+	rte_free(txq->pend_desc);
+	rte_free(txq);
+}
+
+static sfc_dp_tx_qstart_t sfc_efx_tx_qstart;
+static int
+sfc_efx_tx_qstart(struct sfc_dp_txq *dp_txq,
+		  __rte_unused unsigned int evq_read_ptr,
+		  unsigned int txq_desc_index)
+{
+	/* libefx-based datapath is specific to libefx-based PMD */
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+
+	txq->common = txq->ctrl->common;
+
+	txq->pending = txq->completed = txq->added = txq_desc_index;
+	txq->hw_vlan_tci = 0;
+
+	txq->flags |= (SFC_EFX_TXQ_FLAG_STARTED | SFC_EFX_TXQ_FLAG_RUNNING);
+
+	return 0;
+}
+
+static sfc_dp_tx_qstop_t sfc_efx_tx_qstop;
+static void
+sfc_efx_tx_qstop(struct sfc_dp_txq *dp_txq,
+		 __rte_unused unsigned int *evq_read_ptr)
+{
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+
+	txq->flags &= ~SFC_EFX_TXQ_FLAG_RUNNING;
+}
+
+static sfc_dp_tx_qreap_t sfc_efx_tx_qreap;
+static void
+sfc_efx_tx_qreap(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+	unsigned int txds;
+
+	sfc_efx_tx_reap(txq);
+
+	for (txds = 0; txds <= txq->ptr_mask; txds++) {
+		if (txq->sw_ring[txds].mbuf != NULL) {
+			rte_pktmbuf_free(txq->sw_ring[txds].mbuf);
+			txq->sw_ring[txds].mbuf = NULL;
+		}
+	}
+
+	txq->flags &= ~SFC_EFX_TXQ_FLAG_STARTED;
+}
+
+struct sfc_dp_tx sfc_efx_tx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EFX,
+		.type		= SFC_DP_TX,
+		.hw_fw_caps	= 0,
+	},
+	.qcreate		= sfc_efx_tx_qcreate,
+	.qdestroy		= sfc_efx_tx_qdestroy,
+	.qstart			= sfc_efx_tx_qstart,
+	.qstop			= sfc_efx_tx_qstop,
+	.qreap			= sfc_efx_tx_qreap,
+	.pkt_burst		= sfc_efx_xmit_pkts,
+};
diff --git a/drivers/net/sfc/sfc_tx.h b/drivers/net/sfc/sfc_tx.h
index 39977a5..63c86a1 100644
--- a/drivers/net/sfc/sfc_tx.h
+++ b/drivers/net/sfc/sfc_tx.h
@@ -35,6 +35,8 @@
 
 #include "efx.h"
 
+#include "sfc_dp_tx.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -48,7 +50,11 @@
 struct sfc_adapter;
 struct sfc_evq;
 
-struct sfc_tx_sw_desc {
+/**
+ * Software Tx descriptor information associated with hardware Tx
+ * descriptor.
+ */
+struct sfc_efx_tx_sw_desc {
 	struct rte_mbuf		*mbuf;
 	uint8_t			*tsoh;	/* Buffer to store TSO header */
 };
@@ -58,36 +64,69 @@ enum sfc_txq_state_bit {
 #define SFC_TXQ_INITIALIZED	(1 << SFC_TXQ_INITIALIZED_BIT)
 	SFC_TXQ_STARTED_BIT,
 #define SFC_TXQ_STARTED		(1 << SFC_TXQ_STARTED_BIT)
-	SFC_TXQ_RUNNING_BIT,
-#define SFC_TXQ_RUNNING		(1 << SFC_TXQ_RUNNING_BIT)
 	SFC_TXQ_FLUSHING_BIT,
 #define SFC_TXQ_FLUSHING	(1 << SFC_TXQ_FLUSHING_BIT)
 	SFC_TXQ_FLUSHED_BIT,
 #define SFC_TXQ_FLUSHED		(1 << SFC_TXQ_FLUSHED_BIT)
 };
 
+/**
+ * Transmit queue control information. Not used on datapath.
+ * Allocated on the socket specified on the queue setup.
+ */
 struct sfc_txq {
-	struct sfc_evq		*evq;
-	struct sfc_tx_sw_desc	*sw_ring;
-	unsigned int		state;
-	unsigned int		ptr_mask;
-	efx_desc_t		*pend_desc;
-	efx_txq_t		*common;
-	efsys_mem_t		mem;
-	unsigned int		added;
-	unsigned int		pending;
-	unsigned int		completed;
-	unsigned int		free_thresh;
-	uint16_t		hw_vlan_tci;
-
-	unsigned int		hw_index;
-	unsigned int		flags;
+	unsigned int			state;
+	unsigned int			hw_index;
+	struct sfc_evq			*evq;
+	efsys_mem_t			mem;
+	struct sfc_dp_txq		*dp;
+	efx_txq_t			*common;
+	unsigned int			free_thresh;
+	unsigned int			flags;
 };
 
 static inline unsigned int
+sfc_txq_sw_index_by_hw_index(unsigned int hw_index)
+{
+	return hw_index;
+}
+
+static inline unsigned int
 sfc_txq_sw_index(const struct sfc_txq *txq)
 {
-	return txq->hw_index;
+	return sfc_txq_sw_index_by_hw_index(txq->hw_index);
+}
+
+/**
+ * Transmit queue information used on libefx-based data path.
+ * Allocated on the socket specified on the queue setup.
+ */
+struct sfc_efx_txq {
+	struct sfc_evq			*evq;
+	struct sfc_efx_tx_sw_desc	*sw_ring;
+	unsigned int			ptr_mask;
+	efx_desc_t			*pend_desc;
+	efx_txq_t			*common;
+	unsigned int			added;
+	unsigned int			pending;
+	unsigned int			completed;
+	unsigned int			free_thresh;
+	uint16_t			hw_vlan_tci;
+
+	unsigned int			hw_index;
+	unsigned int			flags;
+#define SFC_EFX_TXQ_FLAG_STARTED	0x1
+#define SFC_EFX_TXQ_FLAG_RUNNING	0x2
+
+	/* Datapath transmit queue anchor */
+	struct sfc_dp_txq		dp;
+	struct sfc_txq			*ctrl;
+};
+
+static inline struct sfc_efx_txq *
+sfc_efx_txq_by_dp_txq(struct sfc_dp_txq *dp_txq)
+{
+	return container_of(dp_txq, struct sfc_efx_txq, dp);
 }
 
 struct sfc_txq_info {
@@ -111,17 +150,15 @@ int sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 int sfc_tx_start(struct sfc_adapter *sa);
 void sfc_tx_stop(struct sfc_adapter *sa);
 
-uint16_t sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
-		       uint16_t nb_pkts);
-
 /* From 'sfc_tso.c' */
-int sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
-			    unsigned int txq_entries, unsigned int socket_id);
-void sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
-			    unsigned int txq_entries);
-int sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
-	       size_t *in_off, efx_desc_t **pend, unsigned int *pkt_descs,
-	       size_t *pkt_len);
+int sfc_efx_tso_alloc_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+				unsigned int txq_entries,
+				unsigned int socket_id);
+void sfc_efx_tso_free_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+				unsigned int txq_entries);
+int sfc_efx_tso_do(struct sfc_efx_txq *txq, unsigned int idx,
+		   struct rte_mbuf **in_seg, size_t *in_off, efx_desc_t **pend,
+		   unsigned int *pkt_descs, size_t *pkt_len);
 
 #ifdef __cplusplus
 }
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 08/13] net/sfc: VLAN insertion is a datapath dependent feature
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (6 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 07/13] net/sfc: factory out libefx-based Tx datapath Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 09/13] net/sfc: TSO " Andrew Rybchenko
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_tx.h  |  2 ++
 drivers/net/sfc/sfc_ethdev.c |  3 ++-
 drivers/net/sfc/sfc_tx.c     | 14 +++++++++++---
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 4879db5..9f9948f 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -123,6 +123,8 @@ typedef void (sfc_dp_tx_qstop_t)(struct sfc_dp_txq *dp_txq,
 struct sfc_dp_tx {
 	struct sfc_dp			dp;
 
+	unsigned int			features;
+#define SFC_DP_TX_FEAT_VLAN_INSERT	0x1
 	sfc_dp_tx_qcreate_t		*qcreate;
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index c6db730..9cf8624 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -84,7 +84,8 @@
 		DEV_TX_OFFLOAD_TCP_CKSUM;
 
 	dev_info->default_txconf.txq_flags = ETH_TXQ_FLAGS_NOXSUMSCTP;
-	if (!encp->enc_hw_tx_insert_vlan_enabled)
+	if ((~sa->dp_tx->features & SFC_DP_TX_FEAT_VLAN_INSERT) ||
+	    !encp->enc_hw_tx_insert_vlan_enabled)
 		dev_info->default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOVLANOFFL;
 	else
 		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_VLAN_INSERT;
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 31bccac..e295454 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -82,9 +82,16 @@
 		rc = EINVAL;
 	}
 
-	if (!encp->enc_hw_tx_insert_vlan_enabled &&
-	    (flags & ETH_TXQ_FLAGS_NOVLANOFFL) == 0) {
-		sfc_err(sa, "VLAN offload is not supported");
+	if ((flags & ETH_TXQ_FLAGS_NOVLANOFFL) == 0) {
+		if (!encp->enc_hw_tx_insert_vlan_enabled) {
+			sfc_err(sa, "VLAN offload is not supported");
+			rc = EINVAL;
+		} else if (~sa->dp_tx->features & SFC_DP_TX_FEAT_VLAN_INSERT) {
+			sfc_err(sa,
+				"VLAN offload is not supported by %s datapath",
+				sa->dp_tx->dp.name);
+			rc = EINVAL;
+		}
 		rc = EINVAL;
 	}
 
@@ -903,6 +910,7 @@ struct sfc_dp_tx sfc_efx_tx = {
 		.type		= SFC_DP_TX,
 		.hw_fw_caps	= 0,
 	},
+	.features		= SFC_DP_TX_FEAT_VLAN_INSERT,
 	.qcreate		= sfc_efx_tx_qcreate,
 	.qdestroy		= sfc_efx_tx_qdestroy,
 	.qstart			= sfc_efx_tx_qstart,
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 09/13] net/sfc: TSO is a datapath dependent feature
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (7 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 08/13] net/sfc: VLAN insertion is a datapath dependent feature Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 10/13] net/sfc: implement EF10 native Tx datapath Andrew Rybchenko
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_tx.h | 1 +
 drivers/net/sfc/sfc_tx.c    | 6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 9f9948f..b6b7084 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -125,6 +125,7 @@ struct sfc_dp_tx {
 
 	unsigned int			features;
 #define SFC_DP_TX_FEAT_VLAN_INSERT	0x1
+#define SFC_DP_TX_FEAT_TSO		0x2
 	sfc_dp_tx_qcreate_t		*qcreate;
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index e295454..1dc97be 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -303,6 +303,9 @@
 	unsigned int sw_index;
 	int rc = 0;
 
+	if (~sa->dp_tx->features & SFC_DP_TX_FEAT_TSO)
+		sa->tso = B_FALSE;
+
 	rc = sfc_tx_check_mode(sa, &dev_conf->txmode);
 	if (rc != 0)
 		goto fail_check_mode;
@@ -910,7 +913,8 @@ struct sfc_dp_tx sfc_efx_tx = {
 		.type		= SFC_DP_TX,
 		.hw_fw_caps	= 0,
 	},
-	.features		= SFC_DP_TX_FEAT_VLAN_INSERT,
+	.features		= SFC_DP_TX_FEAT_VLAN_INSERT |
+				  SFC_DP_TX_FEAT_TSO,
 	.qcreate		= sfc_efx_tx_qcreate,
 	.qdestroy		= sfc_efx_tx_qdestroy,
 	.qstart			= sfc_efx_tx_qstart,
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 10/13] net/sfc: implement EF10 native Tx datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (8 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 09/13] net/sfc: TSO " Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 11/13] net/sfc: multi-segment support as is Tx datapath features Andrew Rybchenko
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst   |   5 +-
 drivers/net/sfc/Makefile      |   1 +
 drivers/net/sfc/sfc_dp_tx.h   |  17 ++
 drivers/net/sfc/sfc_ef10_tx.c | 439 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  |   1 +
 drivers/net/sfc/sfc_ev.c      |  15 +-
 drivers/net/sfc/sfc_kvargs.h  |   3 +-
 drivers/net/sfc/sfc_tx.c      |   5 +
 8 files changed, 483 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_ef10_tx.c

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index e864ccc..ed0a59f 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -191,13 +191,16 @@ boolean parameters value.
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
 
-- ``tx_datapath`` [auto|efx] (default **auto**)
+- ``tx_datapath`` [auto|efx|ef10] (default **auto**)
 
   Choose transmit datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
   features available and required by the datapath implementation.
   **efx** chooses libefx-based datapath which supports VLAN insertion
   (full-feature firmware variant only), TSO and multi-segment mbufs.
+  **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
+  more efficient than libefx-based but has no VLAN insertion and TSO
+  support yet.
 
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index 3c15722..bb7dcb2 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -92,6 +92,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tso.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_tx.c
 
 VPATH += $(SRCDIR)/base
 
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index b6b7084..8c74428 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -74,6 +74,16 @@ struct sfc_dp_tx_qcreate_args {
 	unsigned int		flags;
 	/** Tx queue size */
 	unsigned int		txq_entries;
+	/** DMA-mapped Tx descriptors ring */
+	void			*txq_hw_ring;
+	/** Associated event queue size */
+	unsigned int		evq_entries;
+	/** Hardware event ring */
+	void			*evq_hw_ring;
+	/** The queue index in hardware (required to push right doorbell) */
+	unsigned int		hw_index;
+	/** Virtual address of the memory-mapped BAR to push Tx doorbell */
+	volatile void		*mem_bar;
 };
 
 /**
@@ -115,6 +125,11 @@ typedef void (sfc_dp_tx_qstop_t)(struct sfc_dp_txq *dp_txq,
 				 unsigned int *evq_read_ptr);
 
 /**
+ * Transmit event handler used during queue flush only.
+ */
+typedef bool (sfc_dp_tx_qtx_ev_t)(struct sfc_dp_txq *dp_txq, unsigned int id);
+
+/**
  * Transmit queue function called after the queue flush.
  */
 typedef void (sfc_dp_tx_qreap_t)(struct sfc_dp_txq *dp_txq);
@@ -130,6 +145,7 @@ struct sfc_dp_tx {
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
 	sfc_dp_tx_qstop_t		*qstop;
+	sfc_dp_tx_qtx_ev_t		*qtx_ev;
 	sfc_dp_tx_qreap_t		*qreap;
 	eth_tx_burst_t			pkt_burst;
 };
@@ -151,6 +167,7 @@ struct sfc_dp_tx {
 }
 
 extern struct sfc_dp_tx sfc_efx_tx;
+extern struct sfc_dp_tx sfc_ef10_tx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
new file mode 100644
index 0000000..8718961
--- /dev/null
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -0,0 +1,439 @@
+/*-
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+
+#include <rte_mbuf.h>
+#include <rte_io.h>
+
+#include "efx.h"
+#include "efx_types.h"
+#include "efx_regs.h"
+#include "efx_regs_ef10.h"
+
+#include "sfc_dp_tx.h"
+#include "sfc_tweak.h"
+#include "sfc_kvargs.h"
+
+/** Maximum length of the mbuf segment data */
+#define SFC_MBUF_SEG_LEN_MAX \
+	((1u << (8 * sizeof(((struct rte_mbuf *)0)->data_len))) - 1)
+
+/** Maximum length of the DMA descriptor data */
+#define SFC_EF10_TX_DMA_DESC_LEN_MAX \
+	((1u << ESF_DZ_TX_KER_BYTE_CNT_WIDTH) - 1)
+
+/** Maximum number of DMA descriptors per mbuf segment */
+#define SFC_EF10_TX_MBUF_SEG_DESCS_MAX \
+	SFC_DIV_ROUND_UP(SFC_MBUF_SEG_LEN_MAX, \
+			 SFC_EF10_TX_DMA_DESC_LEN_MAX)
+
+struct sfc_ef10_tx_sw_desc {
+	struct rte_mbuf			*mbuf;
+};
+
+struct sfc_ef10_txq {
+	unsigned int			flags;
+#define SFC_EF10_TXQ_STARTED		0x1
+#define SFC_EF10_TXQ_NOT_RUNNING	0x2
+#define SFC_EF10_TXQ_EXCEPTION		0x4
+
+	unsigned int			ptr_mask;
+	unsigned int			added;
+	unsigned int			completed;
+	unsigned int			free_thresh;
+	unsigned int			evq_read_ptr;
+	struct sfc_ef10_tx_sw_desc	*sw_ring;
+	efx_qword_t			*txq_hw_ring;
+	volatile void			*doorbell;
+	volatile efx_qword_t		*evq_hw_ring;
+
+	/* Datapath transmit queue anchor */
+	struct sfc_dp_txq		dp;
+	void				*ctrl;
+	sfc_dp_exception_t		*exception;
+};
+
+static inline struct sfc_ef10_txq *
+sfc_ef10_txq_by_dp_txq(struct sfc_dp_txq *dp_txq)
+{
+	return container_of(dp_txq, struct sfc_ef10_txq, dp);
+}
+
+static bool
+sfc_ef10_tx_get_event(volatile efx_qword_t *evq_hw_ring,
+		      unsigned int *read_ptr, const unsigned int ptr_mask,
+		      unsigned int *flags, efx_qword_t *tx_ev)
+{
+	if (unlikely(*flags & SFC_EF10_TXQ_EXCEPTION))
+		return false;
+
+	*tx_ev = evq_hw_ring[*read_ptr & ptr_mask];
+
+	if (tx_ev->eq_u64[0] == UINT64_MAX)
+		return false;
+
+	if (unlikely(EFX_QWORD_FIELD(*tx_ev, FSF_AZ_EV_CODE) !=
+		     FSE_AZ_EV_CODE_TX_EV)) {
+		/* Do not move read_ptr to keep the event for exception
+		 * handling
+		 */
+		*flags |= SFC_EF10_TXQ_EXCEPTION;
+		return false;
+	}
+
+	++(*read_ptr);
+	return true;
+}
+
+static void
+sfc_ef10_tx_reap(struct sfc_ef10_txq *txq)
+{
+	volatile efx_qword_t * const evq_hw_ring = txq->evq_hw_ring;
+	unsigned int old_read_ptr = txq->evq_read_ptr;
+	unsigned int evq_read_ptr = old_read_ptr;
+	const unsigned int ptr_mask = txq->ptr_mask;
+	unsigned int completed = txq->completed;
+	unsigned int pending = completed;
+	const unsigned int curr_done = pending - 1;
+	unsigned int anew_done = curr_done;
+	efx_qword_t tx_ev;
+
+	while (sfc_ef10_tx_get_event(evq_hw_ring, &evq_read_ptr,
+				     ptr_mask, &txq->flags, &tx_ev)) {
+		if (EFX_TEST_QWORD_BIT(tx_ev, ESF_DZ_TX_DROP_EVENT_LBN))
+			continue;
+
+		/* Update the latest done descriptor */
+		anew_done = EFX_QWORD_FIELD(tx_ev, ESF_DZ_TX_DESCR_INDX);
+	}
+	pending += (anew_done - curr_done) & ptr_mask;
+
+	if (pending != completed) {
+		do {
+			struct sfc_ef10_tx_sw_desc *txd;
+
+			txd = &txq->sw_ring[completed & ptr_mask];
+
+			if (txd->mbuf != NULL) {
+				rte_pktmbuf_free(txd->mbuf);
+				txd->mbuf = NULL;
+			}
+		} while (++completed != pending);
+
+		txq->completed = completed;
+	}
+
+	if (old_read_ptr != evq_read_ptr) {
+		do {
+			EFX_SET_QWORD(evq_hw_ring[old_read_ptr & ptr_mask]);
+		} while (++old_read_ptr != evq_read_ptr);
+
+		txq->evq_read_ptr = evq_read_ptr;
+	}
+}
+
+static void
+sfc_ef10_tx_qdesc_dma_create(phys_addr_t addr, uint16_t size, bool eop,
+			     efx_qword_t *edp)
+{
+	EFX_POPULATE_QWORD_4(*edp,
+			     ESF_DZ_TX_KER_TYPE, 0,
+			     ESF_DZ_TX_KER_CONT, !eop,
+			     ESF_DZ_TX_KER_BYTE_CNT, size,
+			     ESF_DZ_TX_KER_BUF_ADDR, addr);
+}
+
+static inline void
+sfc_ef10_tx_qpush(struct sfc_ef10_txq *txq, unsigned int added,
+		  unsigned int pushed)
+{
+	efx_qword_t desc;
+	efx_oword_t oword;
+
+	/*
+	 * This improves performance by pushing a TX descriptor at the same
+	 * time as the doorbell. The descriptor must be added to the TXQ,
+	 * so that can be used if the hardware decides not to use the pushed
+	 * descriptor.
+	 */
+	desc.eq_u64[0] = txq->txq_hw_ring[pushed & txq->ptr_mask].eq_u64[0];
+	EFX_POPULATE_OWORD_3(oword,
+		ERF_DZ_TX_DESC_WPTR, added & txq->ptr_mask,
+		ERF_DZ_TX_DESC_HWORD, EFX_QWORD_FIELD(desc, EFX_DWORD_1),
+		ERF_DZ_TX_DESC_LWORD, EFX_QWORD_FIELD(desc, EFX_DWORD_0));
+
+	/* Make sure that all descriptor update (Tx and event) reach memory */
+	rte_wmb();
+
+	/* DMA sync to device is not required */
+
+	*(volatile __m128i *)txq->doorbell = oword.eo_u128[0];
+	rte_io_wmb();
+}
+
+static uint16_t
+sfc_ef10_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	struct sfc_ef10_txq * const txq = sfc_ef10_txq_by_dp_txq(tx_queue);
+	unsigned int ptr_mask;
+	unsigned int added;
+	unsigned int dma_desc_space;
+	bool reap_done;
+	struct rte_mbuf **pktp;
+	struct rte_mbuf **pktp_end;
+
+	/* Exception handling may restart the TxQ so cache nothing before */
+	if (unlikely(txq->flags &
+		     (SFC_EF10_TXQ_NOT_RUNNING | SFC_EF10_TXQ_EXCEPTION))) {
+		if (txq->flags & SFC_EF10_TXQ_EXCEPTION)
+			txq->exception(txq->ctrl);
+		if (txq->flags & SFC_EF10_TXQ_NOT_RUNNING)
+			return 0;
+	}
+
+	ptr_mask = txq->ptr_mask;
+	added = txq->added;
+	dma_desc_space = EFX_TXQ_LIMIT(ptr_mask + 1) -
+			 (added - txq->completed);
+
+	reap_done = (dma_desc_space < txq->free_thresh);
+	if (reap_done) {
+		sfc_ef10_tx_reap(txq);
+		dma_desc_space = EFX_TXQ_LIMIT(ptr_mask + 1) -
+				 (added - txq->completed);
+	}
+
+	for (pktp = &tx_pkts[0], pktp_end = &tx_pkts[nb_pkts];
+	     pktp != pktp_end;
+	     ++pktp) {
+		struct rte_mbuf *m_seg = *pktp;
+		unsigned int pkt_start = added;
+		uint32_t pkt_len;
+
+		if (likely(pktp + 1 != pktp_end))
+			rte_mbuf_prefetch_part1(pktp[1]);
+
+		if (m_seg->nb_segs * SFC_EF10_TX_MBUF_SEG_DESCS_MAX >
+		    dma_desc_space) {
+			if (reap_done)
+				break;
+
+			sfc_ef10_tx_reap(txq);
+			reap_done = true;
+			dma_desc_space = EFX_TXQ_LIMIT(ptr_mask + 1) -
+				(added - txq->completed);
+			if (m_seg->nb_segs * SFC_EF10_TX_MBUF_SEG_DESCS_MAX >
+			    dma_desc_space)
+				break;
+		}
+
+		pkt_len = m_seg->pkt_len;
+		do {
+			phys_addr_t seg_addr = rte_mbuf_data_dma_addr(m_seg);
+			unsigned int seg_len = rte_pktmbuf_data_len(m_seg);
+
+			SFC_ASSERT(seg_len <= SFC_EF10_TX_DMA_DESC_LEN_MAX);
+
+			pkt_len -= seg_len;
+
+			sfc_ef10_tx_qdesc_dma_create(seg_addr,
+				seg_len, (pkt_len == 0),
+				&txq->txq_hw_ring[added & ptr_mask]);
+			++added;
+
+		} while ((m_seg = m_seg->next) != 0);
+
+		dma_desc_space -= (added - pkt_start);
+
+		/* Assign mbuf to the last used desc */
+		txq->sw_ring[(added - 1) & ptr_mask].mbuf = *pktp;
+	}
+
+	if (likely(added != txq->added)) {
+		sfc_ef10_tx_qpush(txq, added, txq->added);
+		txq->added = added;
+	}
+
+#if SFC_TX_XMIT_PKTS_REAP_AT_LEAST_ONCE
+	if (!reap_done)
+		sfc_ef10_tx_reap(txq);
+#endif
+
+	return pktp - &tx_pkts[0];
+}
+
+static void *
+sfc_ef10_txq_get_ctrl(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	return txq->ctrl;
+}
+
+static const struct sfc_dp_txq_ops sfc_ef10_txq_ops = {
+	.get_ctrl	= sfc_ef10_txq_get_ctrl,
+};
+
+static sfc_dp_tx_qcreate_t sfc_ef10_tx_qcreate;
+static int
+sfc_ef10_tx_qcreate(void *ctrl, sfc_dp_exception_t *exception, int socket_id,
+		    const struct sfc_dp_tx_qcreate_args *args,
+		    struct sfc_dp_txq **dp_txqp)
+{
+	struct sfc_ef10_txq *txq;
+	int rc;
+
+	rc = EINVAL;
+	if (args->txq_entries != args->evq_entries)
+		goto fail_bad_args;
+
+	rc = ENOMEM;
+	txq = rte_zmalloc_socket("sfc-ef10-txq", sizeof(*txq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL)
+		goto fail_txq_alloc;
+
+	rc = ENOMEM;
+	txq->sw_ring = rte_calloc_socket("sfc-ef10-txq-sw_ring",
+					 args->txq_entries,
+					 sizeof(*txq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq->sw_ring == NULL)
+		goto fail_sw_ring_alloc;
+
+	txq->flags = SFC_EF10_TXQ_NOT_RUNNING;
+	txq->ptr_mask = args->txq_entries - 1;
+	txq->free_thresh = args->free_thresh;
+	txq->txq_hw_ring = args->txq_hw_ring;
+	txq->doorbell = (volatile uint8_t *)args->mem_bar +
+			ER_DZ_TX_DESC_UPD_REG_OFST +
+			args->hw_index * ER_DZ_TX_DESC_UPD_REG_STEP;
+	txq->evq_hw_ring = args->evq_hw_ring;
+
+	txq->dp.ops = &sfc_ef10_txq_ops;
+	txq->ctrl = ctrl;
+	txq->exception = exception;
+
+	*dp_txqp = &txq->dp;
+	return 0;
+
+fail_sw_ring_alloc:
+	rte_free(txq);
+
+fail_txq_alloc:
+fail_bad_args:
+	return rc;
+}
+
+static sfc_dp_tx_qdestroy_t sfc_ef10_tx_qdestroy;
+static void
+sfc_ef10_tx_qdestroy(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	rte_free(txq->sw_ring);
+	rte_free(txq);
+}
+
+static sfc_dp_tx_qstart_t sfc_ef10_tx_qstart;
+static int
+sfc_ef10_tx_qstart(struct sfc_dp_txq *dp_txq, unsigned int evq_read_ptr,
+		   unsigned int txq_desc_index)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	txq->evq_read_ptr = evq_read_ptr;
+	txq->added = txq->completed = txq_desc_index;
+
+	txq->flags |= SFC_EF10_TXQ_STARTED;
+	txq->flags &= ~(SFC_EF10_TXQ_NOT_RUNNING | SFC_EF10_TXQ_EXCEPTION);
+
+	return 0;
+}
+
+static sfc_dp_tx_qstop_t sfc_ef10_tx_qstop;
+static void
+sfc_ef10_tx_qstop(struct sfc_dp_txq *dp_txq, unsigned int *evq_read_ptr)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	txq->flags |= SFC_EF10_TXQ_NOT_RUNNING;
+
+	*evq_read_ptr = txq->evq_read_ptr;
+}
+
+static sfc_dp_tx_qtx_ev_t sfc_ef10_tx_qtx_ev;
+static bool
+sfc_ef10_tx_qtx_ev(struct sfc_dp_txq *dp_txq, __rte_unused unsigned int id)
+{
+	__rte_unused struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	SFC_ASSERT(txq->flags & SFC_EF10_TXQ_NOT_RUNNING);
+
+	/*
+	 * It is safe to ignore Tx event since we reap all mbufs on
+	 * queue purge anyway.
+	 */
+
+	return false;
+}
+
+static sfc_dp_tx_qreap_t sfc_ef10_tx_qreap;
+static void
+sfc_ef10_tx_qreap(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+	unsigned int txds;
+
+	for (txds = 0; txds <= txq->ptr_mask; ++txds) {
+		if (txq->sw_ring[txds].mbuf != NULL) {
+			rte_pktmbuf_free(txq->sw_ring[txds].mbuf);
+			txq->sw_ring[txds].mbuf = NULL;
+		}
+	}
+
+	txq->flags &= ~SFC_EF10_TXQ_STARTED;
+}
+
+struct sfc_dp_tx sfc_ef10_tx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10,
+		.type		= SFC_DP_TX,
+		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_tx_qcreate,
+	.qdestroy		= sfc_ef10_tx_qdestroy,
+	.qstart			= sfc_ef10_tx_qstart,
+	.qtx_ev			= sfc_ef10_tx_qtx_ev,
+	.qstop			= sfc_ef10_tx_qstop,
+	.qreap			= sfc_ef10_tx_qreap,
+	.pkt_burst		= sfc_ef10_xmit_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 9cf8624..168c965 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1334,6 +1334,7 @@
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
 
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_tx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_tx.dp);
 	}
 }
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 04e923f..5e7c619 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -210,6 +210,19 @@
 }
 
 static boolean_t
+sfc_ev_dp_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
+{
+	struct sfc_evq *evq = arg;
+	struct sfc_dp_txq *dp_txq;
+
+	dp_txq = evq->dp_txq;
+	SFC_ASSERT(dp_txq != NULL);
+
+	SFC_ASSERT(evq->sa->dp_tx->qtx_ev != NULL);
+	return evq->sa->dp_tx->qtx_ev(dp_txq, id);
+}
+
+static boolean_t
 sfc_ev_exception(void *arg, __rte_unused uint32_t code,
 		 __rte_unused uint32_t data)
 {
@@ -466,7 +479,7 @@
 static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
-	.eec_tx			= sfc_ev_nop_tx,
+	.eec_tx			= sfc_ev_dp_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
 	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 14f46db..68cca4f 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -62,7 +62,8 @@
 
 #define SFC_KVARG_TX_DATAPATH		"tx_datapath"
 #define SFC_KVARG_VALUES_TX_DATAPATH \
-	"[" SFC_KVARG_DATAPATH_EFX "]"
+	"[" SFC_KVARG_DATAPATH_EFX "|" \
+	    SFC_KVARG_DATAPATH_EF10 "]"
 
 struct sfc_adapter;
 
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 1dc97be..d3d5ecc 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -195,6 +195,11 @@
 	args.free_thresh = txq->free_thresh;
 	args.flags = tx_conf->txq_flags;
 	args.txq_entries = txq_info->entries;
+	args.txq_hw_ring = txq->mem.esm_base;
+	args.evq_entries = txq_info->entries;
+	args.evq_hw_ring = evq->mem.esm_base;
+	args.hw_index = txq->hw_index;
+	args.mem_bar = sa->mem_bar.esb_base;
 
 	rc = sa->dp_tx->qcreate(txq, sfc_tx_dp_exception, socket_id, &args,
 				&txq->dp);
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 11/13] net/sfc: multi-segment support as is Tx datapath features
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (9 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 10/13] net/sfc: implement EF10 native Tx datapath Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 12/13] net/sfc: implement simple EF10 native Tx datapath Andrew Rybchenko
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_tx.h   |  1 +
 drivers/net/sfc/sfc_ef10_tx.c |  2 +-
 drivers/net/sfc/sfc_ethdev.c  |  3 +++
 drivers/net/sfc/sfc_tx.c      | 10 +++++++++-
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 8c74428..4f01f65 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -141,6 +141,7 @@ struct sfc_dp_tx {
 	unsigned int			features;
 #define SFC_DP_TX_FEAT_VLAN_INSERT	0x1
 #define SFC_DP_TX_FEAT_TSO		0x2
+#define SFC_DP_TX_FEAT_MULTI_SEG	0x4
 	sfc_dp_tx_qcreate_t		*qcreate;
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
index 8718961..047d0e7 100644
--- a/drivers/net/sfc/sfc_ef10_tx.c
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -428,7 +428,7 @@ struct sfc_dp_tx sfc_ef10_tx = {
 		.type		= SFC_DP_TX,
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10,
 	},
-	.features		= 0,
+	.features		= SFC_DP_TX_FEAT_MULTI_SEG,
 	.qcreate		= sfc_ef10_tx_qcreate,
 	.qdestroy		= sfc_ef10_tx_qdestroy,
 	.qstart			= sfc_ef10_tx_qstart,
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 168c965..ce87284 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -90,6 +90,9 @@
 	else
 		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_VLAN_INSERT;
 
+	if (~sa->dp_tx->features & SFC_DP_TX_FEAT_MULTI_SEG)
+		dev_info->default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOMULTSEGS;
+
 #if EFSYS_OPT_RX_SCALE
 	if (sa->rss_support != EFX_RX_SCALE_UNAVAILABLE) {
 		dev_info->reta_size = EFX_RSS_TBL_SIZE;
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index d3d5ecc..8062615 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -82,6 +82,13 @@
 		rc = EINVAL;
 	}
 
+	if (((flags & ETH_TXQ_FLAGS_NOMULTSEGS) == 0) &&
+	    (~sa->dp_tx->features & SFC_DP_TX_FEAT_MULTI_SEG)) {
+		sfc_err(sa, "Multi-segment is not supported by %s datapath",
+			sa->dp_tx->dp.name);
+		rc = EINVAL;
+	}
+
 	if ((flags & ETH_TXQ_FLAGS_NOVLANOFFL) == 0) {
 		if (!encp->enc_hw_tx_insert_vlan_enabled) {
 			sfc_err(sa, "VLAN offload is not supported");
@@ -919,7 +926,8 @@ struct sfc_dp_tx sfc_efx_tx = {
 		.hw_fw_caps	= 0,
 	},
 	.features		= SFC_DP_TX_FEAT_VLAN_INSERT |
-				  SFC_DP_TX_FEAT_TSO,
+				  SFC_DP_TX_FEAT_TSO |
+				  SFC_DP_TX_FEAT_MULTI_SEG,
 	.qcreate		= sfc_efx_tx_qcreate,
 	.qdestroy		= sfc_efx_tx_qdestroy,
 	.qstart			= sfc_efx_tx_qstart,
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 12/13] net/sfc: implement simple EF10 native Tx datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (10 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 11/13] net/sfc: multi-segment support as is Tx datapath features Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 13/13] net/sfc: support Rx packed stream EF10-specific datapath Andrew Rybchenko
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

The datapath does not support VLAN insertion, TSO and multi-segment
mbufs.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst   |  5 ++-
 drivers/net/sfc/sfc_dp_tx.h   |  1 +
 drivers/net/sfc/sfc_ef10_tx.c | 78 +++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  |  1 +
 drivers/net/sfc/sfc_kvargs.h  |  4 ++-
 5 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index ed0a59f..8e2b36f 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -191,7 +191,7 @@ boolean parameters value.
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
 
-- ``tx_datapath`` [auto|efx|ef10] (default **auto**)
+- ``tx_datapath`` [auto|efx|ef10|ef10_simple] (default **auto**)
 
   Choose transmit datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
@@ -201,6 +201,9 @@ boolean parameters value.
   **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
   more efficient than libefx-based but has no VLAN insertion and TSO
   support yet.
+  **ef10_simple** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which
+  is even more faster then **ef10** but does not support multi-segment
+  mbufs.
 
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 4f01f65..12d7f0a 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -169,6 +169,7 @@ struct sfc_dp_tx {
 
 extern struct sfc_dp_tx sfc_efx_tx;
 extern struct sfc_dp_tx sfc_ef10_tx;
+extern struct sfc_dp_tx sfc_ef10_simple_tx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
index 047d0e7..32bc6d9 100644
--- a/drivers/net/sfc/sfc_ef10_tx.c
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -289,6 +289,69 @@ struct sfc_ef10_txq {
 	return pktp - &tx_pkts[0];
 }
 
+static uint16_t
+sfc_ef10_simple_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+			  uint16_t nb_pkts)
+{
+	struct sfc_ef10_txq * const txq = sfc_ef10_txq_by_dp_txq(tx_queue);
+	unsigned int ptr_mask;
+	unsigned int added;
+	unsigned int dma_desc_space;
+	bool reap_done;
+	struct rte_mbuf **pktp;
+	struct rte_mbuf **pktp_end;
+
+	/* Exception handling may restart the TxQ so cache nothing before */
+	if (unlikely(txq->flags &
+		     (SFC_EF10_TXQ_NOT_RUNNING | SFC_EF10_TXQ_EXCEPTION))) {
+		if (txq->flags & SFC_EF10_TXQ_EXCEPTION)
+			txq->exception(txq->ctrl);
+		if (txq->flags & SFC_EF10_TXQ_NOT_RUNNING)
+			return 0;
+	}
+
+	ptr_mask = txq->ptr_mask;
+	added = txq->added;
+	dma_desc_space = EFX_TXQ_LIMIT(ptr_mask + 1) -
+			 (added - txq->completed);
+
+	reap_done = (dma_desc_space < RTE_MAX(txq->free_thresh, nb_pkts));
+	if (reap_done) {
+		sfc_ef10_tx_reap(txq);
+		dma_desc_space = EFX_TXQ_LIMIT(ptr_mask + 1) -
+				 (added - txq->completed);
+	}
+
+	pktp_end = &tx_pkts[MIN(nb_pkts, dma_desc_space)];
+	for (pktp = &tx_pkts[0]; pktp != pktp_end; ++pktp) {
+		struct rte_mbuf *pkt = *pktp;
+		unsigned int id = added & ptr_mask;
+
+		SFC_ASSERT(rte_pktmbuf_data_len(pkt) <=
+			   SFC_EF10_TX_DMA_DESC_LEN_MAX);
+
+		sfc_ef10_tx_qdesc_dma_create(rte_mbuf_data_dma_addr(pkt),
+					     rte_pktmbuf_data_len(pkt),
+					     true, &txq->txq_hw_ring[id]);
+
+		txq->sw_ring[id].mbuf = pkt;
+
+		++added;
+	}
+
+	if (likely(added != txq->added)) {
+		sfc_ef10_tx_qpush(txq, added, txq->added);
+		txq->added = added;
+	}
+
+#if SFC_TX_XMIT_PKTS_REAP_AT_LEAST_ONCE
+	if (!reap_done)
+		sfc_ef10_tx_reap(txq);
+#endif
+
+	return pktp - &tx_pkts[0];
+}
+
 static void *
 sfc_ef10_txq_get_ctrl(struct sfc_dp_txq *dp_txq)
 {
@@ -437,3 +500,18 @@ struct sfc_dp_tx sfc_ef10_tx = {
 	.qreap			= sfc_ef10_tx_qreap,
 	.pkt_burst		= sfc_ef10_xmit_pkts,
 };
+
+struct sfc_dp_tx sfc_ef10_simple_tx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10_SIMPLE,
+		.type		= SFC_DP_TX,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_tx_qcreate,
+	.qdestroy		= sfc_ef10_tx_qdestroy,
+	.qstart			= sfc_ef10_tx_qstart,
+	.qtx_ev			= sfc_ef10_tx_qtx_ev,
+	.qstop			= sfc_ef10_tx_qstop,
+	.qreap			= sfc_ef10_tx_qreap,
+	.pkt_burst		= sfc_ef10_simple_xmit_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index ce87284..ffa6abe 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1339,6 +1339,7 @@
 
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_tx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_tx.dp);
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_simple_tx.dp);
 	}
 }
 
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 68cca4f..e4ee28f 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -54,6 +54,7 @@
 
 #define SFC_KVARG_DATAPATH_EFX		"efx"
 #define SFC_KVARG_DATAPATH_EF10		"ef10"
+#define SFC_KVARG_DATAPATH_EF10_SIMPLE	"ef10_simple"
 
 #define SFC_KVARG_RX_DATAPATH		"rx_datapath"
 #define SFC_KVARG_VALUES_RX_DATAPATH \
@@ -63,7 +64,8 @@
 #define SFC_KVARG_TX_DATAPATH		"tx_datapath"
 #define SFC_KVARG_VALUES_TX_DATAPATH \
 	"[" SFC_KVARG_DATAPATH_EFX "|" \
-	    SFC_KVARG_DATAPATH_EF10 "]"
+	    SFC_KVARG_DATAPATH_EF10 "|" \
+	    SFC_KVARG_DATAPATH_EF10_SIMPLE "]"
 
 struct sfc_adapter;
 
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH 13/13] net/sfc: support Rx packed stream EF10-specific datapath
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (11 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 12/13] net/sfc: implement simple EF10 native Tx datapath Andrew Rybchenko
@ 2017-03-02  7:07 ` Andrew Rybchenko
  2017-03-04 21:07 ` [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Ferruh Yigit
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
  14 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-02  7:07 UTC (permalink / raw)
  To: dev

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst      |  23 +-
 drivers/net/sfc/Makefile         |   1 +
 drivers/net/sfc/efsys.h          |   2 +-
 drivers/net/sfc/sfc_dp.h         |   5 +-
 drivers/net/sfc/sfc_dp_rx.h      |   8 +
 drivers/net/sfc/sfc_ef10_ps_rx.c | 659 +++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c     |   7 +
 drivers/net/sfc/sfc_ev.c         |  34 ++
 drivers/net/sfc/sfc_kvargs.h     |   4 +-
 drivers/net/sfc/sfc_rx.c         |   5 +-
 drivers/net/sfc/sfc_rx.h         |   2 +-
 11 files changed, 744 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_ef10_ps_rx.c

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 8e2b36f..d52ab45 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -114,6 +114,24 @@ required in the receive buffer.
 It should be taken into account when mbuf pool for receive is created.
 
 
+Packed stream mode
+~~~~~~~~~~~~~~~~~~
+
+When the receive queue is in packed stream mode, the driver handles mbufs
+differently.
+The mbufs that are handed over to the application are always indirect mbufs.
+The contents of mbufs may be freely modified by the application, with an
+important constraint: the indirect mbufs does not have any reserved "head room"
+before the actual contents (or rather it is used by PMD internally),
+so the usual practice of prepending an extra header through manipulating
+the head room **will not** work and would result in corrupted packets.
+If one needs to prepend data to a packet, one should first create a copy of
+mbuf.
+
+Another limitation of a packed stream mode, imposed by the firmware, is that
+it allows for a single RSS context.
+
+
 Supported NICs
 --------------
 
@@ -181,7 +199,7 @@ whitelist option like "-w 02:00.0,arg1=value1,...".
 Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify
 boolean parameters value.
 
-- ``rx_datapath`` [auto|efx|ef10] (default **auto**)
+- ``rx_datapath`` [auto|efx|ef10|ef10_packed] (default **auto**)
 
   Choose receive datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
@@ -190,6 +208,9 @@ boolean parameters value.
   **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
+  **ef10_packed** chooses EF10 (SFN7xxx, SFN8xxx) packed stream datapath
+  which may be used on capture packed stream firmware variant only
+  (see notes about its limitations above).
 
 - ``tx_datapath`` [auto|efx|ef10|ef10_simple] (default **auto**)
 
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index bb7dcb2..dd83238 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -92,6 +92,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tso.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_ps_rx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_tx.c
 
 VPATH += $(SRCDIR)/base
diff --git a/drivers/net/sfc/efsys.h b/drivers/net/sfc/efsys.h
index 60829be..98e591c 100644
--- a/drivers/net/sfc/efsys.h
+++ b/drivers/net/sfc/efsys.h
@@ -210,7 +210,7 @@
 
 #define EFSYS_OPT_ALLOW_UNCONFIGURED_NIC 0
 
-#define EFSYS_OPT_RX_PACKED_STREAM 0
+#define EFSYS_OPT_RX_PACKED_STREAM 1
 
 /* ID */
 
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
index d3e7007..362a998 100644
--- a/drivers/net/sfc/sfc_dp.h
+++ b/drivers/net/sfc/sfc_dp.h
@@ -37,6 +37,8 @@
 extern "C" {
 #endif
 
+#define SFC_P2_ROUND_UP(x, align)	(-(-(x) & -(align)))
+
 #define SFC_DIV_ROUND_UP(a, b) \
 	__extension__ ({		\
 		typeof(a) _a = (a);	\
@@ -62,7 +64,8 @@ struct sfc_dp {
 	enum sfc_dp_type		type;
 	/* Mask of required hardware/firmware capabilities */
 	unsigned int			hw_fw_caps;
-#define SFC_DP_HW_FW_CAP_EF10		0x1
+#define SFC_DP_HW_FW_CAP_EF10				0x1
+#define SFC_DP_HW_FW_CAP_RX_PACKED_STREAM_64K		0x2
 };
 
 /** List of datapath variants */
diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index 944d366..0541c43 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -154,6 +154,12 @@ typedef void (sfc_dp_rx_qstop_t)(struct sfc_dp_rxq *dp_rxq,
 typedef bool (sfc_dp_rx_qrx_ev_t)(struct sfc_dp_rxq *dp_rxq, unsigned int id);
 
 /**
+ * Packed stream receive event handler used during queue flush only.
+ */
+typedef bool (sfc_dp_rx_qrx_ps_ev_t)(struct sfc_dp_rxq *dp_rxq,
+				     unsigned int id);
+
+/**
  * Receive queue purge function called after queue flush.
  *
  * Should be used to free unused recevie buffers.
@@ -177,6 +183,7 @@ struct sfc_dp_rx {
 	sfc_dp_rx_qstart_t			*qstart;
 	sfc_dp_rx_qstop_t			*qstop;
 	sfc_dp_rx_qrx_ev_t			*qrx_ev;
+	sfc_dp_rx_qrx_ps_ev_t			*qrx_ps_ev;
 	sfc_dp_rx_qpurge_t			*qpurge;
 	sfc_dp_rx_supported_ptypes_get_t	*supported_ptypes_get;
 	sfc_dp_rx_qdesc_npending_t		*qdesc_npending;
@@ -201,6 +208,7 @@ struct sfc_dp_rx {
 
 extern struct sfc_dp_rx sfc_efx_rx;
 extern struct sfc_dp_rx sfc_ef10_rx;
+extern struct sfc_dp_rx sfc_ef10_ps_rx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10_ps_rx.c b/drivers/net/sfc/sfc_ef10_ps_rx.c
new file mode 100644
index 0000000..6365092
--- /dev/null
+++ b/drivers/net/sfc/sfc_ef10_ps_rx.c
@@ -0,0 +1,659 @@
+/*-
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* EF10 packed stream native datapath implementation */
+
+#include <stdbool.h>
+
+#include <rte_byteorder.h>
+#include <rte_mbuf_ptype.h>
+#include <rte_mbuf.h>
+#include <rte_io.h>
+
+#include "efx.h"
+#include "efx_types.h"
+#include "efx_regs.h"
+#include "efx_regs_ef10.h"
+
+#include "sfc_tweak.h"
+#include "sfc_dp_rx.h"
+#include "sfc_kvargs.h"
+
+#if 1
+/* Alignment requirement for value written to RX WPTR:
+ *  the WPTR must be aligned to an 8 descriptor boundary
+ */
+#define	EF10_RX_WPTR_ALIGN 8
+#define	EFX_RX_PACKED_STREAM_ALIGNMENT 64
+#define	EFX_RX_PACKED_STREAM_RX_PREFIX_SIZE 8
+#endif
+
+#define SFC_PACKED_STREAM_BUFSIZE (64 * 1024)
+
+
+struct sfc_ef10_ps_rx_sw_desc {
+	struct rte_mbuf			*mbuf;
+};
+
+struct sfc_ef10_ps_rxq {
+	/* Used on data path */
+	unsigned int			flags;
+#define SFC_EF10_PS_RXQ_STARTED		0x1
+#define SFC_EF10_PS_RXQ_RUNNING		0x2
+#define SFC_EF10_PS_RXQ_EXCEPTION	0x4
+	unsigned int			rxq_ptr_mask;
+	unsigned int			completed;
+	unsigned int			pending_pkts;
+	const uint8_t			*next_pkt;
+	unsigned int			packets;
+	unsigned int			evq_read_ptr;
+	unsigned int			evq_ptr_mask;
+	volatile efx_qword_t		*evq_hw_ring;
+	struct sfc_ef10_ps_rx_sw_desc	*sw_ring;
+	struct rte_mempool		*indirect_mb_pool;
+	uint8_t				port_id;
+	uint8_t				credits;
+
+	/* Used on refill */
+	unsigned int			added;
+	unsigned int			refill_threshold;
+	struct rte_mempool		*refill_mb_pool;
+	efx_qword_t			*rxq_hw_ring;
+	volatile void			*doorbell;
+
+	/* Datapath receive queue anchor */
+	struct sfc_dp_rxq		dp;
+	void				*ctrl;
+	sfc_dp_exception_t		*exception;
+};
+
+static inline struct sfc_ef10_ps_rxq *
+sfc_ef10_ps_rxq_by_dp_rxq(struct sfc_dp_rxq *dp_rxq)
+{
+	return container_of(dp_rxq, struct sfc_ef10_ps_rxq, dp);
+}
+
+static void
+sfc_ef10_ps_rx_qpush(struct sfc_ef10_ps_rxq *rxq)
+{
+	efx_dword_t dword;
+
+	/* Hardware has alignment restriction for WPTR */
+	RTE_BUILD_BUG_ON(SFC_RX_REFILL_BULK % EF10_RX_WPTR_ALIGN != 0);
+	SFC_ASSERT(RTE_ALIGN(rxq->added, EF10_RX_WPTR_ALIGN) == rxq->added);
+
+	EFX_POPULATE_DWORD_1(dword, ERF_DZ_RX_DESC_WPTR,
+			     rxq->added & rxq->rxq_ptr_mask);
+
+	/* Make sure that all descriptor update (Rx and event) reach memory */
+	rte_wmb();
+
+	/* DMA sync to device is not required */
+
+	rte_write32(dword.ed_u32[0], rxq->doorbell);
+}
+
+static void
+sfc_ef10_ps_rx_update_credits(struct sfc_ef10_ps_rxq *rxq)
+{
+	efx_dword_t dword;
+
+	if (rxq->credits == 0)
+		return;
+
+	EFX_POPULATE_DWORD_3(dword,
+	    ERF_DZ_RX_DESC_MAGIC_DOORBELL, 1,
+	    ERF_DZ_RX_DESC_MAGIC_CMD,
+	    ERE_DZ_RX_DESC_MAGIC_CMD_PS_CREDITS,
+	    ERF_DZ_RX_DESC_MAGIC_DATA, rxq->credits);
+
+	/* Make sure that event descriptor update reach memory */
+	rte_wmb();
+
+	/* DMA sync to device is not required */
+
+	rte_write32(dword.ed_u32[0], rxq->doorbell);
+
+	rxq->credits = 0;
+}
+
+static void
+sfc_ef10_ps_rx_qrefill(struct sfc_ef10_ps_rxq *rxq)
+{
+	const unsigned int rxq_ptr_mask = rxq->rxq_ptr_mask;
+	unsigned int free_space;
+	unsigned int bulks;
+	void *objs[SFC_RX_REFILL_BULK];
+	unsigned int added = rxq->added;
+
+	free_space = EFX_RXQ_LIMIT(rxq_ptr_mask + 1) - (added - rxq->completed);
+
+	if (free_space < rxq->refill_threshold)
+		return;
+
+	bulks = free_space / RTE_DIM(objs);
+
+	while (bulks-- > 0) {
+		unsigned int id;
+		unsigned int i;
+
+		if (unlikely(rte_mempool_get_bulk(rxq->refill_mb_pool, objs,
+						  RTE_DIM(objs)) < 0)) {
+			struct rte_eth_dev_data *dev_data =
+				rte_eth_devices[rxq->port_id].data;
+
+			/*
+			 * It is hardly a safe way to increment counter
+			 * from different contexts, but all PMDs do it.
+			 */
+			dev_data->rx_mbuf_alloc_failed += RTE_DIM(objs);
+			break;
+		}
+
+		for (i = 0, id = added & rxq_ptr_mask;
+		     i < RTE_DIM(objs);
+		     ++i, ++id) {
+			struct rte_mbuf *m = objs[i];
+			struct sfc_ef10_ps_rx_sw_desc *rxd;
+			unsigned int req_align;
+			uintptr_t dp;
+			uintptr_t adjust_align;
+
+			SFC_ASSERT((id & ~rxq_ptr_mask) == 0);
+			rxd = &rxq->sw_ring[id];
+			rxd->mbuf = m;
+
+			rte_mbuf_refcnt_set(m, 1);
+
+			req_align = SFC_PACKED_STREAM_BUFSIZE;
+			dp = rte_pktmbuf_mtophys(m) + req_align;
+			adjust_align = RTE_ALIGN_CEIL(dp, req_align) - dp;
+
+			/*
+			 * Align using priv_size to be able to
+			 * find correct address of the mbuf on
+			 * detach (see rte_mbuf_from_indirect()).
+			 */
+			m->buf_addr = RTE_PTR_ADD(m->buf_addr, adjust_align);
+			m->buf_physaddr += adjust_align;
+			m->priv_size += adjust_align;
+
+			EFX_POPULATE_QWORD_2(rxq->rxq_hw_ring[id],
+			    ESF_DZ_RX_KER_BYTE_CNT,
+			    EFX_RXQ_PACKED_STREAM_FAKE_BUF_SIZE,
+			    ESF_DZ_RX_KER_BUF_ADDR, m->buf_physaddr);
+		}
+
+		added += RTE_DIM(objs);
+	}
+
+	/* Push doorbell if something is posted */
+	if (likely(rxq->added != added)) {
+		rxq->added = added;
+		sfc_ef10_ps_rx_qpush(rxq);
+	}
+}
+
+static uint16_t
+sfc_ef10_ps_rx_get_pending(struct sfc_ef10_ps_rxq *rxq,
+			   struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	uint16_t n_rx_pkts = RTE_MIN(nb_pkts, rxq->pending_pkts);
+	struct rte_mbuf *hb;
+	const uint8_t *next_pkt;
+	unsigned int i;
+
+	if (n_rx_pkts == 0)
+		return 0;
+
+	rxq->pending_pkts -= n_rx_pkts;
+
+	if (rte_mempool_get_bulk(rxq->indirect_mb_pool,
+				 (void **)rx_pkts,
+				 n_rx_pkts) < 0)
+		return 0;
+
+	hb = rxq->sw_ring[rxq->completed & rxq->rxq_ptr_mask].mbuf;
+	next_pkt = rxq->next_pkt;
+
+	for (i = 0; i < n_rx_pkts; ++i) {
+		struct rte_mbuf *m = rx_pkts[i];
+		const efx_qword_t *qwordp;
+		uint16_t pkt_len;
+		uint16_t buf_len;
+
+		/* Parse pseudo-header */
+		qwordp = (const efx_qword_t *)next_pkt;
+		pkt_len = EFX_QWORD_FIELD(*qwordp, ES_DZ_PS_RX_PREFIX_ORIG_LEN);
+		buf_len = EFX_QWORD_FIELD(*qwordp, ES_DZ_PS_RX_PREFIX_CAP_LEN);
+
+		/* Prepare indirect mbuf */
+		rte_mbuf_refcnt_set(m, 1);
+		/* reference counter is incremented on huge mbuf */
+		rte_pktmbuf_attach(m, hb);
+		m->data_off = next_pkt - (uint8_t *)hb->buf_addr +
+			      EFX_RX_PACKED_STREAM_RX_PREFIX_SIZE;
+		rte_pktmbuf_pkt_len(m) = pkt_len;
+		rte_pktmbuf_data_len(m) = pkt_len;
+		m->packet_type = RTE_PTYPE_L2_ETHER;
+
+		/* Move to the next packet */
+		buf_len = SFC_P2_ROUND_UP(buf_len +
+					  EFX_RX_PACKED_STREAM_RX_PREFIX_SIZE,
+					  EFX_RX_PACKED_STREAM_ALIGNMENT);
+		next_pkt += buf_len + EFX_RX_PACKED_STREAM_ALIGNMENT;
+	}
+
+	rxq->next_pkt = next_pkt;
+
+	return n_rx_pkts;
+}
+
+static void
+sfc_ef10_ps_rx_discard_pending(struct sfc_ef10_ps_rxq *rxq)
+{
+	const uint8_t *next_pkt;
+	unsigned int i;
+
+	printf("DISCARD\n");
+
+	next_pkt = rxq->next_pkt;
+
+	for (i = 0; i < rxq->pending_pkts; ++i) {
+		const efx_qword_t *qwordp;
+		uint16_t buf_len;
+
+		qwordp = (const efx_qword_t *)next_pkt;
+		buf_len = EFX_QWORD_FIELD(*qwordp, ES_DZ_PS_RX_PREFIX_CAP_LEN);
+		buf_len = SFC_P2_ROUND_UP(buf_len +
+					  EFX_RX_PACKED_STREAM_RX_PREFIX_SIZE,
+					  EFX_RX_PACKED_STREAM_ALIGNMENT);
+		next_pkt += buf_len + EFX_RX_PACKED_STREAM_ALIGNMENT;
+	}
+
+	rxq->next_pkt = next_pkt;
+	rxq->pending_pkts = 0;
+}
+
+static void
+sfc_ef10_ps_rx_process_ev(struct sfc_ef10_ps_rxq *rxq, efx_qword_t rx_ev)
+{
+	unsigned int ready;
+
+	SFC_ASSERT(rxq->pending_pkts == 0);
+
+	ready = (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_DSC_PTR_LBITS) -
+		 rxq->packets) &
+		EFX_MASK32(ESF_DZ_RX_DSC_PTR_LBITS);
+
+	rxq->packets += ready;
+	rxq->pending_pkts = ready;
+
+	if (EFX_TEST_QWORD_BIT(rx_ev, ESF_DZ_RX_EV_ROTATE_LBN)) {
+		struct sfc_ef10_ps_rx_sw_desc *rxd;
+
+		/* Credit is spent by firmware */
+		rxq->credits++;
+
+		/* Drop our reference to huge buffer */
+		rxd = &rxq->sw_ring[rxq->completed & rxq->rxq_ptr_mask];
+		rte_pktmbuf_free(rxd->mbuf);
+
+		/* Switch to the next huge buffer */
+		rxq->completed++;
+		rxd = &rxq->sw_ring[rxq->completed & rxq->rxq_ptr_mask];
+		rxq->next_pkt = rxd->mbuf->buf_addr;
+	}
+
+	if (rx_ev.eq_u64[0] &
+	    rte_cpu_to_le_64((1ull << ESF_DZ_RX_ECC_ERR_LBN) |
+			     (1ull << ESF_DZ_RX_ECRC_ERR_LBN)))
+		sfc_ef10_ps_rx_discard_pending(rxq);
+}
+
+static bool
+sfc_ef10_ps_rx_event_get(struct sfc_ef10_ps_rxq *rxq, efx_qword_t *rx_ev)
+{
+	if (unlikely(rxq->flags & SFC_EF10_PS_RXQ_EXCEPTION))
+		return false;
+
+	*rx_ev = rxq->evq_hw_ring[rxq->evq_read_ptr & rxq->evq_ptr_mask];
+
+	if (rx_ev->eq_u64[0] == UINT64_MAX)
+		return false;
+
+	if (unlikely(EFX_QWORD_FIELD(*rx_ev, FSF_AZ_EV_CODE) !=
+		     FSE_AZ_EV_CODE_RX_EV)) {
+		/*
+		 * Do not move read_ptr to keep the event for exception
+		 * handling
+		 */
+		rxq->flags |= SFC_EF10_PS_RXQ_EXCEPTION;
+		return false;
+	}
+
+	rxq->evq_read_ptr++;
+	return true;
+}
+
+static void
+sfc_ef10_ps_ev_qfill(struct sfc_ef10_ps_rxq *rxq, unsigned int old_read_ptr)
+{
+	const unsigned int read_ptr = rxq->evq_read_ptr;
+	const unsigned int evq_ptr_mask = rxq->evq_ptr_mask;
+
+	while (old_read_ptr != read_ptr) {
+		EFX_SET_QWORD(rxq->evq_hw_ring[old_read_ptr & evq_ptr_mask]);
+		++old_read_ptr;
+	}
+
+	/*
+	 * No barriers here.
+	 * Functions which push doorbell should care about correct
+	 * ordering: store instructions which fill in EvQ ring should be
+	 * retired from CPU and DMA sync before doorbell which will allow
+	 * to use these event entries.
+	 */
+}
+
+static uint16_t
+sfc_ef10_ps_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+		      uint16_t nb_pkts)
+{
+	struct sfc_ef10_ps_rxq *rxq = sfc_ef10_ps_rxq_by_dp_rxq(rx_queue);
+	const unsigned int evq_old_read_ptr = rxq->evq_read_ptr;
+	uint16_t n_rx_pkts;
+	efx_qword_t rx_ev;
+
+	if (unlikely((rxq->flags & SFC_EF10_PS_RXQ_RUNNING) == 0))
+		return 0;
+
+	n_rx_pkts = sfc_ef10_ps_rx_get_pending(rxq, rx_pkts, nb_pkts);
+
+	while (n_rx_pkts != nb_pkts && sfc_ef10_ps_rx_event_get(rxq, &rx_ev)) {
+		if (EFX_TEST_QWORD_BIT(rx_ev, ESF_DZ_RX_DROP_EVENT_LBN))
+			continue;
+
+		sfc_ef10_ps_rx_process_ev(rxq, rx_ev);
+		n_rx_pkts += sfc_ef10_ps_rx_get_pending(rxq,
+							rx_pkts + n_rx_pkts,
+							nb_pkts - n_rx_pkts);
+	}
+
+	sfc_ef10_ps_ev_qfill(rxq, evq_old_read_ptr);
+
+	if (unlikely(rxq->flags & SFC_EF10_PS_RXQ_EXCEPTION)) {
+		/* Exception handling may restart the RxQ */
+		/* So, make sure that no cached variables are used after */
+		rxq->exception(rxq->ctrl);
+	} else {
+		sfc_ef10_ps_rx_update_credits(rxq);
+		sfc_ef10_ps_rx_qrefill(rxq);
+	}
+
+	return n_rx_pkts;
+}
+
+static const uint32_t *
+sfc_ef10_ps_supported_ptypes_get(void)
+{
+	static const uint32_t ef10_packed_ptypes[] = {
+		RTE_PTYPE_L2_ETHER,
+		RTE_PTYPE_UNKNOWN
+	};
+
+	return ef10_packed_ptypes;
+}
+
+static sfc_dp_rx_qdesc_npending_t sfc_ef10_ps_rx_qdesc_npending;
+static unsigned int
+sfc_ef10_ps_rx_qdesc_npending(__rte_unused struct sfc_dp_rxq *dp_rxq)
+{
+	/*
+	 * Correct implementation requires EvQ polling and events
+	 * processing.
+	 */
+	return -ENOTSUP;
+}
+
+static void *
+sfc_ef10_ps_rxq_get_ctrl(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_ps_rxq *rxq = sfc_ef10_ps_rxq_by_dp_rxq(dp_rxq);
+
+	return rxq->ctrl;
+}
+
+static const struct sfc_dp_rxq_ops sfc_ef10_ps_rxq_ops = {
+	.get_ctrl	= sfc_ef10_ps_rxq_get_ctrl,
+};
+
+
+static struct rte_mempool *
+sfc_ef10_ps_rx_huge_pktmbuf_pool_create(struct sfc_ef10_ps_rxq *rxq,
+					unsigned int hw_index, int socket_id)
+{
+	struct rte_pktmbuf_pool_private mbp_priv;
+	unsigned int elt_size;
+	char hb_pool_name[64];
+
+	/* Twice size to guarantee alignment */
+	elt_size = sizeof(struct rte_mbuf) + SFC_PACKED_STREAM_BUFSIZE * 2;
+
+	/* mbufs from this pool are not real mbufs, they are
+	 * used solely to hold large data buffers, so most mbuf-specific
+	 * fields are really unimportant. In fact mbuf data room size
+	 * does not fit into 16-bit integer.
+	 */
+	memset(&mbp_priv, 0, sizeof(mbp_priv));
+
+	snprintf(hb_pool_name, sizeof(hb_pool_name),
+		 "sfc-hugebuf%u.%u", rxq->port_id, hw_index);
+
+	/* ptr_mask is the number of entres in the queue minus 1,
+	 * which happens to be the optimal size for rte_mempool_create
+	 */
+	return rte_mempool_create(hb_pool_name, rxq->rxq_ptr_mask, elt_size,
+				  0, sizeof(struct rte_pktmbuf_pool_private),
+				  rte_pktmbuf_pool_init, &mbp_priv,
+				  rte_pktmbuf_init, NULL,
+				  socket_id, 0);
+}
+
+static sfc_dp_rx_qcreate_t sfc_ef10_ps_rx_qcreate;
+static int
+sfc_ef10_ps_rx_qcreate(void *ctrl, sfc_dp_exception_t *exception,
+		       int socket_id,
+		       const struct sfc_dp_rx_qcreate_args *args,
+		       struct sfc_dp_rxq **dp_rxqp)
+{
+	struct sfc_ef10_ps_rxq *rxq;
+	int rc;
+
+	rc = ENOMEM;
+	rxq = rte_zmalloc_socket("sfc-ef10-rxq", sizeof(*rxq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		goto fail_rxq_alloc;
+
+	rc = ENOMEM;
+	rxq->sw_ring = rte_calloc_socket("sfc-ef10-rxq-sw_ring",
+					 args->rxq_entries,
+					 sizeof(*rxq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq->sw_ring == NULL)
+		goto fail_desc_alloc;
+
+	rxq->rxq_ptr_mask = args->rxq_entries - 1;
+	rxq->evq_ptr_mask = args->evq_entries - 1;
+	rxq->evq_hw_ring = args->evq_hw_ring;
+	rxq->refill_threshold = args->refill_threshold;
+	rxq->port_id = args->port_id;
+	rxq->indirect_mb_pool = args->refill_mb_pool;
+	rxq->rxq_hw_ring = args->rxq_hw_ring;
+
+	rc = ENOMEM;
+	rxq->refill_mb_pool =
+		sfc_ef10_ps_rx_huge_pktmbuf_pool_create(rxq, args->hw_index,
+							socket_id);
+	if (rxq->refill_mb_pool == NULL)
+		goto fail_huge_pktmbuf_pool_create;
+
+	rxq->doorbell = (volatile uint8_t *)args->mem_bar +
+			ER_DZ_RX_DESC_UPD_REG_OFST +
+			args->hw_index * ER_DZ_RX_DESC_UPD_REG_STEP;
+
+	rxq->dp.ops = &sfc_ef10_ps_rxq_ops;
+	rxq->ctrl = ctrl;
+	rxq->exception = exception;
+
+	*dp_rxqp = &rxq->dp;
+	return 0;
+
+fail_huge_pktmbuf_pool_create:
+	rte_free(rxq->sw_ring);
+
+fail_desc_alloc:
+	rte_free(rxq);
+
+fail_rxq_alloc:
+	return rc;
+}
+
+static sfc_dp_rx_qdestroy_t sfc_ef10_ps_rx_qdestroy;
+static void
+sfc_ef10_ps_rx_qdestroy(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_ps_rxq *rxq = sfc_ef10_ps_rxq_by_dp_rxq(dp_rxq);
+
+	rte_mempool_free(rxq->refill_mb_pool);
+	rte_free(rxq->sw_ring);
+	rte_free(rxq);
+}
+
+static sfc_dp_rx_qstart_t sfc_ef10_ps_rx_qstart;
+static int
+sfc_ef10_ps_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr)
+{
+	struct sfc_ef10_ps_rxq *rxq = sfc_ef10_ps_rxq_by_dp_rxq(dp_rxq);
+	struct sfc_ef10_ps_rx_sw_desc *rxd;
+
+	rxq->pending_pkts = 0;
+	rxq->evq_read_ptr = evq_read_ptr;
+
+	/* Initialize before refill */
+	rxq->completed = rxq->added = 0;
+
+	sfc_ef10_ps_rx_qrefill(rxq);
+
+	/* Step back to handle the first EV_ROTATE correctly */
+	rxq->completed--;
+	/*
+	 * Allocate dummy mbuf to be freed on the first EV_ROTATE.
+	 * It is not used, so do not bother to initialize it.
+	 */
+	rxd = &rxq->sw_ring[rxq->completed & rxq->rxq_ptr_mask];
+	rxd->mbuf = rte_mbuf_raw_alloc(rxq->refill_mb_pool);
+	if (rxd->mbuf == NULL)
+		return ENOMEM;
+
+	rxq->flags |= (SFC_EF10_PS_RXQ_STARTED | SFC_EF10_PS_RXQ_RUNNING);
+	rxq->flags &= ~SFC_EF10_PS_RXQ_EXCEPTION;
+
+	/*
+	 * Control path grants initial packed stream credits to firmware
+	 * in accordance with event queue size. We simply track when
+	 * credits are spent and refill.
+	 */
+
+	return 0;
+}
+
+static sfc_dp_rx_qstop_t sfc_ef10_ps_rx_qstop;
+static void
+sfc_ef10_ps_rx_qstop(struct sfc_dp_rxq *dp_rxq, unsigned int *evq_read_ptr)
+{
+	struct sfc_ef10_ps_rxq *rxq = sfc_ef10_ps_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->flags &= ~SFC_EF10_PS_RXQ_RUNNING;
+
+	*evq_read_ptr = rxq->evq_read_ptr;
+}
+
+static sfc_dp_rx_qrx_ev_t sfc_ef10_ps_rx_qrx_ev;
+static bool
+sfc_ef10_ps_rx_qrx_ev(struct sfc_dp_rxq *dp_rxq, __rte_unused unsigned int id)
+{
+	__rte_unused struct sfc_ef10_ps_rxq *rxq;
+
+	rxq = sfc_ef10_ps_rxq_by_dp_rxq(dp_rxq);
+	SFC_ASSERT(~rxq->flags & SFC_EF10_PS_RXQ_RUNNING);
+
+	/*
+	 * It is safe to ignore Rx event since we free all mbufs on
+	 * queue purge anyway.
+	 */
+
+	return false;
+}
+
+static sfc_dp_rx_qpurge_t sfc_ef10_ps_rx_qpurge;
+static void
+sfc_ef10_ps_rx_qpurge(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_ps_rxq *rxq = sfc_ef10_ps_rxq_by_dp_rxq(dp_rxq);
+	unsigned int i;
+	struct sfc_ef10_ps_rx_sw_desc *rxd;
+
+	for (i = rxq->completed; i != rxq->added; ++i) {
+		rxd = &rxq->sw_ring[i & rxq->rxq_ptr_mask];
+		rte_pktmbuf_free(rxd->mbuf);
+	}
+
+	rxq->flags &= ~SFC_EF10_PS_RXQ_STARTED;
+}
+
+struct sfc_dp_rx sfc_ef10_ps_rx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10_RX_PACKED,
+		.type		= SFC_DP_RX,
+		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10 |
+				  SFC_DP_HW_FW_CAP_RX_PACKED_STREAM_64K,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_ps_rx_qcreate,
+	.qdestroy		= sfc_ef10_ps_rx_qdestroy,
+	.qstart			= sfc_ef10_ps_rx_qstart,
+	.qstop			= sfc_ef10_ps_rx_qstop,
+	.qrx_ev			= sfc_ef10_ps_rx_qrx_ev,
+	.qpurge			= sfc_ef10_ps_rx_qpurge,
+	.supported_ptypes_get	= sfc_ef10_ps_supported_ptypes_get,
+	.qdesc_npending		= sfc_ef10_ps_rx_qdesc_npending,
+	.pkt_burst		= sfc_ef10_ps_recv_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index ffa6abe..e36f18b 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1234,6 +1234,7 @@
 sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 {
 	struct sfc_adapter *sa = dev->data->dev_private;
+	const efx_nic_cfg_t *encp;
 	unsigned int avail_caps = 0;
 	const char *rx_name = NULL;
 	const char *tx_name = NULL;
@@ -1251,6 +1252,11 @@
 		break;
 	}
 
+	encp = efx_nic_cfg_get(sa->nic);
+	/* 64k buffers can be supported only */
+	if (encp->enc_rx_var_packed_stream_supported)
+		avail_caps |= SFC_DP_HW_FW_CAP_RX_PACKED_STREAM_64K;
+
 	rc = sfc_kvargs_process(sa, SFC_KVARG_RX_DATAPATH,
 				sfc_kvarg_string_handler, &rx_name);
 	if (rc != 0)
@@ -1334,6 +1340,7 @@
 	/* Register once */
 	if (TAILQ_EMPTY(&sfc_dp_head)) {
 		/* Prefer EF10 datapath */
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_ps_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
 
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 5e7c619..c26af3e 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -171,6 +171,35 @@
 }
 
 static boolean_t
+sfc_ev_nop_rx_ps(void *arg, uint32_t label, uint32_t id,
+		 uint32_t pkt_count, uint16_t flags)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa,
+		"EVQ %u unexpected packed stream Rx event label=%u id=%#x pkt_count=%u flags=%#x",
+		evq->evq_index, label, id, pkt_count, flags);
+	return B_TRUE;
+}
+
+/* It is not actually used on datapath, but required on RxQ flush */
+static boolean_t
+sfc_ev_dp_rx_ps(void *arg, __rte_unused uint32_t label, uint32_t id,
+		__rte_unused uint32_t pkt_count, __rte_unused uint16_t flags)
+{
+	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
+
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	if (evq->sa->dp_rx->qrx_ps_ev != NULL)
+		return evq->sa->dp_rx->qrx_ps_ev(dp_rxq, id);
+	else
+		return B_FALSE;
+}
+
+static boolean_t
 sfc_ev_nop_tx(void *arg, uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
@@ -419,6 +448,7 @@
 static const efx_ev_callbacks_t sfc_ev_callbacks = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
+	.eec_rx_ps		= sfc_ev_nop_rx_ps,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
@@ -434,6 +464,7 @@
 static const efx_ev_callbacks_t sfc_ev_callbacks_efx_rx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_efx_rx,
+	.eec_rx_ps		= sfc_ev_nop_rx_ps,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
@@ -449,6 +480,7 @@
 static const efx_ev_callbacks_t sfc_ev_callbacks_dp_rx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_dp_rx,
+	.eec_rx_ps		= sfc_ev_dp_rx_ps,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
@@ -464,6 +496,7 @@
 static const efx_ev_callbacks_t sfc_ev_callbacks_efx_tx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
+	.eec_rx_ps		= sfc_ev_nop_rx_ps,
 	.eec_tx			= sfc_ev_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
@@ -479,6 +512,7 @@
 static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
+	.eec_rx_ps		= sfc_ev_nop_rx_ps,
 	.eec_tx			= sfc_ev_dp_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index e4ee28f..c319bd2 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -55,11 +55,13 @@
 #define SFC_KVARG_DATAPATH_EFX		"efx"
 #define SFC_KVARG_DATAPATH_EF10		"ef10"
 #define SFC_KVARG_DATAPATH_EF10_SIMPLE	"ef10_simple"
+#define SFC_KVARG_DATAPATH_EF10_RX_PACKED	"ef10_packed"
 
 #define SFC_KVARG_RX_DATAPATH		"rx_datapath"
 #define SFC_KVARG_VALUES_RX_DATAPATH \
 	"[" SFC_KVARG_DATAPATH_EFX "|" \
-	    SFC_KVARG_DATAPATH_EF10 "]"
+	    SFC_KVARG_DATAPATH_EF10 "|" \
+	    SFC_KVARG_DATAPATH_EF10_RX_PACKED "]"
 
 #define SFC_KVARG_TX_DATAPATH		"tx_datapath"
 #define SFC_KVARG_VALUES_TX_DATAPATH \
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 2dda5c7..ace02fa 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -861,6 +861,9 @@ struct sfc_dp_rx sfc_efx_rx = {
 	SFC_ASSERT(nb_rx_desc <= rxq_info->max_entries);
 	rxq_info->entries = nb_rx_desc;
 	rxq_info->type =
+		(sa->dp_rx->dp.hw_fw_caps &
+		 SFC_DP_HW_FW_CAP_RX_PACKED_STREAM_64K) ?
+		EFX_RXQ_TYPE_PACKED_STREAM_64K :
 		sa->eth_dev->data->dev_conf.rxmode.enable_scatter ?
 		EFX_RXQ_TYPE_SCATTER : EFX_RXQ_TYPE_DEFAULT;
 
@@ -881,7 +884,7 @@ struct sfc_dp_rx sfc_efx_rx = {
 	rxq_info->rxq = rxq;
 
 	rxq->evq = evq;
-	rxq->hw_index = sw_index;
+	rxq->hw_index = evq_index;
 	rxq->refill_threshold = rx_conf->rx_free_thresh;
 	rxq->refill_mb_pool = mb_pool;
 
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index 6407406..4c84150 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -87,7 +87,7 @@ struct sfc_rxq {
 static inline unsigned int
 sfc_rxq_sw_index_by_hw_index(unsigned int hw_index)
 {
-	return hw_index;
+	return hw_index - 1;
 }
 
 static inline unsigned int
-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage Andrew Rybchenko
@ 2017-03-04 21:04   ` Ferruh Yigit
  0 siblings, 0 replies; 33+ messages in thread
From: Ferruh Yigit @ 2017-03-04 21:04 UTC (permalink / raw)
  To: Andrew Rybchenko, dev

On 3/2/2017 7:07 AM, Andrew Rybchenko wrote:
> Use different sets of libefx EvQ callbacks for management,
> transmit and receive event queue. It makes event handling
> more robust against unexpected events.
> 
> Also it is required for alternative datapath support.
> 
> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>

Can you please update patch titles to start with a verb to describe what
patch does. Most of the patches in this patchset already this way, but
some are not.

For example, this patch title can be something like:
"net/sfc: use different callbacks for EvQ"

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
@ 2017-03-04 21:05   ` Ferruh Yigit
  2017-03-13 13:12     ` Andrew Rybchenko
  0 siblings, 1 reply; 33+ messages in thread
From: Ferruh Yigit @ 2017-03-04 21:05 UTC (permalink / raw)
  To: Andrew Rybchenko, dev

On 3/2/2017 7:07 AM, Andrew Rybchenko wrote:
> Split control and datapath to make datapath substitutable and
> possibly reusable with alternative control path.

Does it make sense to document how alternative control path can be used?

> 
> libefx-based Rx datapath is bound to libefx control path, but
> other datapaths should be possible to use with alternative
> control path(s).
> 
> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
<...>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (12 preceding siblings ...)
  2017-03-02  7:07 ` [dpdk-dev] [PATCH 13/13] net/sfc: support Rx packed stream EF10-specific datapath Andrew Rybchenko
@ 2017-03-04 21:07 ` Ferruh Yigit
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
  14 siblings, 0 replies; 33+ messages in thread
From: Ferruh Yigit @ 2017-03-04 21:07 UTC (permalink / raw)
  To: Andrew Rybchenko, dev

On 3/2/2017 7:07 AM, Andrew Rybchenko wrote:
> Implement EF10 (SFN7xxx and SFN8xxx) native datapaths which may be
> chosen per device using PCI whitelist device arguments.
> 
> libefx-based datapath implementation is bound to API and structure
> imposed by the libefx. It has many indirect function calls to
> provide HW abstraction (bad for CPU pipeline) and uses many data
> structures: driver Rx/Tx queue, driver event queue, libefx Rx/Tx
> queue, libefx event queue, libefx NIC (bad for cache).
> 
> Native datapath implementation is fully separated from control
> path to be able to use alternative control path if required
> (e.g. kernel-aware).
> 
> Native datapaths show better performance than libefx-based.
> 
> Andrew Rybchenko (13):
>   net/sfc: callbacks should depend on EvQ usage
>   net/sfc: emphasis that RSS hash flag is an Rx queue flag
>   net/sfc: do not use Rx queue control state on datapath
>   net/sfc: factor out libefx-based Rx datapath
>   net/sfc: Rx scatter is a datapath-dependent feature
>   net/sfc: implement EF10 native Rx datapath
>   net/sfc: factory out libefx-based Tx datapath
>   net/sfc: VLAN insertion is a datapath dependent feature
>   net/sfc: TSO is a datapath dependent feature
>   net/sfc: implement EF10 native Tx datapath
>   net/sfc: multi-segment support as is Tx datapath features
>   net/sfc: implement simple EF10 native Tx datapath
>   net/sfc: support Rx packed stream EF10-specific datapath

Hi Andrew,

Overall basic tests for the patchset is good. Only can you please update
some patches' title, I commented into the patch already.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath
  2017-03-04 21:05   ` Ferruh Yigit
@ 2017-03-13 13:12     ` Andrew Rybchenko
  0 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-13 13:12 UTC (permalink / raw)
  To: Ferruh Yigit, dev

On 03/05/2017 12:05 AM, Ferruh Yigit wrote:
> On 3/2/2017 7:07 AM, Andrew Rybchenko wrote:
>> Split control and datapath to make datapath substitutable and
>> possibly reusable with alternative control path.
> Does it make sense to document how alternative control path can be used?

Possibility to use alternative control path is just a raw idea right now 
without
real implementation. Yes, the split is done taking the possibility into 
account,
but it is just a design. So, I think right now it is sufficient just 
mention it.

>> libefx-based Rx datapath is bound to libefx control path, but
>> other datapaths should be possible to use with alternative
>> control path(s).
>>
>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
> <...>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 00/13] Improve Solarflare PMD performance
  2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
                   ` (13 preceding siblings ...)
  2017-03-04 21:07 ` [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Ferruh Yigit
@ 2017-03-20 10:15 ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 01/13] net/sfc: use different callbacks for event queues Andrew Rybchenko
                     ` (13 more replies)
  14 siblings, 14 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Implement EF10 (SFN7xxx and SFN8xxx) native datapaths which may be
chosen per device using PCI whitelist device arguments.

libefx-based datapath implementation is bound to API and structure
imposed by the libefx. It has many indirect function calls to
provide HW abstraction (bad for CPU pipeline) and uses many data
structures: driver Rx/Tx queue, driver event queue, libefx Rx/Tx
queue, libefx event queue, libefx NIC (bad for cache).

Native datapath implementation is fully separated from control
path to be able to use alternative control path if required
(e.g. kernel-aware).

Native datapaths show better performance than libefx-based.

v2:
 - fix spelling, reword commit messages as requested
 - exclude packed stream support since it shows worse performance yet
   because of indirect mbufs usage
 - use uint16_t for port_id to avoid changes when corresponding
   mbuf patches are applied
 - add header with functions shared by EF10 Rx and Tx
 - clear event queue entries by cache-lines
 - remove unnecessary checks in refill code
 - add missing BSD LICENSE line to new files
 - avoid usage of function pointers in state structures, make EF10
   native datapath multi-process support friendly to avoid code
   shuffling in the future
 - remove unnecessary memory barriers, add corresponding comments
 - do not use libefx macros for Rx/Tx queue limit, define own which
   take event queue clear by cache-line into account

Andrew Rybchenko (13):
  net/sfc: use different callbacks for event queues
  net/sfc: emphasis that RSS hash flag is an Rx queue flag
  net/sfc: do not use Rx queue control state on datapath
  net/sfc: factor out libefx-based Rx datapath
  net/sfc: make Rx scatter a datapath-dependent feature
  net/sfc: remove few conditions in Rx queue refill
  net/sfc: implement EF10 native Rx datapath
  net/sfc: factor out libefx-based Tx datapath
  net/sfc: make VLAN insertion a datapath-dependent feature
  net/sfc: make TSO a datapath-dependent feature
  net/sfc: implement EF10 native Tx datapath
  net/sfc: make multi-segment support a Tx datapath feature
  net/sfc: implement simple EF10 native Tx datapath

 doc/guides/nics/sfc_efx.rst   |  24 ++
 drivers/net/sfc/Makefile      |   3 +
 drivers/net/sfc/sfc.h         |   4 +
 drivers/net/sfc/sfc_dp.c      | 100 ++++++
 drivers/net/sfc/sfc_dp.h      | 125 ++++++++
 drivers/net/sfc/sfc_dp_rx.h   | 197 ++++++++++++
 drivers/net/sfc/sfc_dp_tx.h   | 170 ++++++++++
 drivers/net/sfc/sfc_ef10.h    | 107 +++++++
 drivers/net/sfc/sfc_ef10_rx.c | 712 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ef10_tx.c | 524 +++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  | 165 ++++++++--
 drivers/net/sfc/sfc_ev.c      | 231 ++++++++++++--
 drivers/net/sfc/sfc_ev.h      |  27 +-
 drivers/net/sfc/sfc_kvargs.c  |  11 +
 drivers/net/sfc/sfc_kvargs.h  |  18 +-
 drivers/net/sfc/sfc_rx.c      | 333 ++++++++++++++++----
 drivers/net/sfc/sfc_rx.h      |  77 +++--
 drivers/net/sfc/sfc_tso.c     |  22 +-
 drivers/net/sfc/sfc_tx.c      | 353 +++++++++++++++------
 drivers/net/sfc/sfc_tx.h      |  98 ++++--
 20 files changed, 3012 insertions(+), 289 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_dp.c
 create mode 100644 drivers/net/sfc/sfc_dp.h
 create mode 100644 drivers/net/sfc/sfc_dp_rx.h
 create mode 100644 drivers/net/sfc/sfc_dp_tx.h
 create mode 100644 drivers/net/sfc/sfc_ef10.h
 create mode 100644 drivers/net/sfc/sfc_ef10_rx.c
 create mode 100644 drivers/net/sfc/sfc_ef10_tx.c

-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 01/13] net/sfc: use different callbacks for event queues
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag Andrew Rybchenko
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Use different sets of libefx EvQ callbacks for management,
transmit and receive event queue. It makes event handling
more robust against unexpected events.

Also it is required for alternative datapath support.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_ev.c | 107 +++++++++++++++++++++++++++++++++++++++++++++--
 drivers/net/sfc/sfc_ev.h |  19 +++++----
 2 files changed, 114 insertions(+), 12 deletions(-)

diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 412645a..bb22001 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -71,6 +71,18 @@ sfc_ev_initialized(void *arg)
 }
 
 static boolean_t
+sfc_ev_nop_rx(void *arg, uint32_t label, uint32_t id,
+	      uint32_t size, uint16_t flags)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa,
+		"EVQ %u unexpected Rx event label=%u id=%#x size=%u flags=%#x",
+		evq->evq_index, label, id, size, flags);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 	  uint32_t size, uint16_t flags)
 {
@@ -144,6 +156,16 @@ sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 }
 
 static boolean_t
+sfc_ev_nop_tx(void *arg, uint32_t label, uint32_t id)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected Tx event label=%u id=%#x",
+		evq->evq_index, label, id);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
@@ -198,6 +220,16 @@ sfc_ev_exception(void *arg, __rte_unused uint32_t code,
 }
 
 static boolean_t
+sfc_ev_nop_rxq_flush_done(void *arg, uint32_t rxq_hw_index)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected RxQ %u flush done",
+		evq->evq_index, rxq_hw_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_rxq_flush_done(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
@@ -213,6 +245,16 @@ sfc_ev_rxq_flush_done(void *arg, __rte_unused uint32_t rxq_hw_index)
 }
 
 static boolean_t
+sfc_ev_nop_rxq_flush_failed(void *arg, uint32_t rxq_hw_index)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected RxQ %u flush failed",
+		evq->evq_index, rxq_hw_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_rxq_flush_failed(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
@@ -228,6 +270,16 @@ sfc_ev_rxq_flush_failed(void *arg, __rte_unused uint32_t rxq_hw_index)
 }
 
 static boolean_t
+sfc_ev_nop_txq_flush_done(void *arg, uint32_t txq_hw_index)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected TxQ %u flush done",
+		evq->evq_index, txq_hw_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_txq_flush_done(void *arg, __rte_unused uint32_t txq_hw_index)
 {
 	struct sfc_evq *evq = arg;
@@ -283,6 +335,16 @@ sfc_ev_timer(void *arg, uint32_t index)
 }
 
 static boolean_t
+sfc_ev_nop_link_change(void *arg, __rte_unused efx_link_mode_t link_mode)
+{
+	struct sfc_evq *evq = arg;
+
+	sfc_err(evq->sa, "EVQ %u unexpected link change event",
+		evq->evq_index);
+	return B_TRUE;
+}
+
+static boolean_t
 sfc_ev_link_change(void *arg, efx_link_mode_t link_mode)
 {
 	struct sfc_evq *evq = arg;
@@ -314,17 +376,47 @@ sfc_ev_link_change(void *arg, efx_link_mode_t link_mode)
 
 static const efx_ev_callbacks_t sfc_ev_callbacks = {
 	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
+	.eec_tx			= sfc_ev_nop_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_nop_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_link_change,
+};
+
+static const efx_ev_callbacks_t sfc_ev_callbacks_rx = {
+	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_rx,
-	.eec_tx			= sfc_ev_tx,
+	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
 	.eec_rxq_flush_failed	= sfc_ev_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_nop_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_nop_link_change,
+};
+
+static const efx_ev_callbacks_t sfc_ev_callbacks_tx = {
+	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
+	.eec_tx			= sfc_ev_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
 	.eec_txq_flush_done	= sfc_ev_txq_flush_done,
 	.eec_software		= sfc_ev_software,
 	.eec_sram		= sfc_ev_sram,
 	.eec_wake_up		= sfc_ev_wake_up,
 	.eec_timer		= sfc_ev_timer,
-	.eec_link_change	= sfc_ev_link_change,
+	.eec_link_change	= sfc_ev_nop_link_change,
 };
 
 
@@ -336,7 +428,7 @@ sfc_ev_qpoll(struct sfc_evq *evq)
 
 	/* Synchronize the DMA memory for reading not required */
 
-	efx_ev_qpoll(evq->common, &evq->read_ptr, &sfc_ev_callbacks, evq);
+	efx_ev_qpoll(evq->common, &evq->read_ptr, evq->callbacks, evq);
 
 	if (unlikely(evq->exception) && sfc_adapter_trylock(evq->sa)) {
 		struct sfc_adapter *sa = evq->sa;
@@ -427,6 +519,14 @@ sfc_ev_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 	if (rc != 0)
 		goto fail_ev_qcreate;
 
+	SFC_ASSERT(evq->rxq == NULL || evq->txq == NULL);
+	if (evq->rxq != 0)
+		evq->callbacks = &sfc_ev_callbacks_rx;
+	else if (evq->txq != 0)
+		evq->callbacks = &sfc_ev_callbacks_tx;
+	else
+		evq->callbacks = &sfc_ev_callbacks;
+
 	evq->init_state = SFC_EVQ_STARTING;
 
 	/* Wait for the initialization event */
@@ -485,6 +585,7 @@ sfc_ev_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 		return;
 
 	evq->init_state = SFC_EVQ_INITIALIZED;
+	evq->callbacks = NULL;
 	evq->read_ptr = 0;
 	evq->exception = B_FALSE;
 
diff --git a/drivers/net/sfc/sfc_ev.h b/drivers/net/sfc/sfc_ev.h
index 5bb2be4..359958e 100644
--- a/drivers/net/sfc/sfc_ev.h
+++ b/drivers/net/sfc/sfc_ev.h
@@ -56,17 +56,18 @@ enum sfc_evq_state {
 
 struct sfc_evq {
 	/* Used on datapath */
-	efx_evq_t		*common;
-	unsigned int		read_ptr;
-	boolean_t		exception;
-	efsys_mem_t		mem;
-	struct sfc_rxq		*rxq;
-	struct sfc_txq		*txq;
+	efx_evq_t			*common;
+	const efx_ev_callbacks_t	*callbacks;
+	unsigned int			read_ptr;
+	boolean_t			exception;
+	efsys_mem_t			mem;
+	struct sfc_rxq			*rxq;
+	struct sfc_txq			*txq;
 
 	/* Not used on datapath */
-	struct sfc_adapter	*sa;
-	unsigned int		evq_index;
-	enum sfc_evq_state	init_state;
+	struct sfc_adapter		*sa;
+	unsigned int			evq_index;
+	enum sfc_evq_state		init_state;
 };
 
 struct sfc_evq_info {
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 01/13] net/sfc: use different callbacks for event queues Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 03/13] net/sfc: do not use Rx queue control state on datapath Andrew Rybchenko
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Style fix to establish namespace for Rx queue flag defines.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_rx.c | 4 ++--
 drivers/net/sfc/sfc_rx.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 214e640..d6ba4e9 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -194,7 +194,7 @@ sfc_rx_set_rss_hash(struct sfc_rxq *rxq, unsigned int flags, struct rte_mbuf *m)
 	uint8_t *mbuf_data;
 
 
-	if ((rxq->flags & SFC_RXQ_RSS_HASH) == 0)
+	if ((rxq->flags & SFC_RXQ_FLAG_RSS_HASH) == 0)
 		return;
 
 	mbuf_data = rte_pktmbuf_mtod(m, uint8_t *);
@@ -765,7 +765,7 @@ sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 
 #if EFSYS_OPT_RX_SCALE
 	if (sa->hash_support == EFX_RX_HASH_AVAILABLE)
-		rxq->flags |= SFC_RXQ_RSS_HASH;
+		rxq->flags |= SFC_RXQ_FLAG_RSS_HASH;
 #endif
 
 	rxq->state = SFC_RXQ_INITIALIZED;
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index de4ac12..4e1b4c6 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -87,7 +87,7 @@ struct sfc_rxq {
 	uint16_t		prefix_size;
 #if EFSYS_OPT_RX_SCALE
 	unsigned int		flags;
-#define SFC_RXQ_RSS_HASH	0x1
+#define SFC_RXQ_FLAG_RSS_HASH	0x1
 #endif
 
 	/* Used on refill */
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 03/13] net/sfc: do not use Rx queue control state on datapath
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 01/13] net/sfc: use different callbacks for event queues Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Rx queue flags should keep the information required on datapath.

It is a preparation to split control and data paths.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_ev.c |  5 +++--
 drivers/net/sfc/sfc_rx.c | 12 +++++++-----
 drivers/net/sfc/sfc_rx.h | 12 +++++-------
 3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index bb22001..cc2a30f 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -101,7 +101,7 @@ sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->evq == evq);
-	SFC_ASSERT(rxq->state & SFC_RXQ_STARTED);
+	SFC_ASSERT(rxq->flags & SFC_RXQ_FLAG_STARTED);
 
 	stop = (id + 1) & rxq->ptr_mask;
 	pending_id = rxq->pending & rxq->ptr_mask;
@@ -434,7 +434,8 @@ sfc_ev_qpoll(struct sfc_evq *evq)
 		struct sfc_adapter *sa = evq->sa;
 		int rc;
 
-		if ((evq->rxq != NULL) && (evq->rxq->state & SFC_RXQ_RUNNING)) {
+		if ((evq->rxq != NULL) &&
+		    (evq->rxq->flags & SFC_RXQ_FLAG_RUNNING)) {
 			unsigned int rxq_sw_index = sfc_rxq_sw_index(evq->rxq);
 
 			sfc_warn(sa,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index d6ba4e9..6af1574 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -219,7 +219,7 @@ sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	boolean_t discard_next = B_FALSE;
 	struct rte_mbuf *scatter_pkt = NULL;
 
-	if (unlikely((rxq->state & SFC_RXQ_RUNNING) == 0))
+	if (unlikely((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0))
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -320,7 +320,7 @@ sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
 	SFC_ASSERT(sw_index < sa->rxq_count);
 	rxq = sa->rxq_info[sw_index].rxq;
 
-	if (rxq == NULL || (rxq->state & SFC_RXQ_RUNNING) == 0)
+	if (rxq == NULL || (rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -331,7 +331,7 @@ sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
 int
 sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset)
 {
-	if ((rxq->state & SFC_RXQ_RUNNING) == 0)
+	if ((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -486,7 +486,8 @@ sfc_rx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 
 	rxq->pending = rxq->completed = rxq->added = rxq->pushed = 0;
 
-	rxq->state |= (SFC_RXQ_STARTED | SFC_RXQ_RUNNING);
+	rxq->state |= SFC_RXQ_STARTED;
+	rxq->flags |= SFC_RXQ_FLAG_STARTED | SFC_RXQ_FLAG_RUNNING;
 
 	sfc_rx_qrefill(rxq);
 
@@ -533,13 +534,14 @@ sfc_rx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 	sa->eth_dev->data->rx_queue_state[sw_index] =
 		RTE_ETH_QUEUE_STATE_STOPPED;
 
-	rxq->state &= ~SFC_RXQ_RUNNING;
+	rxq->flags &= ~SFC_RXQ_FLAG_RUNNING;
 
 	if (sw_index == 0)
 		efx_mac_filter_default_rxq_clear(sa->nic);
 
 	sfc_rx_qflush(sa, sw_index);
 
+	rxq->flags &= ~SFC_RXQ_FLAG_STARTED;
 	rxq->state = SFC_RXQ_INITIALIZED;
 
 	efx_rx_qdestroy(rxq->common);
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index 4e1b4c6..b2ca1fa 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -61,8 +61,6 @@ enum sfc_rxq_state_bit {
 #define SFC_RXQ_INITIALIZED	(1 << SFC_RXQ_INITIALIZED_BIT)
 	SFC_RXQ_STARTED_BIT,
 #define SFC_RXQ_STARTED		(1 << SFC_RXQ_STARTED_BIT)
-	SFC_RXQ_RUNNING_BIT,
-#define SFC_RXQ_RUNNING		(1 << SFC_RXQ_RUNNING_BIT)
 	SFC_RXQ_FLUSHING_BIT,
 #define SFC_RXQ_FLUSHING	(1 << SFC_RXQ_FLUSHING_BIT)
 	SFC_RXQ_FLUSHED_BIT,
@@ -79,16 +77,15 @@ struct sfc_rxq {
 	/* Used on data path */
 	struct sfc_evq		*evq;
 	struct sfc_rx_sw_desc	*sw_desc;
-	unsigned int		state;
+	unsigned int		flags;
+#define SFC_RXQ_FLAG_STARTED	0x1
+#define SFC_RXQ_FLAG_RUNNING	0x2
+#define SFC_RXQ_FLAG_RSS_HASH	0x4
 	unsigned int		ptr_mask;
 	unsigned int		pending;
 	unsigned int		completed;
 	uint16_t		batch_max;
 	uint16_t		prefix_size;
-#if EFSYS_OPT_RX_SCALE
-	unsigned int		flags;
-#define SFC_RXQ_FLAG_RSS_HASH	0x1
-#endif
 
 	/* Used on refill */
 	unsigned int		added;
@@ -102,6 +99,7 @@ struct sfc_rxq {
 
 	/* Not used on data path */
 	unsigned int		hw_index;
+	unsigned int		state;
 };
 
 static inline unsigned int
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 04/13] net/sfc: factor out libefx-based Rx datapath
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (2 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 03/13] net/sfc: do not use Rx queue control state on datapath Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 05/13] net/sfc: make Rx scatter a datapath-dependent feature Andrew Rybchenko
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.

libefx-based Rx datapath is bound to libefx control path, but
other datapaths should be possible to use with alternative
control path(s).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst  |   7 +
 drivers/net/sfc/Makefile     |   1 +
 drivers/net/sfc/sfc.h        |   3 +
 drivers/net/sfc/sfc_dp.c     |  98 ++++++++++++++
 drivers/net/sfc/sfc_dp.h     | 123 ++++++++++++++++++
 drivers/net/sfc/sfc_dp_rx.h  | 173 +++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c | 106 +++++++++++----
 drivers/net/sfc/sfc_ev.c     |  70 +++++++---
 drivers/net/sfc/sfc_ev.h     |   4 +-
 drivers/net/sfc/sfc_kvargs.c |  10 ++
 drivers/net/sfc/sfc_kvargs.h |   9 +-
 drivers/net/sfc/sfc_rx.c     | 297 ++++++++++++++++++++++++++++++++++---------
 drivers/net/sfc/sfc_rx.h     |  73 +++++++----
 13 files changed, 840 insertions(+), 134 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_dp.c
 create mode 100644 drivers/net/sfc/sfc_dp.h
 create mode 100644 drivers/net/sfc/sfc_dp_rx.h

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index c02e1be..2d7a241 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -226,6 +226,13 @@ whitelist option like "-w 02:00.0,arg1=value1,...".
 Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify
 boolean parameters value.
 
+- ``rx_datapath`` [auto|efx] (default **auto**)
+
+  Choose receive datapath implementation.
+  **auto** allows the driver itself to make a choice based on firmware
+  features available and required by the datapath implementation.
+  **efx** chooses libefx-based datapath which supports Rx scatter.
+
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
   Choose hardware tunning to be optimized for either throughput or
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index b6119fb..541c96d 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -94,6 +94,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tso.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
 
 VPATH += $(SRCDIR)/base
 
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 8cf960c..02c97d1 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -142,6 +142,7 @@ struct sfc_intr {
 struct sfc_evq_info;
 struct sfc_rxq_info;
 struct sfc_txq_info;
+struct sfc_dp_rx;
 
 struct sfc_port {
 	unsigned int			lsc_seq;
@@ -225,6 +226,8 @@ struct sfc_adapter {
 	unsigned int			rss_tbl[EFX_RSS_TBL_SIZE];
 	uint8_t				rss_key[SFC_RSS_KEY_SIZE];
 #endif
+
+	const struct sfc_dp_rx		*dp_rx;
 };
 
 /*
diff --git a/drivers/net/sfc/sfc_dp.c b/drivers/net/sfc/sfc_dp.c
new file mode 100644
index 0000000..b52b2ee
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp.c
@@ -0,0 +1,98 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/queue.h>
+#include <string.h>
+#include <errno.h>
+
+#include <rte_log.h>
+
+#include "sfc_dp.h"
+
+void
+sfc_dp_queue_init(struct sfc_dp_queue *dpq, uint16_t port_id, uint16_t queue_id,
+		  const struct rte_pci_addr *pci_addr)
+{
+	dpq->port_id = port_id;
+	dpq->queue_id = queue_id;
+	dpq->pci_addr = *pci_addr;
+}
+
+struct sfc_dp *
+sfc_dp_find_by_name(struct sfc_dp_list *head, enum sfc_dp_type type,
+		    const char *name)
+{
+	struct sfc_dp *entry;
+
+	TAILQ_FOREACH(entry, head, links) {
+		if (entry->type != type)
+			continue;
+
+		if (strcmp(entry->name, name) == 0)
+			return entry;
+	}
+
+	return NULL;
+}
+
+struct sfc_dp *
+sfc_dp_find_by_caps(struct sfc_dp_list *head, enum sfc_dp_type type,
+		    unsigned int avail_caps)
+{
+	struct sfc_dp *entry;
+
+	TAILQ_FOREACH(entry, head, links) {
+		if (entry->type != type)
+			continue;
+
+		/* Take the first matching */
+		if (sfc_dp_match_hw_fw_caps(entry, avail_caps))
+			return entry;
+	}
+
+	return NULL;
+}
+
+int
+sfc_dp_register(struct sfc_dp_list *head, struct sfc_dp *entry)
+{
+	if (sfc_dp_find_by_name(head, entry->type, entry->name) != NULL) {
+		rte_log(RTE_LOG_ERR, RTE_LOGTYPE_PMD,
+			"sfc %s dapapath '%s' already registered\n",
+			entry->type == SFC_DP_RX ? "Rx" : "unknown",
+			entry->name);
+		return EEXIST;
+	}
+
+	TAILQ_INSERT_TAIL(head, entry, links);
+
+	return 0;
+}
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
new file mode 100644
index 0000000..c44c44d
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp.h
@@ -0,0 +1,123 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_DP_H
+#define _SFC_DP_H
+
+#include <stdbool.h>
+#include <sys/queue.h>
+
+#include <rte_pci.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define SFC_DIV_ROUND_UP(a, b) \
+	__extension__ ({		\
+		typeof(a) _a = (a);	\
+		typeof(b) _b = (b);	\
+					\
+		(_a + (_b - 1)) / _b;	\
+	})
+
+/**
+ * Datapath exception handler to be provided by the control path.
+ */
+typedef void (sfc_dp_exception_t)(void *ctrl);
+
+enum sfc_dp_type {
+	SFC_DP_RX = 0,	/**< Receive datapath */
+};
+
+
+/** Datapath queue run-time information */
+struct sfc_dp_queue {
+	uint16_t			port_id;
+	uint16_t			queue_id;
+	struct rte_pci_addr		pci_addr;
+};
+
+void sfc_dp_queue_init(struct sfc_dp_queue *dpq,
+		       uint16_t port_id, uint16_t queue_id,
+		       const struct rte_pci_addr *pci_addr);
+
+/*
+ * Helper macro to define datapath logging macros and have uniform
+ * logging.
+ */
+#define SFC_DP_LOG(dp_name, level, dpq, ...) \
+	do {								\
+		const struct sfc_dp_queue *_dpq = (dpq);		\
+		const struct rte_pci_addr *_addr = &(_dpq)->pci_addr;	\
+									\
+		RTE_LOG(level, PMD,					\
+			RTE_FMT("%s " PCI_PRI_FMT			\
+				" #%" PRIu16 ".%" PRIu16 ": "		\
+				RTE_FMT_HEAD(__VA_ARGS__,) "\n",	\
+				dp_name,				\
+				_addr->domain, _addr->bus,		\
+				_addr->devid, _addr->function,		\
+				_dpq->port_id, _dpq->queue_id,		\
+				RTE_FMT_TAIL(__VA_ARGS__,)));		\
+	} while (0)
+
+
+/** Datapath definition */
+struct sfc_dp {
+	TAILQ_ENTRY(sfc_dp)		links;
+	const char			*name;
+	enum sfc_dp_type		type;
+	/* Mask of required hardware/firmware capabilities */
+	unsigned int			hw_fw_caps;
+};
+
+/** List of datapath variants */
+TAILQ_HEAD(sfc_dp_list, sfc_dp);
+
+/* Check if available HW/FW capabilities are sufficient for the datapath */
+static inline bool
+sfc_dp_match_hw_fw_caps(const struct sfc_dp *dp, unsigned int avail_caps)
+{
+	return (dp->hw_fw_caps & avail_caps) == dp->hw_fw_caps;
+}
+
+struct sfc_dp *sfc_dp_find_by_name(struct sfc_dp_list *head,
+				   enum sfc_dp_type type, const char *name);
+struct sfc_dp *sfc_dp_find_by_caps(struct sfc_dp_list *head,
+				   enum sfc_dp_type type,
+				   unsigned int avail_caps);
+int sfc_dp_register(struct sfc_dp_list *head, struct sfc_dp *entry);
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_DP_H */
diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
new file mode 100644
index 0000000..7e56d14
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -0,0 +1,173 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_DP_RX_H
+#define _SFC_DP_RX_H
+
+#include <rte_mempool.h>
+#include <rte_ethdev.h>
+
+#include "sfc_dp.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic receive queue information used on data path.
+ * It must be kept as small as it is possible since it is built into
+ * the structure used on datapath.
+ */
+struct sfc_dp_rxq {
+	struct sfc_dp_queue	dpq;
+};
+
+/**
+ * Datapath receive queue creation information.
+ *
+ * The structure is used just to pass information from control path to
+ * datapath. It could be just function arguments, but it would be hardly
+ * readable.
+ */
+struct sfc_dp_rx_qcreate_info {
+	/** Memory pool to allocate Rx buffer from */
+	struct rte_mempool	*refill_mb_pool;
+	/** Minimum number of unused Rx descriptors to do refill */
+	unsigned int		refill_threshold;
+	/**
+	 * Usable mbuf data space in accordance with alignment and
+	 * padding requirements imposed by HW.
+	 */
+	unsigned int		buf_size;
+
+	/**
+	 * Maximum number of Rx descriptors completed in one Rx event.
+	 * Just for sanity checks if datapath would like to do.
+	 */
+	unsigned int		batch_max;
+
+	/** Pseudo-header size */
+	unsigned int		prefix_size;
+
+	/** Receive queue flags initializer */
+	unsigned int		flags;
+#define SFC_RXQ_FLAG_RSS_HASH	0x1
+
+	/** Rx queue size */
+	unsigned int		rxq_entries;
+};
+
+/**
+ * Allocate and initialize datapath receive queue.
+ *
+ * @param port_id	The port identifier
+ * @param queue_id	The queue identifier
+ * @param pci_addr	PCI function address
+ * @param socket_id	Socket identifier to allocate memory
+ * @param info		Receive queue information
+ * @param dp_rxqp	Location for generic datapath receive queue pointer
+ *
+ * @return 0 or positive errno.
+ */
+typedef int (sfc_dp_rx_qcreate_t)(uint16_t port_id, uint16_t queue_id,
+				  const struct rte_pci_addr *pci_addr,
+				  int socket_id,
+				  const struct sfc_dp_rx_qcreate_info *info,
+				  struct sfc_dp_rxq **dp_rxqp);
+
+/**
+ * Free resources allocated for datapath recevie queue.
+ */
+typedef void (sfc_dp_rx_qdestroy_t)(struct sfc_dp_rxq *dp_rxq);
+
+/**
+ * Receive queue start callback.
+ *
+ * It handovers EvQ to the datapath.
+ */
+typedef int (sfc_dp_rx_qstart_t)(struct sfc_dp_rxq *dp_rxq,
+				 unsigned int evq_read_ptr);
+
+/**
+ * Receive queue stop function called before flush.
+ */
+typedef void (sfc_dp_rx_qstop_t)(struct sfc_dp_rxq *dp_rxq,
+				 unsigned int *evq_read_ptr);
+
+/**
+ * Receive queue purge function called after queue flush.
+ *
+ * Should be used to free unused recevie buffers.
+ */
+typedef void (sfc_dp_rx_qpurge_t)(struct sfc_dp_rxq *dp_rxq);
+
+/** Get packet types recognized/classified */
+typedef const uint32_t * (sfc_dp_rx_supported_ptypes_get_t)(void);
+
+/** Get number of pending Rx descriptors */
+typedef unsigned int (sfc_dp_rx_qdesc_npending_t)(struct sfc_dp_rxq *dp_rxq);
+
+/** Receive datapath definition */
+struct sfc_dp_rx {
+	struct sfc_dp				dp;
+
+	sfc_dp_rx_qcreate_t			*qcreate;
+	sfc_dp_rx_qdestroy_t			*qdestroy;
+	sfc_dp_rx_qstart_t			*qstart;
+	sfc_dp_rx_qstop_t			*qstop;
+	sfc_dp_rx_qpurge_t			*qpurge;
+	sfc_dp_rx_supported_ptypes_get_t	*supported_ptypes_get;
+	sfc_dp_rx_qdesc_npending_t		*qdesc_npending;
+	eth_rx_burst_t				pkt_burst;
+};
+
+static inline struct sfc_dp_rx *
+sfc_dp_find_rx_by_name(struct sfc_dp_list *head, const char *name)
+{
+	struct sfc_dp *p = sfc_dp_find_by_name(head, SFC_DP_RX, name);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_rx, dp);
+}
+
+static inline struct sfc_dp_rx *
+sfc_dp_find_rx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
+{
+	struct sfc_dp *p = sfc_dp_find_by_caps(head, SFC_DP_RX, avail_caps);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_rx, dp);
+}
+
+extern struct sfc_dp_rx sfc_efx_rx;
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_DP_RX_H */
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index c73f228..6dffc1c 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -32,6 +32,7 @@
 #include <rte_dev.h>
 #include <rte_ethdev.h>
 #include <rte_pci.h>
+#include <rte_errno.h>
 
 #include "efx.h"
 
@@ -43,6 +44,11 @@
 #include "sfc_rx.h"
 #include "sfc_tx.h"
 #include "sfc_flow.h"
+#include "sfc_dp.h"
+#include "sfc_dp_rx.h"
+
+static struct sfc_dp_list sfc_dp_head =
+	TAILQ_HEAD_INITIALIZER(sfc_dp_head);
 
 static int
 sfc_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size)
@@ -164,19 +170,9 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 static const uint32_t *
 sfc_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 {
-	static const uint32_t ptypes[] = {
-		RTE_PTYPE_L2_ETHER,
-		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
-		RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
-		RTE_PTYPE_L4_TCP,
-		RTE_PTYPE_L4_UDP,
-		RTE_PTYPE_UNKNOWN
-	};
-
-	if (dev->rx_pkt_burst == sfc_recv_pkts)
-		return ptypes;
-
-	return NULL;
+	struct sfc_adapter *sa = dev->data->dev_private;
+
+	return sa->dp_rx->supported_ptypes_get();
 }
 
 static int
@@ -416,7 +412,7 @@ sfc_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
 	if (rc != 0)
 		goto fail_rx_qinit;
 
-	dev->data->rx_queues[rx_queue_id] = sa->rxq_info[rx_queue_id].rxq;
+	dev->data->rx_queues[rx_queue_id] = sa->rxq_info[rx_queue_id].rxq->dp;
 
 	sfc_adapter_unlock(sa);
 
@@ -431,13 +427,15 @@ sfc_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
 static void
 sfc_rx_queue_release(void *queue)
 {
-	struct sfc_rxq *rxq = queue;
+	struct sfc_dp_rxq *dp_rxq = queue;
+	struct sfc_rxq *rxq;
 	struct sfc_adapter *sa;
 	unsigned int sw_index;
 
-	if (rxq == NULL)
+	if (dp_rxq == NULL)
 		return;
 
+	rxq = sfc_rxq_by_dp_rxq(dp_rxq);
 	sa = rxq->evq->sa;
 	sfc_adapter_lock(sa);
 
@@ -973,9 +971,9 @@ sfc_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 static int
 sfc_rx_descriptor_done(void *queue, uint16_t offset)
 {
-	struct sfc_rxq *rxq = queue;
+	struct sfc_dp_rxq *dp_rxq = queue;
 
-	return sfc_rx_qdesc_done(rxq, offset);
+	return sfc_rx_qdesc_done(dp_rxq, offset);
 }
 
 static int
@@ -1358,6 +1356,69 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 };
 
 static int
+sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+	unsigned int avail_caps = 0;
+	const char *rx_name = NULL;
+	int rc;
+
+	if (sa == NULL || sa->state == SFC_ADAPTER_UNINITIALIZED)
+		return -E_RTE_SECONDARY;
+
+	rc = sfc_kvargs_process(sa, SFC_KVARG_RX_DATAPATH,
+				sfc_kvarg_string_handler, &rx_name);
+	if (rc != 0)
+		goto fail_kvarg_rx_datapath;
+
+	if (rx_name != NULL) {
+		sa->dp_rx = sfc_dp_find_rx_by_name(&sfc_dp_head, rx_name);
+		if (sa->dp_rx == NULL) {
+			sfc_err(sa, "Rx datapath %s not found", rx_name);
+			rc = ENOENT;
+			goto fail_dp_rx;
+		}
+		if (!sfc_dp_match_hw_fw_caps(&sa->dp_rx->dp, avail_caps)) {
+			sfc_err(sa,
+				"Insufficient Hw/FW capabilities to use Rx datapath %s",
+				rx_name);
+			rc = EINVAL;
+			goto fail_dp_rx;
+		}
+	} else {
+		sa->dp_rx = sfc_dp_find_rx_by_caps(&sfc_dp_head, avail_caps);
+		if (sa->dp_rx == NULL) {
+			sfc_err(sa, "Rx datapath by caps %#x not found",
+				avail_caps);
+			rc = ENOENT;
+			goto fail_dp_rx;
+		}
+	}
+
+	sfc_info(sa, "use %s Rx datapath", sa->dp_rx->dp.name);
+
+	dev->rx_pkt_burst = sa->dp_rx->pkt_burst;
+
+	dev->tx_pkt_burst = sfc_xmit_pkts;
+
+	dev->dev_ops = &sfc_eth_dev_ops;
+
+	return 0;
+
+fail_dp_rx:
+fail_kvarg_rx_datapath:
+	return rc;
+}
+
+static void
+sfc_register_dp(void)
+{
+	/* Register once */
+	if (TAILQ_EMPTY(&sfc_dp_head))
+		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
+}
+
+static int
 sfc_eth_dev_init(struct rte_eth_dev *dev)
 {
 	struct sfc_adapter *sa = dev->data->dev_private;
@@ -1366,6 +1427,8 @@ sfc_eth_dev_init(struct rte_eth_dev *dev)
 	const efx_nic_cfg_t *encp;
 	const struct ether_addr *from;
 
+	sfc_register_dp();
+
 	/* Required for logging */
 	sa->eth_dev = dev;
 
@@ -1406,12 +1469,10 @@ sfc_eth_dev_init(struct rte_eth_dev *dev)
 	from = (const struct ether_addr *)(encp->enc_mac_addr);
 	ether_addr_copy(from, &dev->data->mac_addrs[0]);
 
-	dev->dev_ops = &sfc_eth_dev_ops;
-	dev->rx_pkt_burst = &sfc_recv_pkts;
-	dev->tx_pkt_burst = &sfc_xmit_pkts;
-
 	sfc_adapter_unlock(sa);
 
+	sfc_eth_dev_set_ops(dev);
+
 	sfc_log_init(sa, "done");
 	return 0;
 
@@ -1489,6 +1550,7 @@ RTE_PMD_REGISTER_PCI(net_sfc_efx, sfc_efx_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_sfc_efx, pci_id_sfc_efx_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_sfc_efx, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PARAM_STRING(net_sfc_efx,
+	SFC_KVARG_RX_DATAPATH "=" SFC_KVARG_VALUES_RX_DATAPATH " "
 	SFC_KVARG_PERF_PROFILE "=" SFC_KVARG_VALUES_PERF_PROFILE " "
 	SFC_KVARG_STATS_UPDATE_PERIOD_MS "=<long> "
 	SFC_KVARG_MCDI_LOGGING "=" SFC_KVARG_VALUES_BOOL " "
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index cc2a30f..c6b02f2 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -83,25 +83,25 @@ sfc_ev_nop_rx(void *arg, uint32_t label, uint32_t id,
 }
 
 static boolean_t
-sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
-	  uint32_t size, uint16_t flags)
+sfc_ev_efx_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
+	      uint32_t size, uint16_t flags)
 {
 	struct sfc_evq *evq = arg;
-	struct sfc_rxq *rxq;
+	struct sfc_efx_rxq *rxq;
 	unsigned int stop;
 	unsigned int pending_id;
 	unsigned int delta;
 	unsigned int i;
-	struct sfc_rx_sw_desc *rxd;
+	struct sfc_efx_rx_sw_desc *rxd;
 
 	if (unlikely(evq->exception))
 		goto done;
 
-	rxq = evq->rxq;
+	rxq = sfc_efx_rxq_by_dp_rxq(evq->dp_rxq);
 
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->evq == evq);
-	SFC_ASSERT(rxq->flags & SFC_RXQ_FLAG_STARTED);
+	SFC_ASSERT(rxq->flags & SFC_EFX_RXQ_FLAG_STARTED);
 
 	stop = (id + 1) & rxq->ptr_mask;
 	pending_id = rxq->pending & rxq->ptr_mask;
@@ -119,7 +119,7 @@ sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 			sfc_err(evq->sa,
 				"EVQ %u RxQ %u invalid RX abort "
 				"(id=%#x size=%u flags=%#x); needs restart",
-				evq->evq_index, sfc_rxq_sw_index(rxq),
+				evq->evq_index, rxq->dp.dpq.queue_id,
 				id, size, flags);
 			goto done;
 		}
@@ -134,8 +134,8 @@ sfc_ev_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 		sfc_err(evq->sa,
 			"EVQ %u RxQ %u completion out of order "
 			"(id=%#x delta=%u flags=%#x); needs restart",
-			evq->evq_index, sfc_rxq_sw_index(rxq), id, delta,
-			flags);
+			evq->evq_index, rxq->dp.dpq.queue_id,
+			id, delta, flags);
 
 		goto done;
 	}
@@ -233,9 +233,13 @@ static boolean_t
 sfc_ev_rxq_flush_done(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
 	struct sfc_rxq *rxq;
 
-	rxq = evq->rxq;
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	rxq = sfc_rxq_by_dp_rxq(dp_rxq);
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->hw_index == rxq_hw_index);
 	SFC_ASSERT(rxq->evq == evq);
@@ -258,9 +262,13 @@ static boolean_t
 sfc_ev_rxq_flush_failed(void *arg, __rte_unused uint32_t rxq_hw_index)
 {
 	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
 	struct sfc_rxq *rxq;
 
-	rxq = evq->rxq;
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	rxq = sfc_rxq_by_dp_rxq(dp_rxq);
 	SFC_ASSERT(rxq != NULL);
 	SFC_ASSERT(rxq->hw_index == rxq_hw_index);
 	SFC_ASSERT(rxq->evq == evq);
@@ -389,9 +397,24 @@ static const efx_ev_callbacks_t sfc_ev_callbacks = {
 	.eec_link_change	= sfc_ev_link_change,
 };
 
-static const efx_ev_callbacks_t sfc_ev_callbacks_rx = {
+static const efx_ev_callbacks_t sfc_ev_callbacks_efx_rx = {
 	.eec_initialized	= sfc_ev_initialized,
-	.eec_rx			= sfc_ev_rx,
+	.eec_rx			= sfc_ev_efx_rx,
+	.eec_tx			= sfc_ev_nop_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_nop_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_nop_link_change,
+};
+
+static const efx_ev_callbacks_t sfc_ev_callbacks_dp_rx = {
+	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
@@ -434,9 +457,10 @@ sfc_ev_qpoll(struct sfc_evq *evq)
 		struct sfc_adapter *sa = evq->sa;
 		int rc;
 
-		if ((evq->rxq != NULL) &&
-		    (evq->rxq->flags & SFC_RXQ_FLAG_RUNNING)) {
-			unsigned int rxq_sw_index = sfc_rxq_sw_index(evq->rxq);
+		if (evq->dp_rxq != NULL) {
+			unsigned int rxq_sw_index;
+
+			rxq_sw_index = evq->dp_rxq->dpq.queue_id;
 
 			sfc_warn(sa,
 				 "restart RxQ %u because of exception on its EvQ %u",
@@ -520,13 +544,17 @@ sfc_ev_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 	if (rc != 0)
 		goto fail_ev_qcreate;
 
-	SFC_ASSERT(evq->rxq == NULL || evq->txq == NULL);
-	if (evq->rxq != 0)
-		evq->callbacks = &sfc_ev_callbacks_rx;
-	else if (evq->txq != 0)
+	SFC_ASSERT(evq->dp_rxq == NULL || evq->txq == NULL);
+	if (evq->dp_rxq != 0) {
+		if (strcmp(sa->dp_rx->dp.name, SFC_KVARG_DATAPATH_EFX) == 0)
+			evq->callbacks = &sfc_ev_callbacks_efx_rx;
+		else
+			evq->callbacks = &sfc_ev_callbacks_dp_rx;
+	} else if (evq->txq != 0) {
 		evq->callbacks = &sfc_ev_callbacks_tx;
-	else
+	} else {
 		evq->callbacks = &sfc_ev_callbacks;
+	}
 
 	evq->init_state = SFC_EVQ_STARTING;
 
diff --git a/drivers/net/sfc/sfc_ev.h b/drivers/net/sfc/sfc_ev.h
index 359958e..760df98 100644
--- a/drivers/net/sfc/sfc_ev.h
+++ b/drivers/net/sfc/sfc_ev.h
@@ -42,7 +42,7 @@ extern "C" {
 #define SFC_MGMT_EVQ_ENTRIES	(EFX_EVQ_MINNEVS)
 
 struct sfc_adapter;
-struct sfc_rxq;
+struct sfc_dp_rxq;
 struct sfc_txq;
 
 enum sfc_evq_state {
@@ -61,7 +61,7 @@ struct sfc_evq {
 	unsigned int			read_ptr;
 	boolean_t			exception;
 	efsys_mem_t			mem;
-	struct sfc_rxq			*rxq;
+	struct sfc_dp_rxq		*dp_rxq;
 	struct sfc_txq			*txq;
 
 	/* Not used on datapath */
diff --git a/drivers/net/sfc/sfc_kvargs.c b/drivers/net/sfc/sfc_kvargs.c
index e0a3ff9..01dff4c 100644
--- a/drivers/net/sfc/sfc_kvargs.c
+++ b/drivers/net/sfc/sfc_kvargs.c
@@ -48,6 +48,7 @@ sfc_kvargs_parse(struct sfc_adapter *sa)
 		SFC_KVARG_DEBUG_INIT,
 		SFC_KVARG_MCDI_LOGGING,
 		SFC_KVARG_PERF_PROFILE,
+		SFC_KVARG_RX_DATAPATH,
 		NULL,
 	};
 
@@ -132,3 +133,12 @@ sfc_kvarg_long_handler(__rte_unused const char *key,
 
 	return 0;
 }
+
+int
+sfc_kvarg_string_handler(__rte_unused const char *key,
+			 const char *value_str, void *opaque)
+{
+	*(const char **)opaque = value_str;
+
+	return 0;
+}
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 8f53bd7..e86a7e9 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -56,6 +56,12 @@ extern "C" {
 
 #define SFC_KVARG_STATS_UPDATE_PERIOD_MS	"stats_update_period_ms"
 
+#define SFC_KVARG_DATAPATH_EFX		"efx"
+
+#define SFC_KVARG_RX_DATAPATH		"rx_datapath"
+#define SFC_KVARG_VALUES_RX_DATAPATH \
+	"[" SFC_KVARG_DATAPATH_EFX "]"
+
 struct sfc_adapter;
 
 int sfc_kvargs_parse(struct sfc_adapter *sa);
@@ -66,9 +72,10 @@ int sfc_kvargs_process(struct sfc_adapter *sa, const char *key_match,
 
 int sfc_kvarg_bool_handler(const char *key, const char *value_str,
 			   void *opaque);
-
 int sfc_kvarg_long_handler(const char *key, const char *value_str,
 			   void *opaque);
+int sfc_kvarg_string_handler(const char *key, const char *value_str,
+			     void *opaque);
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 6af1574..7845bd8 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -38,6 +38,7 @@
 #include "sfc_log.h"
 #include "sfc_ev.h"
 #include "sfc_rx.h"
+#include "sfc_kvargs.h"
 #include "sfc_tweak.h"
 
 /*
@@ -74,7 +75,7 @@ sfc_rx_qflush_failed(struct sfc_rxq *rxq)
 }
 
 static void
-sfc_rx_qrefill(struct sfc_rxq *rxq)
+sfc_efx_rx_qrefill(struct sfc_efx_rxq *rxq)
 {
 	unsigned int free_space;
 	unsigned int bulks;
@@ -83,9 +84,9 @@ sfc_rx_qrefill(struct sfc_rxq *rxq)
 	unsigned int added = rxq->added;
 	unsigned int id;
 	unsigned int i;
-	struct sfc_rx_sw_desc *rxd;
+	struct sfc_efx_rx_sw_desc *rxd;
 	struct rte_mbuf *m;
-	uint8_t port_id = rxq->port_id;
+	uint16_t port_id = rxq->dp.dpq.port_id;
 
 	free_space = EFX_RXQ_LIMIT(rxq->ptr_mask + 1) -
 		(added - rxq->completed);
@@ -137,7 +138,7 @@ sfc_rx_qrefill(struct sfc_rxq *rxq)
 }
 
 static uint64_t
-sfc_rx_desc_flags_to_offload_flags(const unsigned int desc_flags)
+sfc_efx_rx_desc_flags_to_offload_flags(const unsigned int desc_flags)
 {
 	uint64_t mbuf_flags = 0;
 
@@ -176,7 +177,7 @@ sfc_rx_desc_flags_to_offload_flags(const unsigned int desc_flags)
 }
 
 static uint32_t
-sfc_rx_desc_flags_to_packet_type(const unsigned int desc_flags)
+sfc_efx_rx_desc_flags_to_packet_type(const unsigned int desc_flags)
 {
 	return RTE_PTYPE_L2_ETHER |
 		((desc_flags & EFX_PKT_IPV4) ?
@@ -187,14 +188,30 @@ sfc_rx_desc_flags_to_packet_type(const unsigned int desc_flags)
 		((desc_flags & EFX_PKT_UDP) ? RTE_PTYPE_L4_UDP : 0);
 }
 
+static const uint32_t *
+sfc_efx_supported_ptypes_get(void)
+{
+	static const uint32_t ptypes[] = {
+		RTE_PTYPE_L2_ETHER,
+		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+		RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+		RTE_PTYPE_L4_TCP,
+		RTE_PTYPE_L4_UDP,
+		RTE_PTYPE_UNKNOWN
+	};
+
+	return ptypes;
+}
+
 static void
-sfc_rx_set_rss_hash(struct sfc_rxq *rxq, unsigned int flags, struct rte_mbuf *m)
+sfc_efx_rx_set_rss_hash(struct sfc_efx_rxq *rxq, unsigned int flags,
+			struct rte_mbuf *m)
 {
 #if EFSYS_OPT_RX_SCALE
 	uint8_t *mbuf_data;
 
 
-	if ((rxq->flags & SFC_RXQ_FLAG_RSS_HASH) == 0)
+	if ((rxq->flags & SFC_EFX_RXQ_FLAG_RSS_HASH) == 0)
 		return;
 
 	mbuf_data = rte_pktmbuf_mtod(m, uint8_t *);
@@ -209,17 +226,18 @@ sfc_rx_set_rss_hash(struct sfc_rxq *rxq, unsigned int flags, struct rte_mbuf *m)
 #endif
 }
 
-uint16_t
-sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+static uint16_t
+sfc_efx_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 {
-	struct sfc_rxq *rxq = rx_queue;
+	struct sfc_dp_rxq *dp_rxq = rx_queue;
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 	unsigned int completed;
 	unsigned int prefix_size = rxq->prefix_size;
 	unsigned int done_pkts = 0;
 	boolean_t discard_next = B_FALSE;
 	struct rte_mbuf *scatter_pkt = NULL;
 
-	if (unlikely((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0))
+	if (unlikely((rxq->flags & SFC_EFX_RXQ_FLAG_RUNNING) == 0))
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -227,7 +245,7 @@ sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 	completed = rxq->completed;
 	while (completed != rxq->pending && done_pkts < nb_pkts) {
 		unsigned int id;
-		struct sfc_rx_sw_desc *rxd;
+		struct sfc_efx_rx_sw_desc *rxd;
 		struct rte_mbuf *m;
 		unsigned int seg_len;
 		unsigned int desc_flags;
@@ -281,14 +299,16 @@ sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		/* The first fragment of the packet has prefix */
 		prefix_size = rxq->prefix_size;
 
-		m->ol_flags = sfc_rx_desc_flags_to_offload_flags(desc_flags);
-		m->packet_type = sfc_rx_desc_flags_to_packet_type(desc_flags);
+		m->ol_flags =
+			sfc_efx_rx_desc_flags_to_offload_flags(desc_flags);
+		m->packet_type =
+			sfc_efx_rx_desc_flags_to_packet_type(desc_flags);
 
 		/*
 		 * Extract RSS hash from the packet prefix and
 		 * set the corresponding field (if needed and possible)
 		 */
-		sfc_rx_set_rss_hash(rxq, desc_flags, m);
+		sfc_efx_rx_set_rss_hash(rxq, desc_flags, m);
 
 		m->data_off += prefix_size;
 
@@ -307,20 +327,18 @@ sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 
 	rxq->completed = completed;
 
-	sfc_rx_qrefill(rxq);
+	sfc_efx_rx_qrefill(rxq);
 
 	return done_pkts;
 }
 
-unsigned int
-sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
+static sfc_dp_rx_qdesc_npending_t sfc_efx_rx_qdesc_npending;
+static unsigned int
+sfc_efx_rx_qdesc_npending(struct sfc_dp_rxq *dp_rxq)
 {
-	struct sfc_rxq *rxq;
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 
-	SFC_ASSERT(sw_index < sa->rxq_count);
-	rxq = sa->rxq_info[sw_index].rxq;
-
-	if (rxq == NULL || (rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
+	if ((rxq->flags & SFC_EFX_RXQ_FLAG_RUNNING) == 0)
 		return 0;
 
 	sfc_ev_qpoll(rxq->evq);
@@ -328,28 +346,177 @@ sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
 	return rxq->pending - rxq->completed;
 }
 
-int
-sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset)
+struct sfc_rxq *
+sfc_rxq_by_dp_rxq(const struct sfc_dp_rxq *dp_rxq)
 {
-	if ((rxq->flags & SFC_RXQ_FLAG_RUNNING) == 0)
-		return 0;
+	const struct sfc_dp_queue *dpq = &dp_rxq->dpq;
+	struct rte_eth_dev *eth_dev;
+	struct sfc_adapter *sa;
+	struct sfc_rxq *rxq;
 
-	sfc_ev_qpoll(rxq->evq);
+	SFC_ASSERT(rte_eth_dev_is_valid_port(dpq->port_id));
+	eth_dev = &rte_eth_devices[dpq->port_id];
 
-	return offset < (rxq->pending - rxq->completed);
+	sa = eth_dev->data->dev_private;
+
+	SFC_ASSERT(dpq->queue_id < sa->rxq_count);
+	rxq = sa->rxq_info[dpq->queue_id].rxq;
+
+	SFC_ASSERT(rxq != NULL);
+	return rxq;
+}
+
+static sfc_dp_rx_qcreate_t sfc_efx_rx_qcreate;
+static int
+sfc_efx_rx_qcreate(uint16_t port_id, uint16_t queue_id,
+		   const struct rte_pci_addr *pci_addr, int socket_id,
+		   const struct sfc_dp_rx_qcreate_info *info,
+		   struct sfc_dp_rxq **dp_rxqp)
+{
+	struct sfc_efx_rxq *rxq;
+	int rc;
+
+	rc = ENOMEM;
+	rxq = rte_zmalloc_socket("sfc-efx-rxq", sizeof(*rxq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		goto fail_rxq_alloc;
+
+	sfc_dp_queue_init(&rxq->dp.dpq, port_id, queue_id, pci_addr);
+
+	rc = ENOMEM;
+	rxq->sw_desc = rte_calloc_socket("sfc-efx-rxq-sw_desc",
+					 info->rxq_entries,
+					 sizeof(*rxq->sw_desc),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq->sw_desc == NULL)
+		goto fail_desc_alloc;
+
+	/* efx datapath is bound to efx control path */
+	rxq->evq = sfc_rxq_by_dp_rxq(&rxq->dp)->evq;
+	if (info->flags & SFC_RXQ_FLAG_RSS_HASH)
+		rxq->flags |= SFC_EFX_RXQ_FLAG_RSS_HASH;
+	rxq->ptr_mask = info->rxq_entries - 1;
+	rxq->batch_max = info->batch_max;
+	rxq->prefix_size = info->prefix_size;
+	rxq->refill_threshold = info->refill_threshold;
+	rxq->buf_size = info->buf_size;
+	rxq->refill_mb_pool = info->refill_mb_pool;
+
+	*dp_rxqp = &rxq->dp;
+	return 0;
+
+fail_desc_alloc:
+	rte_free(rxq);
+
+fail_rxq_alloc:
+	return rc;
+}
+
+static sfc_dp_rx_qdestroy_t sfc_efx_rx_qdestroy;
+static void
+sfc_efx_rx_qdestroy(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
+
+	rte_free(rxq->sw_desc);
+	rte_free(rxq);
+}
+
+static sfc_dp_rx_qstart_t sfc_efx_rx_qstart;
+static int
+sfc_efx_rx_qstart(struct sfc_dp_rxq *dp_rxq,
+		  __rte_unused unsigned int evq_read_ptr)
+{
+	/* libefx-based datapath is specific to libefx-based PMD */
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
+	struct sfc_rxq *crxq = sfc_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->common = crxq->common;
+
+	rxq->pending = rxq->completed = rxq->added = rxq->pushed = 0;
+
+	sfc_efx_rx_qrefill(rxq);
+
+	rxq->flags |= (SFC_EFX_RXQ_FLAG_STARTED | SFC_EFX_RXQ_FLAG_RUNNING);
+
+	return 0;
 }
 
+static sfc_dp_rx_qstop_t sfc_efx_rx_qstop;
 static void
-sfc_rx_qpurge(struct sfc_rxq *rxq)
+sfc_efx_rx_qstop(struct sfc_dp_rxq *dp_rxq,
+		 __rte_unused unsigned int *evq_read_ptr)
 {
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->flags &= ~SFC_EFX_RXQ_FLAG_RUNNING;
+
+	/* libefx-based datapath is bound to libefx-based PMD and uses
+	 * event queue structure directly. So, there is no necessity to
+	 * return EvQ read pointer.
+	 */
+}
+
+static sfc_dp_rx_qpurge_t sfc_efx_rx_qpurge;
+static void
+sfc_efx_rx_qpurge(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_efx_rxq *rxq = sfc_efx_rxq_by_dp_rxq(dp_rxq);
 	unsigned int i;
-	struct sfc_rx_sw_desc *rxd;
+	struct sfc_efx_rx_sw_desc *rxd;
 
 	for (i = rxq->completed; i != rxq->added; ++i) {
 		rxd = &rxq->sw_desc[i & rxq->ptr_mask];
 		rte_mempool_put(rxq->refill_mb_pool, rxd->mbuf);
 		rxd->mbuf = NULL;
+		/* Packed stream relies on 0 in inactive SW desc.
+		 * Rx queue stop is not performance critical, so
+		 * there is no harm to do it always.
+		 */
+		rxd->flags = 0;
+		rxd->size = 0;
 	}
+
+	rxq->flags &= ~SFC_EFX_RXQ_FLAG_STARTED;
+}
+
+struct sfc_dp_rx sfc_efx_rx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EFX,
+		.type		= SFC_DP_RX,
+		.hw_fw_caps	= 0,
+	},
+	.qcreate		= sfc_efx_rx_qcreate,
+	.qdestroy		= sfc_efx_rx_qdestroy,
+	.qstart			= sfc_efx_rx_qstart,
+	.qstop			= sfc_efx_rx_qstop,
+	.qpurge			= sfc_efx_rx_qpurge,
+	.supported_ptypes_get	= sfc_efx_supported_ptypes_get,
+	.qdesc_npending		= sfc_efx_rx_qdesc_npending,
+	.pkt_burst		= sfc_efx_recv_pkts,
+};
+
+unsigned int
+sfc_rx_qdesc_npending(struct sfc_adapter *sa, unsigned int sw_index)
+{
+	struct sfc_rxq *rxq;
+
+	SFC_ASSERT(sw_index < sa->rxq_count);
+	rxq = sa->rxq_info[sw_index].rxq;
+
+	if (rxq == NULL || (rxq->state & SFC_RXQ_STARTED) == 0)
+		return 0;
+
+	return sa->dp_rx->qdesc_npending(rxq->dp);
+}
+
+int
+sfc_rx_qdesc_done(struct sfc_dp_rxq *dp_rxq, unsigned int offset)
+{
+	struct sfc_rxq *rxq = sfc_rxq_by_dp_rxq(dp_rxq);
+
+	return offset < rxq->evq->sa->dp_rx->qdesc_npending(dp_rxq);
 }
 
 static void
@@ -400,7 +567,7 @@ sfc_rx_qflush(struct sfc_adapter *sa, unsigned int sw_index)
 			sfc_info(sa, "RxQ %u flushed", sw_index);
 	}
 
-	sfc_rx_qpurge(rxq);
+	sa->dp_rx->qpurge(rxq->dp);
 }
 
 static int
@@ -484,12 +651,11 @@ sfc_rx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 
 	efx_rx_qenable(rxq->common);
 
-	rxq->pending = rxq->completed = rxq->added = rxq->pushed = 0;
+	rc = sa->dp_rx->qstart(rxq->dp, evq->read_ptr);
+	if (rc != 0)
+		goto fail_dp_qstart;
 
 	rxq->state |= SFC_RXQ_STARTED;
-	rxq->flags |= SFC_RXQ_FLAG_STARTED | SFC_RXQ_FLAG_RUNNING;
-
-	sfc_rx_qrefill(rxq);
 
 	if (sw_index == 0) {
 		rc = sfc_rx_default_rxq_set_filter(sa, rxq);
@@ -504,6 +670,9 @@ sfc_rx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 	return 0;
 
 fail_mac_filter_default_rxq_set:
+	sa->dp_rx->qstop(rxq->dp, &rxq->evq->read_ptr);
+
+fail_dp_qstart:
 	sfc_rx_qflush(sa, sw_index);
 
 fail_rx_qcreate:
@@ -534,14 +703,13 @@ sfc_rx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 	sa->eth_dev->data->rx_queue_state[sw_index] =
 		RTE_ETH_QUEUE_STATE_STOPPED;
 
-	rxq->flags &= ~SFC_RXQ_FLAG_RUNNING;
+	sa->dp_rx->qstop(rxq->dp, &rxq->evq->read_ptr);
 
 	if (sw_index == 0)
 		efx_mac_filter_default_rxq_clear(sa->nic);
 
 	sfc_rx_qflush(sa, sw_index);
 
-	rxq->flags &= ~SFC_RXQ_FLAG_STARTED;
 	rxq->state = SFC_RXQ_INITIALIZED;
 
 	efx_rx_qdestroy(rxq->common);
@@ -692,6 +860,7 @@ sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	unsigned int evq_index;
 	struct sfc_evq *evq;
 	struct sfc_rxq *rxq;
+	struct sfc_dp_rx_qcreate_info info;
 
 	rc = sfc_rx_qcheck_conf(sa, nb_rx_desc, rx_conf);
 	if (rc != 0)
@@ -740,47 +909,51 @@ sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	if (rxq == NULL)
 		goto fail_rxq_alloc;
 
-	rc = sfc_dma_alloc(sa, "rxq", sw_index, EFX_RXQ_SIZE(rxq_info->entries),
-			   socket_id, &rxq->mem);
-	if (rc != 0)
-		goto fail_dma_alloc;
-
-	rc = ENOMEM;
-	rxq->sw_desc = rte_calloc_socket("sfc-rxq-sw_desc", rxq_info->entries,
-					 sizeof(*rxq->sw_desc),
-					 RTE_CACHE_LINE_SIZE, socket_id);
-	if (rxq->sw_desc == NULL)
-		goto fail_desc_alloc;
+	rxq_info->rxq = rxq;
 
-	evq->rxq = rxq;
 	rxq->evq = evq;
-	rxq->ptr_mask = rxq_info->entries - 1;
+	rxq->hw_index = sw_index;
 	rxq->refill_threshold = rx_conf->rx_free_thresh;
 	rxq->refill_mb_pool = mb_pool;
-	rxq->buf_size = buf_size;
-	rxq->hw_index = sw_index;
-	rxq->port_id = sa->eth_dev->data->port_id;
 
-	/* Cache limits required on datapath in RxQ structure */
-	rxq->batch_max = encp->enc_rx_batch_max;
-	rxq->prefix_size = encp->enc_rx_prefix_size;
+	rc = sfc_dma_alloc(sa, "rxq", sw_index, EFX_RXQ_SIZE(rxq_info->entries),
+			   socket_id, &rxq->mem);
+	if (rc != 0)
+		goto fail_dma_alloc;
+
+	memset(&info, 0, sizeof(info));
+	info.refill_mb_pool = rxq->refill_mb_pool;
+	info.refill_threshold = rxq->refill_threshold;
+	info.buf_size = buf_size;
+	info.batch_max = encp->enc_rx_batch_max;
+	info.prefix_size = encp->enc_rx_prefix_size;
 
 #if EFSYS_OPT_RX_SCALE
 	if (sa->hash_support == EFX_RX_HASH_AVAILABLE)
-		rxq->flags |= SFC_RXQ_FLAG_RSS_HASH;
+		info.flags |= SFC_RXQ_FLAG_RSS_HASH;
 #endif
 
+	info.rxq_entries = rxq_info->entries;
+
+	rc = sa->dp_rx->qcreate(sa->eth_dev->data->port_id, sw_index,
+				&SFC_DEV_TO_PCI(sa->eth_dev)->addr,
+				socket_id, &info, &rxq->dp);
+	if (rc != 0)
+		goto fail_dp_rx_qcreate;
+
+	evq->dp_rxq = rxq->dp;
+
 	rxq->state = SFC_RXQ_INITIALIZED;
 
-	rxq_info->rxq = rxq;
 	rxq_info->deferred_start = (rx_conf->rx_deferred_start != 0);
 
 	return 0;
 
-fail_desc_alloc:
+fail_dp_rx_qcreate:
 	sfc_dma_free(sa, &rxq->mem);
 
 fail_dma_alloc:
+	rxq_info->rxq = NULL;
 	rte_free(rxq);
 
 fail_rxq_alloc:
@@ -807,10 +980,12 @@ sfc_rx_qfini(struct sfc_adapter *sa, unsigned int sw_index)
 	rxq = rxq_info->rxq;
 	SFC_ASSERT(rxq->state == SFC_RXQ_INITIALIZED);
 
+	sa->dp_rx->qdestroy(rxq->dp);
+	rxq->dp = NULL;
+
 	rxq_info->rxq = NULL;
 	rxq_info->entries = 0;
 
-	rte_free(rxq->sw_desc);
 	sfc_dma_free(sa, &rxq->mem);
 	rte_free(rxq);
 }
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index b2ca1fa..521e50e 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -38,6 +38,8 @@
 
 #include "efx.h"
 
+#include "sfc_dp_rx.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -49,7 +51,7 @@ struct sfc_evq;
  * Software Rx descriptor information associated with hardware Rx
  * descriptor.
  */
-struct sfc_rx_sw_desc {
+struct sfc_efx_rx_sw_desc {
 	struct rte_mbuf		*mbuf;
 	unsigned int		flags;
 	unsigned int		size;
@@ -70,35 +72,17 @@ enum sfc_rxq_state_bit {
 };
 
 /**
- * Receive queue information used on data path.
+ * Receive queue control information.
  * Allocated on the socket specified on the queue setup.
  */
 struct sfc_rxq {
-	/* Used on data path */
 	struct sfc_evq		*evq;
-	struct sfc_rx_sw_desc	*sw_desc;
-	unsigned int		flags;
-#define SFC_RXQ_FLAG_STARTED	0x1
-#define SFC_RXQ_FLAG_RUNNING	0x2
-#define SFC_RXQ_FLAG_RSS_HASH	0x4
-	unsigned int		ptr_mask;
-	unsigned int		pending;
-	unsigned int		completed;
-	uint16_t		batch_max;
-	uint16_t		prefix_size;
-
-	/* Used on refill */
-	unsigned int		added;
-	unsigned int		pushed;
-	unsigned int		refill_threshold;
-	uint8_t			port_id;
-	uint16_t		buf_size;
-	struct rte_mempool	*refill_mb_pool;
 	efx_rxq_t		*common;
 	efsys_mem_t		mem;
-
-	/* Not used on data path */
 	unsigned int		hw_index;
+	unsigned int		refill_threshold;
+	struct rte_mempool	*refill_mb_pool;
+	struct sfc_dp_rxq	*dp;
 	unsigned int		state;
 };
 
@@ -114,6 +98,44 @@ sfc_rxq_sw_index(const struct sfc_rxq *rxq)
 	return sfc_rxq_sw_index_by_hw_index(rxq->hw_index);
 }
 
+struct sfc_rxq *sfc_rxq_by_dp_rxq(const struct sfc_dp_rxq *dp_rxq);
+
+/**
+ * Receive queue information used on libefx-based data path.
+ * Allocated on the socket specified on the queue setup.
+ */
+struct sfc_efx_rxq {
+	/* Used on data path */
+	struct sfc_evq			*evq;
+	unsigned int			flags;
+#define SFC_EFX_RXQ_FLAG_STARTED	0x1
+#define SFC_EFX_RXQ_FLAG_RUNNING	0x2
+#define SFC_EFX_RXQ_FLAG_RSS_HASH	0x4
+	unsigned int			ptr_mask;
+	unsigned int			pending;
+	unsigned int			completed;
+	uint16_t			batch_max;
+	uint16_t			prefix_size;
+	struct sfc_efx_rx_sw_desc	*sw_desc;
+
+	/* Used on refill */
+	unsigned int			added;
+	unsigned int			pushed;
+	unsigned int			refill_threshold;
+	uint16_t			buf_size;
+	struct rte_mempool		*refill_mb_pool;
+	efx_rxq_t			*common;
+
+	/* Datapath receive queue anchor */
+	struct sfc_dp_rxq		dp;
+};
+
+static inline struct sfc_efx_rxq *
+sfc_efx_rxq_by_dp_rxq(struct sfc_dp_rxq *dp_rxq)
+{
+	return container_of(dp_rxq, struct sfc_efx_rxq, dp);
+}
+
 /**
  * Receive queue information used during setup/release only.
  * Allocated on the same socket as adapter data.
@@ -143,12 +165,9 @@ void sfc_rx_qstop(struct sfc_adapter *sa, unsigned int sw_index);
 void sfc_rx_qflush_done(struct sfc_rxq *rxq);
 void sfc_rx_qflush_failed(struct sfc_rxq *rxq);
 
-uint16_t sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
-		       uint16_t nb_pkts);
-
 unsigned int sfc_rx_qdesc_npending(struct sfc_adapter *sa,
 				   unsigned int sw_index);
-int sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset);
+int sfc_rx_qdesc_done(struct sfc_dp_rxq *dp_rxq, unsigned int offset);
 
 #if EFSYS_OPT_RX_SCALE
 efx_rx_hash_type_t sfc_rte_to_efx_hash_type(uint64_t rss_hf);
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 05/13] net/sfc: make Rx scatter a datapath-dependent feature
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (3 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 06/13] net/sfc: remove few conditions in Rx queue refill Andrew Rybchenko
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_rx.h | 2 ++
 drivers/net/sfc/sfc_rx.c    | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index 7e56d14..b3c6a6c 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -139,6 +139,8 @@ typedef unsigned int (sfc_dp_rx_qdesc_npending_t)(struct sfc_dp_rxq *dp_rxq);
 struct sfc_dp_rx {
 	struct sfc_dp				dp;
 
+	unsigned int				features;
+#define SFC_DP_RX_FEAT_SCATTER			0x1
 	sfc_dp_rx_qcreate_t			*qcreate;
 	sfc_dp_rx_qdestroy_t			*qdestroy;
 	sfc_dp_rx_qstart_t			*qstart;
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 7845bd8..56e48ab 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -487,6 +487,7 @@ struct sfc_dp_rx sfc_efx_rx = {
 		.type		= SFC_DP_RX,
 		.hw_fw_caps	= 0,
 	},
+	.features		= SFC_DP_RX_FEAT_SCATTER,
 	.qcreate		= sfc_efx_rx_qcreate,
 	.qdestroy		= sfc_efx_rx_qdestroy,
 	.qstart			= sfc_efx_rx_qstart,
@@ -1181,6 +1182,13 @@ sfc_rx_check_mode(struct sfc_adapter *sa, struct rte_eth_rxmode *rxmode)
 		rxmode->hw_strip_crc = 1;
 	}
 
+	if (rxmode->enable_scatter &&
+	    (~sa->dp_rx->features & SFC_DP_RX_FEAT_SCATTER)) {
+		sfc_err(sa, "Rx scatter not supported by %s datapath",
+			sa->dp_rx->dp.name);
+		rc = EINVAL;
+	}
+
 	if (rxmode->enable_lro) {
 		sfc_err(sa, "LRO not supported");
 		rc = EINVAL;
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 06/13] net/sfc: remove few conditions in Rx queue refill
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (4 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 05/13] net/sfc: make Rx scatter a datapath-dependent feature Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 07/13] net/sfc: implement EF10 native Rx datapath Andrew Rybchenko
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

If Rx refill threshold guarantees that refill happens for one or
more bulks, less checks may be done on refill.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_rx.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 56e48ab..f412376 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -95,17 +95,23 @@ sfc_efx_rx_qrefill(struct sfc_efx_rxq *rxq)
 		return;
 
 	bulks = free_space / RTE_DIM(objs);
+	/* refill_threshold guarantees that bulks is positive */
+	SFC_ASSERT(bulks > 0);
 
 	id = added & rxq->ptr_mask;
-	while (bulks-- > 0) {
-		if (rte_mempool_get_bulk(rxq->refill_mb_pool, objs,
-					 RTE_DIM(objs)) < 0) {
+	do {
+		if (unlikely(rte_mempool_get_bulk(rxq->refill_mb_pool, objs,
+						  RTE_DIM(objs)) < 0)) {
 			/*
 			 * It is hardly a safe way to increment counter
 			 * from different contexts, but all PMDs do it.
 			 */
 			rxq->evq->sa->eth_dev->data->rx_mbuf_alloc_failed +=
 				RTE_DIM(objs);
+			/* Return if we have posted nothing yet */
+			if (added == rxq->added)
+				return;
+			/* Push posted */
 			break;
 		}
 
@@ -128,13 +134,11 @@ sfc_efx_rx_qrefill(struct sfc_efx_rxq *rxq)
 		efx_rx_qpost(rxq->common, addr, rxq->buf_size,
 			     RTE_DIM(objs), rxq->completed, added);
 		added += RTE_DIM(objs);
-	}
+	} while (--bulks > 0);
 
-	/* Push doorbell if something is posted */
-	if (rxq->added != added) {
-		rxq->added = added;
-		efx_rx_qpush(rxq->common, added, &rxq->pushed);
-	}
+	SFC_ASSERT(added != rxq->added);
+	rxq->added = added;
+	efx_rx_qpush(rxq->common, added, &rxq->pushed);
 }
 
 static uint64_t
@@ -914,7 +918,8 @@ sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 
 	rxq->evq = evq;
 	rxq->hw_index = sw_index;
-	rxq->refill_threshold = rx_conf->rx_free_thresh;
+	rxq->refill_threshold =
+		RTE_MAX(rx_conf->rx_free_thresh, SFC_RX_REFILL_BULK);
 	rxq->refill_mb_pool = mb_pool;
 
 	rc = sfc_dma_alloc(sa, "rxq", sw_index, EFX_RXQ_SIZE(rxq_info->entries),
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 07/13] net/sfc: implement EF10 native Rx datapath
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (5 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 06/13] net/sfc: remove few conditions in Rx queue refill Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 08/13] net/sfc: factor out libefx-based Tx datapath Andrew Rybchenko
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: David Riddoch <driddoch@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst   |   5 +-
 drivers/net/sfc/Makefile      |   1 +
 drivers/net/sfc/sfc_dp.h      |   1 +
 drivers/net/sfc/sfc_dp_rx.h   |  22 ++
 drivers/net/sfc/sfc_ef10.h    | 107 +++++++
 drivers/net/sfc/sfc_ef10_rx.c | 712 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  |  14 +-
 drivers/net/sfc/sfc_ev.c      |  16 +-
 drivers/net/sfc/sfc_kvargs.h  |   4 +-
 drivers/net/sfc/sfc_rx.c      |   5 +
 10 files changed, 883 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_ef10.h
 create mode 100644 drivers/net/sfc/sfc_ef10_rx.c

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 2d7a241..43bd9f1 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -226,12 +226,15 @@ whitelist option like "-w 02:00.0,arg1=value1,...".
 Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify
 boolean parameters value.
 
-- ``rx_datapath`` [auto|efx] (default **auto**)
+- ``rx_datapath`` [auto|efx|ef10] (default **auto**)
 
   Choose receive datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
   features available and required by the datapath implementation.
   **efx** chooses libefx-based datapath which supports Rx scatter.
+  **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
+  more efficient than libefx-based and provides richer packet type
+  classification, but lacks Rx scatter support.
 
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index 541c96d..66b9114 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -95,6 +95,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tso.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_rx.c
 
 VPATH += $(SRCDIR)/base
 
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
index c44c44d..f83ea58 100644
--- a/drivers/net/sfc/sfc_dp.h
+++ b/drivers/net/sfc/sfc_dp.h
@@ -98,6 +98,7 @@ struct sfc_dp {
 	enum sfc_dp_type		type;
 	/* Mask of required hardware/firmware capabilities */
 	unsigned int			hw_fw_caps;
+#define SFC_DP_HW_FW_CAP_EF10		0x1
 };
 
 /** List of datapath variants */
diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index b3c6a6c..9d05a4b 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -83,6 +83,21 @@ struct sfc_dp_rx_qcreate_info {
 
 	/** Rx queue size */
 	unsigned int		rxq_entries;
+	/** DMA-mapped Rx descriptors ring */
+	void			*rxq_hw_ring;
+
+	/** Associated event queue size */
+	unsigned int		evq_entries;
+	/** Hardware event ring */
+	void			*evq_hw_ring;
+
+	/** The queue index in hardware (required to push right doorbell) */
+	unsigned int		hw_index;
+	/**
+	 * Virtual address of the memory-mapped BAR to push Rx refill
+	 * doorbell
+	 */
+	volatile void		*mem_bar;
 };
 
 /**
@@ -123,6 +138,11 @@ typedef void (sfc_dp_rx_qstop_t)(struct sfc_dp_rxq *dp_rxq,
 				 unsigned int *evq_read_ptr);
 
 /**
+ * Receive event handler used during queue flush only.
+ */
+typedef bool (sfc_dp_rx_qrx_ev_t)(struct sfc_dp_rxq *dp_rxq, unsigned int id);
+
+/**
  * Receive queue purge function called after queue flush.
  *
  * Should be used to free unused recevie buffers.
@@ -145,6 +165,7 @@ struct sfc_dp_rx {
 	sfc_dp_rx_qdestroy_t			*qdestroy;
 	sfc_dp_rx_qstart_t			*qstart;
 	sfc_dp_rx_qstop_t			*qstop;
+	sfc_dp_rx_qrx_ev_t			*qrx_ev;
 	sfc_dp_rx_qpurge_t			*qpurge;
 	sfc_dp_rx_supported_ptypes_get_t	*supported_ptypes_get;
 	sfc_dp_rx_qdesc_npending_t		*qdesc_npending;
@@ -168,6 +189,7 @@ sfc_dp_find_rx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
 }
 
 extern struct sfc_dp_rx sfc_efx_rx;
+extern struct sfc_dp_rx sfc_ef10_rx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10.h b/drivers/net/sfc/sfc_ef10.h
new file mode 100644
index 0000000..060d8fe
--- /dev/null
+++ b/drivers/net/sfc/sfc_ef10.h
@@ -0,0 +1,107 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2017 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_EF10_H
+#define _SFC_EF10_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Number of events in one cache line */
+#define SFC_EF10_EV_PER_CACHE_LINE \
+	(RTE_CACHE_LINE_SIZE / sizeof(efx_qword_t))
+
+#define SFC_EF10_EV_QCLEAR_MASK		(~(SFC_EF10_EV_PER_CACHE_LINE - 1))
+
+#if defined(SFC_EF10_EV_QCLEAR_USE_EFX)
+static inline void
+sfc_ef10_ev_qclear_cache_line(void *ptr)
+{
+	efx_qword_t *entry = ptr;
+	unsigned int i;
+
+	for (i = 0; i < SFC_EF10_EV_PER_CACHE_LINE; ++i)
+		EFX_SET_QWORD(entry[i]);
+}
+#else
+/*
+ * It is possible to do it using AVX2 and AVX512F, but it shows less
+ * performance.
+ */
+static inline void
+sfc_ef10_ev_qclear_cache_line(void *ptr)
+{
+	const __m128i val = _mm_set1_epi64x(UINT64_MAX);
+	__m128i *addr = ptr;
+	unsigned int i;
+
+	RTE_BUILD_BUG_ON(sizeof(val) > RTE_CACHE_LINE_SIZE);
+	RTE_BUILD_BUG_ON(RTE_CACHE_LINE_SIZE % sizeof(val) != 0);
+
+	for (i = 0; i < RTE_CACHE_LINE_SIZE / sizeof(val); ++i)
+		_mm_store_si128(&addr[i], val);
+}
+#endif
+
+static inline void
+sfc_ef10_ev_qclear(efx_qword_t *hw_ring, unsigned int ptr_mask,
+		   unsigned int old_read_ptr, unsigned int read_ptr)
+{
+	const unsigned int clear_ptr = read_ptr & SFC_EF10_EV_QCLEAR_MASK;
+	unsigned int old_clear_ptr = old_read_ptr & SFC_EF10_EV_QCLEAR_MASK;
+
+	while (old_clear_ptr != clear_ptr) {
+		sfc_ef10_ev_qclear_cache_line(
+			&hw_ring[old_clear_ptr & ptr_mask]);
+		old_clear_ptr += SFC_EF10_EV_PER_CACHE_LINE;
+	}
+
+	/*
+	 * No barriers here.
+	 * Functions which push doorbell should care about correct
+	 * ordering: store instructions which fill in EvQ ring should be
+	 * retired from CPU and DMA sync before doorbell which will allow
+	 * to use these event entries.
+	 */
+}
+
+static inline bool
+sfc_ef10_ev_present(const efx_qword_t ev)
+{
+	return ~EFX_QWORD_FIELD(ev, EFX_DWORD_0) |
+	       ~EFX_QWORD_FIELD(ev, EFX_DWORD_1);
+}
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_EF10_H */
diff --git a/drivers/net/sfc/sfc_ef10_rx.c b/drivers/net/sfc/sfc_ef10_rx.c
new file mode 100644
index 0000000..2a3bf89
--- /dev/null
+++ b/drivers/net/sfc/sfc_ef10_rx.c
@@ -0,0 +1,712 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* EF10 native datapath implementation */
+
+#include <stdbool.h>
+
+#include <rte_byteorder.h>
+#include <rte_mbuf_ptype.h>
+#include <rte_mbuf.h>
+#include <rte_io.h>
+
+#include "efx.h"
+#include "efx_types.h"
+#include "efx_regs.h"
+#include "efx_regs_ef10.h"
+
+#include "sfc_tweak.h"
+#include "sfc_dp_rx.h"
+#include "sfc_kvargs.h"
+#include "sfc_ef10.h"
+
+#define sfc_ef10_rx_err(dpq, ...) \
+	SFC_DP_LOG(SFC_KVARG_DATAPATH_EF10, ERR, dpq, __VA_ARGS__)
+
+/**
+ * Alignment requirement for value written to RX WPTR:
+ * the WPTR must be aligned to an 8 descriptor boundary.
+ */
+#define SFC_EF10_RX_WPTR_ALIGN	8
+
+/**
+ * Maximum number of descriptors/buffers in the Rx ring.
+ * It should guarantee that corresponding event queue never overfill.
+ * EF10 native datapath uses event queue of the same size as Rx queue.
+ * Maximum number of events on datapath can be estimated as number of
+ * Rx queue entries (one event per Rx buffer in the worst case) plus
+ * Rx error and flush events.
+ */
+#define SFC_EF10_RXQ_LIMIT(_ndesc) \
+	((_ndesc) - 1 /* head must not step on tail */ - \
+	 (SFC_EF10_EV_PER_CACHE_LINE - 1) /* max unused EvQ entries */ - \
+	 1 /* Rx error */ - 1 /* flush */)
+
+struct sfc_ef10_rx_sw_desc {
+	struct rte_mbuf			*mbuf;
+};
+
+struct sfc_ef10_rxq {
+	/* Used on data path */
+	unsigned int			flags;
+#define SFC_EF10_RXQ_STARTED		0x1
+#define SFC_EF10_RXQ_NOT_RUNNING	0x2
+#define SFC_EF10_RXQ_EXCEPTION		0x4
+#define SFC_EF10_RXQ_RSS_HASH		0x8
+	unsigned int			ptr_mask;
+	unsigned int			prepared;
+	unsigned int			completed;
+	unsigned int			evq_read_ptr;
+	efx_qword_t			*evq_hw_ring;
+	struct sfc_ef10_rx_sw_desc	*sw_ring;
+	uint64_t			rearm_data;
+	uint16_t			prefix_size;
+
+	/* Used on refill */
+	uint16_t			buf_size;
+	unsigned int			added;
+	unsigned int			refill_threshold;
+	struct rte_mempool		*refill_mb_pool;
+	efx_qword_t			*rxq_hw_ring;
+	volatile void			*doorbell;
+
+	/* Datapath receive queue anchor */
+	struct sfc_dp_rxq		dp;
+};
+
+static inline struct sfc_ef10_rxq *
+sfc_ef10_rxq_by_dp_rxq(struct sfc_dp_rxq *dp_rxq)
+{
+	return container_of(dp_rxq, struct sfc_ef10_rxq, dp);
+}
+
+static void
+sfc_ef10_rx_qpush(struct sfc_ef10_rxq *rxq)
+{
+	efx_dword_t dword;
+
+	/* Hardware has alignment restriction for WPTR */
+	RTE_BUILD_BUG_ON(SFC_RX_REFILL_BULK % SFC_EF10_RX_WPTR_ALIGN != 0);
+	SFC_ASSERT(RTE_ALIGN(rxq->added, SFC_EF10_RX_WPTR_ALIGN) == rxq->added);
+
+	EFX_POPULATE_DWORD_1(dword, ERF_DZ_RX_DESC_WPTR,
+			     rxq->added & rxq->ptr_mask);
+
+	/* DMA sync to device is not required */
+
+	/*
+	 * rte_write32() has rte_io_wmb() which guarantees that the STORE
+	 * operations (i.e. Rx and event descriptor updates) that precede
+	 * the rte_io_wmb() call are visible to NIC before the STORE
+	 * operations that follow it (i.e. doorbell write).
+	 */
+	rte_write32(dword.ed_u32[0], rxq->doorbell);
+}
+
+static void
+sfc_ef10_rx_qrefill(struct sfc_ef10_rxq *rxq)
+{
+	const unsigned int ptr_mask = rxq->ptr_mask;
+	const uint32_t buf_size = rxq->buf_size;
+	unsigned int free_space;
+	unsigned int bulks;
+	void *objs[SFC_RX_REFILL_BULK];
+	unsigned int added = rxq->added;
+
+	free_space = SFC_EF10_RXQ_LIMIT(ptr_mask + 1) -
+		(added - rxq->completed);
+
+	if (free_space < rxq->refill_threshold)
+		return;
+
+	bulks = free_space / RTE_DIM(objs);
+	/* refill_threshold guarantees that bulks is positive */
+	SFC_ASSERT(bulks > 0);
+
+	do {
+		unsigned int id;
+		unsigned int i;
+
+		if (unlikely(rte_mempool_get_bulk(rxq->refill_mb_pool, objs,
+						  RTE_DIM(objs)) < 0)) {
+			struct rte_eth_dev_data *dev_data =
+				rte_eth_devices[rxq->dp.dpq.port_id].data;
+
+			/*
+			 * It is hardly a safe way to increment counter
+			 * from different contexts, but all PMDs do it.
+			 */
+			dev_data->rx_mbuf_alloc_failed += RTE_DIM(objs);
+			/* Return if we have posted nothing yet */
+			if (added == rxq->added)
+				return;
+			/* Push posted */
+			break;
+		}
+
+		for (i = 0, id = added & ptr_mask;
+		     i < RTE_DIM(objs);
+		     ++i, ++id) {
+			struct rte_mbuf *m = objs[i];
+			struct sfc_ef10_rx_sw_desc *rxd;
+			phys_addr_t phys_addr;
+
+			SFC_ASSERT((id & ~ptr_mask) == 0);
+			rxd = &rxq->sw_ring[id];
+			rxd->mbuf = m;
+
+			/*
+			 * Avoid writing to mbuf. It is cheaper to do it
+			 * when we receive packet and fill in nearby
+			 * structure members.
+			 */
+
+			phys_addr = rte_mbuf_data_dma_addr_default(m);
+			EFX_POPULATE_QWORD_2(rxq->rxq_hw_ring[id],
+			    ESF_DZ_RX_KER_BYTE_CNT, buf_size,
+			    ESF_DZ_RX_KER_BUF_ADDR, phys_addr);
+		}
+
+		added += RTE_DIM(objs);
+	} while (--bulks > 0);
+
+	SFC_ASSERT(rxq->added != added);
+	rxq->added = added;
+	sfc_ef10_rx_qpush(rxq);
+}
+
+static void
+sfc_ef10_rx_prefetch_next(struct sfc_ef10_rxq *rxq, unsigned int next_id)
+{
+	struct rte_mbuf *next_mbuf;
+
+	/* Prefetch next bunch of software descriptors */
+	if ((next_id % (RTE_CACHE_LINE_SIZE / sizeof(rxq->sw_ring[0]))) == 0)
+		rte_prefetch0(&rxq->sw_ring[next_id]);
+
+	/*
+	 * It looks strange to prefetch depending on previous prefetch
+	 * data, but measurements show that it is really efficient and
+	 * increases packet rate.
+	 */
+	next_mbuf = rxq->sw_ring[next_id].mbuf;
+	if (likely(next_mbuf != NULL)) {
+		/* Prefetch the next mbuf structure */
+		rte_mbuf_prefetch_part1(next_mbuf);
+
+		/* Prefetch pseudo header of the next packet */
+		/* data_off is not filled in yet */
+		/* Yes, data could be not ready yet, but we hope */
+		rte_prefetch0((uint8_t *)next_mbuf->buf_addr +
+			      RTE_PKTMBUF_HEADROOM);
+	}
+}
+
+static uint16_t
+sfc_ef10_rx_prepared(struct sfc_ef10_rxq *rxq, struct rte_mbuf **rx_pkts,
+		     uint16_t nb_pkts)
+{
+	uint16_t n_rx_pkts = RTE_MIN(nb_pkts, rxq->prepared);
+	unsigned int completed = rxq->completed;
+	unsigned int i;
+
+	rxq->prepared -= n_rx_pkts;
+	rxq->completed = completed + n_rx_pkts;
+
+	for (i = 0; i < n_rx_pkts; ++i, ++completed)
+		rx_pkts[i] = rxq->sw_ring[completed & rxq->ptr_mask].mbuf;
+
+	return n_rx_pkts;
+}
+
+static void
+sfc_ef10_rx_ev_to_offloads(struct sfc_ef10_rxq *rxq, const efx_qword_t rx_ev,
+			   struct rte_mbuf *m)
+{
+	uint32_t l2_ptype = 0;
+	uint32_t l3_ptype = 0;
+	uint32_t l4_ptype = 0;
+	uint64_t ol_flags = 0;
+
+	if (unlikely(EFX_TEST_QWORD_BIT(rx_ev, ESF_DZ_RX_PARSE_INCOMPLETE_LBN)))
+		goto done;
+
+	switch (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_ETH_TAG_CLASS)) {
+	case ESE_DZ_ETH_TAG_CLASS_NONE:
+		l2_ptype = RTE_PTYPE_L2_ETHER;
+		break;
+	case ESE_DZ_ETH_TAG_CLASS_VLAN1:
+		l2_ptype = RTE_PTYPE_L2_ETHER_VLAN;
+		break;
+	case ESE_DZ_ETH_TAG_CLASS_VLAN2:
+		l2_ptype = RTE_PTYPE_L2_ETHER_QINQ;
+		break;
+	default:
+		/* Unexpected Eth tag class */
+		SFC_ASSERT(false);
+	}
+
+	switch (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_L3_CLASS)) {
+	case ESE_DZ_L3_CLASS_IP4_FRAG:
+		l4_ptype = RTE_PTYPE_L4_FRAG;
+		/* FALLTHROUGH */
+	case ESE_DZ_L3_CLASS_IP4:
+		l3_ptype = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+		ol_flags |= PKT_RX_RSS_HASH |
+			((EFX_TEST_QWORD_BIT(rx_ev,
+					     ESF_DZ_RX_IPCKSUM_ERR_LBN)) ?
+			 PKT_RX_IP_CKSUM_BAD : PKT_RX_IP_CKSUM_GOOD);
+		break;
+	case ESE_DZ_L3_CLASS_IP6_FRAG:
+		l4_ptype |= RTE_PTYPE_L4_FRAG;
+		/* FALLTHROUGH */
+	case ESE_DZ_L3_CLASS_IP6:
+		l3_ptype |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+		ol_flags |= PKT_RX_RSS_HASH;
+		break;
+	case ESE_DZ_L3_CLASS_ARP:
+		/* Override Layer 2 packet type */
+		l2_ptype = RTE_PTYPE_L2_ETHER_ARP;
+		break;
+	default:
+		/* Unexpected Layer 3 class */
+		SFC_ASSERT(false);
+	}
+
+	switch (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_L4_CLASS)) {
+	case ESE_DZ_L4_CLASS_TCP:
+		l4_ptype = RTE_PTYPE_L4_TCP;
+		ol_flags |=
+			(EFX_TEST_QWORD_BIT(rx_ev,
+					    ESF_DZ_RX_TCPUDP_CKSUM_ERR_LBN)) ?
+			PKT_RX_L4_CKSUM_BAD : PKT_RX_L4_CKSUM_GOOD;
+		break;
+	case ESE_DZ_L4_CLASS_UDP:
+		l4_ptype = RTE_PTYPE_L4_UDP;
+		ol_flags |=
+			(EFX_TEST_QWORD_BIT(rx_ev,
+					    ESF_DZ_RX_TCPUDP_CKSUM_ERR_LBN)) ?
+			PKT_RX_L4_CKSUM_BAD : PKT_RX_L4_CKSUM_GOOD;
+		break;
+	case ESE_DZ_L4_CLASS_UNKNOWN:
+		break;
+	default:
+		/* Unexpected Layer 4 class */
+		SFC_ASSERT(false);
+	}
+
+	/* Remove RSS hash offload flag if RSS is not enabled */
+	if (~rxq->flags & SFC_EF10_RXQ_RSS_HASH)
+		ol_flags &= ~PKT_RX_RSS_HASH;
+
+done:
+	m->ol_flags = ol_flags;
+	m->packet_type = l2_ptype | l3_ptype | l4_ptype;
+}
+
+static uint16_t
+sfc_ef10_rx_pseudo_hdr_get_len(const uint8_t *pseudo_hdr)
+{
+	return rte_le_to_cpu_16(*(const uint16_t *)&pseudo_hdr[8]);
+}
+
+static uint32_t
+sfc_ef10_rx_pseudo_hdr_get_hash(const uint8_t *pseudo_hdr)
+{
+	return rte_le_to_cpu_32(*(const uint32_t *)pseudo_hdr);
+}
+
+static uint16_t
+sfc_ef10_rx_process_event(struct sfc_ef10_rxq *rxq, efx_qword_t rx_ev,
+			  struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	const unsigned int ptr_mask = rxq->ptr_mask;
+	unsigned int completed = rxq->completed;
+	unsigned int ready;
+	struct sfc_ef10_rx_sw_desc *rxd;
+	struct rte_mbuf *m;
+	struct rte_mbuf *m0;
+	uint16_t n_rx_pkts;
+	const uint8_t *pseudo_hdr;
+	uint16_t pkt_len;
+
+	ready = (EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_DSC_PTR_LBITS) - completed) &
+		EFX_MASK32(ESF_DZ_RX_DSC_PTR_LBITS);
+	SFC_ASSERT(ready > 0);
+
+	if (rx_ev.eq_u64[0] &
+	    rte_cpu_to_le_64((1ull << ESF_DZ_RX_ECC_ERR_LBN) |
+			     (1ull << ESF_DZ_RX_ECRC_ERR_LBN))) {
+		SFC_ASSERT(rxq->prepared == 0);
+		rxq->completed += ready;
+		while (ready-- > 0) {
+			rxd = &rxq->sw_ring[completed++ & ptr_mask];
+			rte_mempool_put(rxq->refill_mb_pool, rxd->mbuf);
+		}
+		return 0;
+	}
+
+	n_rx_pkts = RTE_MIN(ready, nb_pkts);
+	rxq->prepared = ready - n_rx_pkts;
+	rxq->completed += n_rx_pkts;
+
+	rxd = &rxq->sw_ring[completed++ & ptr_mask];
+
+	sfc_ef10_rx_prefetch_next(rxq, completed & ptr_mask);
+
+	m = rxd->mbuf;
+
+	*rx_pkts++ = m;
+
+	*(uint64_t *)(&m->rearm_data) = rxq->rearm_data;
+	/* rearm_data rewrites ol_flags which is updated below */
+	rte_compiler_barrier();
+
+	/* Classify packet based on Rx event */
+	sfc_ef10_rx_ev_to_offloads(rxq, rx_ev, m);
+
+	/* data_off already moved past pseudo header */
+	pseudo_hdr = (uint8_t *)m->buf_addr + RTE_PKTMBUF_HEADROOM;
+
+	/*
+	 * Always get RSS hash from pseudo header to avoid
+	 * condition/branching. If it is valid or not depends on
+	 * PKT_RX_RSS_HASH in m->ol_flags.
+	 */
+	m->hash.rss = sfc_ef10_rx_pseudo_hdr_get_hash(pseudo_hdr);
+
+	if (ready == 1)
+		pkt_len = EFX_QWORD_FIELD(rx_ev, ESF_DZ_RX_BYTES) -
+			rxq->prefix_size;
+	else
+		pkt_len = sfc_ef10_rx_pseudo_hdr_get_len(pseudo_hdr);
+	SFC_ASSERT(pkt_len > 0);
+	rte_pktmbuf_data_len(m) = pkt_len;
+	rte_pktmbuf_pkt_len(m) = pkt_len;
+
+	m->next = NULL;
+
+	/* Remember mbuf to copy offload flags and packet type from */
+	m0 = m;
+	for (--ready; ready > 0; --ready) {
+		rxd = &rxq->sw_ring[completed++ & ptr_mask];
+
+		sfc_ef10_rx_prefetch_next(rxq, completed & ptr_mask);
+
+		m = rxd->mbuf;
+
+		if (ready > rxq->prepared)
+			*rx_pkts++ = m;
+
+		*(uint64_t *)(&m->rearm_data) = rxq->rearm_data;
+		/* rearm_data rewrites ol_flags which is updated below */
+		rte_compiler_barrier();
+
+		/* Event-dependent information is the same */
+		m->ol_flags = m0->ol_flags;
+		m->packet_type = m0->packet_type;
+
+		/* data_off already moved past pseudo header */
+		pseudo_hdr = (uint8_t *)m->buf_addr + RTE_PKTMBUF_HEADROOM;
+
+		/*
+		 * Always get RSS hash from pseudo header to avoid
+		 * condition/branching. If it is valid or not depends on
+		 * PKT_RX_RSS_HASH in m->ol_flags.
+		 */
+		m->hash.rss = sfc_ef10_rx_pseudo_hdr_get_hash(pseudo_hdr);
+
+		pkt_len = sfc_ef10_rx_pseudo_hdr_get_len(pseudo_hdr);
+		SFC_ASSERT(pkt_len > 0);
+		rte_pktmbuf_data_len(m) = pkt_len;
+		rte_pktmbuf_pkt_len(m) = pkt_len;
+
+		m->next = NULL;
+	}
+
+	return n_rx_pkts;
+}
+
+static bool
+sfc_ef10_rx_get_event(struct sfc_ef10_rxq *rxq, efx_qword_t *rx_ev)
+{
+	*rx_ev = rxq->evq_hw_ring[rxq->evq_read_ptr & rxq->ptr_mask];
+
+	if (!sfc_ef10_ev_present(*rx_ev))
+		return false;
+
+	if (unlikely(EFX_QWORD_FIELD(*rx_ev, FSF_AZ_EV_CODE) !=
+		     FSE_AZ_EV_CODE_RX_EV)) {
+		/*
+		 * Do not move read_ptr to keep the event for exception
+		 * handling by the control path.
+		 */
+		rxq->flags |= SFC_EF10_RXQ_EXCEPTION;
+		sfc_ef10_rx_err(&rxq->dp.dpq,
+				"RxQ exception at EvQ read ptr %#x",
+				rxq->evq_read_ptr);
+		return false;
+	}
+
+	rxq->evq_read_ptr++;
+	return true;
+}
+
+static uint16_t
+sfc_ef10_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(rx_queue);
+	unsigned int evq_old_read_ptr;
+	uint16_t n_rx_pkts;
+	efx_qword_t rx_ev;
+
+	if (unlikely(rxq->flags &
+		     (SFC_EF10_RXQ_NOT_RUNNING | SFC_EF10_RXQ_EXCEPTION)))
+		return 0;
+
+	n_rx_pkts = sfc_ef10_rx_prepared(rxq, rx_pkts, nb_pkts);
+
+	evq_old_read_ptr = rxq->evq_read_ptr;
+	while (n_rx_pkts != nb_pkts && sfc_ef10_rx_get_event(rxq, &rx_ev)) {
+		/*
+		 * DROP_EVENT is an internal to the NIC, software should
+		 * never see it and, therefore, may ignore it.
+		 */
+
+		n_rx_pkts += sfc_ef10_rx_process_event(rxq, rx_ev,
+						       rx_pkts + n_rx_pkts,
+						       nb_pkts - n_rx_pkts);
+	}
+
+	sfc_ef10_ev_qclear(rxq->evq_hw_ring, rxq->ptr_mask, evq_old_read_ptr,
+			   rxq->evq_read_ptr);
+
+	/* It is not a problem if we refill in the case of exception */
+	sfc_ef10_rx_qrefill(rxq);
+
+	return n_rx_pkts;
+}
+
+static const uint32_t *
+sfc_ef10_supported_ptypes_get(void)
+{
+	static const uint32_t ef10_native_ptypes[] = {
+		RTE_PTYPE_L2_ETHER,
+		RTE_PTYPE_L2_ETHER_ARP,
+		RTE_PTYPE_L2_ETHER_VLAN,
+		RTE_PTYPE_L2_ETHER_QINQ,
+		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+		RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+		RTE_PTYPE_L4_FRAG,
+		RTE_PTYPE_L4_TCP,
+		RTE_PTYPE_L4_UDP,
+		RTE_PTYPE_UNKNOWN
+	};
+
+	return ef10_native_ptypes;
+}
+
+static sfc_dp_rx_qdesc_npending_t sfc_ef10_rx_qdesc_npending;
+static unsigned int
+sfc_ef10_rx_qdesc_npending(__rte_unused struct sfc_dp_rxq *dp_rxq)
+{
+	/*
+	 * Correct implementation requires EvQ polling and events
+	 * processing (keeping all ready mbufs in prepared).
+	 */
+	return -ENOTSUP;
+}
+
+
+static uint64_t
+sfc_ef10_mk_mbuf_rearm_data(uint16_t port_id, uint16_t prefix_size)
+{
+	struct rte_mbuf m;
+
+	memset(&m, 0, sizeof(m));
+
+	rte_mbuf_refcnt_set(&m, 1);
+	m.data_off = RTE_PKTMBUF_HEADROOM + prefix_size;
+	m.nb_segs = 1;
+	m.port = port_id;
+
+	/* rearm_data covers structure members filled in above */
+	rte_compiler_barrier();
+	return *(uint64_t *)(&m.rearm_data);
+}
+
+static sfc_dp_rx_qcreate_t sfc_ef10_rx_qcreate;
+static int
+sfc_ef10_rx_qcreate(uint16_t port_id, uint16_t queue_id,
+		    const struct rte_pci_addr *pci_addr, int socket_id,
+		    const struct sfc_dp_rx_qcreate_info *info,
+		    struct sfc_dp_rxq **dp_rxqp)
+{
+	struct sfc_ef10_rxq *rxq;
+	int rc;
+
+	rc = EINVAL;
+	if (info->rxq_entries != info->evq_entries)
+		goto fail_rxq_args;
+
+	rc = ENOMEM;
+	rxq = rte_zmalloc_socket("sfc-ef10-rxq", sizeof(*rxq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		goto fail_rxq_alloc;
+
+	sfc_dp_queue_init(&rxq->dp.dpq, port_id, queue_id, pci_addr);
+
+	rc = ENOMEM;
+	rxq->sw_ring = rte_calloc_socket("sfc-ef10-rxq-sw_ring",
+					 info->rxq_entries,
+					 sizeof(*rxq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq->sw_ring == NULL)
+		goto fail_desc_alloc;
+
+	rxq->flags |= SFC_EF10_RXQ_NOT_RUNNING;
+	if (info->flags & SFC_RXQ_FLAG_RSS_HASH)
+		rxq->flags |= SFC_EF10_RXQ_RSS_HASH;
+	rxq->ptr_mask = info->rxq_entries - 1;
+	rxq->evq_hw_ring = info->evq_hw_ring;
+	rxq->refill_threshold = info->refill_threshold;
+	rxq->rearm_data =
+		sfc_ef10_mk_mbuf_rearm_data(port_id, info->prefix_size);
+	rxq->prefix_size = info->prefix_size;
+	rxq->buf_size = info->buf_size;
+	rxq->refill_mb_pool = info->refill_mb_pool;
+	rxq->rxq_hw_ring = info->rxq_hw_ring;
+	rxq->doorbell = (volatile uint8_t *)info->mem_bar +
+			ER_DZ_RX_DESC_UPD_REG_OFST +
+			info->hw_index * ER_DZ_RX_DESC_UPD_REG_STEP;
+
+	*dp_rxqp = &rxq->dp;
+	return 0;
+
+fail_desc_alloc:
+	rte_free(rxq);
+
+fail_rxq_alloc:
+fail_rxq_args:
+	return rc;
+}
+
+static sfc_dp_rx_qdestroy_t sfc_ef10_rx_qdestroy;
+static void
+sfc_ef10_rx_qdestroy(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	rte_free(rxq->sw_ring);
+	rte_free(rxq);
+}
+
+static sfc_dp_rx_qstart_t sfc_ef10_rx_qstart;
+static int
+sfc_ef10_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->prepared = 0;
+	rxq->completed = rxq->added = 0;
+
+	sfc_ef10_rx_qrefill(rxq);
+
+	rxq->evq_read_ptr = evq_read_ptr;
+
+	rxq->flags |= SFC_EF10_RXQ_STARTED;
+	rxq->flags &= ~(SFC_EF10_RXQ_NOT_RUNNING | SFC_EF10_RXQ_EXCEPTION);
+
+	return 0;
+}
+
+static sfc_dp_rx_qstop_t sfc_ef10_rx_qstop;
+static void
+sfc_ef10_rx_qstop(struct sfc_dp_rxq *dp_rxq, unsigned int *evq_read_ptr)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	rxq->flags |= SFC_EF10_RXQ_NOT_RUNNING;
+
+	*evq_read_ptr = rxq->evq_read_ptr;
+}
+
+static sfc_dp_rx_qrx_ev_t sfc_ef10_rx_qrx_ev;
+static bool
+sfc_ef10_rx_qrx_ev(struct sfc_dp_rxq *dp_rxq, __rte_unused unsigned int id)
+{
+	__rte_unused struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+
+	SFC_ASSERT(rxq->flags & SFC_EF10_RXQ_NOT_RUNNING);
+
+	/*
+	 * It is safe to ignore Rx event since we free all mbufs on
+	 * queue purge anyway.
+	 */
+
+	return false;
+}
+
+static sfc_dp_rx_qpurge_t sfc_ef10_rx_qpurge;
+static void
+sfc_ef10_rx_qpurge(struct sfc_dp_rxq *dp_rxq)
+{
+	struct sfc_ef10_rxq *rxq = sfc_ef10_rxq_by_dp_rxq(dp_rxq);
+	unsigned int i;
+	struct sfc_ef10_rx_sw_desc *rxd;
+
+	for (i = rxq->completed; i != rxq->added; ++i) {
+		rxd = &rxq->sw_ring[i & rxq->ptr_mask];
+		rte_mempool_put(rxq->refill_mb_pool, rxd->mbuf);
+		rxd->mbuf = NULL;
+	}
+
+	rxq->flags &= ~SFC_EF10_RXQ_STARTED;
+}
+
+struct sfc_dp_rx sfc_ef10_rx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10,
+		.type		= SFC_DP_RX,
+		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_rx_qcreate,
+	.qdestroy		= sfc_ef10_rx_qdestroy,
+	.qstart			= sfc_ef10_rx_qstart,
+	.qstop			= sfc_ef10_rx_qstop,
+	.qrx_ev			= sfc_ef10_rx_qrx_ev,
+	.qpurge			= sfc_ef10_rx_qpurge,
+	.supported_ptypes_get	= sfc_ef10_supported_ptypes_get,
+	.qdesc_npending		= sfc_ef10_rx_qdesc_npending,
+	.pkt_burst		= sfc_ef10_recv_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 6dffc1c..010641b 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1366,6 +1366,15 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 	if (sa == NULL || sa->state == SFC_ADAPTER_UNINITIALIZED)
 		return -E_RTE_SECONDARY;
 
+	switch (sa->family) {
+	case EFX_FAMILY_HUNTINGTON:
+	case EFX_FAMILY_MEDFORD:
+		avail_caps |= SFC_DP_HW_FW_CAP_EF10;
+		break;
+	default:
+		break;
+	}
+
 	rc = sfc_kvargs_process(sa, SFC_KVARG_RX_DATAPATH,
 				sfc_kvarg_string_handler, &rx_name);
 	if (rc != 0)
@@ -1414,8 +1423,11 @@ static void
 sfc_register_dp(void)
 {
 	/* Register once */
-	if (TAILQ_EMPTY(&sfc_dp_head))
+	if (TAILQ_EMPTY(&sfc_dp_head)) {
+		/* Prefer EF10 datapath */
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
+	}
 }
 
 static int
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index c6b02f2..84b2fd1 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -156,6 +156,20 @@ sfc_ev_efx_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
 }
 
 static boolean_t
+sfc_ev_dp_rx(void *arg, __rte_unused uint32_t label, uint32_t id,
+	     __rte_unused uint32_t size, __rte_unused uint16_t flags)
+{
+	struct sfc_evq *evq = arg;
+	struct sfc_dp_rxq *dp_rxq;
+
+	dp_rxq = evq->dp_rxq;
+	SFC_ASSERT(dp_rxq != NULL);
+
+	SFC_ASSERT(evq->sa->dp_rx->qrx_ev != NULL);
+	return evq->sa->dp_rx->qrx_ev(dp_rxq, id);
+}
+
+static boolean_t
 sfc_ev_nop_tx(void *arg, uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
@@ -414,7 +428,7 @@ static const efx_ev_callbacks_t sfc_ev_callbacks_efx_rx = {
 
 static const efx_ev_callbacks_t sfc_ev_callbacks_dp_rx = {
 	.eec_initialized	= sfc_ev_initialized,
-	.eec_rx			= sfc_ev_nop_rx,
+	.eec_rx			= sfc_ev_dp_rx,
 	.eec_tx			= sfc_ev_nop_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_rxq_flush_done,
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index e86a7e9..38d17e0 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -57,10 +57,12 @@ extern "C" {
 #define SFC_KVARG_STATS_UPDATE_PERIOD_MS	"stats_update_period_ms"
 
 #define SFC_KVARG_DATAPATH_EFX		"efx"
+#define SFC_KVARG_DATAPATH_EF10		"ef10"
 
 #define SFC_KVARG_RX_DATAPATH		"rx_datapath"
 #define SFC_KVARG_VALUES_RX_DATAPATH \
-	"[" SFC_KVARG_DATAPATH_EFX "]"
+	"[" SFC_KVARG_DATAPATH_EFX "|" \
+	    SFC_KVARG_DATAPATH_EF10 "]"
 
 struct sfc_adapter;
 
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index f412376..eef4ce0 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -940,6 +940,11 @@ sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 #endif
 
 	info.rxq_entries = rxq_info->entries;
+	info.rxq_hw_ring = rxq->mem.esm_base;
+	info.evq_entries = rxq_info->entries;
+	info.evq_hw_ring = evq->mem.esm_base;
+	info.hw_index = rxq->hw_index;
+	info.mem_bar = sa->mem_bar.esb_base;
 
 	rc = sa->dp_rx->qcreate(sa->eth_dev->data->port_id, sw_index,
 				&SFC_DEV_TO_PCI(sa->eth_dev)->addr,
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 08/13] net/sfc: factor out libefx-based Tx datapath
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (6 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 07/13] net/sfc: implement EF10 native Rx datapath Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 09/13] net/sfc: make VLAN insertion a datapath-dependent feature Andrew Rybchenko
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.

libefx-based Tx datapath is bound to libefx control path, but
it should be possible to use other datapaths with alternative
control path(s).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst  |   8 ++
 drivers/net/sfc/sfc.h        |   1 +
 drivers/net/sfc/sfc_dp.c     |   4 +-
 drivers/net/sfc/sfc_dp.h     |   1 +
 drivers/net/sfc/sfc_dp_tx.h  | 148 ++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c |  41 +++++-
 drivers/net/sfc/sfc_ev.c     |  48 +++++--
 drivers/net/sfc/sfc_ev.h     |   8 +-
 drivers/net/sfc/sfc_kvargs.c |   1 +
 drivers/net/sfc/sfc_kvargs.h |   4 +
 drivers/net/sfc/sfc_tso.c    |  22 +--
 drivers/net/sfc/sfc_tx.c     | 320 +++++++++++++++++++++++++++++++------------
 drivers/net/sfc/sfc_tx.h     |  98 +++++++++----
 13 files changed, 562 insertions(+), 142 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_dp_tx.h

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 43bd9f1..94c4d07 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -236,6 +236,14 @@ boolean parameters value.
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
 
+- ``tx_datapath`` [auto|efx] (default **auto**)
+
+  Choose transmit datapath implementation.
+  **auto** allows the driver itself to make a choice based on firmware
+  features available and required by the datapath implementation.
+  **efx** chooses libefx-based datapath which supports VLAN insertion
+  (full-feature firmware variant only), TSO and multi-segment mbufs.
+
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
   Choose hardware tunning to be optimized for either throughput or
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 02c97d1..c2961ea 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -228,6 +228,7 @@ struct sfc_adapter {
 #endif
 
 	const struct sfc_dp_rx		*dp_rx;
+	const struct sfc_dp_tx		*dp_tx;
 };
 
 /*
diff --git a/drivers/net/sfc/sfc_dp.c b/drivers/net/sfc/sfc_dp.c
index b52b2ee..860aa92 100644
--- a/drivers/net/sfc/sfc_dp.c
+++ b/drivers/net/sfc/sfc_dp.c
@@ -87,7 +87,9 @@ sfc_dp_register(struct sfc_dp_list *head, struct sfc_dp *entry)
 	if (sfc_dp_find_by_name(head, entry->type, entry->name) != NULL) {
 		rte_log(RTE_LOG_ERR, RTE_LOGTYPE_PMD,
 			"sfc %s dapapath '%s' already registered\n",
-			entry->type == SFC_DP_RX ? "Rx" : "unknown",
+			entry->type == SFC_DP_RX ? "Rx" :
+			entry->type == SFC_DP_TX ? "Tx" :
+			"unknown",
 			entry->name);
 		return EEXIST;
 	}
diff --git a/drivers/net/sfc/sfc_dp.h b/drivers/net/sfc/sfc_dp.h
index f83ea58..eff0aa8 100644
--- a/drivers/net/sfc/sfc_dp.h
+++ b/drivers/net/sfc/sfc_dp.h
@@ -56,6 +56,7 @@ typedef void (sfc_dp_exception_t)(void *ctrl);
 
 enum sfc_dp_type {
 	SFC_DP_RX = 0,	/**< Receive datapath */
+	SFC_DP_TX,	/**< Transmit datapath */
 };
 
 
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
new file mode 100644
index 0000000..1f922e5
--- /dev/null
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -0,0 +1,148 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SFC_DP_TX_H
+#define _SFC_DP_TX_H
+
+#include <rte_ethdev.h>
+
+#include "sfc_dp.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic transmit queue information used on data path.
+ * It must be kept as small as it is possible since it is built into
+ * the structure used on datapath.
+ */
+struct sfc_dp_txq {
+	struct sfc_dp_queue	dpq;
+};
+
+/**
+ * Datapath transmit queue creation information.
+ *
+ * The structure is used just to pass information from control path to
+ * datapath. It could be just function arguments, but it would be hardly
+ * readable.
+ */
+struct sfc_dp_tx_qcreate_info {
+	/** Minimum number of unused Tx descriptors to do reap */
+	unsigned int		free_thresh;
+	/** Transmit queue configuration flags */
+	unsigned int		flags;
+	/** Tx queue size */
+	unsigned int		txq_entries;
+	/** Maximum size of data in the DMA descriptor */
+	uint16_t		dma_desc_size_max;
+};
+
+/**
+ * Allocate and initialize datapath transmit queue.
+ *
+ * @param port_id	The port identifier
+ * @param queue_id	The queue identifier
+ * @param pci_addr	PCI function address
+ * @param socket_id	Socket identifier to allocate memory
+ * @param info		Tx queue details wrapped in structure
+ * @param dp_txqp	Location for generic datapath transmit queue pointer
+ *
+ * @return 0 or positive errno.
+ */
+typedef int (sfc_dp_tx_qcreate_t)(uint16_t port_id, uint16_t queue_id,
+				  const struct rte_pci_addr *pci_addr,
+				  int socket_id,
+				  const struct sfc_dp_tx_qcreate_info *info,
+				  struct sfc_dp_txq **dp_txqp);
+
+/**
+ * Free resources allocated for datapath transmit queue.
+ */
+typedef void (sfc_dp_tx_qdestroy_t)(struct sfc_dp_txq *dp_txq);
+
+/**
+ * Transmit queue start callback.
+ *
+ * It handovers EvQ to the datapath.
+ */
+typedef int (sfc_dp_tx_qstart_t)(struct sfc_dp_txq *dp_txq,
+				 unsigned int evq_read_ptr,
+				 unsigned int txq_desc_index);
+
+/**
+ * Transmit queue stop function called before the queue flush.
+ *
+ * It returns EvQ to the control path.
+ */
+typedef void (sfc_dp_tx_qstop_t)(struct sfc_dp_txq *dp_txq,
+				 unsigned int *evq_read_ptr);
+
+/**
+ * Transmit queue function called after the queue flush.
+ */
+typedef void (sfc_dp_tx_qreap_t)(struct sfc_dp_txq *dp_txq);
+
+/** Transmit datapath definition */
+struct sfc_dp_tx {
+	struct sfc_dp			dp;
+
+	sfc_dp_tx_qcreate_t		*qcreate;
+	sfc_dp_tx_qdestroy_t		*qdestroy;
+	sfc_dp_tx_qstart_t		*qstart;
+	sfc_dp_tx_qstop_t		*qstop;
+	sfc_dp_tx_qreap_t		*qreap;
+	eth_tx_burst_t			pkt_burst;
+};
+
+static inline struct sfc_dp_tx *
+sfc_dp_find_tx_by_name(struct sfc_dp_list *head, const char *name)
+{
+	struct sfc_dp *p = sfc_dp_find_by_name(head, SFC_DP_TX, name);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_tx, dp);
+}
+
+static inline struct sfc_dp_tx *
+sfc_dp_find_tx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
+{
+	struct sfc_dp *p = sfc_dp_find_by_caps(head, SFC_DP_TX, avail_caps);
+
+	return (p == NULL) ? NULL : container_of(p, struct sfc_dp_tx, dp);
+}
+
+extern struct sfc_dp_tx sfc_efx_tx;
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _SFC_DP_TX_H */
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 010641b..d1ef269 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -467,7 +467,7 @@ sfc_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 	if (rc != 0)
 		goto fail_tx_qinit;
 
-	dev->data->tx_queues[tx_queue_id] = sa->txq_info[tx_queue_id].txq;
+	dev->data->tx_queues[tx_queue_id] = sa->txq_info[tx_queue_id].txq->dp;
 
 	sfc_adapter_unlock(sa);
 	return 0;
@@ -1361,6 +1361,7 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 	struct sfc_adapter *sa = dev->data->dev_private;
 	unsigned int avail_caps = 0;
 	const char *rx_name = NULL;
+	const char *tx_name = NULL;
 	int rc;
 
 	if (sa == NULL || sa->state == SFC_ADAPTER_UNINITIALIZED)
@@ -1408,12 +1409,45 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 
 	dev->rx_pkt_burst = sa->dp_rx->pkt_burst;
 
-	dev->tx_pkt_burst = sfc_xmit_pkts;
+	rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
+				sfc_kvarg_string_handler, &tx_name);
+	if (rc != 0)
+		goto fail_kvarg_tx_datapath;
+
+	if (tx_name != NULL) {
+		sa->dp_tx = sfc_dp_find_tx_by_name(&sfc_dp_head, tx_name);
+		if (sa->dp_tx == NULL) {
+			sfc_err(sa, "Tx datapath %s not found", tx_name);
+			rc = ENOENT;
+			goto fail_dp_tx;
+		}
+		if (!sfc_dp_match_hw_fw_caps(&sa->dp_tx->dp, avail_caps)) {
+			sfc_err(sa,
+				"Insufficient Hw/FW capabilities to use Tx datapath %s",
+				tx_name);
+			rc = EINVAL;
+			goto fail_dp_tx;
+		}
+	} else {
+		sa->dp_tx = sfc_dp_find_tx_by_caps(&sfc_dp_head, avail_caps);
+		if (sa->dp_tx == NULL) {
+			sfc_err(sa, "Tx datapath by caps %#x not found",
+				avail_caps);
+			rc = ENOENT;
+			goto fail_dp_tx;
+		}
+	}
+
+	sfc_info(sa, "use %s Tx datapath", sa->dp_tx->dp.name);
+
+	dev->tx_pkt_burst = sa->dp_tx->pkt_burst;
 
 	dev->dev_ops = &sfc_eth_dev_ops;
 
 	return 0;
 
+fail_dp_tx:
+fail_kvarg_tx_datapath:
 fail_dp_rx:
 fail_kvarg_rx_datapath:
 	return rc;
@@ -1427,6 +1461,8 @@ sfc_register_dp(void)
 		/* Prefer EF10 datapath */
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
+
+		sfc_dp_register(&sfc_dp_head, &sfc_efx_tx.dp);
 	}
 }
 
@@ -1563,6 +1599,7 @@ RTE_PMD_REGISTER_PCI_TABLE(net_sfc_efx, pci_id_sfc_efx_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_sfc_efx, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PARAM_STRING(net_sfc_efx,
 	SFC_KVARG_RX_DATAPATH "=" SFC_KVARG_VALUES_RX_DATAPATH " "
+	SFC_KVARG_TX_DATAPATH "=" SFC_KVARG_VALUES_TX_DATAPATH " "
 	SFC_KVARG_PERF_PROFILE "=" SFC_KVARG_VALUES_PERF_PROFILE " "
 	SFC_KVARG_STATS_UPDATE_PERIOD_MS "=<long> "
 	SFC_KVARG_MCDI_LOGGING "=" SFC_KVARG_VALUES_BOOL " "
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 84b2fd1..2f96fb8 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -183,16 +183,18 @@ static boolean_t
 sfc_ev_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
 {
 	struct sfc_evq *evq = arg;
-	struct sfc_txq *txq;
+	struct sfc_dp_txq *dp_txq;
+	struct sfc_efx_txq *txq;
 	unsigned int stop;
 	unsigned int delta;
 
-	txq = evq->txq;
+	dp_txq = evq->dp_txq;
+	SFC_ASSERT(dp_txq != NULL);
 
-	SFC_ASSERT(txq != NULL);
+	txq = sfc_efx_txq_by_dp_txq(dp_txq);
 	SFC_ASSERT(txq->evq == evq);
 
-	if (unlikely((txq->state & SFC_TXQ_STARTED) == 0))
+	if (unlikely((txq->flags & SFC_EFX_TXQ_FLAG_STARTED) == 0))
 		goto done;
 
 	stop = (id + 1) & txq->ptr_mask;
@@ -305,9 +307,13 @@ static boolean_t
 sfc_ev_txq_flush_done(void *arg, __rte_unused uint32_t txq_hw_index)
 {
 	struct sfc_evq *evq = arg;
+	struct sfc_dp_txq *dp_txq;
 	struct sfc_txq *txq;
 
-	txq = evq->txq;
+	dp_txq = evq->dp_txq;
+	SFC_ASSERT(dp_txq != NULL);
+
+	txq = sfc_txq_by_dp_txq(dp_txq);
 	SFC_ASSERT(txq != NULL);
 	SFC_ASSERT(txq->hw_index == txq_hw_index);
 	SFC_ASSERT(txq->evq == evq);
@@ -441,7 +447,7 @@ static const efx_ev_callbacks_t sfc_ev_callbacks_dp_rx = {
 	.eec_link_change	= sfc_ev_nop_link_change,
 };
 
-static const efx_ev_callbacks_t sfc_ev_callbacks_tx = {
+static const efx_ev_callbacks_t sfc_ev_callbacks_efx_tx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
 	.eec_tx			= sfc_ev_tx,
@@ -456,6 +462,21 @@ static const efx_ev_callbacks_t sfc_ev_callbacks_tx = {
 	.eec_link_change	= sfc_ev_nop_link_change,
 };
 
+static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
+	.eec_initialized	= sfc_ev_initialized,
+	.eec_rx			= sfc_ev_nop_rx,
+	.eec_tx			= sfc_ev_nop_tx,
+	.eec_exception		= sfc_ev_exception,
+	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
+	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
+	.eec_txq_flush_done	= sfc_ev_txq_flush_done,
+	.eec_software		= sfc_ev_software,
+	.eec_sram		= sfc_ev_sram,
+	.eec_wake_up		= sfc_ev_wake_up,
+	.eec_timer		= sfc_ev_timer,
+	.eec_link_change	= sfc_ev_nop_link_change,
+};
+
 
 void
 sfc_ev_qpoll(struct sfc_evq *evq)
@@ -487,8 +508,10 @@ sfc_ev_qpoll(struct sfc_evq *evq)
 					rxq_sw_index);
 		}
 
-		if (evq->txq != NULL) {
-			unsigned int txq_sw_index = sfc_txq_sw_index(evq->txq);
+		if (evq->dp_txq != NULL) {
+			unsigned int txq_sw_index;
+
+			txq_sw_index = evq->dp_txq->dpq.queue_id;
 
 			sfc_warn(sa,
 				 "restart TxQ %u because of exception on its EvQ %u",
@@ -558,14 +581,17 @@ sfc_ev_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 	if (rc != 0)
 		goto fail_ev_qcreate;
 
-	SFC_ASSERT(evq->dp_rxq == NULL || evq->txq == NULL);
+	SFC_ASSERT(evq->dp_rxq == NULL || evq->dp_txq == NULL);
 	if (evq->dp_rxq != 0) {
 		if (strcmp(sa->dp_rx->dp.name, SFC_KVARG_DATAPATH_EFX) == 0)
 			evq->callbacks = &sfc_ev_callbacks_efx_rx;
 		else
 			evq->callbacks = &sfc_ev_callbacks_dp_rx;
-	} else if (evq->txq != 0) {
-		evq->callbacks = &sfc_ev_callbacks_tx;
+	} else if (evq->dp_txq != 0) {
+		if (strcmp(sa->dp_tx->dp.name, SFC_KVARG_DATAPATH_EFX) == 0)
+			evq->callbacks = &sfc_ev_callbacks_efx_tx;
+		else
+			evq->callbacks = &sfc_ev_callbacks_dp_tx;
 	} else {
 		evq->callbacks = &sfc_ev_callbacks;
 	}
diff --git a/drivers/net/sfc/sfc_ev.h b/drivers/net/sfc/sfc_ev.h
index 760df98..e8d3090 100644
--- a/drivers/net/sfc/sfc_ev.h
+++ b/drivers/net/sfc/sfc_ev.h
@@ -32,8 +32,12 @@
 #ifndef _SFC_EV_H_
 #define _SFC_EV_H_
 
+#include <rte_ethdev.h>
+
 #include "efx.h"
 
+#include "sfc.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -43,7 +47,7 @@ extern "C" {
 
 struct sfc_adapter;
 struct sfc_dp_rxq;
-struct sfc_txq;
+struct sfc_dp_txq;
 
 enum sfc_evq_state {
 	SFC_EVQ_UNINITIALIZED = 0,
@@ -62,7 +66,7 @@ struct sfc_evq {
 	boolean_t			exception;
 	efsys_mem_t			mem;
 	struct sfc_dp_rxq		*dp_rxq;
-	struct sfc_txq			*txq;
+	struct sfc_dp_txq		*dp_txq;
 
 	/* Not used on datapath */
 	struct sfc_adapter		*sa;
diff --git a/drivers/net/sfc/sfc_kvargs.c b/drivers/net/sfc/sfc_kvargs.c
index 01dff4c..7bcd595 100644
--- a/drivers/net/sfc/sfc_kvargs.c
+++ b/drivers/net/sfc/sfc_kvargs.c
@@ -49,6 +49,7 @@ sfc_kvargs_parse(struct sfc_adapter *sa)
 		SFC_KVARG_MCDI_LOGGING,
 		SFC_KVARG_PERF_PROFILE,
 		SFC_KVARG_RX_DATAPATH,
+		SFC_KVARG_TX_DATAPATH,
 		NULL,
 	};
 
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 38d17e0..a3a08b7 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -64,6 +64,10 @@ extern "C" {
 	"[" SFC_KVARG_DATAPATH_EFX "|" \
 	    SFC_KVARG_DATAPATH_EF10 "]"
 
+#define SFC_KVARG_TX_DATAPATH		"tx_datapath"
+#define SFC_KVARG_VALUES_TX_DATAPATH \
+	"[" SFC_KVARG_DATAPATH_EFX "]"
+
 struct sfc_adapter;
 
 int sfc_kvargs_parse(struct sfc_adapter *sa);
diff --git a/drivers/net/sfc/sfc_tso.c b/drivers/net/sfc/sfc_tso.c
index 271861f..fb79d74 100644
--- a/drivers/net/sfc/sfc_tso.c
+++ b/drivers/net/sfc/sfc_tso.c
@@ -44,13 +44,13 @@
 #define SFC_TSO_OPDESCS_IDX_SHIFT	2
 
 int
-sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
-			unsigned int txq_entries, unsigned int socket_id)
+sfc_efx_tso_alloc_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+			    unsigned int txq_entries, unsigned int socket_id)
 {
 	unsigned int i;
 
 	for (i = 0; i < txq_entries; ++i) {
-		sw_ring[i].tsoh = rte_malloc_socket("sfc-txq-tsoh-obj",
+		sw_ring[i].tsoh = rte_malloc_socket("sfc-efx-txq-tsoh-obj",
 						    SFC_TSOH_STD_LEN,
 						    RTE_CACHE_LINE_SIZE,
 						    socket_id);
@@ -68,7 +68,8 @@ sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
 }
 
 void
-sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring, unsigned int txq_entries)
+sfc_efx_tso_free_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+			   unsigned int txq_entries)
 {
 	unsigned int i;
 
@@ -79,8 +80,8 @@ sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring, unsigned int txq_entries)
 }
 
 static void
-sfc_tso_prepare_header(struct sfc_txq *txq, struct rte_mbuf **in_seg,
-		       size_t *in_off, unsigned int idx, size_t bytes_left)
+sfc_efx_tso_prepare_header(struct sfc_efx_txq *txq, struct rte_mbuf **in_seg,
+			   size_t *in_off, unsigned int idx, size_t bytes_left)
 {
 	struct rte_mbuf *m = *in_seg;
 	size_t bytes_to_copy = 0;
@@ -111,9 +112,9 @@ sfc_tso_prepare_header(struct sfc_txq *txq, struct rte_mbuf **in_seg,
 }
 
 int
-sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
-	   size_t *in_off, efx_desc_t **pend, unsigned int *pkt_descs,
-	   size_t *pkt_len)
+sfc_efx_tso_do(struct sfc_efx_txq *txq, unsigned int idx,
+	       struct rte_mbuf **in_seg, size_t *in_off, efx_desc_t **pend,
+	       unsigned int *pkt_descs, size_t *pkt_len)
 {
 	uint8_t *tsoh;
 	const struct tcp_hdr *th;
@@ -150,7 +151,8 @@ sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
 	 * limitations on address boundaries crossing by DMA descriptor data.
 	 */
 	if (m->data_len < header_len) {
-		sfc_tso_prepare_header(txq, in_seg, in_off, idx, header_len);
+		sfc_efx_tso_prepare_header(txq, in_seg, in_off, idx,
+					   header_len);
 		tsoh = txq->sw_ring[idx & txq->ptr_mask].tsoh;
 
 		header_paddr = rte_malloc_virt2phy((void *)tsoh);
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 6131a49..442d16c 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -35,6 +35,7 @@
 #include "sfc_ev.h"
 #include "sfc_tx.h"
 #include "sfc_tweak.h"
+#include "sfc_kvargs.h"
 
 /*
  * Maximum number of TX queue flush attempts in case of
@@ -111,29 +112,6 @@ sfc_tx_qflush_done(struct sfc_txq *txq)
 	txq->state &= ~SFC_TXQ_FLUSHING;
 }
 
-static void
-sfc_tx_reap(struct sfc_txq *txq)
-{
-	unsigned int    completed;
-
-
-	sfc_ev_qpoll(txq->evq);
-
-	for (completed = txq->completed;
-	     completed != txq->pending; completed++) {
-		struct sfc_tx_sw_desc *txd;
-
-		txd = &txq->sw_ring[completed & txq->ptr_mask];
-
-		if (txd->mbuf != NULL) {
-			rte_pktmbuf_free(txd->mbuf);
-			txd->mbuf = NULL;
-		}
-	}
-
-	txq->completed = completed;
-}
-
 int
 sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	     uint16_t nb_tx_desc, unsigned int socket_id,
@@ -145,6 +123,7 @@ sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	struct sfc_txq *txq;
 	unsigned int evq_index = sfc_evq_index_by_txq_sw_index(sa, sw_index);
 	int rc = 0;
+	struct sfc_dp_tx_qcreate_info info;
 
 	sfc_log_init(sa, "TxQ = %u", sw_index);
 
@@ -169,57 +148,45 @@ sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	if (txq == NULL)
 		goto fail_txq_alloc;
 
+	txq_info->txq = txq;
+
+	txq->hw_index = sw_index;
+	txq->evq = evq;
+	txq->free_thresh =
+		(tx_conf->tx_free_thresh) ? tx_conf->tx_free_thresh :
+		SFC_TX_DEFAULT_FREE_THRESH;
+	txq->flags = tx_conf->txq_flags;
+
 	rc = sfc_dma_alloc(sa, "txq", sw_index, EFX_TXQ_SIZE(txq_info->entries),
 			   socket_id, &txq->mem);
 	if (rc != 0)
 		goto fail_dma_alloc;
 
-	rc = ENOMEM;
-	txq->pend_desc = rte_calloc_socket("sfc-txq-pend-desc",
-					   EFX_TXQ_LIMIT(txq_info->entries),
-					   sizeof(efx_desc_t), 0, socket_id);
-	if (txq->pend_desc == NULL)
-		goto fail_pend_desc_alloc;
+	memset(&info, 0, sizeof(info));
+	info.free_thresh = txq->free_thresh;
+	info.flags = tx_conf->txq_flags;
+	info.txq_entries = txq_info->entries;
+	info.dma_desc_size_max = encp->enc_tx_dma_desc_size_max;
 
-	rc = ENOMEM;
-	txq->sw_ring = rte_calloc_socket("sfc-txq-desc", txq_info->entries,
-					 sizeof(*txq->sw_ring), 0, socket_id);
-	if (txq->sw_ring == NULL)
-		goto fail_desc_alloc;
+	rc = sa->dp_tx->qcreate(sa->eth_dev->data->port_id, sw_index,
+				&SFC_DEV_TO_PCI(sa->eth_dev)->addr,
+				socket_id, &info, &txq->dp);
+	if (rc != 0)
+		goto fail_dp_tx_qinit;
 
-	if (sa->tso) {
-		rc = sfc_tso_alloc_tsoh_objs(txq->sw_ring, txq_info->entries,
-					     socket_id);
-		if (rc != 0)
-			goto fail_alloc_tsoh_objs;
-	}
+	evq->dp_txq = txq->dp;
 
 	txq->state = SFC_TXQ_INITIALIZED;
-	txq->ptr_mask = txq_info->entries - 1;
-	txq->free_thresh = (tx_conf->tx_free_thresh) ? tx_conf->tx_free_thresh :
-						     SFC_TX_DEFAULT_FREE_THRESH;
-	txq->dma_desc_size_max = encp->enc_tx_dma_desc_size_max;
-	txq->hw_index = sw_index;
-	txq->flags = tx_conf->txq_flags;
-	txq->evq = evq;
 
-	evq->txq = txq;
-
-	txq_info->txq = txq;
 	txq_info->deferred_start = (tx_conf->tx_deferred_start != 0);
 
 	return 0;
 
-fail_alloc_tsoh_objs:
-	rte_free(txq->sw_ring);
-
-fail_desc_alloc:
-	rte_free(txq->pend_desc);
-
-fail_pend_desc_alloc:
+fail_dp_tx_qinit:
 	sfc_dma_free(sa, &txq->mem);
 
 fail_dma_alloc:
+	txq_info->txq = NULL;
 	rte_free(txq);
 
 fail_txq_alloc:
@@ -248,13 +215,12 @@ sfc_tx_qfini(struct sfc_adapter *sa, unsigned int sw_index)
 	SFC_ASSERT(txq != NULL);
 	SFC_ASSERT(txq->state == SFC_TXQ_INITIALIZED);
 
-	sfc_tso_free_tsoh_objs(txq->sw_ring, txq_info->entries);
+	sa->dp_tx->qdestroy(txq->dp);
+	txq->dp = NULL;
 
 	txq_info->txq = NULL;
 	txq_info->entries = 0;
 
-	rte_free(txq->sw_ring);
-	rte_free(txq->pend_desc);
 	sfc_dma_free(sa, &txq->mem);
 	rte_free(txq);
 }
@@ -421,12 +387,13 @@ sfc_tx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 		goto fail_tx_qcreate;
 	}
 
-	txq->added = txq->pending = txq->completed = desc_index;
-	txq->hw_vlan_tci = 0;
-
 	efx_tx_qenable(txq->common);
 
-	txq->state |= (SFC_TXQ_STARTED | SFC_TXQ_RUNNING);
+	txq->state |= SFC_TXQ_STARTED;
+
+	rc = sa->dp_tx->qstart(txq->dp, evq->read_ptr, desc_index);
+	if (rc != 0)
+		goto fail_dp_qstart;
 
 	/*
 	 * It seems to be used by DPDK for debug purposes only ('rte_ether')
@@ -436,6 +403,10 @@ sfc_tx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 
 	return 0;
 
+fail_dp_qstart:
+	txq->state = SFC_TXQ_INITIALIZED;
+	efx_tx_qdestroy(txq->common);
+
 fail_tx_qcreate:
 	sfc_ev_qstop(sa, evq->evq_index);
 
@@ -451,7 +422,6 @@ sfc_tx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 	struct sfc_txq *txq;
 	unsigned int retry_count;
 	unsigned int wait_count;
-	unsigned int txds;
 
 	sfc_log_init(sa, "TxQ = %u", sw_index);
 
@@ -465,7 +435,7 @@ sfc_tx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 
 	SFC_ASSERT(txq->state & SFC_TXQ_STARTED);
 
-	txq->state &= ~SFC_TXQ_RUNNING;
+	sa->dp_tx->qstop(txq->dp, &txq->evq->read_ptr);
 
 	/*
 	 * Retry TX queue flushing in case of flush failed or
@@ -500,14 +470,7 @@ sfc_tx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 			sfc_info(sa, "TxQ %u flushed", sw_index);
 	}
 
-	sfc_tx_reap(txq);
-
-	for (txds = 0; txds < txq_info->entries; txds++) {
-		if (txq->sw_ring[txds].mbuf != NULL) {
-			rte_pktmbuf_free(txq->sw_ring[txds].mbuf);
-			txq->sw_ring[txds].mbuf = NULL;
-		}
-	}
+	sa->dp_tx->qreap(txq->dp);
 
 	txq->state = SFC_TXQ_INITIALIZED;
 
@@ -579,6 +542,28 @@ sfc_tx_stop(struct sfc_adapter *sa)
 	efx_tx_fini(sa->nic);
 }
 
+static void
+sfc_efx_tx_reap(struct sfc_efx_txq *txq)
+{
+	unsigned int completed;
+
+	sfc_ev_qpoll(txq->evq);
+
+	for (completed = txq->completed;
+	     completed != txq->pending; completed++) {
+		struct sfc_efx_tx_sw_desc *txd;
+
+		txd = &txq->sw_ring[completed & txq->ptr_mask];
+
+		if (txd->mbuf != NULL) {
+			rte_pktmbuf_free(txd->mbuf);
+			txd->mbuf = NULL;
+		}
+	}
+
+	txq->completed = completed;
+}
+
 /*
  * The function is used to insert or update VLAN tag;
  * the firmware has state of the firmware tag to insert per TxQ
@@ -587,8 +572,8 @@ sfc_tx_stop(struct sfc_adapter *sa)
  * the function will update it
  */
 static unsigned int
-sfc_tx_maybe_insert_tag(struct sfc_txq *txq, struct rte_mbuf *m,
-			efx_desc_t **pend)
+sfc_efx_tx_maybe_insert_tag(struct sfc_efx_txq *txq, struct rte_mbuf *m,
+			    efx_desc_t **pend)
 {
 	uint16_t this_tag = ((m->ol_flags & PKT_TX_VLAN_PKT) ?
 			     m->vlan_tci : 0);
@@ -610,10 +595,11 @@ sfc_tx_maybe_insert_tag(struct sfc_txq *txq, struct rte_mbuf *m,
 	return 1;
 }
 
-uint16_t
-sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+static uint16_t
+sfc_efx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
-	struct sfc_txq *txq = (struct sfc_txq *)tx_queue;
+	struct sfc_dp_txq *dp_txq = (struct sfc_dp_txq *)tx_queue;
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
 	unsigned int added = txq->added;
 	unsigned int pushed = added;
 	unsigned int pkts_sent = 0;
@@ -625,7 +611,7 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	int rc __rte_unused;
 	struct rte_mbuf **pktp;
 
-	if (unlikely((txq->state & SFC_TXQ_RUNNING) == 0))
+	if (unlikely((txq->flags & SFC_EFX_TXQ_FLAG_RUNNING) == 0))
 		goto done;
 
 	/*
@@ -636,7 +622,7 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	reap_done = (fill_level > soft_max_fill);
 
 	if (reap_done) {
-		sfc_tx_reap(txq);
+		sfc_efx_tx_reap(txq);
 		/*
 		 * Recalculate fill level since 'txq->completed'
 		 * might have changed on reap
@@ -659,15 +645,16 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		 * DEV_TX_VLAN_OFFLOAD and pushes VLAN TCI, then
 		 * TX_ERROR will occur
 		 */
-		pkt_descs += sfc_tx_maybe_insert_tag(txq, m_seg, &pend);
+		pkt_descs += sfc_efx_tx_maybe_insert_tag(txq, m_seg, &pend);
 
+#ifdef RTE_LIBRTE_SFC_EFX_TSO
 		if (m_seg->ol_flags & PKT_TX_TCP_SEG) {
 			/*
 			 * We expect correct 'pkt->l[2, 3, 4]_len' values
 			 * to be set correctly by the caller
 			 */
-			if (sfc_tso_do(txq, added, &m_seg, &in_off, &pend,
-				       &pkt_descs, &pkt_len) != 0) {
+			if (sfc_efx_tso_do(txq, added, &m_seg, &in_off, &pend,
+					   &pkt_descs, &pkt_len) != 0) {
 				/* We may have reached this place for
 				 * one of the following reasons:
 				 *
@@ -698,6 +685,7 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			 * as for the usual non-TSO path
 			 */
 		}
+#endif /* RTE_LIBRTE_SFC_EFX_TSO */
 
 		for (; m_seg != NULL; m_seg = m_seg->next) {
 			efsys_dma_addr_t	next_frag;
@@ -749,7 +737,7 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			 * Try to reap (if we haven't yet).
 			 */
 			if (!reap_done) {
-				sfc_tx_reap(txq);
+				sfc_efx_tx_reap(txq);
 				reap_done = B_TRUE;
 				fill_level = added - txq->completed;
 				if (fill_level > hard_max_fill) {
@@ -778,9 +766,169 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 
 #if SFC_TX_XMIT_PKTS_REAP_AT_LEAST_ONCE
 	if (!reap_done)
-		sfc_tx_reap(txq);
+		sfc_efx_tx_reap(txq);
 #endif
 
 done:
 	return pkts_sent;
 }
+
+struct sfc_txq *
+sfc_txq_by_dp_txq(const struct sfc_dp_txq *dp_txq)
+{
+	const struct sfc_dp_queue *dpq = &dp_txq->dpq;
+	struct rte_eth_dev *eth_dev;
+	struct sfc_adapter *sa;
+	struct sfc_txq *txq;
+
+	SFC_ASSERT(rte_eth_dev_is_valid_port(dpq->port_id));
+	eth_dev = &rte_eth_devices[dpq->port_id];
+
+	sa = eth_dev->data->dev_private;
+
+	SFC_ASSERT(dpq->queue_id < sa->txq_count);
+	txq = sa->txq_info[dpq->queue_id].txq;
+
+	SFC_ASSERT(txq != NULL);
+	return txq;
+}
+
+static sfc_dp_tx_qcreate_t sfc_efx_tx_qcreate;
+static int
+sfc_efx_tx_qcreate(uint16_t port_id, uint16_t queue_id,
+		   const struct rte_pci_addr *pci_addr,
+		   int socket_id,
+		   const struct sfc_dp_tx_qcreate_info *info,
+		   struct sfc_dp_txq **dp_txqp)
+{
+	struct sfc_efx_txq *txq;
+	struct sfc_txq *ctrl_txq;
+	int rc;
+
+	rc = ENOMEM;
+	txq = rte_zmalloc_socket("sfc-efx-txq", sizeof(*txq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL)
+		goto fail_txq_alloc;
+
+	sfc_dp_queue_init(&txq->dp.dpq, port_id, queue_id, pci_addr);
+
+	rc = ENOMEM;
+	txq->pend_desc = rte_calloc_socket("sfc-efx-txq-pend-desc",
+					   EFX_TXQ_LIMIT(info->txq_entries),
+					   sizeof(*txq->pend_desc), 0,
+					   socket_id);
+	if (txq->pend_desc == NULL)
+		goto fail_pend_desc_alloc;
+
+	rc = ENOMEM;
+	txq->sw_ring = rte_calloc_socket("sfc-efx-txq-sw_ring",
+					 info->txq_entries,
+					 sizeof(*txq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq->sw_ring == NULL)
+		goto fail_sw_ring_alloc;
+
+	ctrl_txq = sfc_txq_by_dp_txq(&txq->dp);
+	if (ctrl_txq->evq->sa->tso) {
+		rc = sfc_efx_tso_alloc_tsoh_objs(txq->sw_ring,
+						 info->txq_entries, socket_id);
+		if (rc != 0)
+			goto fail_alloc_tsoh_objs;
+	}
+
+	txq->evq = ctrl_txq->evq;
+	txq->ptr_mask = info->txq_entries - 1;
+	txq->free_thresh = info->free_thresh;
+	txq->dma_desc_size_max = info->dma_desc_size_max;
+
+	*dp_txqp = &txq->dp;
+	return 0;
+
+fail_alloc_tsoh_objs:
+	rte_free(txq->sw_ring);
+
+fail_sw_ring_alloc:
+	rte_free(txq->pend_desc);
+
+fail_pend_desc_alloc:
+	rte_free(txq);
+
+fail_txq_alloc:
+	return rc;
+}
+
+static sfc_dp_tx_qdestroy_t sfc_efx_tx_qdestroy;
+static void
+sfc_efx_tx_qdestroy(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+
+	sfc_efx_tso_free_tsoh_objs(txq->sw_ring, txq->ptr_mask + 1);
+	rte_free(txq->sw_ring);
+	rte_free(txq->pend_desc);
+	rte_free(txq);
+}
+
+static sfc_dp_tx_qstart_t sfc_efx_tx_qstart;
+static int
+sfc_efx_tx_qstart(struct sfc_dp_txq *dp_txq,
+		  __rte_unused unsigned int evq_read_ptr,
+		  unsigned int txq_desc_index)
+{
+	/* libefx-based datapath is specific to libefx-based PMD */
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+	struct sfc_txq *ctrl_txq = sfc_txq_by_dp_txq(dp_txq);
+
+	txq->common = ctrl_txq->common;
+
+	txq->pending = txq->completed = txq->added = txq_desc_index;
+	txq->hw_vlan_tci = 0;
+
+	txq->flags |= (SFC_EFX_TXQ_FLAG_STARTED | SFC_EFX_TXQ_FLAG_RUNNING);
+
+	return 0;
+}
+
+static sfc_dp_tx_qstop_t sfc_efx_tx_qstop;
+static void
+sfc_efx_tx_qstop(struct sfc_dp_txq *dp_txq,
+		 __rte_unused unsigned int *evq_read_ptr)
+{
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+
+	txq->flags &= ~SFC_EFX_TXQ_FLAG_RUNNING;
+}
+
+static sfc_dp_tx_qreap_t sfc_efx_tx_qreap;
+static void
+sfc_efx_tx_qreap(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_efx_txq *txq = sfc_efx_txq_by_dp_txq(dp_txq);
+	unsigned int txds;
+
+	sfc_efx_tx_reap(txq);
+
+	for (txds = 0; txds <= txq->ptr_mask; txds++) {
+		if (txq->sw_ring[txds].mbuf != NULL) {
+			rte_pktmbuf_free(txq->sw_ring[txds].mbuf);
+			txq->sw_ring[txds].mbuf = NULL;
+		}
+	}
+
+	txq->flags &= ~SFC_EFX_TXQ_FLAG_STARTED;
+}
+
+struct sfc_dp_tx sfc_efx_tx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EFX,
+		.type		= SFC_DP_TX,
+		.hw_fw_caps	= 0,
+	},
+	.qcreate		= sfc_efx_tx_qcreate,
+	.qdestroy		= sfc_efx_tx_qdestroy,
+	.qstart			= sfc_efx_tx_qstart,
+	.qstop			= sfc_efx_tx_qstop,
+	.qreap			= sfc_efx_tx_qreap,
+	.pkt_burst		= sfc_efx_xmit_pkts,
+};
diff --git a/drivers/net/sfc/sfc_tx.h b/drivers/net/sfc/sfc_tx.h
index 35b65d3..94477f7 100644
--- a/drivers/net/sfc/sfc_tx.h
+++ b/drivers/net/sfc/sfc_tx.h
@@ -37,6 +37,8 @@
 
 #include "efx.h"
 
+#include "sfc_dp_tx.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -44,7 +46,11 @@ extern "C" {
 struct sfc_adapter;
 struct sfc_evq;
 
-struct sfc_tx_sw_desc {
+/**
+ * Software Tx descriptor information associated with hardware Tx
+ * descriptor.
+ */
+struct sfc_efx_tx_sw_desc {
 	struct rte_mbuf		*mbuf;
 	uint8_t			*tsoh;	/* Buffer to store TSO header */
 };
@@ -54,37 +60,71 @@ enum sfc_txq_state_bit {
 #define SFC_TXQ_INITIALIZED	(1 << SFC_TXQ_INITIALIZED_BIT)
 	SFC_TXQ_STARTED_BIT,
 #define SFC_TXQ_STARTED		(1 << SFC_TXQ_STARTED_BIT)
-	SFC_TXQ_RUNNING_BIT,
-#define SFC_TXQ_RUNNING		(1 << SFC_TXQ_RUNNING_BIT)
 	SFC_TXQ_FLUSHING_BIT,
 #define SFC_TXQ_FLUSHING	(1 << SFC_TXQ_FLUSHING_BIT)
 	SFC_TXQ_FLUSHED_BIT,
 #define SFC_TXQ_FLUSHED		(1 << SFC_TXQ_FLUSHED_BIT)
 };
 
+/**
+ * Transmit queue control information. Not used on datapath.
+ * Allocated on the socket specified on the queue setup.
+ */
 struct sfc_txq {
-	struct sfc_evq		*evq;
-	struct sfc_tx_sw_desc	*sw_ring;
-	unsigned int		state;
-	unsigned int		ptr_mask;
-	efx_desc_t		*pend_desc;
-	efx_txq_t		*common;
-	efsys_mem_t		mem;
-	unsigned int		added;
-	unsigned int		pending;
-	unsigned int		completed;
-	unsigned int		free_thresh;
-	uint16_t		hw_vlan_tci;
-	uint16_t		dma_desc_size_max;
-
-	unsigned int		hw_index;
-	unsigned int		flags;
+	unsigned int			state;
+	unsigned int			hw_index;
+	struct sfc_evq			*evq;
+	efsys_mem_t			mem;
+	struct sfc_dp_txq		*dp;
+	efx_txq_t			*common;
+	unsigned int			free_thresh;
+	unsigned int			flags;
 };
 
 static inline unsigned int
+sfc_txq_sw_index_by_hw_index(unsigned int hw_index)
+{
+	return hw_index;
+}
+
+static inline unsigned int
 sfc_txq_sw_index(const struct sfc_txq *txq)
 {
-	return txq->hw_index;
+	return sfc_txq_sw_index_by_hw_index(txq->hw_index);
+}
+
+struct sfc_txq *sfc_txq_by_dp_txq(const struct sfc_dp_txq *dp_txq);
+
+/**
+ * Transmit queue information used on libefx-based data path.
+ * Allocated on the socket specified on the queue setup.
+ */
+struct sfc_efx_txq {
+	struct sfc_evq			*evq;
+	struct sfc_efx_tx_sw_desc	*sw_ring;
+	unsigned int			ptr_mask;
+	efx_desc_t			*pend_desc;
+	efx_txq_t			*common;
+	unsigned int			added;
+	unsigned int			pending;
+	unsigned int			completed;
+	unsigned int			free_thresh;
+	uint16_t			hw_vlan_tci;
+	uint16_t			dma_desc_size_max;
+
+	unsigned int			hw_index;
+	unsigned int			flags;
+#define SFC_EFX_TXQ_FLAG_STARTED	0x1
+#define SFC_EFX_TXQ_FLAG_RUNNING	0x2
+
+	/* Datapath transmit queue anchor */
+	struct sfc_dp_txq		dp;
+};
+
+static inline struct sfc_efx_txq *
+sfc_efx_txq_by_dp_txq(struct sfc_dp_txq *dp_txq)
+{
+	return container_of(dp_txq, struct sfc_efx_txq, dp);
 }
 
 struct sfc_txq_info {
@@ -108,17 +148,15 @@ void sfc_tx_qstop(struct sfc_adapter *sa, unsigned int sw_index);
 int sfc_tx_start(struct sfc_adapter *sa);
 void sfc_tx_stop(struct sfc_adapter *sa);
 
-uint16_t sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
-		       uint16_t nb_pkts);
-
 /* From 'sfc_tso.c' */
-int sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
-			    unsigned int txq_entries, unsigned int socket_id);
-void sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
-			    unsigned int txq_entries);
-int sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
-	       size_t *in_off, efx_desc_t **pend, unsigned int *pkt_descs,
-	       size_t *pkt_len);
+int sfc_efx_tso_alloc_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+				unsigned int txq_entries,
+				unsigned int socket_id);
+void sfc_efx_tso_free_tsoh_objs(struct sfc_efx_tx_sw_desc *sw_ring,
+				unsigned int txq_entries);
+int sfc_efx_tso_do(struct sfc_efx_txq *txq, unsigned int idx,
+		   struct rte_mbuf **in_seg, size_t *in_off, efx_desc_t **pend,
+		   unsigned int *pkt_descs, size_t *pkt_len);
 
 #ifdef __cplusplus
 }
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 09/13] net/sfc: make VLAN insertion a datapath-dependent feature
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (7 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 08/13] net/sfc: factor out libefx-based Tx datapath Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 10/13] net/sfc: make TSO " Andrew Rybchenko
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_tx.h  |  2 ++
 drivers/net/sfc/sfc_ethdev.c |  3 ++-
 drivers/net/sfc/sfc_tx.c     | 14 +++++++++++---
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 1f922e5..1a6d6c1 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -116,6 +116,8 @@ typedef void (sfc_dp_tx_qreap_t)(struct sfc_dp_txq *dp_txq);
 struct sfc_dp_tx {
 	struct sfc_dp			dp;
 
+	unsigned int			features;
+#define SFC_DP_TX_FEAT_VLAN_INSERT	0x1
 	sfc_dp_tx_qcreate_t		*qcreate;
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index d1ef269..b064d0a 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -135,7 +135,8 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 		DEV_TX_OFFLOAD_TCP_CKSUM;
 
 	dev_info->default_txconf.txq_flags = ETH_TXQ_FLAGS_NOXSUMSCTP;
-	if (!encp->enc_hw_tx_insert_vlan_enabled)
+	if ((~sa->dp_tx->features & SFC_DP_TX_FEAT_VLAN_INSERT) ||
+	    !encp->enc_hw_tx_insert_vlan_enabled)
 		dev_info->default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOVLANOFFL;
 	else
 		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_VLAN_INSERT;
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 442d16c..225a3a6 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -84,9 +84,16 @@ sfc_tx_qcheck_conf(struct sfc_adapter *sa, uint16_t nb_tx_desc,
 		rc = EINVAL;
 	}
 
-	if (!encp->enc_hw_tx_insert_vlan_enabled &&
-	    (flags & ETH_TXQ_FLAGS_NOVLANOFFL) == 0) {
-		sfc_err(sa, "VLAN offload is not supported");
+	if ((flags & ETH_TXQ_FLAGS_NOVLANOFFL) == 0) {
+		if (!encp->enc_hw_tx_insert_vlan_enabled) {
+			sfc_err(sa, "VLAN offload is not supported");
+			rc = EINVAL;
+		} else if (~sa->dp_tx->features & SFC_DP_TX_FEAT_VLAN_INSERT) {
+			sfc_err(sa,
+				"VLAN offload is not supported by %s datapath",
+				sa->dp_tx->dp.name);
+			rc = EINVAL;
+		}
 		rc = EINVAL;
 	}
 
@@ -925,6 +932,7 @@ struct sfc_dp_tx sfc_efx_tx = {
 		.type		= SFC_DP_TX,
 		.hw_fw_caps	= 0,
 	},
+	.features		= SFC_DP_TX_FEAT_VLAN_INSERT,
 	.qcreate		= sfc_efx_tx_qcreate,
 	.qdestroy		= sfc_efx_tx_qdestroy,
 	.qstart			= sfc_efx_tx_qstart,
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 10/13] net/sfc: make TSO a datapath-dependent feature
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (8 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 09/13] net/sfc: make VLAN insertion a datapath-dependent feature Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 11/13] net/sfc: implement EF10 native Tx datapath Andrew Rybchenko
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_tx.h | 1 +
 drivers/net/sfc/sfc_tx.c    | 6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 1a6d6c1..c93932e 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -118,6 +118,7 @@ struct sfc_dp_tx {
 
 	unsigned int			features;
 #define SFC_DP_TX_FEAT_VLAN_INSERT	0x1
+#define SFC_DP_TX_FEAT_TSO		0x2
 	sfc_dp_tx_qcreate_t		*qcreate;
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 225a3a6..a63f746 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -294,6 +294,9 @@ sfc_tx_init(struct sfc_adapter *sa)
 		goto fail_tx_dma_desc_boundary;
 	}
 
+	if (~sa->dp_tx->features & SFC_DP_TX_FEAT_TSO)
+		sa->tso = B_FALSE;
+
 	rc = sfc_tx_check_mode(sa, &dev_conf->txmode);
 	if (rc != 0)
 		goto fail_check_mode;
@@ -932,7 +935,8 @@ struct sfc_dp_tx sfc_efx_tx = {
 		.type		= SFC_DP_TX,
 		.hw_fw_caps	= 0,
 	},
-	.features		= SFC_DP_TX_FEAT_VLAN_INSERT,
+	.features		= SFC_DP_TX_FEAT_VLAN_INSERT |
+				  SFC_DP_TX_FEAT_TSO,
 	.qcreate		= sfc_efx_tx_qcreate,
 	.qdestroy		= sfc_efx_tx_qdestroy,
 	.qstart			= sfc_efx_tx_qstart,
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 11/13] net/sfc: implement EF10 native Tx datapath
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (9 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 10/13] net/sfc: make TSO " Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 12/13] net/sfc: make multi-segment support a Tx datapath feature Andrew Rybchenko
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: David Riddoch <driddoch@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst   |   5 +-
 drivers/net/sfc/Makefile      |   1 +
 drivers/net/sfc/sfc_dp_tx.h   |  17 ++
 drivers/net/sfc/sfc_ef10_tx.c | 451 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  |   1 +
 drivers/net/sfc/sfc_ev.c      |  15 +-
 drivers/net/sfc/sfc_kvargs.h  |   3 +-
 drivers/net/sfc/sfc_tx.c      |   5 +
 8 files changed, 495 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_ef10_tx.c

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 94c4d07..5c96625 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -236,13 +236,16 @@ boolean parameters value.
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
 
-- ``tx_datapath`` [auto|efx] (default **auto**)
+- ``tx_datapath`` [auto|efx|ef10] (default **auto**)
 
   Choose transmit datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
   features available and required by the datapath implementation.
   **efx** chooses libefx-based datapath which supports VLAN insertion
   (full-feature firmware variant only), TSO and multi-segment mbufs.
+  **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
+  more efficient than libefx-based but has no VLAN insertion and TSO
+  support yet.
 
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index 66b9114..4ce6f9d 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -96,6 +96,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_dp.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_ef10_tx.c
 
 VPATH += $(SRCDIR)/base
 
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index c93932e..ef62378 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -65,6 +65,16 @@ struct sfc_dp_tx_qcreate_info {
 	unsigned int		txq_entries;
 	/** Maximum size of data in the DMA descriptor */
 	uint16_t		dma_desc_size_max;
+	/** DMA-mapped Tx descriptors ring */
+	void			*txq_hw_ring;
+	/** Associated event queue size */
+	unsigned int		evq_entries;
+	/** Hardware event ring */
+	void			*evq_hw_ring;
+	/** The queue index in hardware (required to push right doorbell) */
+	unsigned int		hw_index;
+	/** Virtual address of the memory-mapped BAR to push Tx doorbell */
+	volatile void		*mem_bar;
 };
 
 /**
@@ -108,6 +118,11 @@ typedef void (sfc_dp_tx_qstop_t)(struct sfc_dp_txq *dp_txq,
 				 unsigned int *evq_read_ptr);
 
 /**
+ * Transmit event handler used during queue flush only.
+ */
+typedef bool (sfc_dp_tx_qtx_ev_t)(struct sfc_dp_txq *dp_txq, unsigned int id);
+
+/**
  * Transmit queue function called after the queue flush.
  */
 typedef void (sfc_dp_tx_qreap_t)(struct sfc_dp_txq *dp_txq);
@@ -123,6 +138,7 @@ struct sfc_dp_tx {
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
 	sfc_dp_tx_qstop_t		*qstop;
+	sfc_dp_tx_qtx_ev_t		*qtx_ev;
 	sfc_dp_tx_qreap_t		*qreap;
 	eth_tx_burst_t			pkt_burst;
 };
@@ -144,6 +160,7 @@ sfc_dp_find_tx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
 }
 
 extern struct sfc_dp_tx sfc_efx_tx;
+extern struct sfc_dp_tx sfc_ef10_tx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
new file mode 100644
index 0000000..7514529
--- /dev/null
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -0,0 +1,451 @@
+/*-
+ *   BSD LICENSE
+ *
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+
+#include <rte_mbuf.h>
+#include <rte_io.h>
+
+#include "efx.h"
+#include "efx_types.h"
+#include "efx_regs.h"
+#include "efx_regs_ef10.h"
+
+#include "sfc_dp_tx.h"
+#include "sfc_tweak.h"
+#include "sfc_kvargs.h"
+#include "sfc_ef10.h"
+
+#define sfc_ef10_tx_err(dpq, ...) \
+	SFC_DP_LOG(SFC_KVARG_DATAPATH_EF10, ERR, dpq, __VA_ARGS__)
+
+/** Maximum length of the mbuf segment data */
+#define SFC_MBUF_SEG_LEN_MAX \
+	((1u << (8 * sizeof(((struct rte_mbuf *)0)->data_len))) - 1)
+
+/** Maximum length of the DMA descriptor data */
+#define SFC_EF10_TX_DMA_DESC_LEN_MAX \
+	((1u << ESF_DZ_TX_KER_BYTE_CNT_WIDTH) - 1)
+
+/** Maximum number of DMA descriptors per mbuf segment */
+#define SFC_EF10_TX_MBUF_SEG_DESCS_MAX \
+	SFC_DIV_ROUND_UP(SFC_MBUF_SEG_LEN_MAX, \
+			 SFC_EF10_TX_DMA_DESC_LEN_MAX)
+
+/**
+ * Maximum number of descriptors/buffers in the Tx ring.
+ * It should guarantee that corresponding event queue never overfill.
+ * EF10 native datapath uses event queue of the same size as Tx queue.
+ * Maximum number of events on datapath can be estimated as number of
+ * Tx queue entries (one event per Tx buffer in the worst case) plus
+ * Tx error and flush events.
+ */
+#define SFC_EF10_TXQ_LIMIT(_ndesc) \
+	((_ndesc) - 1 /* head must not step on tail */ - \
+	 (SFC_EF10_EV_PER_CACHE_LINE - 1) /* max unused EvQ entries */ - \
+	 1 /* Rx error */ - 1 /* flush */)
+
+struct sfc_ef10_tx_sw_desc {
+	struct rte_mbuf			*mbuf;
+};
+
+struct sfc_ef10_txq {
+	unsigned int			flags;
+#define SFC_EF10_TXQ_STARTED		0x1
+#define SFC_EF10_TXQ_NOT_RUNNING	0x2
+#define SFC_EF10_TXQ_EXCEPTION		0x4
+
+	unsigned int			ptr_mask;
+	unsigned int			added;
+	unsigned int			completed;
+	unsigned int			free_thresh;
+	unsigned int			evq_read_ptr;
+	struct sfc_ef10_tx_sw_desc	*sw_ring;
+	efx_qword_t			*txq_hw_ring;
+	volatile void			*doorbell;
+	efx_qword_t			*evq_hw_ring;
+
+	/* Datapath transmit queue anchor */
+	struct sfc_dp_txq		dp;
+};
+
+static inline struct sfc_ef10_txq *
+sfc_ef10_txq_by_dp_txq(struct sfc_dp_txq *dp_txq)
+{
+	return container_of(dp_txq, struct sfc_ef10_txq, dp);
+}
+
+static bool
+sfc_ef10_tx_get_event(struct sfc_ef10_txq *txq, efx_qword_t *tx_ev)
+{
+	volatile efx_qword_t *evq_hw_ring = txq->evq_hw_ring;
+
+	/*
+	 * Exception flag is set when reap is done.
+	 * It is never done twice per packet burst get and absence of
+	 * the flag is checked on burst get entry.
+	 */
+	SFC_ASSERT((txq->flags & SFC_EF10_TXQ_EXCEPTION) == 0);
+
+	*tx_ev = evq_hw_ring[txq->evq_read_ptr & txq->ptr_mask];
+
+	if (!sfc_ef10_ev_present(*tx_ev))
+		return false;
+
+	if (unlikely(EFX_QWORD_FIELD(*tx_ev, FSF_AZ_EV_CODE) !=
+		     FSE_AZ_EV_CODE_TX_EV)) {
+		/*
+		 * Do not move read_ptr to keep the event for exception
+		 * handling by the control path.
+		 */
+		txq->flags |= SFC_EF10_TXQ_EXCEPTION;
+		sfc_ef10_tx_err(&txq->dp.dpq,
+				"TxQ exception at EvQ read ptr %#x",
+				txq->evq_read_ptr);
+		return false;
+	}
+
+	txq->evq_read_ptr++;
+	return true;
+}
+
+static void
+sfc_ef10_tx_reap(struct sfc_ef10_txq *txq)
+{
+	const unsigned int old_read_ptr = txq->evq_read_ptr;
+	const unsigned int ptr_mask = txq->ptr_mask;
+	unsigned int completed = txq->completed;
+	unsigned int pending = completed;
+	const unsigned int curr_done = pending - 1;
+	unsigned int anew_done = curr_done;
+	efx_qword_t tx_ev;
+
+	while (sfc_ef10_tx_get_event(txq, &tx_ev)) {
+		/*
+		 * DROP_EVENT is an internal to the NIC, software should
+		 * never see it and, therefore, may ignore it.
+		 */
+
+		/* Update the latest done descriptor */
+		anew_done = EFX_QWORD_FIELD(tx_ev, ESF_DZ_TX_DESCR_INDX);
+	}
+	pending += (anew_done - curr_done) & ptr_mask;
+
+	if (pending != completed) {
+		do {
+			struct sfc_ef10_tx_sw_desc *txd;
+
+			txd = &txq->sw_ring[completed & ptr_mask];
+
+			if (txd->mbuf != NULL) {
+				rte_pktmbuf_free(txd->mbuf);
+				txd->mbuf = NULL;
+			}
+		} while (++completed != pending);
+
+		txq->completed = completed;
+	}
+
+	sfc_ef10_ev_qclear(txq->evq_hw_ring, ptr_mask, old_read_ptr,
+			   txq->evq_read_ptr);
+}
+
+static void
+sfc_ef10_tx_qdesc_dma_create(phys_addr_t addr, uint16_t size, bool eop,
+			     efx_qword_t *edp)
+{
+	EFX_POPULATE_QWORD_4(*edp,
+			     ESF_DZ_TX_KER_TYPE, 0,
+			     ESF_DZ_TX_KER_CONT, !eop,
+			     ESF_DZ_TX_KER_BYTE_CNT, size,
+			     ESF_DZ_TX_KER_BUF_ADDR, addr);
+}
+
+static inline void
+sfc_ef10_tx_qpush(struct sfc_ef10_txq *txq, unsigned int added,
+		  unsigned int pushed)
+{
+	efx_qword_t desc;
+	efx_oword_t oword;
+
+	/*
+	 * This improves performance by pushing a TX descriptor at the same
+	 * time as the doorbell. The descriptor must be added to the TXQ,
+	 * so that can be used if the hardware decides not to use the pushed
+	 * descriptor.
+	 */
+	desc.eq_u64[0] = txq->txq_hw_ring[pushed & txq->ptr_mask].eq_u64[0];
+	EFX_POPULATE_OWORD_3(oword,
+		ERF_DZ_TX_DESC_WPTR, added & txq->ptr_mask,
+		ERF_DZ_TX_DESC_HWORD, EFX_QWORD_FIELD(desc, EFX_DWORD_1),
+		ERF_DZ_TX_DESC_LWORD, EFX_QWORD_FIELD(desc, EFX_DWORD_0));
+
+	/* DMA sync to device is not required */
+
+	/*
+	 * rte_io_wmb() which guarantees that the STORE operations
+	 * (i.e. Tx and event descriptor updates) that precede
+	 * the rte_io_wmb() call are visible to NIC before the STORE
+	 * operations that follow it (i.e. doorbell write).
+	 */
+	rte_io_wmb();
+
+	*(volatile __m128i *)txq->doorbell = oword.eo_u128[0];
+}
+
+static uint16_t
+sfc_ef10_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	struct sfc_ef10_txq * const txq = sfc_ef10_txq_by_dp_txq(tx_queue);
+	unsigned int ptr_mask;
+	unsigned int added;
+	unsigned int dma_desc_space;
+	bool reap_done;
+	struct rte_mbuf **pktp;
+	struct rte_mbuf **pktp_end;
+
+	if (unlikely(txq->flags &
+		     (SFC_EF10_TXQ_NOT_RUNNING | SFC_EF10_TXQ_EXCEPTION)))
+		return 0;
+
+	ptr_mask = txq->ptr_mask;
+	added = txq->added;
+	dma_desc_space = SFC_EF10_TXQ_LIMIT(ptr_mask + 1) -
+			 (added - txq->completed);
+
+	reap_done = (dma_desc_space < txq->free_thresh);
+	if (reap_done) {
+		sfc_ef10_tx_reap(txq);
+		dma_desc_space = SFC_EF10_TXQ_LIMIT(ptr_mask + 1) -
+				 (added - txq->completed);
+	}
+
+	for (pktp = &tx_pkts[0], pktp_end = &tx_pkts[nb_pkts];
+	     pktp != pktp_end;
+	     ++pktp) {
+		struct rte_mbuf *m_seg = *pktp;
+		unsigned int pkt_start = added;
+		uint32_t pkt_len;
+
+		if (likely(pktp + 1 != pktp_end))
+			rte_mbuf_prefetch_part1(pktp[1]);
+
+		if (m_seg->nb_segs * SFC_EF10_TX_MBUF_SEG_DESCS_MAX >
+		    dma_desc_space) {
+			if (reap_done)
+				break;
+
+			/* Push already prepared descriptors before polling */
+			if (added != txq->added) {
+				sfc_ef10_tx_qpush(txq, added, txq->added);
+				txq->added = added;
+			}
+
+			sfc_ef10_tx_reap(txq);
+			reap_done = true;
+			dma_desc_space = SFC_EF10_TXQ_LIMIT(ptr_mask + 1) -
+				(added - txq->completed);
+			if (m_seg->nb_segs * SFC_EF10_TX_MBUF_SEG_DESCS_MAX >
+			    dma_desc_space)
+				break;
+		}
+
+		pkt_len = m_seg->pkt_len;
+		do {
+			phys_addr_t seg_addr = rte_mbuf_data_dma_addr(m_seg);
+			unsigned int seg_len = rte_pktmbuf_data_len(m_seg);
+
+			SFC_ASSERT(seg_len <= SFC_EF10_TX_DMA_DESC_LEN_MAX);
+
+			pkt_len -= seg_len;
+
+			sfc_ef10_tx_qdesc_dma_create(seg_addr,
+				seg_len, (pkt_len == 0),
+				&txq->txq_hw_ring[added & ptr_mask]);
+			++added;
+
+		} while ((m_seg = m_seg->next) != 0);
+
+		dma_desc_space -= (added - pkt_start);
+
+		/* Assign mbuf to the last used desc */
+		txq->sw_ring[(added - 1) & ptr_mask].mbuf = *pktp;
+	}
+
+	if (likely(added != txq->added)) {
+		sfc_ef10_tx_qpush(txq, added, txq->added);
+		txq->added = added;
+	}
+
+#if SFC_TX_XMIT_PKTS_REAP_AT_LEAST_ONCE
+	if (!reap_done)
+		sfc_ef10_tx_reap(txq);
+#endif
+
+	return pktp - &tx_pkts[0];
+}
+
+
+static sfc_dp_tx_qcreate_t sfc_ef10_tx_qcreate;
+static int
+sfc_ef10_tx_qcreate(uint16_t port_id, uint16_t queue_id,
+		    const struct rte_pci_addr *pci_addr, int socket_id,
+		    const struct sfc_dp_tx_qcreate_info *info,
+		    struct sfc_dp_txq **dp_txqp)
+{
+	struct sfc_ef10_txq *txq;
+	int rc;
+
+	rc = EINVAL;
+	if (info->txq_entries != info->evq_entries)
+		goto fail_bad_args;
+
+	rc = ENOMEM;
+	txq = rte_zmalloc_socket("sfc-ef10-txq", sizeof(*txq),
+				 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL)
+		goto fail_txq_alloc;
+
+	sfc_dp_queue_init(&txq->dp.dpq, port_id, queue_id, pci_addr);
+
+	rc = ENOMEM;
+	txq->sw_ring = rte_calloc_socket("sfc-ef10-txq-sw_ring",
+					 info->txq_entries,
+					 sizeof(*txq->sw_ring),
+					 RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq->sw_ring == NULL)
+		goto fail_sw_ring_alloc;
+
+	txq->flags = SFC_EF10_TXQ_NOT_RUNNING;
+	txq->ptr_mask = info->txq_entries - 1;
+	txq->free_thresh = info->free_thresh;
+	txq->txq_hw_ring = info->txq_hw_ring;
+	txq->doorbell = (volatile uint8_t *)info->mem_bar +
+			ER_DZ_TX_DESC_UPD_REG_OFST +
+			info->hw_index * ER_DZ_TX_DESC_UPD_REG_STEP;
+	txq->evq_hw_ring = info->evq_hw_ring;
+
+	*dp_txqp = &txq->dp;
+	return 0;
+
+fail_sw_ring_alloc:
+	rte_free(txq);
+
+fail_txq_alloc:
+fail_bad_args:
+	return rc;
+}
+
+static sfc_dp_tx_qdestroy_t sfc_ef10_tx_qdestroy;
+static void
+sfc_ef10_tx_qdestroy(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	rte_free(txq->sw_ring);
+	rte_free(txq);
+}
+
+static sfc_dp_tx_qstart_t sfc_ef10_tx_qstart;
+static int
+sfc_ef10_tx_qstart(struct sfc_dp_txq *dp_txq, unsigned int evq_read_ptr,
+		   unsigned int txq_desc_index)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	txq->evq_read_ptr = evq_read_ptr;
+	txq->added = txq->completed = txq_desc_index;
+
+	txq->flags |= SFC_EF10_TXQ_STARTED;
+	txq->flags &= ~(SFC_EF10_TXQ_NOT_RUNNING | SFC_EF10_TXQ_EXCEPTION);
+
+	return 0;
+}
+
+static sfc_dp_tx_qstop_t sfc_ef10_tx_qstop;
+static void
+sfc_ef10_tx_qstop(struct sfc_dp_txq *dp_txq, unsigned int *evq_read_ptr)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	txq->flags |= SFC_EF10_TXQ_NOT_RUNNING;
+
+	*evq_read_ptr = txq->evq_read_ptr;
+}
+
+static sfc_dp_tx_qtx_ev_t sfc_ef10_tx_qtx_ev;
+static bool
+sfc_ef10_tx_qtx_ev(struct sfc_dp_txq *dp_txq, __rte_unused unsigned int id)
+{
+	__rte_unused struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+
+	SFC_ASSERT(txq->flags & SFC_EF10_TXQ_NOT_RUNNING);
+
+	/*
+	 * It is safe to ignore Tx event since we reap all mbufs on
+	 * queue purge anyway.
+	 */
+
+	return false;
+}
+
+static sfc_dp_tx_qreap_t sfc_ef10_tx_qreap;
+static void
+sfc_ef10_tx_qreap(struct sfc_dp_txq *dp_txq)
+{
+	struct sfc_ef10_txq *txq = sfc_ef10_txq_by_dp_txq(dp_txq);
+	unsigned int txds;
+
+	for (txds = 0; txds <= txq->ptr_mask; ++txds) {
+		if (txq->sw_ring[txds].mbuf != NULL) {
+			rte_pktmbuf_free(txq->sw_ring[txds].mbuf);
+			txq->sw_ring[txds].mbuf = NULL;
+		}
+	}
+
+	txq->flags &= ~SFC_EF10_TXQ_STARTED;
+}
+
+struct sfc_dp_tx sfc_ef10_tx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10,
+		.type		= SFC_DP_TX,
+		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_tx_qcreate,
+	.qdestroy		= sfc_ef10_tx_qdestroy,
+	.qstart			= sfc_ef10_tx_qstart,
+	.qtx_ev			= sfc_ef10_tx_qtx_ev,
+	.qstop			= sfc_ef10_tx_qstop,
+	.qreap			= sfc_ef10_tx_qreap,
+	.pkt_burst		= sfc_ef10_xmit_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index b064d0a..98f57fc 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1463,6 +1463,7 @@ sfc_register_dp(void)
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_rx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_rx.dp);
 
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_tx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_tx.dp);
 	}
 }
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index 2f96fb8..24071b2 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -209,6 +209,19 @@ sfc_ev_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
 }
 
 static boolean_t
+sfc_ev_dp_tx(void *arg, __rte_unused uint32_t label, uint32_t id)
+{
+	struct sfc_evq *evq = arg;
+	struct sfc_dp_txq *dp_txq;
+
+	dp_txq = evq->dp_txq;
+	SFC_ASSERT(dp_txq != NULL);
+
+	SFC_ASSERT(evq->sa->dp_tx->qtx_ev != NULL);
+	return evq->sa->dp_tx->qtx_ev(dp_txq, id);
+}
+
+static boolean_t
 sfc_ev_exception(void *arg, __rte_unused uint32_t code,
 		 __rte_unused uint32_t data)
 {
@@ -465,7 +478,7 @@ static const efx_ev_callbacks_t sfc_ev_callbacks_efx_tx = {
 static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
 	.eec_initialized	= sfc_ev_initialized,
 	.eec_rx			= sfc_ev_nop_rx,
-	.eec_tx			= sfc_ev_nop_tx,
+	.eec_tx			= sfc_ev_dp_tx,
 	.eec_exception		= sfc_ev_exception,
 	.eec_rxq_flush_done	= sfc_ev_nop_rxq_flush_done,
 	.eec_rxq_flush_failed	= sfc_ev_nop_rxq_flush_failed,
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index a3a08b7..e38ec17 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -66,7 +66,8 @@ extern "C" {
 
 #define SFC_KVARG_TX_DATAPATH		"tx_datapath"
 #define SFC_KVARG_VALUES_TX_DATAPATH \
-	"[" SFC_KVARG_DATAPATH_EFX "]"
+	"[" SFC_KVARG_DATAPATH_EFX "|" \
+	    SFC_KVARG_DATAPATH_EF10 "]"
 
 struct sfc_adapter;
 
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index a63f746..ffd0bc3 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -174,6 +174,11 @@ sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	info.flags = tx_conf->txq_flags;
 	info.txq_entries = txq_info->entries;
 	info.dma_desc_size_max = encp->enc_tx_dma_desc_size_max;
+	info.txq_hw_ring = txq->mem.esm_base;
+	info.evq_entries = txq_info->entries;
+	info.evq_hw_ring = evq->mem.esm_base;
+	info.hw_index = txq->hw_index;
+	info.mem_bar = sa->mem_bar.esb_base;
 
 	rc = sa->dp_tx->qcreate(sa->eth_dev->data->port_id, sw_index,
 				&SFC_DEV_TO_PCI(sa->eth_dev)->addr,
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 12/13] net/sfc: make multi-segment support a Tx datapath feature
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (10 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 11/13] net/sfc: implement EF10 native Tx datapath Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 13/13] net/sfc: implement simple EF10 native Tx datapath Andrew Rybchenko
  2017-03-20 15:37   ` [dpdk-dev] [PATCH v2 00/13] Improve Solarflare PMD performance Ferruh Yigit
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 drivers/net/sfc/sfc_dp_tx.h   |  1 +
 drivers/net/sfc/sfc_ef10_tx.c |  2 +-
 drivers/net/sfc/sfc_ethdev.c  |  3 +++
 drivers/net/sfc/sfc_tx.c      | 10 +++++++++-
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index ef62378..f99a53d 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -134,6 +134,7 @@ struct sfc_dp_tx {
 	unsigned int			features;
 #define SFC_DP_TX_FEAT_VLAN_INSERT	0x1
 #define SFC_DP_TX_FEAT_TSO		0x2
+#define SFC_DP_TX_FEAT_MULTI_SEG	0x4
 	sfc_dp_tx_qcreate_t		*qcreate;
 	sfc_dp_tx_qdestroy_t		*qdestroy;
 	sfc_dp_tx_qstart_t		*qstart;
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
index 7514529..1ef198a 100644
--- a/drivers/net/sfc/sfc_ef10_tx.c
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -440,7 +440,7 @@ struct sfc_dp_tx sfc_ef10_tx = {
 		.type		= SFC_DP_TX,
 		.hw_fw_caps	= SFC_DP_HW_FW_CAP_EF10,
 	},
-	.features		= 0,
+	.features		= SFC_DP_TX_FEAT_MULTI_SEG,
 	.qcreate		= sfc_ef10_tx_qcreate,
 	.qdestroy		= sfc_ef10_tx_qdestroy,
 	.qstart			= sfc_ef10_tx_qstart,
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 98f57fc..ca6f61c 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -141,6 +141,9 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	else
 		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_VLAN_INSERT;
 
+	if (~sa->dp_tx->features & SFC_DP_TX_FEAT_MULTI_SEG)
+		dev_info->default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOMULTSEGS;
+
 #if EFSYS_OPT_RX_SCALE
 	if (sa->rss_support != EFX_RX_SCALE_UNAVAILABLE) {
 		dev_info->reta_size = EFX_RSS_TBL_SIZE;
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index ffd0bc3..0c8ff26 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -84,6 +84,13 @@ sfc_tx_qcheck_conf(struct sfc_adapter *sa, uint16_t nb_tx_desc,
 		rc = EINVAL;
 	}
 
+	if (((flags & ETH_TXQ_FLAGS_NOMULTSEGS) == 0) &&
+	    (~sa->dp_tx->features & SFC_DP_TX_FEAT_MULTI_SEG)) {
+		sfc_err(sa, "Multi-segment is not supported by %s datapath",
+			sa->dp_tx->dp.name);
+		rc = EINVAL;
+	}
+
 	if ((flags & ETH_TXQ_FLAGS_NOVLANOFFL) == 0) {
 		if (!encp->enc_hw_tx_insert_vlan_enabled) {
 			sfc_err(sa, "VLAN offload is not supported");
@@ -941,7 +948,8 @@ struct sfc_dp_tx sfc_efx_tx = {
 		.hw_fw_caps	= 0,
 	},
 	.features		= SFC_DP_TX_FEAT_VLAN_INSERT |
-				  SFC_DP_TX_FEAT_TSO,
+				  SFC_DP_TX_FEAT_TSO |
+				  SFC_DP_TX_FEAT_MULTI_SEG,
 	.qcreate		= sfc_efx_tx_qcreate,
 	.qdestroy		= sfc_efx_tx_qdestroy,
 	.qstart			= sfc_efx_tx_qstart,
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [dpdk-dev] [PATCH v2 13/13] net/sfc: implement simple EF10 native Tx datapath
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (11 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 12/13] net/sfc: make multi-segment support a Tx datapath feature Andrew Rybchenko
@ 2017-03-20 10:15   ` Andrew Rybchenko
  2017-03-20 15:37   ` [dpdk-dev] [PATCH v2 00/13] Improve Solarflare PMD performance Ferruh Yigit
  13 siblings, 0 replies; 33+ messages in thread
From: Andrew Rybchenko @ 2017-03-20 10:15 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

The datapath does not support VLAN insertion, TSO and multi-segment
mbufs.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst   |  5 ++-
 drivers/net/sfc/sfc_dp_tx.h   |  1 +
 drivers/net/sfc/sfc_ef10_tx.c | 73 +++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_ethdev.c  |  1 +
 drivers/net/sfc/sfc_kvargs.h  |  4 ++-
 5 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 5c96625..5f825e9 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -236,7 +236,7 @@ boolean parameters value.
   more efficient than libefx-based and provides richer packet type
   classification, but lacks Rx scatter support.
 
-- ``tx_datapath`` [auto|efx|ef10] (default **auto**)
+- ``tx_datapath`` [auto|efx|ef10|ef10_simple] (default **auto**)
 
   Choose transmit datapath implementation.
   **auto** allows the driver itself to make a choice based on firmware
@@ -246,6 +246,9 @@ boolean parameters value.
   **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is
   more efficient than libefx-based but has no VLAN insertion and TSO
   support yet.
+  **ef10_simple** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which
+  is even more faster then **ef10** but does not support multi-segment
+  mbufs.
 
 - ``perf_profile`` [auto|throughput|low-latency] (default **throughput**)
 
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index f99a53d..2bb9a2e 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -162,6 +162,7 @@ sfc_dp_find_tx_by_caps(struct sfc_dp_list *head, unsigned int avail_caps)
 
 extern struct sfc_dp_tx sfc_efx_tx;
 extern struct sfc_dp_tx sfc_ef10_tx;
+extern struct sfc_dp_tx sfc_ef10_simple_tx;
 
 #ifdef __cplusplus
 }
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
index 1ef198a..74bd822 100644
--- a/drivers/net/sfc/sfc_ef10_tx.c
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -313,6 +313,64 @@ sfc_ef10_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	return pktp - &tx_pkts[0];
 }
 
+static uint16_t
+sfc_ef10_simple_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+			  uint16_t nb_pkts)
+{
+	struct sfc_ef10_txq * const txq = sfc_ef10_txq_by_dp_txq(tx_queue);
+	unsigned int ptr_mask;
+	unsigned int added;
+	unsigned int dma_desc_space;
+	bool reap_done;
+	struct rte_mbuf **pktp;
+	struct rte_mbuf **pktp_end;
+
+	if (unlikely(txq->flags &
+		     (SFC_EF10_TXQ_NOT_RUNNING | SFC_EF10_TXQ_EXCEPTION)))
+		return 0;
+
+	ptr_mask = txq->ptr_mask;
+	added = txq->added;
+	dma_desc_space = SFC_EF10_TXQ_LIMIT(ptr_mask + 1) -
+			 (added - txq->completed);
+
+	reap_done = (dma_desc_space < RTE_MAX(txq->free_thresh, nb_pkts));
+	if (reap_done) {
+		sfc_ef10_tx_reap(txq);
+		dma_desc_space = SFC_EF10_TXQ_LIMIT(ptr_mask + 1) -
+				 (added - txq->completed);
+	}
+
+	pktp_end = &tx_pkts[MIN(nb_pkts, dma_desc_space)];
+	for (pktp = &tx_pkts[0]; pktp != pktp_end; ++pktp) {
+		struct rte_mbuf *pkt = *pktp;
+		unsigned int id = added & ptr_mask;
+
+		SFC_ASSERT(rte_pktmbuf_data_len(pkt) <=
+			   SFC_EF10_TX_DMA_DESC_LEN_MAX);
+
+		sfc_ef10_tx_qdesc_dma_create(rte_mbuf_data_dma_addr(pkt),
+					     rte_pktmbuf_data_len(pkt),
+					     true, &txq->txq_hw_ring[id]);
+
+		txq->sw_ring[id].mbuf = pkt;
+
+		++added;
+	}
+
+	if (likely(added != txq->added)) {
+		sfc_ef10_tx_qpush(txq, added, txq->added);
+		txq->added = added;
+	}
+
+#if SFC_TX_XMIT_PKTS_REAP_AT_LEAST_ONCE
+	if (!reap_done)
+		sfc_ef10_tx_reap(txq);
+#endif
+
+	return pktp - &tx_pkts[0];
+}
+
 
 static sfc_dp_tx_qcreate_t sfc_ef10_tx_qcreate;
 static int
@@ -449,3 +507,18 @@ struct sfc_dp_tx sfc_ef10_tx = {
 	.qreap			= sfc_ef10_tx_qreap,
 	.pkt_burst		= sfc_ef10_xmit_pkts,
 };
+
+struct sfc_dp_tx sfc_ef10_simple_tx = {
+	.dp = {
+		.name		= SFC_KVARG_DATAPATH_EF10_SIMPLE,
+		.type		= SFC_DP_TX,
+	},
+	.features		= 0,
+	.qcreate		= sfc_ef10_tx_qcreate,
+	.qdestroy		= sfc_ef10_tx_qdestroy,
+	.qstart			= sfc_ef10_tx_qstart,
+	.qtx_ev			= sfc_ef10_tx_qtx_ev,
+	.qstop			= sfc_ef10_tx_qstop,
+	.qreap			= sfc_ef10_tx_qreap,
+	.pkt_burst		= sfc_ef10_simple_xmit_pkts,
+};
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index ca6f61c..b745714 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1468,6 +1468,7 @@ sfc_register_dp(void)
 
 		sfc_dp_register(&sfc_dp_head, &sfc_ef10_tx.dp);
 		sfc_dp_register(&sfc_dp_head, &sfc_efx_tx.dp);
+		sfc_dp_register(&sfc_dp_head, &sfc_ef10_simple_tx.dp);
 	}
 }
 
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index e38ec17..d9c3b1d 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -58,6 +58,7 @@ extern "C" {
 
 #define SFC_KVARG_DATAPATH_EFX		"efx"
 #define SFC_KVARG_DATAPATH_EF10		"ef10"
+#define SFC_KVARG_DATAPATH_EF10_SIMPLE	"ef10_simple"
 
 #define SFC_KVARG_RX_DATAPATH		"rx_datapath"
 #define SFC_KVARG_VALUES_RX_DATAPATH \
@@ -67,7 +68,8 @@ extern "C" {
 #define SFC_KVARG_TX_DATAPATH		"tx_datapath"
 #define SFC_KVARG_VALUES_TX_DATAPATH \
 	"[" SFC_KVARG_DATAPATH_EFX "|" \
-	    SFC_KVARG_DATAPATH_EF10 "]"
+	    SFC_KVARG_DATAPATH_EF10 "|" \
+	    SFC_KVARG_DATAPATH_EF10_SIMPLE "]"
 
 struct sfc_adapter;
 
-- 
2.9.3

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/13] Improve Solarflare PMD performance
  2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
                     ` (12 preceding siblings ...)
  2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 13/13] net/sfc: implement simple EF10 native Tx datapath Andrew Rybchenko
@ 2017-03-20 15:37   ` Ferruh Yigit
  13 siblings, 0 replies; 33+ messages in thread
From: Ferruh Yigit @ 2017-03-20 15:37 UTC (permalink / raw)
  To: Andrew Rybchenko, dev

On 3/20/2017 10:15 AM, Andrew Rybchenko wrote:
> Implement EF10 (SFN7xxx and SFN8xxx) native datapaths which may be
> chosen per device using PCI whitelist device arguments.
> 
> libefx-based datapath implementation is bound to API and structure
> imposed by the libefx. It has many indirect function calls to
> provide HW abstraction (bad for CPU pipeline) and uses many data
> structures: driver Rx/Tx queue, driver event queue, libefx Rx/Tx
> queue, libefx event queue, libefx NIC (bad for cache).
> 
> Native datapath implementation is fully separated from control
> path to be able to use alternative control path if required
> (e.g. kernel-aware).
> 
> Native datapaths show better performance than libefx-based.
> 
> v2:
>  - fix spelling, reword commit messages as requested
>  - exclude packed stream support since it shows worse performance yet
>    because of indirect mbufs usage
>  - use uint16_t for port_id to avoid changes when corresponding
>    mbuf patches are applied
>  - add header with functions shared by EF10 Rx and Tx
>  - clear event queue entries by cache-lines
>  - remove unnecessary checks in refill code
>  - add missing BSD LICENSE line to new files
>  - avoid usage of function pointers in state structures, make EF10
>    native datapath multi-process support friendly to avoid code
>    shuffling in the future
>  - remove unnecessary memory barriers, add corresponding comments
>  - do not use libefx macros for Rx/Tx queue limit, define own which
>    take event queue clear by cache-line into account
> 
> Andrew Rybchenko (13):
>   net/sfc: use different callbacks for event queues
>   net/sfc: emphasis that RSS hash flag is an Rx queue flag
>   net/sfc: do not use Rx queue control state on datapath
>   net/sfc: factor out libefx-based Rx datapath
>   net/sfc: make Rx scatter a datapath-dependent feature
>   net/sfc: remove few conditions in Rx queue refill
>   net/sfc: implement EF10 native Rx datapath
>   net/sfc: factor out libefx-based Tx datapath
>   net/sfc: make VLAN insertion a datapath-dependent feature
>   net/sfc: make TSO a datapath-dependent feature
>   net/sfc: implement EF10 native Tx datapath
>   net/sfc: make multi-segment support a Tx datapath feature
>   net/sfc: implement simple EF10 native Tx datapath

Series applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-03-20 15:37 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-02  7:07 [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 01/13] net/sfc: callbacks should depend on EvQ usage Andrew Rybchenko
2017-03-04 21:04   ` Ferruh Yigit
2017-03-02  7:07 ` [dpdk-dev] [PATCH 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 03/13] net/sfc: do not use Rx queue control state on datapath Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
2017-03-04 21:05   ` Ferruh Yigit
2017-03-13 13:12     ` Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 05/13] net/sfc: Rx scatter is a datapath-dependent feature Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 06/13] net/sfc: implement EF10 native Rx datapath Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 07/13] net/sfc: factory out libefx-based Tx datapath Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 08/13] net/sfc: VLAN insertion is a datapath dependent feature Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 09/13] net/sfc: TSO " Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 10/13] net/sfc: implement EF10 native Tx datapath Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 11/13] net/sfc: multi-segment support as is Tx datapath features Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 12/13] net/sfc: implement simple EF10 native Tx datapath Andrew Rybchenko
2017-03-02  7:07 ` [dpdk-dev] [PATCH 13/13] net/sfc: support Rx packed stream EF10-specific datapath Andrew Rybchenko
2017-03-04 21:07 ` [dpdk-dev] [PATCH 00/13] Improve Solarflare PMD performance Ferruh Yigit
2017-03-20 10:15 ` [dpdk-dev] [PATCH v2 " Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 01/13] net/sfc: use different callbacks for event queues Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 02/13] net/sfc: emphasis that RSS hash flag is an Rx queue flag Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 03/13] net/sfc: do not use Rx queue control state on datapath Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 04/13] net/sfc: factor out libefx-based Rx datapath Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 05/13] net/sfc: make Rx scatter a datapath-dependent feature Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 06/13] net/sfc: remove few conditions in Rx queue refill Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 07/13] net/sfc: implement EF10 native Rx datapath Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 08/13] net/sfc: factor out libefx-based Tx datapath Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 09/13] net/sfc: make VLAN insertion a datapath-dependent feature Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 10/13] net/sfc: make TSO " Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 11/13] net/sfc: implement EF10 native Tx datapath Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 12/13] net/sfc: make multi-segment support a Tx datapath feature Andrew Rybchenko
2017-03-20 10:15   ` [dpdk-dev] [PATCH v2 13/13] net/sfc: implement simple EF10 native Tx datapath Andrew Rybchenko
2017-03-20 15:37   ` [dpdk-dev] [PATCH v2 00/13] Improve Solarflare PMD performance Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).